I’ve noticed an increase in the number of projects using monorepos. Unfortunately, I see a lack of discussion on the dangers of them or how to deal with those dangers. Few jump out of a plane without a parachute or even training and guidance to go along with the parachute. But, when it comes to monorepos we all too often do the software equivalent.
In the Go community this is more prevalent than in some others. I think that comes from the influence of Google who uses monorepos for their proprietary code. Google has had many years to learn how to use them well (hopefully). Not everyone has the benefit of that.
It’s worth noting that I’m not telling anyone whether they should leverage monorepos or not. I’ve done them numerous times. Though, these days I tend to use them far less. These are the things I wish I’d known before using them. Just don’t want folks to walk in blind.
Loss of Version Information
So, you’re vendoring the dependencies you use. When your application is built and runs tests you don’t have to spend time pulling down external packages. That’s great.
What versions are the dependencies you’re using? Maybe you have the commit ids for the packages you use. Say you’re using a package version 1.2.3
but all you have is a commit id as a hash and there’s a security issue. They say versions prior to 1.3.0
are vulnerable but not after. You have just a commit id. How do you know if you’re vulnerable and should update?
I’ve seen a number of monorepos where the details of imported libraries are lost. Without those details it can be hard to deal with updates around things like security issues.
Forking Packages
Those vendored libraries, stored in your VCS, are an opportune target. You’ve got a change, that would be so nice to have, it’s tempting to just change it in your vendored copy. Now you have what you want.
Sometime later someone else comes along and updates the version of that vendored library. That change is gone. How long until someone detects it? Now what?
Don’t think this will happen to you? I’ve seen it. More than once. Even with a documented process and patch files stored to re-apply after updates.
SDK to API Mapping
One of the common themes I’ve been seeing lately is the SDK, REST library, or language binding is in the same repo as the application it talks to.
For the developers who are common between both this can be incredibly useful. But, there are still a couple of potential downfalls.
- When the SDK, Rest Library, or language binding is tightly coupled to the exact same API version being served up by the server you have a problem. The API should be versioned and communications happen based on the version rather than any implementation.
- Developers who consume the SDK, REST library, or language binding who don’t want the server. There are a lot of them. To download the server may be an annoyance. Something I’ve heard numerous times in “parking lot conversations”.
Build And Test Bloat
Ever seen a build and test run take a long time? I’m talking an hour or longer. I’m aware of multiple monorepos where this is the case.
When you have a monorepo broken up into libraries each can be independently and individually tested where there are changes to them. When you have a monorepo it can pragmatically be more difficult. Folks will often built and test in one big shot. Now, apply those large test runs to every push. The CI/CD process can become painful.
Conclusion
Know what you’re getting into with monorepos. These are just a few of the issues I’ve encountered. There are more. Monorepos aren’t a silver bullet. If you do them I recommend being active and intentional about avoiding the pitfalls. Otherwise you could end up needing to untangle problems later.