FubuMVC Lessons Learned — Misadventures in DevOps with Ripple, Nuget, TeamCity, and Gems
tl;dr: Large .Net codebases are a challenge, strong naming in .Net is awful, Nuget out of the box breaks down when things are changing rapidly, and the csproj file format is problematic — but we remedied some, but certainly not all, of these issues with a tool we built called Ripple.
At this point, we have collectively decided that FubuMVC did absolutely nothing right in regards to documentation, samples, getting started tutorials, and generally making it easier for new users to get going, so I’m going to automatically delete any comment deriding us yet again on these topics. We did, however, do a number of things in the purely technical realm that might still be of use to other folks and that’s what I’m concentrating on throughout the rest of these posts.
I got to speak at the Monkeyspace conference last year (had a blast, highly recommend it this year in Dublin, I’m hoping to go again) and one of my talks was about our experiences with dependency management across the FubuMVC projects. Just a sampling of the slides included “A Comedy of Errors”, “Merge Hell”, and “Strong Naming Woes.” To put it mildly, we’ve had some technical difficulties with Nuget, TeamCity, Git, our own Ripple tool, and .Net CLR mechanics in general that have caused me to use language that my Grandmother would not approve of.
.Net Codebases Do Not Scale
There’s a line of thinking out there that says that you can happily use dynamic languages for smaller applications but when you get into bigger codebases you have to graduate to a grown up static typed language. While I don’t have any first hand knowledge about how well a Ruby on Rails or a Python Django codebase will scale in size, I can tell you that a .Net codebase can become a severe productivity problem as it becomes much larger. Visual Studio.Net grinds down, ReSharper gets slower, your build script gets slower, compile times take longer, and basically every single feedback mechanism that a good development team wants to use gets slower.
The FubuMVC ecosystem became very large within just a year of constant development. Take a brief glance at the sheer number of active projects we have going at the moment and the test counts.* The sheer size of the codebase became a problem for us and I felt like the slower build times were slowing down development.
Split up Large Codebases into Separate Cohesive Repositories
The main FubuMVC GitHub repository quickly became quite large and sluggish. If you’ll take a look at a very old tagged branch from that time, you can probably see why. The main FubuMVC.Core library was already getting big just by itself and the repository also included several related libraries and their associated testing libraries — and my consistent experience over the years has been that the number of projects in a solution to compile seems to make more difference in compile times than the raw number of lines of code.
The very obvious thing to do was to split off the ancillary libraries like FubuCore, FubuLocalization, and FubuValidation into their own git repositories. Great and all, but the next issue was that FubuMVC was dependent upon the now upstream build products from FubuCore and FubuLocalization. So what to do? The old way was to just check the FubuCore and FubuLocalization assemblies into the FubuMVC repository, but as I’ll discuss later, we found that to be problematic with git. More importantly, even though in a perfect world the upstream projects were stable and would never introduce breaking changes, we would absolutely need a quick way to migrate changes from upstream to test against the downstream FubuMVC codebase.
As part of the original effort to break up the codebases, Joshua Flanagan and I worked on a series of tooling that we eventually named “Ripple”** to automate the flow of build products from upstream to downstream in cascading automated builds (by “cascading” I mean that a successful build of FubuCore would trigger a new CI build of FubuMVC using the latest FubuCore build products). Ripple originally worked in two modes. First, a “local” ripple that acted as a little bit of glue to build locally on your box and copy the build products to the right place in the downstream code and run the downstream build to check for any breaking changes without having to push any code changes to GitHub. Secondly, a Nuget-based workflow that allowed us to consume the very latest Nuget version from the upstream builds in the downstream builds. More on Ripple below.
Once this infrastructure was in place we were able to break the codebase into smaller, more cohesive codebases and reap the rewards of faster build times and smaller codebases — or we would have been if that new tooling hadn’t been so damn problematic as I’ll describe in sections below.
Thoughts on breaking up a codebase:
The following is a mixed bag of my thoughts on when and whether you should break up a large codebase. Unfortunately, there is no black and white answer. I’m still glad that we went through the effort of breaking up the main FubuMVC codebase, but in retrospect I would not have gone as far as we did in splitting up the main FubuMVC.Core library and I’ve actually partially reversed that trend for the 2.0 release.
- Don’t break things up before what would be the upstream library or package has a stable API
- Things that are tightly coupled and often need to change together should be built and released together
- You better have a good way to automate cascading builds in continuous integration across related codebases before you even attempt to split up a codebase
- It was helpful to pull out parts of the codebase that were relatively stable while isolating subsystems that were evolving much more quickly
- Sometimes breaking up a larger library into smaller, more cohesive libraries makes the functionality more discoverable. The FubuCore library for instance has support for command line tools, model binding, reflection helpers, and an implementation of a dependency graph. We theorized over the years that we should have broken up FubuCore to make it more obvious what the various functions were.
- Many .Net developers seem to be almost allergic to having more than a couple dependencies and we got feedback over the years that some folks really didn’t like how starting a new FubuMVC application required so many different libraries. The fact that Nuget put it all together for you was irrelevant. Unfortunately, I think this issue militates against getting too slap happy with dividing up your code repository and assemblies.
- It was a little challenging to do release management across so many different repositories. Even though we could conceivably release packages separately for upstream and downstream products, I usually ended up doing the releases together. I don’t have a great answer for this problem and now I don’t have to now that we’re shutting things down;)
- Don’t attempt to build a very large codebase with a very small team, no matter how good or passionate said team is
Don’t put binaries in Git
Git does NOT like it when you check your binaries into source control the way that we used to in the pre-Nuget/Subversion days. We found this out the hard way when I was at Dovetail and we were rev’ing FubuMVC very hard at the same time and committing the FubuMVC binaries into our application’s git repository. The end result was that the Java (and yes, I see where the problem might have been) client that TeamCity used for Git just absolutely choked and flopped over with out of memory exceptions. It turned out that Jenkins *could* handle our git repository, but there’s still a very noticeable performance lag with the git repo’s that have way too many revisions of binary dependencies.
Other git clients can handle the binaries, but there’s a very noticeable hit to Git’s near legendary performance moving from a repository with almost all text files to a codebase that commits their binaries *cough* MassTransit (at the time that I wrote this draft) *cough*.
Enter ripple restore for cascading builds
In the end, we built the “ripple restore” feature (Ripple’s analogue to Nuget Package Restore, but for the record, we built our feature before the Nuget team did and Ripple is significantly faster than Nuget’s;) to find and explode out Nuget dependencies declared inside the codebase at build time as a precursor to compiling in our rake scripts on either the CI server or a local developer box. We no longer have to commit any binaries to the repository that are delivered via Nuget and the impact on Git repository performance, especially for fresh clones, is very noticeable.
Ripple treats Nuget dependencies as either a “Fixed” dependency that is locked to a specific version or a “Float” dependency that is always going to resolve to the very latest published version. In the case of FubuMVC’s dependencies today, the internal FubuCore dependency is a “Float”, while the external dependencies like NUnit, Katana, and WebDriver are “Fixed.” When the FubuMVC build on our TeamCity server runs, it always runs against the very latest version of FubuCore. Moreover, we use the cascading build feature of TeamCity to trigger FubuMVC builds whenever a FubuCore build succeeds. This way we have very rapid feedback whenever an upstream change in FubuCore breaks something downstream in FubuMVC — and while I wish I could say that I was so good that that never happens, it certainly does.
Awesome, we’ve got cascading builds and a relatively quick feedback loop between our upstream and downstream builds. Except now we ran into some new problems.
Nuget and CsProj Merge Hell
Ironically, Phil Haack has a blog post out this week on how bad merge conflicts are in csproj files, because the last time I saw Phil I was trying to argue with him that the way that Nuget embeds the package version number in the exploded file paths was a big mistake specifically because that caused us no end of horrendous merge conflicts when we were updating Nugets rapidly. When we were doing development across repositories it wasn’t that uncommon for the same Nuget dependency to get updated in different feature branches, causing some of the worse merge conflicts you can possibly imagine with csproj and the Nuget Packages.config files.
Josh Arnold beat this issue permanently in Ripple 2.0 by using a very different workflow than Nuget out of the box. The first step was to eliminate the !@#$%ing version number in our Nuget /packages folder by moving the version requirement to the level of the codebase instead of being project by project (another flaw in OOTB Nuget in my opinion). Doing that meant that the csproj files only be change on Nuget package updates if the Nuget packages in question changed their own structure. Bang, a whole bunch of merge issues went away just like that.
The second thing we did was to eliminate the Packages.config Xml files in each project folder and replace it with a simple flat file analogue that listed each project’s dependencies in alphabetic order. That change also helped reduce the number of merge conflicts.
The end result was that we were able to move and revision faster and more effectively across multiple code repositories. I still think that was a very big win.
Let’s just say this bluntly, it’s a big anti-pattern to have any kind of central file that has to frequently and simultaneously change by multiple people doing what should be parallel work — be it ORM mapping, IoC container configuration, the blasted csproj files, or some kind of routing table, it’s a hurtful design that causes project friction — and Xml makes it so much worse. I think Microsoft tools to this day do not take adequate precautions to avoid merge conflict problems (EF configuration, *.csproj files, Web.config).
TeamCity as a Nuget Server
It’s easy to use, quick to set up, but I’m recommending that you don’t use it for much. I feel like it didn’t hold up very well as the feed got bigger performance wise and it would often “lose its Nugets,” forcing you to re-build the Nuget index before the feed would work again. The rough thing was that the feed wouldn’t fail, it would just return very old results and cause plenty of havoc for our users that depended on the edge feed. To keep the performance to a decent level, we had to set up archive rules to delete all but say 10 versions of each Nuget. Deleting the old version caused obvious trouble.
If I had to do it again, I would have opted for many fewer builds and made a point of treating builds that were triggered by cascading builds from builds that were triggered by commits to source control. As it is, we publish new Nuget packages every single time a CI build succeeds. In a better world we would have only published new Nugets on builds caused by changes to the source code repository.
Reproduceability of Builds with Floating Dependencies
The single worst thing we did that I’ve always regretted is not creating perfectly reproduce-able builds with our floating ripple dependencies. To make things fully reproduce-able, we would have needed to be able to build a specific version of a codebase by using the exact same versions of all of its dependencies that were used at the time of the CI build. I.e., when I try to work with FubuMVC #1200, we need it to be using the exact same version of FubuCore that was used inside the TeamCity CI build. We got somewhat close. Ripple would build a history digest of its dependencies and have that published to TeamCity’s artifacts — but we were archiving the artifacts to keep the build server running faster. We also set up tagging on successful builds to add the build number to the GitHub repositories after successful builds (that’s an old part of the Extreme Programming playbook too, but we didn’t get that going upfront and I really wish we had). What we probably needed was some additional functionality in Ripple to take our published dependency history and completely rebuild everything back to the way we needed it. I’m still not exactly sure what we should have done to alleviate this issue.
This was mostly an issue for teams that wanted to try to reproduce problems with older versions of FubuMVC that were well behind the current edge version. One of the things that I think helped sink FubuMVC was that so many of the teams that were very active in our community early on stopped contributing and being involved and we were stuck trying to support lots of old versions even while we were trying to push to the magic SemVer 1.0 release.
Nuget vs. gems vs. git submodules for build time dependencies
In my last post in this series I got plenty of abuse from people who think that having to install Ruby and learn how to type “rake” at the command prompt was too big of a barrier for .Net developers (that was sarcasm by the way). Some commenters thought that we should have been using absolutely nothing but Nuget to fulfill build time dependencies on tools that we used within the automated builds themselves. There’s just one little problem with that understandable ideal: Nuget was a terrible fit for command line executables within an automated build.
We use a couple different command line tools from within our Rake scripts:
- Ripple for our equivalent of Nuget package restore and publishing build products as Nuget packages
- FubuDocs for publishing documentation (and yes, it’s occurred to me many times that I spent much more time creating a new tool for publishing technical documentation than I did writing docs but we all know which activity was much more fun to do)
Yet again, having the package version number as part of the Nuget package folder made using command line tools resolved by Nuget a minor nightmare. We had an awkward Ruby function that could magically determine the newest version of the Nuget package to find the right path to the bottles.exe/ripple.exe/fubudocs.exe tools. In retrospect, we could have used the Nuget’s as is to continue distributing the executables after Ripple 2.0 fixed the predictable Nuget path problem, but we also wanted to be able to use these tools from the command line as well.
As it turned out, using Ruby gems to distribute and install our .Net executables was much more effective than Nuget was. For one, gems is well integrated with Rake which we were already using. Gems also has the ability to place a shim for an executable onto your Windows PATH, making our custom tools easier to use at the command line.
And yes, we could have used Chocolately to distribute our executables, but at one point we were much more invested in making our ecosystem be cross platform and Chocolately is strictly Windows only where gems is happily cross-platform. Because Rob Reynolds is just that awesome, you can actually use Chocolately to install our gems and it’ll even install Ruby for you if it’s not already there.
And yeah, in the very early days because FubuMVC actually predates Nuget, we tried to distribute shared build utilities via Git submodules. The less said about this approach, the better. I’ll never willingly do that again.
Topics for Another Day:
I’m trying to keep my newfound blogging resurgence going, but we’ll see how it goes. The stuff below got cut from this post for length:
- Why semantic versioning is so important
- My recommendations for improving Nuget (the Nuget team is asking the Ripple team for input and I’m trying to oblige)
- One really long rant about how strong naming is completely broken
- Why and how we think Ripple improves upon Nuget
- Branch by feature across multiple repositories with Ripple
* For my money, the number of unit tests is my favorite metric to judge how large and complicated a codebase is, but only measured in the large. Like all other metrics, I’d use this with a grain of salt and it’s probably also useless if the developers are cognizant of the unit test as metric usage.
** Get it, changes “ripple” from one repo to another. I love the song “Ripple” by the Greatful Dead and it’s also pretty likely that I was listening to that song the day we came up with the name.