Communication, Teams

Please, break the build!

The feeling when you hit the button and push the code to server. You’re done! You’ve implemented a new feature that’s going to be ready for the end-user. Soon. You see that the code has reached the build machine, and a new build gets scheduled. You get up to take a quick break and end up chatting with a few colleagues on the way. When you get back you see there are 10 messages in your inbox and the same number of notifications on your team chat.

Looking around, you see that the scheduled build has failed. And now there are 5 other schedules builds that ended up failing. It looks like you’ve just broken the build. The rest of the team of 20 developers are now blocked from committing in their code.

You sit down feeling…

Now most developers that have been in any decent sized team recognise the scenario from above. Perhaps they’ve even gone as far as pushing code as the last thing they did before leaving for home? The question is, in what state of mind is the developer? Have they just committed a cardinal sin within the team, or is this just business as usual?

Let’s explore the reactions our developer could have and how that reflects on their team.

…scared! aka “Don’t break the build!”

In some cases our developer would freak out. They know that all those messages are people “shouting” over electronic channels. Perhaps there’s a silly hat waiting for them the next day, indicating that they broke the build and need to be ridiculed about it.

When there are fingers pointed at the developer every time, new behaviours may emerge. A sense of fear. Fear of being pointed out as the build breaker, again. Fear of looking like “less” in front of the others on the team. Fear that the build statistics will find their way back to the performance review.

Fear has a way of manifesting itself back into the work, and of course affecting the developer in question negatively.

On a more positive side, having a strong regime for code pushes and deployments will ensure that the build pipeline is green most of the time. Which in turn means there will be a clear path to production for a hot fix.

…indifferent aka “Oh, the build server is always broken”

On the other extreme you have a situation where the developer actually doesn’t have that many notifications on their machine. Nobody seems to care that the build is broken, and even though it’s red, no one is doing anything about it. They’re actually piling up more and more work for the server to churn through.

A situation like this indicates a general attitude of not caring for the build pipeline, and probably having a low sense of quality in the software itself. Seeing that the build server is “always” broken, it’s now considered the same as in the story about the boy who cries “WOLF!”; something to be ignored.

There is no real positive side here, except that it’s easy to merge your work onto the trunk to be sent to the build server. The bad part is that code is probably bug-ridden, since developers don’t seem to be maintaining functioning tests. Or that it’ just hard to run tests on the build server. There also doesn’t seem to be a culture of learning and adapting based on failures.

…determined aka “Please, break the build (then fix it)”

In this scenario, our developer looks at the requests on their monitor and realise that the last code push had broken the build. It looks like it was a broken test. Our developer jumps into the team chat and announces that the build is broken, and that they are on the case of attempting to unblock the build.

In this team the developer didn’t fear being shamed, nor did they feel to just force push the code anyway and hope someone else will fix it. A developer is expected to have a proactive mindset, follow-up their build and take responsibility if it doesn’t run green. If that doesn’t happen then another developer can revert the offending commit, allowing others to deploy safely. Then the fix can be made without the stress of know the entire team is waiting for them.

A team with this attitude can more easily focus on the “why”. This means they can learn from the errors occurring, and make the needed changes to the process of the team to compensate.

So…

Each of team cultures produce completely different results.

strict team with low fault tolerance can lead to lack of experimentation and a focus on always making sure you are that your code will not break the build. This could lead to an increase in quality, but perhaps also a decrease in morale and creativity?

careless team could lead to a flawed approach to quality in code. Where the build server serves as the second compiler, and sometimes also the first. Developers are indifferent and don’t care about the build. This can lead to many packages going out to customers with bugs. Bugs that could be caught on the build server.

A team with the emphasis on safety hits a sweet-spot where developers know the importance of keeping the build green. They also know it’s a safety harness to catch when the developers make mistakes. A team with this mentality has the chance to help each other and make sure the end users get a product they can enjoy.

Rounding up

Safety within a team allows the developers to be able to have real conversations. Conversations about improving and adapting. But safety isn’t a prerequisite to having a good build setup. Any of the teams above can have a good build-setup, but possibly not have a team that is confident in the product they’re shipping nor have people who trust each other.

There are many variations though, and the real world is a lot more nuanced than can be depicted in this article. I’m still a fan of breaking the build one time too many, than too less. At the end of the day it’s all about feedback loops and getting answers as close to the time the code was written as possible. Most importantly, it’s about learning and adjusting your team and process. The build warning is just one of many signs.

What’s your approach towards breaking the build? Yay! Nay! Or just Meh? Please leave your thoughts and perspectives in the comments or reach out to me directly.

 

  • Being able to build locally in the same way as you do on the CI server is the best tool to prevent broken builds. This is what we use in many projects.

    • Yes, that’s a great way have a sense of stability. Yet at the same time, it also requires having a fast build / test-suite.

      But then there are certain jobs a build server does that usually isn’t run on a dev machine, like create artefacts, run integration /smoke / api tests. creating the package and also a deployment, if that’s part of the build.

      Are those also part of what get run locally?

      • Cross-component tests are usually not run on CI build but rather during the nightly build, along with the deployment. If there’s a good level of isolation, all tests can be run locally. Of course there’ll be a risk of breaking contracts but this is more discipline than technology. More often than not the build is broken far before the integration part.

        • An interesting discussion turned up on twitter: https://twitter.com/rickasaurus/status/821979998575874048

          Have you tried building pre-merge and post-merge-pre-sync? So basically you can’t push code that doesn’t merge without a successful build / passing tests.

          • We use Bamboo and there it is technically possible. We can have builds created per branch automatically and then we can enable merge check. Also TeamCity can do gated merge build. But we do not use it. We still fight on packages versus project references, this drives me mad.

          • Check this by Mathias Verraes: http://verraes.net/2016/04/code-reviews-and-blame-culture/. He writes about pull requests. We were discussing these and I personally believe this is a way to go. Others are concerned about people be blamed by the one who is wearing the police hat. But this article states what I assumed. As soon as the merge (and “breaking the build”) becomes a shared responsibility, the blame culture can be decreased. Yes, one wrote something that breaks the build, but the others let it through.

  • Since we push constantly it’s not as big of a deal when our build fails, and the failure is usually caught on CI before we even merge to master. However, this post still applies. Sometimes I find myself feeling guilty when I have to go back over a branch and rework it until it passes for the merge. I will try to bring a more positive mindset about it though.

    For my non-dev colleagues I just point out that the break caught an issue that would have otherwise been discovered in production.

    Great post