For Cloud-Native Applications, Testing In Production Goes From Punchline To Best Practice

As software takes over more of IT, developers are taking more ownership of related parts of software delivery, and moving faster. In this shifting reality, with increasing velocity and complexity, more software can mean more risk. However, beyond the linear increase in risk, it is also the very nature of risk that is changing.

In the very recent past, developers would work on monolithic applications, which meant that the scope and rate of change were more manageable. If developers have a vision of the desired state, they can test for it in the place where they can validate changes; for most of the recent two decades, this place has been the Continuous Integration/Continuous Delivery (CI/CD) pipeline.


Microservices increase complexity by shifting change validation from the pipeline into the network: if my application is made up of a 100 services, the pipeline won’t build and test them all at the same time. Inherent to the concept and benefit of microservices, some services will go through the pipeline while most others are in a different state, from being rewritten to running in production. In this situation, a desired state can be a phantom, so we need a new testing paradigm.

From punchline to best practice

It used to be the case that to scare off a nosy manager, developers would use the phrase “we test in production”. This was a concept so absurd it couldn’t possibly be taken literally, and would be sure to end the conversation. Now, with cloud-native applications, testing in production can be seen as the least scary option of them all.

When there is a constant generation of change that still needs validation, production environments become the main place where a never-ending series of ‘desired states’ exist. With the right tools for observability and for quick reaction to issues, risk could be managed here. (Assuming, of course, that I have the right people and processes in place first.)

Companies like Honeycomb and others herald this approach. When dealing with complex systems, they claim that most problems result from the convergence of numerous and simultaneous failures. Therefore, Honeycomb focuses on an observable state in order to understand exactly what is happening at any given moment.

This is observing state, not validating change. A different approach comes from where we validated change before: the CI/CD pipeline.

CircleCI CTO Rob Zuber

Putting change and state together

In an eBook published this week, Rob Zuber, CTO of developer tools company CircleCI, talks about the role of CI/CD together with testing of microservices.

As more and more companies outside of the tech industry come to see software as a competitive differentiator, Zuber sees greater use of 3rd-party services and tools, a proliferation of microservices architectures, and larger data sets.

In practice, Zuber claims, breaking change into smaller increments can look like deploying multiple changes—something that a CI/CD tools would naturally handle well. Being able to evolve your production environment quickly and with confidence is more important than ever, but using a combination of tools that validate change and others that observe state is key.

CircleCI itself, says Zuber, uses its own technology for CI/CD, and other tools to tie together the change and state validation—crash reporting, observability, and centralised logging are examples. At the center of this stack is the CI/CD tool, which in their view is better placed compared to source control tools or operational systems.

A full view of the software delivery cycle

Both approaches seem to agree that neither observing a state nor observing the change is enough: good coding practices, source control management, solid deployment practices—they are all crucial parts of the same picture. All of these elements together can lead to better cost and risk decisions, more robust code, and better products. Testing is another area of software delivery that is being utterly transformed by cloud-native technologies and open source, and should be examined with new realities and new risks in mind.

(Originally posted on