You Can't Test All of Your Serverless Application, but You Should Monitor It. Here's Why.

If you're not monitoring everything, good luck!

Mar 29, 2022

This is the fifth and last post in the series, “5 Strategies for Serverless Testing”. Check out the first, second, third, and fourth post. If you like this content, feel free to share and subscribe!

With distributed systems, testing everything most rigorously is not always possible. That’s why monitoring everything is key to understanding and then addressing the weaker links of the whole platform.

1. Monitoring is a crucial way to ensure the quality of your application

Monitoring an application is what allows you to keep an eye on (and improve upon):

application metrics (for example, how many payments are performed every hour, how many successful signups every day, etc)
system metrics (for example, deployments, scaling up/down, or configuration changes)
platform metrics (for example, how many queries against the database are performed every hour, or what’s the average response time of a given Lambda).

Application metrics are about how your platform behaves and is perceived by the end-user. System metrics are about outside forces and events that may or may not have an impact on performance. Platform metrics are about the quality of your infrastructure.

At a fundamental level, monitoring is the key to ensuring your standards are consistently maintained in those three areas. And this is regardless of whether you run a monolith or a distributed system such as microservices.

If you don’t know what’s going on in your application, you can’t make meaningful improvements. On the other hand, if you can measure how long it takes to do X, or how many customers get to complete flow Y every day, then you can set out a goal to improve on those numbers (if they are not satisfactory).

2. Distributed systems require greater and more specific forms of monitoring

With a distributed system such as microservice architecture, things get even more complex. The intuitively obvious reason for this increased complexity is that you now have so many moving parts.

If a single function or container fails to respond, the whole application may be in trouble. A slow response time may be due to just one misconfigured (or under-provisioned) microservice.

With a medium-complexity monolith, you can make blanket statements such as “our application is too slow”; the solution to that might just be to add some caching, increase the size of the instance where the code is running, or scale up the database.

With microservices (and serverless), blanket statements like that don’t usually work or mean much. You need to be more specific. The solution to your performance issues is rarely going to be to just scale up all your Dynamo tables or increase the size of all your Lambda functions.

As your system increases in size and complexity, you want to be clear and precise about:

what exactly is the issue, and
how can we address it without being wasteful

Scaling up all your services can be expensive, so you need to know exactly where the pressure points are.

3. Not everything can be tested, but everything can be caught!

We have talked before about the major gains that can be achieved by switching your focus from mostly unit tests to mostly integration tests.

The reality, though, is that some things are very hard to write integration tests for. The same applies to end-to-end tests. This is especially true if third-party APIs or components are involved.

There is a deeper conversation to be had here as to what can be done in those difficult-to-test scenarios. On a case-by-case basis, it may just make sense for you to defer putting tests in place for those areas of the system.

Regardless, the one thing you cannot afford to postpone is a clear view of what is (and isn’t) happening. Everything can be caught, and if a problem is caught early enough you may still be able to do something about it and diminish its impact on the end-user.

The takeaway

Monitoring is a crucial component to ensure that your desired standards are maintained across various areas of your system
Distributed systems require much more granular and sophisticated forms of monitoring
Testing everything may sometimes be too expensive or downright impossible. But good monitoring can always be done and can help you deliver an acceptable experience to your customers

The Serverless Mindset

Discussion about this post