Deep Dive: Learning About Your API Behavior on K8s
This is a guest post by Bruno Félix. Bruno is CTO at Waterdog, a digital consultancy company.
“Tests are green, everything looks good, I’ll just push this change and quickly grab something to eat before starting the next task.” Twenty minutes later Slack lights up with reports of several services experiencing issues. “Uh-oh this is not good”, and it wasn’t…
In this post, I’ll share an all-too-common story about how a typo almost got the better of us, how “free” API specs could have prevented headache and heartache, and how to set up a tool by a company called Akita to do this. This post explores how to generate and compare OpenAPI specs for a service just by analyzing the network traffic, without touching the application’s code. While it’s possible to run Akita in CI/CD, staging, or production, we’ll focus on integrating with Kubernetes, the most popular container orchestration system in use today, for staging and production deployments.
And if this sounds like something you want to try, sign up for the Akita beta.
How a typo almost got the better of us
At the time we were working with a client in the financial services industry, the team was two days away from an important milestone, which for organizational reasons (AKA politics) could not be postponed. Things were proceeding at a brisk pace, and despite some delays, we were confident that we could meet that deadline, the remaining work was relatively straightforward, so it ought to be a smooth ride.
Having a massive dumpster fire at this stage was a nasty surprise, and it gave everyone pause. What could it be? Out of the many changes we pushed recently, what triggered this? What changed? Are services healthy, or is something crashing? Is it any third party service misbehaving?
After going through the logs of the different services, we started narrowing down the range of possible explanations to API changes in the internal services. Digging around some more, we finally came to the conclusion that the cause was a relatively minor change in the service that contained the “reference information” for the financial products we were handling - the change I pushed before going to grab something to eat.
Sometimes it’s the little things that get you, in this case the change was relatively trivial: renaming a field due to a typo, and adding a new field was enough to halt our progress for the better part of a day, two days away from an unmovable deadline featuring a demo to key stakeholders.
I’m sure a lot of people reading this have faced similar situations. The fact is that despite our best efforts, and our attempt at being disciplined in our practice, our process and tools did not capture a trivial mistake. The fact is that a lot of the tools we have, IDEs, typecheckers, debuggers, static analysis tools are not geared towards this brave new world where we’re all building distributed systems. We need new tools, and we need them for yesterday!
Akita is part of this new generation of tools that embraces the kinds of software a lot of us are building: services that expose APIs, communicating via network protocols, and featuring complex interaction patterns, where the behaviour of the system is more than the sum of its parts.
Specs for free
Specifications are valuable technical artifacts that can have a pivotal role in establishing common ground in terms of the capabilities a system provides across individuals and teams. That being said, more often than not developers only grudgingly adopt the usage of specs, because they can be expensive to create and maintain. They are prone to becoming outdated, taking considerable effort to keep everything in sync.
This state of affairs is even worse in the world of “legacy” software, especially in sectors other than software (e.g. manufacturing, transportation, etc), it’s not uncommon to find organizations relying on systems were the existing API documentation hasn’t been updated since the system has been originally developed, and where the people that implemented it have long since moved on.
Fortunately the API tooling landscape is changing, and Akita gives you the capability of inferring API models from network traffic. Akita API models contain information about endpoints, data types, and data formats—and are stored as annotated API specs. Generating API models is extremely valuable for a number of reasons:
- Enables you to map out the scope and structure of your APIs when the only artifacts that exist are old/outdated specs or documentation without touching the application code
- Detecting drift between implementation and specification;
- Detect regressions across different versions of an implementation;
- Generate API specs that you can use for client generation, documentation, and more
Detecting drift is one of the reasons why I’m excited about Akita. At Waterdog we’ve seen several situations where one team introduces some trivial change, but forgets to update the specification, and this mismatch between spec and implementation cascades and affects other teams consuming that service. This is especially frustrating if teams or team members are spread around different time zones. Having to wait for a team four or five hours away from you to come online, so you can make progress on a given task is not only a waste, but also an incredible source of frustration, especially if you’re hard pressed to deliver results.
In the next section, we’re going to do a deep dive on how to set up Akita in a non-trivial scenario, hopefully closer to what can be encountered in production. The application under test is a small web application, that exposes a minimal REST API, and depends on a database and another service. For the demo, we will run this setup in a Kubernetes cluster, and generate an Open API spec without touching any of the application code. Let’s get started!
Technical deep dive
Requirements
In order to run we recommend you have access to a Kubernetes cluster. There are several options available if you want to run locally:
- Minikube: https://minikube.sigs.k8s.io/docs/start/
- Microk8s: https://microk8s.io/
You’ll need to set up your Akita account (unsurprisingly), so please make sure you have joined the beta, and have access to Akita UI.
Also make sure to download the Akita CLI.
Running in Kubernetes
With this out of the way, let’s dive into the service we are going to use, and infer some APIs!
The system under test
In this example we tried to create a toy system that hopefully matches some of the things you’ll find in a more realistic scenario. We’ve built a simple todo service, that exposes a REST API, and is backed by Redis (for persistent storage), and collaborates with another internal service, the statistics service, via it’s REST API to keep track of some statistics.
You can find the code in this repo, so go ahead and clone it.
This example is composed of an `src` folder where we can find the code for both the todo service (`todo-service.py`) and the statistics service (`statistics-service.py`), and in `service.yaml` we have our Kubernetes setup. The `service-with-akita.yaml` contains the finalized example for your reference.
On the `service.yaml` file we can find the configurations for the various elements in the system:
- Redis: the service and the stateful set;
- Statistics service: the service and deployment;
- Todo service: config map, service and deployment;
Caveat emptor: Do yourself a favor and don’t even try to use this Redis setup for any kind of production work!
Running the example
Let’s go ahead and run this in our cluster with the following command:
After a few moments everything should be up and running, so by running: `kubectl get services` you should be able to see the three services declared in `service.yaml`.
For simplicity’s sake we’ve decided to expose the todo service and statistics service with the type NodePort, which simply means that the service is reachable through a port in our node in the 30000-32767 range.
In particular, for this example the todo service is reachable at port 30123 and the statistics service is reachable at port 30456. If you open a browser (or your tool of choice like cURL) on http://localhost:30123/todos you should see a JSON response indicating an empty todo list:
So far so good, everything seems to be running.
You can play around a bit with the API, by creating new todo items via a POST request to http://localhost:30123/todos
List todos:
Create a new item:
Enter Akita
What we want to do next is set up Akita so that it runs alongside our todo service, as a sidecar, and is able to capture incoming and outgoing traffic in order to infer the API of the service.
In order to proceed we need to have access to an API Key ID and Secret as well as a service.
Head over to the Akita UI, log in, and if you’re a first time user you will be greeted with the onboarding flow. If you already went through the onboarding flow, you can repeat that flow to set up a new service via a call to action in the “Dashboard” section.
The onboarding flow is a four step process:
- Step 1: choose “API”, as that is the most suitable option for the current example.
- Step 2: You will need to enter the service name. We recommend using “k8s-integration”, since that matches with what’s in the example code. Once you’re done click “Create service”.
- Step 3: Select “Docker” and click “Generate API key”. Copy the API key ID and secret, they are essential.
- Step 4: You will now be shown a message indicating Akita is waiting to receive API traffic
Updating the Kubernetes cluster
With the new Akita service and credentials, it’s time to update our Kubernetes cluster. The first step is to create a new config map where we store the API Key ID and Secret we generated earlier. Create a new `akita-secrets.yaml` file and add the following content:
Note: keep in mind that you can encode a string using `echo -n "my string" | base64`
Add the secrets to the Kubernetes cluster by running `kubectl apply -f akita-secrets.yaml`
The next step is to set up Akita to run alongside the todo service. In order to do that, we need to edit `service.yaml` and add a new container to the todo-service-deployment (line 46):
A few important highlights:
- The pre-stop hook is essential to cleanly shutdown akita and generate the API
- The Akita credentials are read from the previously created secret
- The last argument is the service name, so make sure that it matches what you created
Save the changes to the file and deploy the modified configuration using `kubectl apply -f service.yaml`
Note: A complete example featuring this change can be found in `service-with-akita.yaml`
Your first trace from a Kubernetes service
After a few moments everything should be up and running, so by running: `kubectl get services` you should be able to see the three services declared in `service.yaml`, and note that the todo-service-deployment pod should have a 2/2 indication in the ready column (it signifies that both containers in the pod are running).
Run some requests through the API, and if you head to Akita UI you should now see a preview of your API spec! At this point we are almost done.
Once you’re done playing around with the system you can stop your Kubernetes cluster.
Generating a spec from a trace
Head to Akita UI, expand the “Services” dropdown in the left hand side menu, and click on the service you previously created. This will show you the service dashboard. Click on the traces tab, and you should see at least one trace (possibly more if you have stopped and started the pod/cluster multiple times).
To generate a spec we are going to use the Akita CLI. Note the name of the trace you wish to generate the spec from and run the following command: `akita apispec --traces akita://k8s-integration:trace:traceName --service k8s-integration` (see the docs for more options)
Head over to your service in Akita UI, and on the API Specs tab you should see an entry for the newly inferred API model!
If you click on it you’ll be able to explore not only the API itself, but also the outbound calls, in particular those to the statistics service.
Diffing specs
Another interesting feature of Akita is the API diffing capabilities. Some attentive readers might have realized that the todo service also features a delete endpoint that we haven’t mentioned yet.
Start Kubernetes again, create a few todo items, and then issue a DELETE request to http://localhost:30123/todos/{id} (note: the id of a todo item is visible when listing items)
Delete an item:
Head over to the traces tab for your service in Akita UI, and you should see a new trace there. Generate a new spec, using the same command used above, and you should have two specs.
In Akita UI, on the specs tab, click on the “Diff” button and choose the two specs you want to diff. You should be able to see a diff like this:
Note that the diff detected the delete endpoint. This can be a really powerful way to have better situational awareness for changes, especially when combined with your CI/CD pipeline.
Takeaways
We’ve demonstrated how to get started using Akita with Kubernetes, and how to automatically generate OpenAPI specifications without touching any application code. This traffic-watching approach provides a structured, human friendly summary of what’s going on in your system, improving your situational awareness, without having to spend significant amounts of development time to do so.
The ability to infer API specifications, eliminates a lot of the overheads required to keep specs in sync with the code, and the subsequent problems. That fact coupled with the ability to visualize changes over time, via diffing, can help your organization leverage the use of specs, as first class engineering artifacts.
The current crop of development tools and practices was built for a time when most software ran in the same process. While still very much relevant, a lot of the software being built nowadays is distributed, spread across different machines, runtimes and organizational boundaries. Now more than ever in the software world, complexity lies in the interactions between components, often over unreliable networks, and subject to the weird and wonderful properties and boundaries of distributed systems (e.g. CAP theorem, difficulties asserting causality, etc).
Akita is part of a new generation of tools that embrace this heterogeneous environment and seeks to help you make sense of your software ecosystem, through the combination of structure and observability.
We’re still early in our journey, but if like us you believe software development shouldn’t be painful, and fraught with incidents, that there has to be a better way to reason and build software, join us in the Akita beta.
With thanks to Mark Gritter and Jean Yang for comments. Photo by Luke Jones on Unsplash. You can read more about Akita on K8s in the docs here.