Observability: what it is, what it is not
Observability: ever since this word appeared on our screens, everyone has been redefining it to their advantage. For example, the market player who offers log and trace services will only talk about these specificities as if they allowed to grasp the whole concept. Whereas today, there is really no single solution for observability.
A specialist’s job
We can propose the most integrated definition possible. For us measurement experts, it is the collection and exploitation of all the elements of measurements and data necessary to detect a slowdown or unavailability, and especially, to determine the cause. We will find :
- System metrics from monitoring ;
- Functional uses;
- Visible and non-visible transactions (status, result, execution time)
This profession, reserved for specialists, is based on three pillars:
- Metrics and monitoring: actionable information about your application and its performance. Not necessarily the CPU graph, but for example measurementsDEM ;
- Recording/Journalization. Events must be recorded;
- Tracing. The goal is to capture what happens during an execution, especially in distributed systems (a request is sent to several micro services).
The right approach
The right solution is to choose a specialist for each of the pillars mentioned. But first, we must ensure that the application system is fully “observable”. The set described above is indeed only possible if the implementation of logs and traces is made possible from the development stage thanks to the technical choices of the application. This is not the case in a large part of the applications.
Each new service or functionality must be developed taking into account this need for observability. It must be continuous or it will not be! The biggest mistake would be to put observability on an unobservable application.
Observability is the ability to measure the internal states of a system by examining what it produces. A system is considered “observable” if its current state can be estimated using only the output information, namely the sensor data.
AI, which is very often put forward, also has its limitations. It is only possible to improve a situation that has already occurred, which limits its deployment to date. Each context is new and autonomous intelligence solutions are not yet powerful enough.
If the application allows it, the MIP solution can meet all the objectives related to the first pillar.
What ROI for observability (Business Vs Technical Perfection)?
The question then arises: what is the ROI for the post development of functions to improve the observability of my application if they are not directly useful to the user.
No solution to date will answer 100% to a 100% observable system. There will always be a gap. And trying to fill this gap, or 101% availability, will sometimes cost more than allowing a drop in performance.
One priority: the user!
The user’s point of view is thus essential. The user is able to give a very clear indication of where to look for the problem. This makes it possible to participate in the construction of an observability that will be directly useful.
MIP and its partners can thus help build observability with transparency and pragmatism.