Distributed tracing
Last updated
Last updated
You have applied the [[Microservice architecture]] pattern. Requests often span multiple services. Each service handles a request by performing one or more operations, e.g. database queries, publishes messages, etc.
How to understand the behavior of an application and troubleshoot problems?
External monitoring only tells you the overall response time and number of invocations - no insight into the individual operations
Any solution should have minimal runtime overhead
Log entries for a request are scattered across numerous logs
Instrument services with code that
Assigns each external request a unique external request id
Passes the external request id to all services that are involved in handling the request
Includes the external request id in all
Records information (e.g. start time, end time) about the requests and operations performed when handling a external request in a centralized service
This instrumentation might be part of the functionality provided by a [[Microservice Chassis]] framework.
The following Spring Cloud Sleuth dependencies are configured in build.gradle
:
[[RabbitMQ]] is used to deliver traces to [[Zipkin]].
The services are deployed with various [[Spring Cloud Sleuth]]-related environment variables set in the docker-compose.yml
:
This properties enable Spring Cloud Sleuth and configure it to sample all requests. It also tells Spring Cloud Sleuth to deliver traces to [[Zipkin]] via [[RabbitMQ]] running on the host called rabbitmq
.
The Zipkin server is a simple, Spring Boot application:
It is deployed using Docker:
This pattern has the following benefits:
It provides useful insight into the behavior of the system including the sources of latency
This pattern has the following issues:
Aggregating and storing traces can require significant infrastructure
[[Log aggregation]] - the external request id is included in each log message
The is an example of an application that uses client-side service discovery. It is written in Scala and uses Spring Boot and Spring Cloud as the [[Microservice chassis]]. They provide various capabilities including , which provides support for distributed tracing. It instruments Spring components to gather trace information and can delivers it to a [[Zipkin]] Server, which gathers and displays traces.
It enables developers to see how an individual request is handled by searching across for its external request id
- service for recording and displaying tracing information
- standardized API for distributed tracing