Skip to main content

Current Planning

Overview

After conducting quite a lot of investigation on how to implement observability end to end (FE, API Gateway, MW, BE) in Grafana, we got to the conclusion that we’re still not currently in a position of enabling it fully.

The main reasons are:

  • Tempo, the tracing backend used by Grafana is still a beta product, which has evolved quickly in the last year but is still not 100% there.
  • The Tempo instance enabled as part of our Grafana Cloud account, is pretty limited, and some tooling like debugging and metrics (available on the open-source tool) are not available. This means that we'd need probably to host an instance of Tempo on our AWS Cloud side at least during the integration work.
  • Following the previous point, the instance on Grafana Cloud accepts only OTLP/gRPC. The open-source option allows having many others, like OTLP/HTTP, which simplifies the tracing considerably.
  • Tempo accepts only hexadecimal strings as traceids, and AWS XRay auto-instrumentation uses a custom header X_AMZN_TRACE_ID, which uses dashes, like 1-6077002b-21ca433815c335581682f516, so we’d need to transform those before ingestion.
  • The current integration between Grafana and AWS when talking about Middleware is not enabling the smooth Observability we can found when talking about dedicated servers. In other words, will probably work pretty well with our BE/FE systems, but the MW would be a gap that won't fit on the solution. (There are plugins and datasources that can retrieve information from Cloudwatch or XRAY, but there's nothing much than querying those resources can be done)
  • Next Grafana CONline (7th to 17th of June 2021 - GrafanaCONline 2021 ) will introduce Grafana 8 and a lot of sessions related to new features on Tempo, Loki, Grafana Labs... so worths to wait and see what news Grafana brings

So, as an interim solution, the proposal is to use Amazon’s XRay, which is explained in the XRay document.