Skip to Content
Welcome to the new DocsGPT docs!
DeployingπŸ”­ Observability

Observability

DocsGPT bundles the OpenTelemetry SDK and auto-instrumentation packages in application/requirements.txt β€” they install with the rest of the backend deps. Telemetry is off by default; opt in by prefixing the launch command with opentelemetry-instrument and setting OTLP env vars.

Auto-instrumentation covers Flask, Starlette, Celery, SQLAlchemy, psycopg, Redis, requests, and Python logging. LLM/retriever calls are not captured at this layer β€” see Going further below.

Enabling

Set these env vars in your .env (or compose environment: block):

OTEL_SDK_DISABLED=false OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf OTEL_EXPORTER_OTLP_ENDPOINT=https://your-collector.example.com OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer%20<token> OTEL_TRACES_EXPORTER=otlp OTEL_METRICS_EXPORTER=otlp OTEL_LOGS_EXPORTER=otlp OTEL_PYTHON_LOG_CORRELATION=true OTEL_RESOURCE_ATTRIBUTES=service.name=docsgpt-backend,deployment.environment=prod

Then prefix the process command with opentelemetry-instrument. The simplest way is a compose override (no image rebuild):

# deployment/docker-compose.override.yaml services: backend: command: > opentelemetry-instrument gunicorn -w 1 -k uvicorn_worker.UvicornWorker --bind 0.0.0.0:7091 --config application/gunicorn_conf.py application.asgi:asgi_app environment: - OTEL_SERVICE_NAME=docsgpt-backend worker: command: opentelemetry-instrument celery -A application.app.celery worker -l INFO -B environment: - OTEL_SERVICE_NAME=docsgpt-celery-worker

For local dev, prepend dotenv run -- so the OTEL_* vars from .env reach opentelemetry-instrument before it boots the SDK:

dotenv run -- opentelemetry-instrument flask --app application/app.py run --port=7091 dotenv run -- opentelemetry-instrument celery -A application.app.celery worker -l INFO --pool=solo
ℹ️

Logs are exported in-process when OTEL_LOGS_EXPORTER=otlp is set β€” application/core/logging_config.py detects the flag and preserves the OTEL log handler. Without it, logging writes only to stdout.

Backend examples

Axiom

OTEL_EXPORTER_OTLP_ENDPOINT=https://api.axiom.co OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer%20xaat-XXXX,X-Axiom-Dataset=docsgpt OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf

%20 is the URL-encoded space between Bearer and the token. Create the dataset in the Axiom UI before sending.

Self-hosted OTLP collector / Jaeger / Tempo

OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317 OTEL_EXPORTER_OTLP_PROTOCOL=grpc

Honeycomb / Grafana Cloud / Datadog

Each vendor publishes a single-line OTEL_EXPORTER_OTLP_ENDPOINT plus OTEL_EXPORTER_OTLP_HEADERS recipe β€” drop them in alongside the service-name override.

Caveats

  • The Dockerfile uses gunicorn -w 1. If you raise worker count, move SDK init into a post_worker_init hook to avoid one-thread-per-process exporter contention.
  • asgi.py wraps Flask in Starlette’s WSGIMiddleware. Both instrumentors are installed, so each request produces a Starlette span enclosing a Flask span. Drop opentelemetry-instrumentation-flask from requirements.txt if the duplication is noisy.
  • OTEL packages add ~50 MB to the image. They install on every build β€” the runtime cost is zero unless you set opentelemetry-instrument on the command and set the OTLP env vars.
  • The OTEL exporter ecosystem currently caps protobuf at <7, so the backend runs on protobuf 6.x. This will catch up in a future OTEL release.