Observability
DocsGPT bundles the OpenTelemetry SDK and auto-instrumentation packages
in application/requirements.txt β they install with the rest of the
backend deps. Telemetry is off by default; opt in by prefixing the
launch command with opentelemetry-instrument and setting OTLP env
vars.
Auto-instrumentation covers Flask, Starlette, Celery, SQLAlchemy, psycopg, Redis, requests, and Python logging. LLM/retriever calls are not captured at this layer β see Going further below.
Enabling
Set these env vars in your .env (or compose environment: block):
OTEL_SDK_DISABLED=false
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
OTEL_EXPORTER_OTLP_ENDPOINT=https://your-collector.example.com
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer%20<token>
OTEL_TRACES_EXPORTER=otlp
OTEL_METRICS_EXPORTER=otlp
OTEL_LOGS_EXPORTER=otlp
OTEL_PYTHON_LOG_CORRELATION=true
OTEL_RESOURCE_ATTRIBUTES=service.name=docsgpt-backend,deployment.environment=prodThen prefix the process command with opentelemetry-instrument. The
simplest way is a compose override (no image rebuild):
# deployment/docker-compose.override.yaml
services:
backend:
command: >
opentelemetry-instrument gunicorn -w 1 -k uvicorn_worker.UvicornWorker
--bind 0.0.0.0:7091 --config application/gunicorn_conf.py
application.asgi:asgi_app
environment:
- OTEL_SERVICE_NAME=docsgpt-backend
worker:
command: opentelemetry-instrument celery -A application.app.celery worker -l INFO -B
environment:
- OTEL_SERVICE_NAME=docsgpt-celery-workerFor local dev, prepend dotenv run -- so the OTEL_* vars from .env
reach opentelemetry-instrument before it boots the SDK:
dotenv run -- opentelemetry-instrument flask --app application/app.py run --port=7091
dotenv run -- opentelemetry-instrument celery -A application.app.celery worker -l INFO --pool=soloLogs are exported in-process when OTEL_LOGS_EXPORTER=otlp is set β
application/core/logging_config.py detects the flag and preserves
the OTEL log handler. Without it, logging writes only to stdout.
Backend examples
Axiom
OTEL_EXPORTER_OTLP_ENDPOINT=https://api.axiom.co
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer%20xaat-XXXX,X-Axiom-Dataset=docsgpt
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf%20 is the URL-encoded space between Bearer and the token. Create
the dataset in the Axiom UI before sending.
Self-hosted OTLP collector / Jaeger / Tempo
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
OTEL_EXPORTER_OTLP_PROTOCOL=grpcHoneycomb / Grafana Cloud / Datadog
Each vendor publishes a single-line OTEL_EXPORTER_OTLP_ENDPOINT plus
OTEL_EXPORTER_OTLP_HEADERS recipe β drop them in alongside the
service-name override.
Caveats
- The Dockerfile uses
gunicorn -w 1. If you raise worker count, move SDK init into apost_worker_inithook to avoid one-thread-per-process exporter contention. asgi.pywraps Flask in StarletteβsWSGIMiddleware. Both instrumentors are installed, so each request produces a Starlette span enclosing a Flask span. Dropopentelemetry-instrumentation-flaskfromrequirements.txtif the duplication is noisy.- OTEL packages add ~50 MB to the image. They install on every build β
the runtime cost is zero unless you set
opentelemetry-instrumenton the command and set the OTLP env vars. - The OTEL exporter ecosystem currently caps
protobufat<7, so the backend runs on protobuf 6.x. This will catch up in a future OTEL release.