Skip to Content
Welcome to the new DocsGPT docs!
Deploying🐘 PostgreSQL for User Data

PostgreSQL for User Data

DocsGPT is progressively moving user data (conversations, agents, prompts, preferences, etc.) from MongoDB to PostgreSQL, one collection at a time. Each collection is guarded by a feature flag so you can opt in and roll back instantly. MongoDB stays the source of truth until you cut over reads; vector stores (VECTOR_STORE=pgvector, faiss, qdrant, mongodb, …) are unaffected.

ℹ️

Which collections are available today is in the Status table below. That table is the only part of this page that changes release to release.

Setup

  1. Run Postgres 13+. Native install, Docker, or managed (Neon, RDS, Supabase, Cloud SQL…) β€” all work. You’ll need the pgcrypto and citext extensions, both standard contrib modules available everywhere.

  2. Create a database and role (skip if your managed provider gave you these):

    CREATE ROLE docsgpt LOGIN PASSWORD 'docsgpt'; CREATE DATABASE docsgpt OWNER docsgpt;
  3. Set POSTGRES_URI in .env. Any standard Postgres URI works β€” DocsGPT normalizes it internally.

    POSTGRES_URI=postgresql://docsgpt:docsgpt@localhost:5432/docsgpt # Append ?sslmode=require for managed providers that enforce SSL.
  4. Apply the schema (idempotent β€” safe to re-run):

    python scripts/db/init_postgres.py

Migrating data

Two global flags, no per-collection knobs β€” every collection marked βœ… in the Status table is handled automatically.

  1. Enable dual-write. Writes go to both Mongo and Postgres; Mongo remains source of truth. Set the flag in .env and restart:

    USE_POSTGRES=true
  2. Backfill existing data. Idempotent β€” re-run any time to re-sync drifted rows. Without arguments, backfills every registered table; pass --tables to limit.

    python scripts/db/backfill.py --dry-run # preview everything python scripts/db/backfill.py # real run, everything python scripts/db/backfill.py --tables users # only specific tables
  3. Cut over reads once you trust the Postgres state:

    READ_POSTGRES=true

    Rollback is instant: unset READ_POSTGRES and restart. Dual-write keeps Postgres up to date so you can flip back and forth.

⚠️

Don’t decommission MongoDB until every collection you use is fully cut over. During the migration window, Mongo is still required.

Status

Last updated: 2026-04-10

CollectionStatus
usersβœ… Phase 1
prompts, user_tools, feedback, stack_logs, user_logs, token_usage⏳ Phase 1
agents, sources, attachments, memories, todos, notes, connector_sessions, agent_folders⏳ Phase 2
conversations, pending_tool_state, workflows⏳ Phase 3

Schemas for every row above already exist after init_postgres.py runs. What’s landing progressively is the application-level dual-write wiring and the backfill logic for each collection. Once a collection is βœ…, enabling USE_POSTGRES=true and running python scripts/db/backfill.py picks it up automatically β€” no per-collection config change.

Troubleshooting

  • relation "..." does not exist β€” run python scripts/db/init_postgres.py.
  • FATAL: role "docsgpt" does not exist β€” run the CREATE ROLE / CREATE DATABASE statements from step 2 as a Postgres superuser.
  • SSL errors on a managed provider β€” append ?sslmode=require to POSTGRES_URI.
  • Dual-write warnings in the logs β€” expected to be non-fatal. Mongo is source of truth, so the user-facing request succeeds. Re-run the backfill to re-sync whichever rows drifted.