PostgreSQL for User Data
DocsGPT is progressively moving user data (conversations, agents, prompts,
preferences, etc.) from MongoDB to PostgreSQL, one collection at a time.
Each collection is guarded by a feature flag so you can opt in and roll
back instantly. MongoDB stays the source of truth until you cut over
reads; vector stores (VECTOR_STORE=pgvector, faiss, qdrant, mongodb, β¦)
are unaffected.
Which collections are available today is in the Status table below. That table is the only part of this page that changes release to release.
Setup
-
Run Postgres 13+. Native install, Docker, or managed (Neon, RDS, Supabase, Cloud SQLβ¦) β all work. Youβll need the
pgcryptoandcitextextensions, both standard contrib modules available everywhere. -
Create a database and role (skip if your managed provider gave you these):
CREATE ROLE docsgpt LOGIN PASSWORD 'docsgpt'; CREATE DATABASE docsgpt OWNER docsgpt; -
Set
POSTGRES_URIin.env. Any standard Postgres URI works β DocsGPT normalizes it internally.POSTGRES_URI=postgresql://docsgpt:docsgpt@localhost:5432/docsgpt # Append ?sslmode=require for managed providers that enforce SSL. -
Apply the schema (idempotent β safe to re-run):
python scripts/db/init_postgres.py
Migrating data
Two global flags, no per-collection knobs β every collection marked β in the Status table is handled automatically.
-
Enable dual-write. Writes go to both Mongo and Postgres; Mongo remains source of truth. Set the flag in
.envand restart:USE_POSTGRES=true -
Backfill existing data. Idempotent β re-run any time to re-sync drifted rows. Without arguments, backfills every registered table; pass
--tablesto limit.python scripts/db/backfill.py --dry-run # preview everything python scripts/db/backfill.py # real run, everything python scripts/db/backfill.py --tables users # only specific tables -
Cut over reads once you trust the Postgres state:
READ_POSTGRES=trueRollback is instant: unset
READ_POSTGRESand restart. Dual-write keeps Postgres up to date so you can flip back and forth.
Donβt decommission MongoDB until every collection you use is fully cut over. During the migration window, Mongo is still required.
Status
Last updated: 2026-04-10
| Collection | Status |
|---|---|
users | β Phase 1 |
prompts, user_tools, feedback, stack_logs, user_logs, token_usage | β³ Phase 1 |
agents, sources, attachments, memories, todos, notes, connector_sessions, agent_folders | β³ Phase 2 |
conversations, pending_tool_state, workflows | β³ Phase 3 |
Schemas for every row above already exist after init_postgres.py
runs. Whatβs landing progressively is the application-level dual-write
wiring and the backfill logic for each collection. Once a collection
is β
, enabling USE_POSTGRES=true and running python scripts/db/backfill.py
picks it up automatically β no per-collection config change.
Troubleshooting
relation "..." does not existβ runpython scripts/db/init_postgres.py.FATAL: role "docsgpt" does not existβ run theCREATE ROLE/CREATE DATABASEstatements from step 2 as a Postgres superuser.- SSL errors on a managed provider β append
?sslmode=requiretoPOSTGRES_URI. - Dual-write warnings in the logs β expected to be non-fatal. Mongo is source of truth, so the user-facing request succeeds. Re-run the backfill to re-sync whichever rows drifted.