Docs / Databases / Migrating from MongoDB to PostgreSQL

Migrating from MongoDB to PostgreSQL

By Admin · Mar 15, 2026 · Updated Apr 25, 2026 · 510 views · 5 min read

Migrating from MongoDB to PostgreSQL involves transforming a schema-less document model into a relational schema, converting BSON data types, and rewriting application queries. PostgreSQL's JSONB support makes this transition smoother by allowing a gradual migration where some data remains semi-structured. This guide covers planning, execution, and validation of the migration.

Why Migrate from MongoDB to PostgreSQL?

  • ACID transactions — full multi-table transaction support without the complexity of MongoDB's multi-document transactions
  • SQL ecosystem — rich tooling, reporting capabilities, and developer familiarity
  • Cost — PostgreSQL is fully open source with no commercial licensing tiers
  • JSONB — PostgreSQL handles semi-structured data almost as well as MongoDB, with SQL power on top
  • Joins — replace application-level joins ($lookup) with efficient database-level joins

Migration Strategy Overview

  1. Schema analysis — understand your MongoDB collections and document structures
  2. Schema design — create PostgreSQL tables (relational + JSONB hybrid)
  3. Data export — dump MongoDB data as JSON
  4. Data transformation — convert and load into PostgreSQL
  5. Application migration — update queries and data access layer
  6. Validation — verify data integrity and performance

Step 1: Analyze MongoDB Schema

// Use variety.js to analyze document structure
// Install: npm install -g mongodb-schema

// Or manually analyze in mongosh:
var fields = {};
db.users.find().limit(1000).forEach(doc => {
    Object.keys(doc).forEach(key => {
        var type = typeof doc[key];
        if (Array.isArray(doc[key])) type = "array";
        if (doc[key] instanceof ObjectId) type = "ObjectId";
        if (doc[key] instanceof Date) type = "Date";
        fields[key] = fields[key] || {};
        fields[key][type] = (fields[key][type] || 0) + 1;
    });
});
printjson(fields);

Step 2: Design PostgreSQL Schema

The key decision is which embedded documents to normalize into separate tables versus keeping as JSONB:

-- MongoDB document:
-- {
--   _id: ObjectId("..."),
--   email: "user@example.com",
--   name: "John Doe",
--   address: { street: "123 Main", city: "NYC", zip: "10001" },
--   orders: [{ product: "Widget", qty: 3, price: 9.99 }],
--   tags: ["premium", "early-adopter"],
--   metadata: { source: "api", version: 2, custom: {...} }
-- }

-- PostgreSQL schema:
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    mongo_id VARCHAR(24) UNIQUE,  -- Keep for reference during migration
    email VARCHAR(255) UNIQUE NOT NULL,
    name VARCHAR(255) NOT NULL,
    -- Normalize address into columns (frequently queried)
    address_street VARCHAR(255),
    address_city VARCHAR(100),
    address_zip VARCHAR(20),
    -- Keep tags as array
    tags TEXT[] DEFAULT '{}',
    -- Keep flexible metadata as JSONB
    metadata JSONB DEFAULT '{}',
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Normalize orders into a separate table (one-to-many)
CREATE TABLE orders (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID REFERENCES users(id) ON DELETE CASCADE,
    product VARCHAR(255) NOT NULL,
    quantity INT NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_tags ON users USING GIN(tags);
CREATE INDEX idx_users_metadata ON users USING GIN(metadata);
CREATE INDEX idx_orders_user_id ON orders(user_id);

Step 3: Export Data from MongoDB

# Export each collection as JSON
mongoexport --uri="mongodb://user:pass@localhost/mydb" \
    --collection=users \
    --out=/tmp/users.json \
    --jsonArray

# For large collections, use streaming format (one doc per line)
mongoexport --uri="mongodb://user:pass@localhost/mydb" \
    --collection=users \
    --out=/tmp/users.jsonl

Step 4: Transform and Load Data

#!/usr/bin/env python3
# migrate_users.py
import json
import psycopg2
from psycopg2.extras import execute_values, Json

conn = psycopg2.connect("dbname=myapp user=postgres")
cur = conn.cursor()

with open('/tmp/users.jsonl') as f:
    batch = []
    for line in f:
        doc = json.loads(line)
        addr = doc.get('address', {})
        row = (
            doc['_id']['$oid'] if isinstance(doc['_id'], dict) else str(doc['_id']),
            doc.get('email'),
            doc.get('name', ''),
            addr.get('street'),
            addr.get('city'),
            addr.get('zip'),
            doc.get('tags', []),
            Json(doc.get('metadata', {})),
        )
        batch.append(row)

        if len(batch) >= 5000:
            execute_values(cur, """
                INSERT INTO users (mongo_id, email, name, address_street,
                    address_city, address_zip, tags, metadata)
                VALUES %s ON CONFLICT (mongo_id) DO NOTHING
            """, batch)
            conn.commit()
            batch = []

    if batch:
        execute_values(cur, """
            INSERT INTO users (mongo_id, email, name, address_street,
                address_city, address_zip, tags, metadata)
            VALUES %s ON CONFLICT (mongo_id) DO NOTHING
        """, batch)
        conn.commit()

cur.close()
conn.close()
print("Migration complete")

Step 5: Query Migration

-- MongoDB: db.users.find({ email: "user@example.com" })
SELECT * FROM users WHERE email = 'user@example.com';

-- MongoDB: db.users.find({ tags: "premium" })
SELECT * FROM users WHERE 'premium' = ANY(tags);

-- MongoDB: db.users.find({ "metadata.source": "api" })
SELECT * FROM users WHERE metadata->>'source' = 'api';

-- MongoDB: db.users.find({ "address.city": "NYC" }).sort({ name: 1 })
SELECT * FROM users WHERE address_city = 'NYC' ORDER BY name;

-- MongoDB: db.users.aggregate([
--   { $lookup: { from: "orders", localField: "_id", foreignField: "user_id", as: "orders" } }
-- ])
SELECT u.*, json_agg(o.*) AS orders
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id;

-- MongoDB: db.users.updateOne({ email: "user@example.com" }, { $set: { "metadata.verified": true } })
UPDATE users SET metadata = jsonb_set(metadata, '{verified}', 'true')
WHERE email = 'user@example.com';

-- MongoDB: db.users.updateOne({ email: "user@example.com" }, { $push: { tags: "verified" } })
UPDATE users SET tags = array_append(tags, 'verified')
WHERE email = 'user@example.com';

Step 6: Validation

# Compare counts
mongo mydb --eval "db.users.countDocuments()" 
psql myapp -c "SELECT COUNT(*) FROM users"

# Spot-check random documents
mongo mydb --eval "db.users.find().skip(Math.random() * 10000).limit(5).pretty()"
# Compare with PostgreSQL results for the same mongo_ids

# Verify JSONB data integrity
psql myapp -c "SELECT COUNT(*) FROM users WHERE metadata IS NOT NULL AND metadata != '{}'::jsonb"

Best Practices

  • Use JSONB columns for truly flexible/sparse data — do not over-normalize
  • Keep the mongo_id column during migration for cross-referencing; remove it once migration is validated
  • Migrate in batches of 5,000-10,000 documents to manage memory and transaction size
  • Run both databases in parallel during transition to validate results before cutover
  • Use PostgreSQL arrays for simple lists (tags) and JSONB for nested structures (metadata)
  • Create GIN indexes on JSONB and array columns for query performance
  • Test all application queries against PostgreSQL before switching production traffic

Was this article helpful?