PostgreSQL: The Database That Does Everything (And Why It's My Default Now)
I started with MySQL. I dabbled with SQLite. Then I tried PostgreSQL, and I understood why experienced developers get weirdly passionate about it. JSON columns, full-text search, window functions, CTEs, JSONB indexing—Postgres isn't just a database, it's a Swiss Army data platform. Here's why I switched and never looked back.
The Recommendation That Changed Everything
It started with a throwaway comment in a code review.
I’d pushed a MySQL migration that created a metadata column as TEXT, with a plan to store JSON and parse it in the application layer. My reviewer—a backend lead with fifteen years of experience—left one comment:
“Why not use Postgres? You’d get a native JSONB column with indexing. No parsing in the app layer.”
I didn’t know what JSONB was. I didn’t know what “indexing JSON” meant. I barely knew what PostgreSQL was beyond “that other SQL database.”
But I trusted this person, so I spent a weekend installing PostgreSQL and migrating my side project from MySQL. By Monday, I was a convert. By the end of the month, I was one of those annoying people who brings up Postgres in every database conversation.
Here’s why.
The First Surprise: It’s Just SQL (But More)
The transition from MySQL to PostgreSQL was smoother than I expected. SQL is SQL. The basics are identical:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
INSERT INTO users (name, email) VALUES ('Saurav', 'saurav@example.com');
SELECT * FROM users;
SERIAL instead of AUTO_INCREMENT. NOW() instead of CURRENT_TIMESTAMP. Minor dialect differences. If you know MySQL, you can read and write PostgreSQL in an afternoon.
But then I started discovering the features MySQL didn’t have, and the gap widened into a chasm.
JSONB: When Relational Meets Flexible
This was the feature that hooked me. Remember my metadata column problem? In MySQL, I’d store JSON as a text string:
-- MySQL approach: JSON as text
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255),
metadata TEXT -- Store JSON as a string, pray it's valid
);
INSERT INTO products (name, metadata) VALUES (
'Widget',
'{"color": "red", "weight": 150, "tags": ["sale", "new"]}'
);
-- To query by color? Parse JSON in the app layer.
-- Or use MySQL's JSON functions (limited, slow without indexes)
In PostgreSQL:
-- Postgres approach: JSONB is a first-class data type
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
metadata JSONB
);
INSERT INTO products (name, metadata) VALUES (
'Widget',
'{"color": "red", "weight": 150, "tags": ["sale", "new"]}'
);
-- Query JSON fields directly
SELECT name, metadata->>'color' AS color
FROM products
WHERE metadata->>'color' = 'red';
-- Query nested values
SELECT name FROM products
WHERE metadata @> '{"tags": ["sale"]}';
-- Index JSON fields for fast lookups
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);
The @> operator means “contains.” I’m asking: “give me products whose metadata contains a tags array with ‘sale’ in it.” And it’s indexed. Fast. No full table scan.
This solved a problem I’d been hitting for months: data that was mostly structured but had some fields that varied between records. Product attributes (shoes have sizes, laptops have RAM), user preferences, API response caching—all cases where you need the reliability of a relational database with the flexibility of a document store.
People told me “use MongoDB for flexible data.” Postgres said “I can do both.”
Arrays: A Column Can Hold Multiple Values
MySQL doesn’t have array columns. If a post has multiple tags, you need a junction table:
-- MySQL: three tables for posts with tags
CREATE TABLE posts (id INT PRIMARY KEY, title VARCHAR(255));
CREATE TABLE tags (id INT PRIMARY KEY, name VARCHAR(100));
CREATE TABLE post_tags (post_id INT, tag_id INT); -- Junction table
PostgreSQL:
-- Postgres: arrays are a data type
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
title VARCHAR(255),
tags TEXT[] -- Array of text values
);
INSERT INTO posts (title, tags) VALUES
('Learning Postgres', ARRAY['database', 'postgresql', 'tutorial']),
('Why I Switched', ARRAY['database', 'mysql', 'migration']);
-- Find posts with a specific tag
SELECT title FROM posts WHERE 'database' = ANY(tags);
-- Find posts with ALL of these tags
SELECT title FROM posts WHERE tags @> ARRAY['database', 'postgresql'];
-- Index the array for fast lookups
CREATE INDEX idx_posts_tags ON posts USING GIN(tags);
Is this always better than a junction table? No. For complex many-to-many relationships with additional data on the relationship (like “when was this tag added?”), junction tables are still the right call. But for simple lists of values that belong to a record, arrays are cleaner and faster.
Full-Text Search: No Elasticsearch Needed
I was about to add Elasticsearch to a project just for search functionality. Then I discovered PostgreSQL’s built-in full-text search.
-- Add a search vector column
ALTER TABLE posts ADD COLUMN search_vector tsvector;
-- Populate it from title and content
UPDATE posts SET search_vector =
setweight(to_tsvector('english', title), 'A') ||
setweight(to_tsvector('english', content), 'B');
-- Create an index for fast search
CREATE INDEX idx_posts_search ON posts USING GIN(search_vector);
-- Search!
SELECT title, ts_rank(search_vector, query) AS rank
FROM posts, to_tsquery('english', 'database & postgresql') query
WHERE search_vector @@ query
ORDER BY rank DESC;
The setweight function lets me prioritize title matches over content matches. The ts_rank function scores results by relevance. The GIN index makes it fast.
Is this as powerful as Elasticsearch? No. Elasticsearch has better fuzzy matching, better faceted search, better scaling for terabytes of text. But for a blog, a documentation site, or an internal tool? PostgreSQL’s full-text search is more than enough, and it’s one less service to deploy, monitor, and maintain.
I removed Elasticsearch from that project. The search was 90% as good with zero additional infrastructure.
CTEs: Queries That Read Like Prose
Common Table Expressions changed how I write complex queries. Instead of nested subqueries that make your eyes cross, you build queries step by step:
-- Without CTEs: nested, hard to read
SELECT department, avg_salary
FROM (
SELECT department, AVG(salary) as avg_salary
FROM (
SELECT * FROM employees WHERE status = 'active'
) active_employees
GROUP BY department
) dept_averages
WHERE avg_salary > 80000;
-- With CTEs: clear, step-by-step
WITH active_employees AS (
SELECT * FROM employees WHERE status = 'active'
),
department_averages AS (
SELECT department, AVG(salary) AS avg_salary
FROM active_employees
GROUP BY department
)
SELECT department, avg_salary
FROM department_averages
WHERE avg_salary > 80000;
Same result. But the CTE version reads top to bottom like a recipe: first get active employees, then calculate department averages, then filter. Each step has a name. Each step is readable independently.
MySQL added CTEs in version 8.0, but PostgreSQL had them for years before. And PostgreSQL has recursive CTEs, which are genuinely powerful for tree-structured data:
-- Get a category and all its subcategories (tree traversal)
WITH RECURSIVE category_tree AS (
-- Start with the root category
SELECT id, name, parent_id, 0 AS depth
FROM categories
WHERE id = 1
UNION ALL
-- Recursively find children
SELECT c.id, c.name, c.parent_id, ct.depth + 1
FROM categories c
JOIN category_tree ct ON c.parent_id = ct.id
)
SELECT * FROM category_tree ORDER BY depth, name;
Try doing that with a regular query. I’ll wait.
Window Functions: Analytics Without GROUP BY Pain
This is the feature I use daily and wonder how I lived without.
Say I want each employee’s salary and how it compares to their department average:
-- Without window functions: awkward self-join
SELECT e.name, e.salary, d.avg_salary
FROM employees e
JOIN (
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
) d ON e.department = d.department;
-- With window functions: one pass, clean
SELECT
name,
department,
salary,
AVG(salary) OVER (PARTITION BY department) AS dept_average,
salary - AVG(salary) OVER (PARTITION BY department) AS diff_from_average,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees;
One query. Each employee’s row includes their salary, their department’s average, the difference, and their rank within the department. No self-joins. No subqueries. No GROUP BY collapsing rows.
Window functions operate over a set of rows but don’t collapse them. You keep every row and add computed values. Once this clicked, I started seeing uses for it everywhere:
-- Running total of orders by date
SELECT
order_date,
amount,
SUM(amount) OVER (ORDER BY order_date) AS running_total
FROM orders;
-- Percentage of total revenue per product
SELECT
product_name,
revenue,
ROUND(revenue * 100.0 / SUM(revenue) OVER (), 2) AS pct_of_total
FROM product_sales;
-- Previous and next values (for comparison)
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) AS prev_month,
revenue - LAG(revenue) OVER (ORDER BY month) AS growth
FROM monthly_revenue;
LAG gives you the previous row’s value. LEAD gives you the next. RANK, ROW_NUMBER, DENSE_RANK for rankings. SUM, AVG, COUNT as window functions for running aggregates. This is a language for data analysis built right into the database.
The Things That Made Me Stay
Constraints That Actually Constrain
PostgreSQL takes data integrity seriously. Beyond basic NOT NULL and UNIQUE:
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id) ON DELETE CASCADE,
status VARCHAR(20) CHECK (status IN ('pending', 'shipped', 'delivered', 'cancelled')),
quantity INTEGER CHECK (quantity > 0),
total DECIMAL(10, 2) CHECK (total >= 0),
shipped_at TIMESTAMP,
-- Custom constraint: shipped_at must be set when status is 'shipped'
CONSTRAINT shipped_must_have_date
CHECK (status != 'shipped' OR shipped_at IS NOT NULL)
);
That last constraint is business logic enforced at the database level. “If the status is ‘shipped’, there must be a shipped date.” Try inserting a shipped order without a date and PostgreSQL rejects it. The application can have bugs. The database won’t.
ENUM Types That Are Actual Types
CREATE TYPE order_status AS ENUM ('pending', 'shipped', 'delivered', 'cancelled');
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
status order_status DEFAULT 'pending'
);
-- Adding a new value later
ALTER TYPE order_status ADD VALUE 'refunded' AFTER 'cancelled';
In MySQL, ENUMs are column-level. In PostgreSQL, they’re database-level types. Define once, use in many tables. Add values without altering every table.
UPSERT: Insert or Update in One Statement
One of my most common patterns:
-- Insert a user. If their email already exists, update their name instead.
INSERT INTO users (email, name) VALUES ('saurav@example.com', 'Saurav Sitaula')
ON CONFLICT (email)
DO UPDATE SET name = EXCLUDED.name, updated_at = NOW();
One query. No “check if exists, then insert or update” logic in the application. No race conditions between the check and the insert. The database handles it atomically.
LISTEN/NOTIFY: Built-In Pub/Sub
This one blew my mind:
-- In one connection
LISTEN new_order;
-- In another connection
NOTIFY new_order, '{"order_id": 123, "total": 29.99}';
-- First connection receives the notification instantly
A basic pub/sub system built into the database. No Redis, no RabbitMQ—just PostgreSQL. For simple real-time features (new order notifications, cache invalidation signals), this is often enough.
PostgreSQL vs MySQL: The Honest Comparison
I’m not going to pretend MySQL is bad. It’s not. Billions of sites run on MySQL. But after using both extensively, here’s my honest take:
MySQL is easier to start with. The installation is simpler. The documentation is more beginner-friendly. The defaults are more forgiving. If you’re learning SQL for the first time, MySQL is a fine choice.
PostgreSQL is better to grow into. As your queries get complex, as your data models get sophisticated, as your performance requirements increase, PostgreSQL has more tools. JSONB, arrays, window functions, CTEs, full-text search, custom types, advanced indexing—these aren’t niche features. They’re tools I reach for weekly.
MySQL is faster for simple reads. For basic SELECT * FROM users WHERE id = 1, MySQL’s simpler architecture can be marginally faster. The difference is negligible for most applications.
PostgreSQL is faster for complex queries. The query planner is more sophisticated. Complex JOINs, subqueries, and analytical queries perform better. For reporting and analytics workloads, PostgreSQL pulls ahead significantly.
MySQL has a larger beginner ecosystem. More shared hosting support, more WordPress tutorials, more “getting started” content.
PostgreSQL has a larger professional ecosystem. More extensions (PostGIS for geospatial, TimescaleDB for time-series, pgvector for AI embeddings), more advanced tooling, and increasingly more cloud support.
What I Wish I’d Known Earlier
-
Start with PostgreSQL if you can. Everything you learn in PostgreSQL works in MySQL (basic SQL is the same). But the reverse isn’t true—PostgreSQL features like JSONB, arrays, and window functions don’t exist in MySQL (or exist in limited form).
-
EXPLAIN ANALYZEis your best friend. Not justEXPLAIN—EXPLAIN ANALYZEactually runs the query and shows real execution times. It’s the fastest path to understanding why a query is slow. -
JSONB doesn’t mean “use Postgres as a document database.” Use relational columns for structured data. Use JSONB for genuinely flexible data that varies between records. Putting everything in JSONB defeats the purpose of a relational database.
-
Extensions are superpowers.
pg_trgmfor fuzzy string matching.PostGISfor geospatial queries.pgvectorfor AI embedding similarity search.pg_stat_statementsfor query performance monitoring. The extension ecosystem is massive. -
Connection pooling matters. PostgreSQL creates a new process for each connection (unlike MySQL’s threads). For web applications, use PgBouncer or your framework’s built-in pooling. Without it, you’ll hit connection limits fast.
-
The community is incredible. The PostgreSQL mailing lists, the documentation (genuinely the best database docs I’ve read), the conference talks—the community is passionate, helpful, and deeply knowledgeable.
Why It’s My Default
Every new project starts with PostgreSQL now. Not because I evaluated alternatives and chose it. Because I stopped needing to evaluate.
Need JSON flexibility? Postgres has JSONB. Need search? Postgres has full-text search. Need geospatial? Postgres has PostGIS. Need analytics? Postgres has window functions. Need a queue? Postgres has LISTEN/NOTIFY (and SKIP LOCKED for job queues). Need vector similarity for AI features? Postgres has pgvector.
The answer to “should I add another service for this?” keeps being “Postgres already does it.”
I’m not saying it’s perfect. Connection management is fussy. The replication setup is more complex than MySQL’s. The learning curve for advanced features is real. But the depth of what a single PostgreSQL instance can do—relational data, document data, full-text search, geospatial queries, time-series, pub/sub, job queues—is unmatched.
I started with MySQL and I’m grateful for that foundation. I used SQLite and I still reach for it when simplicity matters. But PostgreSQL is where I build things I care about.
It does everything. And it does everything well.
P.S. — There’s a running joke in the Postgres community: “What’s the best NoSQL database? PostgreSQL.” It’s funny because it’s true. JSONB with GIN indexes handles document-style workloads that used to require MongoDB. And you still get transactions, joins, and constraints. Best of both worlds. The Postgres community is very smug about this. Justifiably so.
Saurav Sitaula
Software Architect • Nepal