Database Changes in Live Systems: How to Evolve Schemas Without Breaki

Why schema changes break production and how experienced teams avoid it.

A diagram of a server and a server

Introduction: Databases remember everything

You can redeploy code in minutes.
You can roll back services.
You can toggle feature flags.

But your database?

It remembers every decision you ever made.

Most production failures related to databases don’t come from performance or capacity.
They come from schema changes that didn’t respect existing data.

If APIs punish careless change, databases are unforgiving.

Why database changes are uniquely dangerous

Database changes affect:

Running code
Historical data
Batch jobs
Reports
External integrations

And they do it all at once.

Unlike code, database changes are often:

Irreversible
Shared across services
Hard to test realistically

This is why experienced engineers treat database evolution as a slow, deliberate process.

The biggest misconception: “It’s just a small migration”

There is no such thing as a small migration in production.

A single ALTER TABLE can:

Lock tables
Break reads
Corrupt assumptions
Expose hidden bugs

If a change touches existing data, it is not small.

Example 1: The dangerous `NOT NULL`

You add a new column:

ALTER TABLE orders
ADD COLUMN source_type VARCHAR(20) NOT NULL;

Looks reasonable.

But:

Old rows don’t have this value
Inserts fail
Reads fail
Reports fail

The system breaks in places you didn’t expect.

The safer approach (always)

Add the column as nullable
Deploy code that handles NULL
Backfill existing data
Add NOT NULL constraint later

This is not overengineering.
This is production hygiene.

Expand → Migrate → Contract (non-negotiable pattern)

This pattern should be burned into memory.

Step 1: Expand

Add new schema elements without breaking existing ones.

ADD COLUMN source_type VARCHAR(20);

Step 2: Migrate

Gradually update:

Old data
Old code paths
Old queries

Step 3: Contract

Remove or restrict only after verification.

ALTER TABLE orders
ALTER COLUMN source_type SET NOT NULL;

Skipping steps is how outages happen.

Example 2: Renaming a column (don’t)

Renaming feels harmless:

ALTER TABLE users
RENAME COLUMN mobile TO phone;

But:

Old code still queries mobile
Old reports fail
Old scripts break silently

Safer alternative

Add new column phone
Keep both in sync
Update code gradually
Remove mobile later

Yes, it’s extra work.
So is recovering from production incidents.

Historical data is the real enemy

Most schema changes fail because of old data, not new code.

Old data:

Has missing fields
Has unexpected values
Was created under different rules

Your new logic must handle:

Nulls
Defaults
Inconsistent formats

Example 3: New validation on old data

You add validation:

email must not be empty

But historical data contains:

email = ""

Now:

Reads fail
Jobs crash
Admin screens break

New rules must coexist with old reality.

Zero-downtime migrations are a discipline

Zero-downtime is not a tool.
It’s a mindset.

It means:

No locks that block traffic
No assumptions about clean data
No “quick fixes” during business hours

Rule:
If a migration cannot be safely paused or rolled back, it’s not ready.

Anti-patterns that hurt the most

Anti-pattern 1: Coupling code deploy with schema change

Deploying code and schema together assumes:

No rollback
No partial failure
Perfect timing

Reality disagrees.

Schema first.
Code second.
Cleanup last.

Anti-pattern 2: Deleting columns too early

Columns are rarely “unused”.
They are just used quietly.

Before deleting:

Log access
Search queries
Check reports
Wait longer than you think

Deletion is easy.
Recovery is not.

Anti-pattern 3: One migration script for everything

Large migrations fail mid-way.

Split them:

One change per script
One responsibility per migration
Clear rollback steps

Small migrations fail less catastrophically.

How experienced teams treat database changes

They:

Treat migrations as production code
Review them carefully
Test them on real-like data
Run them slowly and deliberately

They assume:

Data is messy
Code will be rolled back
Something will go wrong

And they plan accordingly.

A simple checklist before database changes

Before touching the database, ask:

What old data exists?
What code reads this table?
Can this change be additive?
What happens if this runs slowly?
How do we roll back safely?

If rollback is unclear, stop.

Final thought: databases punish arrogance

Databases are honest.

They don’t care about deadlines, intentions, or confidence.
They only care about correctness.

If you respect existing data, databases will serve you well.
If you rush change, they will expose it—usually in production.

More such articles:

https://medium.com/techwasti

https://www.youtube.com/@maheshwarligade

https://techwasti.com/series/spring-boot-tutorials

https://techwasti.com/series/go-language

Database Changes in Live Systems.

Introduction: Databases remember everything

Why database changes are uniquely dangerous

The biggest misconception: “It’s just a small migration”

Example 1: The dangerous `NOT NULL`

The safer approach (always)

Expand → Migrate → Contract (non-negotiable pattern)

Step 1: Expand

Step 2: Migrate

Step 3: Contract

Example 2: Renaming a column (don’t)

Safer alternative

Historical data is the real enemy

Example 3: New validation on old data

Zero-downtime migrations are a discipline

Anti-patterns that hurt the most

Anti-pattern 1: Coupling code deploy with schema change

Anti-pattern 2: Deleting columns too early

Anti-pattern 3: One migration script for everything

How experienced teams treat database changes

A simple checklist before database changes

Final thought: databases punish arrogance

Comments

Backward Compatibility

Evolving APIs Without Breaking Clients.

More from this blog

Adopting an MCP-First API Design for the Agentic Era | Beyond REST.

Article 6: Mastering Kestra's Plugin Ecosystem.

Article 5: Advanced Workflow Patterns in Kestra.

Article 4: Building Your First ETL Pipeline with Kestra.

Article 3: Understanding Kestra's Architecture: Flows, Tasks, and Namespaces

Command Palette

Introduction: Databases remember everything

Why database changes are uniquely dangerous

The biggest misconception: “It’s just a small migration”

Example 1: The dangerous NOT NULL

The safer approach (always)

Expand → Migrate → Contract (non-negotiable pattern)

Step 1: Expand

Step 2: Migrate

Step 3: Contract

Example 2: Renaming a column (don’t)

Safer alternative

Historical data is the real enemy

Example 3: New validation on old data

Zero-downtime migrations are a discipline

Anti-patterns that hurt the most

Anti-pattern 1: Coupling code deploy with schema change

Anti-pattern 2: Deleting columns too early

Anti-pattern 3: One migration script for everything

How experienced teams treat database changes

A simple checklist before database changes

Final thought: databases punish arrogance

Comments

Backward Compatibility

Evolving APIs Without Breaking Clients.

More from this blog

Example 1: The dangerous `NOT NULL`