jkisolo.com

Efficient Change Tracking in PostgreSQL: A Comprehensive Guide

Written on

Chapter 1: Introduction to Change Tracking

Have you ever looked at a row in a database and wondered, "When did this change?" or "What did it look like yesterday?" I certainly have.

When I configure new PostgreSQL databases, I also implement a straightforward system to monitor all modifications made. This setup allows us to view the history of any specific row—showing the previous state, the current state, the timestamp of the change, and sometimes even the identity of the individual who made it.

This system has proven invaluable on multiple occasions when diagnosing bugs and unusual behaviors. By combining this data within the database with application logs, I can obtain a comprehensive view of how the system interacted with the data.

How Does It Work?

There are various methods to achieve this, but I find my approach to be both simple and efficient—two qualities I highly value!

I utilize a database trigger that activates a function following each insert, update, or delete operation on a table. This function collects information about the change (e.g., the state before and after the operation) and records it into a designated db_audit table that I have established for this purpose.

Diagram illustrating the change tracking process

The entire process operates within the database server, eliminating the need for any alterations to application logic, and it will also log any manual changes made directly to the database. Once implemented, this system runs automatically and can adapt to schema modifications of the monitored tables.

It's important to note that this may introduce some latency in database write operations, as each write now records changes to two tables instead of one. However, in my experience, I haven't encountered any noticeable impact—just a factor to consider for applications or tables that undergo heavy write operations.

Chapter 2: Implementation Code

Here’s how to set it up.

The Audit Table Definition

CREATE TABLE db_audit (

"id" UUID PRIMARY KEY DEFAULT gen_random_uuid(),

"timestamp" TIMESTAMP NOT NULL DEFAULT now(),

"user_id" TEXT NOT NULL,

"table" TEXT NOT NULL,

"row_id" UUID NOT NULL,

"op" TEXT NOT NULL,

"before" TEXT NOT NULL,

"after" TEXT NOT NULL

);

The user_id indicates who made the change (as recognized by PostgreSQL), which may or may not be useful depending on your situation. If there's only one database user, this will always be the same. The table represents the name of the table being audited, while row_id is the identifier of the modified row. The op signifies the type of operation (INSERT, UPDATE, or DELETE), and before and after contain JSON representations of the row's state pre- and post-change. Although I use TEXT here, you could switch to JSONB if you plan to execute queries on it (which might slightly affect write performance).

Trigger Function for Change Logging

CREATE OR REPLACE FUNCTION audit_db_change()

RETURNS TRIGGER AS $$

DECLARE

_row_id uuid;

_before text;

_after text;

BEGIN

_row_id = COALESCE(NEW.id, OLD.id, '00000000-0000-0000-0000-000000000000');

_before = COALESCE(row_to_json(OLD), '{}');

_after = COALESCE(row_to_json(NEW), '{}');

IF _before IS DISTINCT FROM _after THEN

INSERT INTO public.db_audit ("user_id", "table", "row_id", "op", "before", "after")

VALUES (user, TG_TABLE_NAME, _row_id, TG_OP, _before, _after);

END IF;

RETURN NULL; -- Returning NULL as this executes AFTER the operation

END;

$$ language 'plpgsql';

This function gathers data about the change and inserts it into the db_audit table.

Trigger Creation Command

CREATE TRIGGER table_name_audit_trigger

AFTER INSERT OR UPDATE OR DELETE

ON table_name

FOR EACH ROW EXECUTE PROCEDURE audit_db_change();

You will need to replicate this for each table you wish to monitor—just replace table_name with the appropriate table name.

Important Notes

Triggers are specific to individual tables, so remember to establish a trigger for each new table you create. Although this may seem cumbersome, it allows you to selectively audit tables, which can be beneficial, especially for those that do not require monitoring, such as log tables.

This setup presumes that every table you are tracking includes a column named id, which serves as a sufficient identifier for locating the corresponding row. If you have tables with composite primary keys, adjustments may be necessary to accommodate that.

Furthermore, since the row_id column in the audit table has a defined type, it's expected that all tables you are monitoring utilize that same type for their identifier column. You can choose any type you prefer (I used UUID here), as long as it's consistent across all tables.

You could certainly establish multiple tables and functions for different types, should the need arise.

Conclusion

And that’s all there is to it! While alternative methods may exist (perhaps involving PostgreSQL's replication capabilities?), this approach remains straightforward and has been tremendously helpful for troubleshooting issues within the backend services I oversee.

Have you ever implemented a similar system? How did you approach it?

Chapter 3: Real-Time Monitoring Techniques

Explore how to utilize Web Sockets for real-time updates in PostgreSQL with the video titled "Web Sockets and Real-Time Updates for Postgres with pg_eventserv."

Chapter 4: Advanced Change Data Capture

Learn about real-time change data capture techniques with PostgreSQL in the video titled "Real-Time Change Data Capture with PostgreSQL."

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Energizing Your Life: Transforming Laziness into Productivity

Discover three key habits that can boost your energy levels, enhance productivity, and help you overcome laziness.

Optimizing Endurance Training: A Review of Modern Techniques

A concise review of a valuable resource for runners seeking to enhance their training through scientific principles.

The Beauty of Being a Nobody: Finding Happiness Within

Explore the importance of authenticity and self-worth in a world focused on validation and status.

Navigating the Hidden Challenges of Fitness: What You Need to Know

Explore the often overlooked challenges of fitness that everyone should know about to ensure a healthy and sustainable fitness journey.

Unlocking Success: 3 Essential Rules for a Fulfilling Life

Discover the three crucial principles to achieve success in life by embracing determination, organization, and openness to new opportunities.

Innovative Side Hustles Gen Z is Embracing You Should Try!

Discover the exciting side hustles Gen Z is exploring, from social media influencing to digital art sales, and learn how to join the movement.

Balancing Work Ethics and Life: A Deeper Look

Exploring the interplay between work ethics and work-life balance, and how sacrifices in work lead to broader societal advancements.

Creating AWS Lambda Layers: A Comprehensive Guide

Discover how to effectively create and manage AWS Lambda layers for your functions.