Sep 21 2017

Efficiently fetching updated rows with Postgres

Let's say you've got a web app that's currently fetching all its data from an API, or doing everything on the server with server-rendered views, but you want to start storing all your users' data locally. Maybe you're making a native mobile app, or you want to build a Progressive Web App like all the cool kids.

You don't necessarily care about synchronising offline edits or anything like that; not yet, anyway. All you want to do is store a read-only copy of your users' data on their devices, so you don't have to hit the network and make them wait every time they launch your app, then periodically fetch whatever's changed from the server.

The simple (but wrong) solution

The obvious thing to do here is to just store a last-modified timestamp on each row, making it DEFAULT to now() and updating it in an ON UPDATE trigger. Then, when the client wants to fetch everything that's changed on the server, it sends an if-modified-since timestamp of the last time it checked, and the server sends back every row newer than that timestamp.

This will work in testing, and if each row is only normally exposed to a single human user, it'll work in production most of the time. But there's a race condition here:

  1. Server A opens a transaction and writes Row A, firing our trigger and updating the last-modified timestamp.
  2. Server B receives an update request from Client B, and selects all updated rows.
  3. Server A commits.

Server B won't see Row A, because it wasn't committed yet. Next time Client B fetches an update, though, it'll send an if-modified-since timestamp newer than Row A's last-modified timestamp, so it'll never see Row A. (At least, not until Row A gets updated next.)

Note that Server A and Server B don't need to be distinct physical servers; they could be two separate worker processes, threads or coroutines on the same machine.

Barking up the wrong tree

What if, instead of our client sending the timestamp of the last time it checked, it sends the newest last-modified timestamp it already has? That solves the scenario above, but it doesn't solve the race condition. Consider:

  1. Server A writes Row A.
  2. Server B writes Row B, and commits.
  3. Server C receives an update request from Client C, and selects all updated rows.
  4. Server A commits.

In this scenario, Client C will never see Row A, because it's already got a copy of Row B which is newer.

Transaction IDs to the rescue

Fortunately, Postgres has functionality that can help us out here. Let's use a to-do list as an example:

create table todo_items (
  id serial primary key,
  done bool not null default false,
  name varchar(255) not null,
  last_updated_txid bigint not null

create function add_txid()
returns trigger as '
    new.last_updated_txid := txid_current();
    return new;
' language 'plpgsql';

create trigger todo_items_txid
before insert or update on todo_items
for each row
execute procedure add_txid();

This is basically the same strategy as we were using before, except instead of storing a timestamp on each row, we're storing the ID of the transaction we're currently in, which is a 64-bit integer that increases monotonically and is guaranteed to never wrap throughout the life of a Postgres installation. (Internally, Postgres uses a 32-bit ID which does wrap, but it tracks each wrap in order to provide a larger number with that guarantee to us, the user.)

The important bit happens when we go to fetch updated data. Along with all of our updated rows, we also send back to the client the output of this query:

select txid_snapshot_xmin(txid_current_snapshot());

This gives us the current xmin—the ID of the oldest transaction that is currently active. Every transaction with an ID smaller than this number is guaranteed to already be either committed or rolled back, so when our client sends that value back to us on its next fetch, we know we're definitely giving it every change they don't already have. We can see this behaviour by opening multiple Postgres client sessions side by side:

-- Session A
insert into todo_items (name) values ('Buy some milk'), ('Walk the dog');

-- Session B
insert into todo_items (name) values ('Clean the kitty litter');
update todo_items set done = true where id = 2;

-- Session C
insert into todo_items (name) values ('Write a blog post');

-- Session A
select * from todo_items;
-- +------+--------+-------------------+---------------------+
-- | id   | done   | name              | last_updated_txid   |
-- |------+--------+-------------------+---------------------|
-- | 1    | False  | Buy some milk     | 3751                |
-- | 2    | False  | Walk the dog      | 3751                |
-- | 4    | False  | Write a blog post | 3753                |
-- +------+--------+-------------------+---------------------+
select txid_snapshot_xmin(txid_current_snapshot());
-- 3752

-- Session B

-- Then, when Client A wants to update its data...
select * from todo_items where last_updated_txid >= 3752;
-- +------+--------+------------------------+---------------------+
-- | id   | done   | name                   | last_updated_txid   |
-- |------+--------+------------------------+---------------------|
-- | 3    | False  | Clean the kitty litter | 3752                |
-- | 2    | True   | Walk the dog           | 3752                |
-- | 4    | False  | Write a blog post      | 3753                |
-- +------+--------+------------------------+---------------------+

This is effectively playing out the second scenario we described above, and you can see that the changes we made in Session B are successfully retrieved.

Note that the "Write a blog post" entry gets sent to the client twice; where the timestamp method makes data invisible to clients in this edge case, this one sends data the client already has, which is much less undesirable.

For the general case, we are still only sending updated rows, as shown by the "Buy some milk" row.


What's this like in production?

I don't know, you tell me! I've got some project ideas that I want to try building as PWAs that store a cache of user data locally in IndexedDB, and this is probably how I'll sync them. For now, this is just an idea I've had bouncing around in my head that I wanted to get onto paper.

Thanks to Matt Magin for his feedback on this idea, and initially pointing out to me why the timestamp solution wouldn't work, and to Nic Crouch for his feedback and insight on a draft version of this article.