Wednesday, April 2, 2014

Putting stats_temp_directory on a ramdisk

This hack is an old chestnut among PostgreSQL performance tuners, but it doesn't seem to be widely known elsewhere. That's a shame, because it's pure win, and it's ridiculously easy to set up. You don't even need to restart PostgreSQL.

Here's the situation: PostgreSQL writes certain temporary statistics. These go in the dir given by the stats_temp_directory setting. By default, that's pg_stat_tmp in the data dir. Temp files get written a lot, but there's no need for them to persist.

That makes them perfect candidates for a ramdisk (a.k.a. RAM drive). A ramdisk is a chunk of memory treated as a block device by the OS. Because it's RAM, it's super-fast. As far as the app is concerned, the ramdisk just holds a filesystem that it can read and write like any other. Moreover, PostgreSQL generally only needs a few hundred kilobytes for stats_temp_directory; any modern server can fit that in RAM.

In Linux, you set up a ramdisk like this:

As root:

'mkdir /var/lib/pgsql_stats_tmp' [1]

'chmod 777 /var/lib/pgsql_stats_tmp'

'chmod +t /var/lib/pgsql_stats_tmp'

Add this line to /etc/fstab. That 2G is an upper limit; the system will use only as much as it needs.

tmpfs /var/lib/pgsql_stats_tmp tmpfs size=2G,uid=postgres,gid=postgres 0 0

'mount /var/lib/pgsql_stats_tmp'

Then, as postgres:

Change the stats_temp_directory setting in postgresql.conf:

stats_temp_directory = '/var/lib/pgsql_stats_tmp'

Tell PostgreSQL to re-read its configuration:

'pg_ctl -D YOUR_DATA_DIR reload'

And that's it!

Other operating systems have different ways to set up ramdisks. Perhaps I'll cover them in a later post.

[1] The directory /var/lib/pgsql_stats_tmp is an arbitrary choice, but it works well for Debian's filesystem layout.

9 comments:

  1. Can you post some benchmark results or statistics showing the performance benefit?

    ReplyDelete
  2. Or just /dev/shm or /run/shm..

    The speed improvements can be massive. We went from 1800 iops to 100, writes from ~100mb/s to mere kb/s.

    ReplyDelete
  3. Colin: I don't have such data on hand, but I'll grab some next time I tune a suitable system.

    ReplyDelete
  4. I generally just set stats_temp_directory='/run/postgresql'. That directory always exists and has the right privileges on Debian/Ubuntu/Arch machines, and /run uses tmpfs in most setups. Not sure about other distros.

    ReplyDelete
  5. Shouldn't the OS just avoid hitting the disk for highly transient data ?

    ReplyDelete
  6. Here's what I added to my PostgreSQL installation script on CentOS.

    echo "tmpfs $PGDATA/pg_stat_tmp tmpfs size=1G,uid=postgres,gid=postgres 0 0" >> /etc/fstab
    mount $PGDATA/pg_stat_tmp

    ReplyDelete
  7. Since the update interval is at 500ms per default (see http://www.postgresql.org/docs/9.3/static/monitoring-stats.html#MONITORING-STATS-VIEWS) I wouldn't expect seeing more than a few iops spent on updating the statistics files, unless you've got a huge number of active databases.

    ReplyDelete
  8. From the docs:
    "stats_temp_directory (string)
    Sets the directory to store temporary statistics data in. This can be a path relative to the data directory or an absolute path. The default is pg_stat_tmp. Pointing this at a RAM-based file system will decrease physical I/O requirements and can lead to improved performance. This parameter can only be set in the postgresql.conf file or on the server command line."

    pg_ctl reload will not update this setting, you must restart

    ReplyDelete
  9. @Noah,

    Changing this setting only requires a SIGHUP (pg_ctl reload)
    postgres=# select name,context from pg_settings where name = 'stats_temp_directory';
    name | context
    ----------------------+---------
    stats_temp_directory | sighup
    (1 row)

    ReplyDelete