3
votes

We have a PostgreSQL instance running in a VM in the Google Cloud. The nature of the queries that we run involves lots of PostgreSQL temporary table space. (5 or 6 or more TB of disk I/O every day)

This I/O continues to be a major bottleneck in our database. Currently I have it all happening on an SSD persistent disk - not because we need to save any of the data in the event of a reboot, but because PostgreSQL lays out a file structure on the disk that it then uses for the temporary tables and if the file structure is missing when the database starts up, it isn't very good.

What I'd like to do is configure the temporary tablespace on the local SSD's because of their much higher I/O throughput. Unfortunately, they get wiped out on every reboot. I'd like a simple way to be able to re-layout the disk after reboot and before PostgreSQL starts back up.

I could tar up the empty file structure and then write a script that untars it after every boot. Does that make sense? Is there a better way/best practice for doing this?

What would be awesome is if there was a PostgreSQL extension out there that did this magically.

Ideas?

1
We were running the whole PG database on local SSDs on GCE for ~1 year. Speed was great. But later suddenly one of tables was corrupted and we did not find any explanation for it. Google explains that local SSDs do not have any error correction. And some comments on web seem to suggest that others had this problem too. So local SSDs are really fast but not 100% reliable over longer time...JosMac
But does that matter for temp space?rotten
Have you considered exporting the structure on every reboot rather than doing the tar up if that is suitable in your case?Taher
@nezda we have hot standby replica but only on Persistent SSDs. Our database on local SSDs was intended for heavy nightly calculations - to gain better speed. Beside it we maintained another the same database with persistent disks. After problems with files consistency on local SSDs we started to use Bigquery - price is similar, runtime much better.JosMac
We ran into driver issues with the SSD drives that caused random errors that the database could not recover from when it was under high load. We had to back out of the high performance volatile drive configuration and return to persistent drives in the interest of stability. I do not recommend doing this any more unless you don't mind the database locking up occasionally.rotten

1 Answers

5
votes

I dug a bit into my previous tests and here is some summary:

PostgreSQL tablespace is just a directory - no big deal. Plus - if you will use it only as temporary table space there will be no persistent file left when you shutdown database.

You can create tablespace for temp tables on any location you want and then go to this location and check directory structure to see what PG created. But you must do under OS because PG will show you only tablespace main directory - both \db+ in psql or select oid, spcname, pg_tablespace_location(oid) from pg_tablespace; work the same way.

My example:

  • (I used /tempspace/pgtemp as presumed mounting point) CREATE TABLESPACE p_temp OWNER xxxxxx LOCATION '/tempspace/pgtemp'; created in my case structure /tempspace/pgtemp/PG_10_201707211
  • I set temp_tablespaces = 'pg_temp' in postgresql.conf and reloaded configuration.
  • When I used create temp table .... PG added another subdirectory - /tempspace/pgtemp/PG_10_201707211/16393 = oid of schema - but this does not matter for temp tablespace because if this subdirectory will be missing PG will create it.
  • PG created in this subdir files for temp table.
  • When I closed this session files for temp table were gone.

Now I stopped PG and tested what would happened if directories will be missing:

  • I deleted PG_10_201707211 with its subdir
  • started PG and log showed message LOG: could not open tablespace directory "pg_tblspc/166827/PG_10_201707211": No such file or directory but PG started
  • I tried to create temp table - I got error message ERROR: could not create directory "pg_tblspc/166827/PG_10_201707211/16393": No such file or directory SQL state: 58P01
  • Now (with running PG) I issued these commands in OS:
    • sudo mkdir -p /tempspace/pgtemp/PG_10_201707211
    • sudo chown postgres:postgres -R /tempspace/pgtemp
    • sudo chmod 700 -R /tempspace/pgtemp
  • I tried to create temp table again and insert and select values and everything worked OK

So conclusion is - since PG tablespace is no "big magic" just directories you can simply create bash script running on linux startup which will check (and mount if necessary) local SSD and create necessary directories for PG temp tablespace.