A better backup with PostgreSQL using pg_dump
This is generously borrowed from the PostgreSQL Docs, and updated to something that represents a modern approach to PostgreSQL backups. This documentation has always bothered me because it should have been re-written years ago. Yes I plan on submitting a more comprehensive version as a patch but I don't have time to push it into DocBook right now. If someone else wants to grab it, please do. Yes, I really do believe the use of plain text backups is a mistake. Yes I realize PostgreSQL has the limitation of not being able to backup the cluster in anyway but plain text.

The standard for portable backups with PostgreSQL is pg_dump and pg_dumpall. When used properly pg_dump will create a portable and highly customizable backup file that can be used to restore all or part of a single database. The pg_dump application acts as a standard PostgreSQL client. This means that you can perform this backup procedure from any remote host that has access to the database. You do not need to be a super user to use pg_dump but you must have read (and EXECUTE for functions) access to every object within the database. Backups created by pg_dump are internally consistent, meaning, the dump represents a snapshot of the database at the time pg_dump began running. The backup will not block other operations on the database while it is working. (Exceptions are those operations that need to operate with an exclusive lock, such as most forms of ALTER TABLE.) The minimum useful syntax for pg_dump is:

pg_dump dbname > outfile

However, the backup created from this method has limited usefulness. It can be used to restore a single database in full. A more useful and proper form of PostgreSQL backup syntax looks like this:

pg_dump -U $username --format=c --file=$mydatabase.sqlc $dbname

The options in detail are:


-U, --username=NAME connect as specified database user
-F, --format=c|t|p output file format (custom, tar, plain text)
-f, --file=FILENAME output file name

The most important of which is --format. By default pg_dump uses the plain text format. The plain text format is useful for very small databases with a minimal number of objects but other than that, it should be avoided. The custom format allows for a wealth of customizability. Using the custom format you are able to restore single objects from a backup. For example to restore only a specified index from a backup file:


pg_restore -U $username --dbname=$dbname --index=$indexname

If you wanted to restore only a single function:


pg_restore -U $username --dbname=$dbname --function=$functionname(args)

If you wanted to restore only a single table:


pg_restore -U $username --dbname=$dbname --table=$tablename

For more information on all the pg_dump options, please see the reference page. Restoring the dump The command used to restore a backup file is pg_restore. It has similar options to pg_dump. A simple restore:

pg_restore -U$username --dbname=$databasename $filename

Where filename is the name of the backup file.

Do not confuse --file with $filename. The --file option is used to turn a custom format backup into a plain text backup. The value of --file will be used as the output file for that transformation.

If you make the mistake of creating a plain text backup, pg_restore can not be used as a restoration mechanism. You can use psql to restore it:

psql $dbname < $backupfile

Backing up every database The "postgresql" way of backing up every database is to use the command pg_dumpall. Unfortunately pg_dumpall can only create plain text backups and should be considered deprecated. However it is the only way to backup the globals in your cluster. A reasonable backup strategy to backup your globals and produce a flexible backup of every database in the cluster would look like this:



pg_dumpall -g -U$username --file=$globals.sql;
psql -AtU postgres -c "SELECT datname FROM pg_database \
WHERE NOT datistemplate"| \
while read f;
do pg_dump -Upostgres --format=c --file=$f.sqlc $f;
done;

If someone knows of some Windows code that produces a similar result, it would be great if you would share.

Remember, pg_dumpall creates a plain text backup. This means you will need to use psql to restore the globals backup file.

After restoring a backup, make sure you run ANALYZE to update the statistics. I know this isn't as comprehensive as it could be, but hey, its just a blog.