How to work with docker data volumes and make postgresql work with it.

Challenge

When you work with docker images it is sometimes difficult to get persistency done right. Where is stuff saved? What can I delete? What do I need to backup?
What if an upgrade comes along, do I loose my data?

Prerequisites

Solution: Docker Data Volumes

I found the problem out the hard way. I lost all my test data because of an upgrade. I totally forgot that if you delete an image you lose the data. So I had a new goal. Get the whole setup working again but now with external data.

Why not put the data on the native file system? Well because that it is more difficult to setup because of the different native file systems.

In this blog I will try to explain how to make this all work with the standard postgres image

Case: Make postgres use a data volume

I have the standard postgres docker image and want to externalise its data to a data volume and be able to:

  • Make backups
  • Restore backups
    etc.

Step 1: Create a data only docker image

It is actually very easy to create a data volume.

This command will do it:

1
docker run --name data -v /data busybox true

This command will create a named images called “data” with a volume /data based on the busybox image.
If you have another images that uses a volume situated in /data you can make it use the one you created by using the –from-volume option.

You can inspect images to see what they are using for volumes.

1
docker inspect -f "{{ .Config.Volumes }}" postgres

with result:
map[/var/lib/postgresql/data:{}]

This means that the postgres image has one volume configured and it is located at: /var/lib/postgresql/data
Now we have something to work with.
If you do not configure a data volume, postgres will store its data at the mentioned location, but that also means that if you throw away the image because of e.g. an upgrade you will loose the data.
Separation of concerns baby… so we want a datavolume.

1
docker run --name ivonet-postgres-data -v /var/lib/postgresql/data busybox true

If you look at your docker images you will see that it created the ivonet-postgres-data image.

Step 2: Have fun with a data volume

Just to get used to some tricks… a bit of playing around to get familiair with it all.

1
docker start ivonet-postgres-data

will do not much ;-) and if you docker ps it, it will not show because the “output” is true…
but how to access the /var/lib/postgresql/data volume then?
remember the –volumes-from parameter I mentioned earlier?

1
docker run --rm -it --volumes-from ivonet-postgres-data busybox /bin/sh

This will enter the shell of the busybox images but with the volumes-from ivonet-postgres-data associated with it. When ls-ing it it will show a data folder and yes that is the folder of your data-volume.

now still in the shell do the following:

1
2
cd /var/lib/postgresql/data
touch "HelloWorld"

now leave the shell by entering ctrl-d or logout
you now should have created a file in the /data folder of your data-volume called HelloWorld

Step 3: Lets make backups work

1
docker run --rm -it -v $(pwd):/backup --volumes-from ivonet-postgres-data busybox tar -vpczf /backup/backup-postgres-data-$(date +"%Y-%m-%d").tar.gz /var/lib/postgresql/data

what does this command do:

  • run a busybox image that will be removed after quit
  • give it access to the current local host volume mounted on /backup in the image
  • use the ivonet-postgres-data volumes for access
  • create a tar.gz from the postgres data folder
  • quit

The result of this command in this example will be a file called something like backup-postgres-data-2015-12-11.tar in the folder you ran the command in (host folder).

If you unzip this it will reveal the HelloWorld file you created earlier.

Step 4: Lets restore a backup

to test this we of course first need to remove the HelloWorld file. Or you can remove the whole data volume and recreate it from scratch. You should now have the skills :-)

You can also do it this way:

1
docker run --rm -it -v $(pwd):/backup --volumes-from ivonet-postgres-data busybox rm -rfv  /var/lib/postgresql/data/HelloWorld

Now the restore:

1
docker run --rm -it -v $(pwd):/backup --volumes-from ivonet-postgres-data busybox tar -C / -xvvf /backup/backup-postgres-data-2015-12-11.tar.gz

Look if it worked. It should contain the HelloWorld file again.

1
docker run --rm -it -v $(pwd):/backup --volumes-from ivonet-postgres-data busybox ls -lsa /var/lib/postgresql/data

Now remove the HelloWorld file again because Postgres needs an empty folder to initialize it’s database.

Step 4: Now create a postgres with it all

Realy easy step :-)

1
docker run --name ivonet-postgres --volumes-from ivonet-postgres-data -p 5432:5432 -e POSTGRES_DATABASE=ivonet -e POSTGRES_USER=ivonet -e POSTGRES_PASSWORD=ivonet -e POSTGRES_ROOT_PASSWORD=s3cr3t -d postgres

You can look at the log:

1
docker logs ivonet-postres

Now if it all went wel you can look at the data folder again to see if it is filled with something like:

1
docker run --rm -it -v $(pwd):/backup --volumes-from ivonet-postgres-data busybox ls -lsa /var/lib/postgresql/data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
total 124
4 drwx------ 18 999 root 4096 Dec 11 20:14 .
4 drwxr-xr-x 3 root root 4096 Dec 11 20:16 ..
4 -rw------- 1 999 999 4 Dec 11 20:14 PG_VERSION
4 drwx------ 6 999 999 4096 Dec 11 20:14 base
4 drwx------ 2 999 999 4096 Dec 11 20:15 global
4 drwx------ 2 999 999 4096 Dec 11 20:14 pg_clog
4 drwx------ 2 999 999 4096 Dec 11 20:14 pg_dynshmem
8 -rw------- 1 999 999 4496 Dec 11 20:14 pg_hba.conf
4 -rw------- 1 999 999 1636 Dec 11 20:14 pg_ident.conf
4 drwx------ 4 999 999 4096 Dec 11 20:14 pg_logical
4 drwx------ 4 999 999 4096 Dec 11 20:14 pg_multixact
4 drwx------ 2 999 999 4096 Dec 11 20:14 pg_notify
4 drwx------ 2 999 999 4096 Dec 11 20:14 pg_replslot
4 drwx------ 2 999 999 4096 Dec 11 20:14 pg_serial
4 drwx------ 2 999 999 4096 Dec 11 20:14 pg_snapshots
4 drwx------ 2 999 999 4096 Dec 11 20:14 pg_stat
4 drwx------ 2 999 999 4096 Dec 11 20:16 pg_stat_tmp
4 drwx------ 2 999 999 4096 Dec 11 20:14 pg_subtrans
4 drwx------ 2 999 999 4096 Dec 11 20:14 pg_tblspc
4 drwx------ 2 999 999 4096 Dec 11 20:14 pg_twophase
4 drwx------ 3 999 999 4096 Dec 11 20:14 pg_xlog
4 -rw------- 1 999 999 88 Dec 11 20:14 postgresql.auto.conf
24 -rw------- 1 999 999 21288 Dec 11 20:14 postgresql.conf
4 -rw------- 1 999 999 37 Dec 11 20:14 postmaster.opts
4 -rw------- 1 999 999 85 Dec 11 20:14 postmaster.pid

recap

  • Now you can create a data volume
  • look at it
  • manipulate it
  • backup it
  • restore it
  • use it
  • have fun with it!

Of course the backup you did is of the raw data and not the actual exported db data

Reference documentation