In this post I look at using Docker to restore a Postgres dump file to a Postgres database running in the cloud on AWS RDS.
Keep it clean
One of the big selling points of docker, for me, is that I can have lots of apps and utils running in nice containers on my dev laptop, without having to install them locally. This ensures my laptop stays nice and responsive and I don’t clutter/break my laptop with lots of weird dependencies and running processes that I’m then too scared to delete.
Postgres is a good example – I don’t want to install it locally, but I do need access to the command line tools like psql and pg_restore, to be able to work with my databases effectively.
One way of accessing these tools would be to ssh onto the AWS cloud instances, but there’s a bunch of reasons most pertinently security (not to mention the faff) why you’d want to avoid that every time you want to run some sql. So let’s look at how we use Docker to ease the pain instead.
Start Me Up
With Docker installed you can build this simple Dockerfile to create a local Postgres container. The User and Password env vars aren’t strictly required, however, if you want to actually connect to the containerised DB, it’s pretty handy
|ENV POSTGRES_USER postgres|
|ENV POSTGRES_PASSWORD password|
You can build, run and connect to the container as follows (assumes you are on Mac)
|vi Dockerfile # Copy Docker commands listed above into your local Dockerfile|
|docker build -t postgres-db .|
|docker run -d -v /Users/iainhunter/dev/docker/postgres/data-load:/data-loader -p 5432:5432 postgres-db|
|docker exec -it <imageId> bash|
Note line 4 where I map the data-load dir I created at line 1 to a new directory called data-loader inside my container. This means that when I copy the Postgres dump file into my local data-load directory, it will be available to the postgres tools available in the container.
Line 6 allows me to connect to the container, swap the imageId for your locally running containerID.
Restoring your database with pg_restore
I’ll assume you already have a Postgres database set up within the AWS cloud. So now we have connected to our container, we can use pg_restore to use restore our dumpfile into AWS (note this command will prompt you for the admin password)
|pg_restore –host=<youHost>.eu-west-1.rds.amazonaws.com –port=5432 –username=<yourAdminUser> –password –dbname=<yourDB> /data-loader/dumpfile.dmp|
A note on schemas
If you’re doing a partial restore, you may want to restore your dumpfile to a separate schema. Unfortunately there appears to be no way to do this from the command line. What you have to do is to rename the public schema, create a new public schema and restore into that, then reverse the process.
This StackOverflow answer outlines the process.
You should now have a complete restore of your dumpfile in the cloud. Please add comments if anything is unclear.