Replication with PostgreSQL on Ubuntu

Feb 13, 2018 • #database #postgresql #replication

In this guide, you will learn how to setup replication between two PostgreSQL servers on Ubuntu. Along with the how and why of what you will learn.

Why Replication?
Ubuntu and PostgreSQL Version Compatibility
Install Prerequisites
Install PostgreSQL
Create Replication User
Replica Connection Permission
Configuration for Creating Archive Files for Replication
Make Folder for Saving Archive To
Restart Primary Server
Replication Server
Configure Replica postgresql.config
Sync Primary Data to Replica
Configure recovery.conf for Continuing Replication
Restart Replica Server
Test Replciation is Working
Final Thoughts

Why Replication?

Database replication is fairly important when you want to scale your database horizontally. Services like Amazon RDS make it super easy. However, that can be expensive, or as I found in my case, I couldn't use RDS. We ended up needing to use a plugin that wasn't available for RDS so we have to host our own.

The process of setting up replication contains multiple steps, and if you get them wrong it wont work, and wont easily tell you why. Through this guide I will show you what you need to do to setup replication with ubuntu. Also, hopefully, enough information to figure it out for other distributions as well. I also hope to explain what is going on so you can understand the why of what you are doing.

Ubuntu and PostgreSQL Version Compatibility

Below are the versions of postgres and ubuntu I have tested this guide on to make sure that it works. As new versions of both come out I hope to test them to keep this guide up-to-date.

	Ubuntu 14.04	Ubuntu 16.04	Ubuntu 18.04
PostgreSQL - 10.0
PostgreSQL - 9.6
PostgreSQL - 9.5

Install Prerequisites

With ubuntu we need to add the apt repository so that we can install postgres. One of the frustrating sides to notice in this is the pgdg if you don't get that it wont work. However, you should just be able to copy and paste the following 4 lines and it should work no matter your version of ubuntu.

sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'

sudo apt-get install wget ca-certificates
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

sudo apt-get update

Install PostgreSQL

Next is to actually install PostgreSQL. Through the rest of this guide I'll add the line or file you need to use for each version of postgres that you want to use. Only use that line or frustration will happen. Now all we need to do is a simple apt-get install and we are good to go.

sudo apt-get install postgresql-10
sudo apt-get install postgresql-9.6
sudo apt-get install postgresql-9.5

Create Replication User

Now that Postgres is installed we need to create our user for doing replication. Postgres requires a user with replication privileges in order to pull data over. By convention people use a user called replication with the replication permission.

First thing we need to do is log in to postgres, and the easiest way to do that is to switch users to the postgres user. Once you are the postgres user use psql to log in and execute the CREATE USER command.

sudo su - postgres
psql
CREATE USER replication REPLICATION LOGIN ENCRYPTED PASSWORD 'password1';

CREATE USER will create the replication user, set the replication permission, set the user to be able to login, and encrypt the password in the database. It is recommended to use a different password than password1. Also please do encrypt the password because if you don't it can lead to headaches later on with other tool configurations.

Replica Connection Permission

Now that we have our user we need to give them permission to actually connect to the primary postgres server. We do that in the pg_hba.conf file. In the file we declare what user, database, and where they are connecting from. In this case our replica server is at ip 192.168.1.102 so we are explicitly telling our server to allow connections from the user replication from that ip only. The more servers or users you have the more you add to this list.

/etc/postgresql/10/main/pg_hba.conf
/etc/postgresql/9.6/main/pg_hba.conf
/etc/postgresql/9.5/main/pg_hba.conf

host replication replication 192.168.1.102/32 md5

Configuration for Creating Archive Files for Replication

One of the most critical parts of configuring replicaiton is the saving of data to archive files so they can be pulled over to replicas. The following configuration does just this.

So that you know WAL stand for write-ahead logging. It is very useful for a few things, but we are only going to focus on replication to other servers in this guide.

The first step is to set the wal_level. I liken it to logging_level it is how you configure the intensity, might be a bad analogy, but it is how I keep it in my head. We want to set it to replica for postgre 9.6+, for 9.5 hot_standby. We follow the wal_level with a few other settings that are good defaults.

/etc/postgresql/10/main/postgresql.conf
/etc/postgresql/9.6/main/postgresql.conf

listen_addresses = '*'
wal_level = replica  # In releases prior to 9.6, this parameter also allowed the values archive and hot_standby. These are mapped to replica.
max_wal_senders = 2
wal_keep_segments = 32
archive_mode    = on
archive_command = 'cp -i %p /var/lib/postgresql/10/main/archive/%f'  # Use with PostgreSQL 10
archive_command = 'cp -i %p /var/lib/postgresql/9.6/main/archive/%f' # Use with PostgreSQL 9.6

/etc/postgresql/9.5/main/postgresql.conf

listen_addresses = '*'
wal_level = hot_standby
max_wal_senders = 2
wal_keep_segments = 32
archive_mode    = on
archive_command = 'cp -i %p /var/lib/postgresql/9.5/main/archive/%f'

To me the key to understanding and getting replication working is the archive_command this is where you set the location of the archive files for our replicas to restore from. In the case of our archive folder we have it in the base of the postgres main install folder. This is an ideal location because later when we pull the data it will pull not only the archive but a similar set of configurations as well. That way we get a good configuration and setup for our replica.

Make Folder for Saving Archive To

So with the last paragraph in mind we need to make the archive folder, and it needs to have the owner be the postgres user. You can either use chown or temporarily change to the postgres user to create the folder.

sudo su - postgres
mkdir /var/lib/postgresql/10/main/archive
mkdir /var/lib/postgresql/9.6/main/archive
mkdir /var/lib/postgresql/9.5/main/archive

Restart Primary Server

At this point our primary server is done being configured, and we just need to restart postgres.

sudo systemctl restart postgresql  # or
sudo service postgresql restart

Replication Server

The replication server(s) is as simple as a standard PostgreSQL server with a couple of configuration tweaks. First install postgres like we have done above, and then follow along below. It is fairly similar in that you install, tweak configuration, run a sync command, create a file and restart the server.

Configure Replica postgresql.config

Much like with the primary server just edit the postgresql.conf file and add the following lines to the bottom. This sets up that we are going to pull data from the primary server.

/etc/postgresql/10/main/postgresql.conf
/etc/postgresql/9.6/main/postgresql.conf

wal_keep_segments = 32
wal_level = replica
hot_standby = on
wal_log_hints = on

/etc/postgresql/9.5/main/postgresql.conf

wal_keep_segments = 32
wal_level = hot_standby
hot_standby = on
wal_log_hints = on

Sync Primary Data to Replica

Now that our configuration is set we need to pull over all of our data, and some base files. We are running a main setup for postgres so we need to remove that folder and all of its contents. Then we need to pull the data over from the primary server. The first is a simple rm -rf command. The later uses pg_basebackup

pg_basebackup is called with the host you want to pull data from. Then the directory we want to pull over. Since we deleted the main folder we need to be sure to pull that one from the primary server. This will pull in the base files, plus all of the files in the archive folder which we will use for restoring to our replica. You also need to do this as the postgres user so that it sets proper permissions.

sudo su - postgres

rm -rf /var/lib/postgresql/10/main/
rm -rf /var/lib/postgresql/9.6/main/
rm -rf /var/lib/postgresql/9.5/main/

pg_basebackup -h 192.168.1.103 -D /var/lib/postgresql/10/main -P -U replication
pg_basebackup -h 192.168.1.103 -D /var/lib/postgresql/9.6/main -P -U replication
pg_basebackup -h 192.168.1.103 -D /var/lib/postgresql/9.5/main -P -U replication

Configure recovery.conf for Continuing Replication

At this point we have installed postgres, configured it, and pulled over all of the data. Now we need to add the file that will connect and constantly pull data over. It will also do the restore command of the data. The file needs to be named recovery.conf, note the y, and it needs to go in the main folder we pulled over. This file needs to be added after a sync, and does not belong on the primary server.

The contents set the connection info to the primary server, along with password. It also sets the command for restoring files. This works since we are pulling WAL logs. It is just moving them to where postgres will restore them automatically. This file needs to be owned by the postgres user.

/var/lib/postgresql/10/main/recovery.conf
/var/lib/postgresql/9.6/main/recovery.conf
/var/lib/postgresql/9.5/main/recovery.conf

standby_mode = 'on'
primary_conninfo = 'host=192.168.1.103 port=5432 user=replication password=password1'
trigger_file = '/var/tmp/standby.trig'
restore_command = 'cp /var/lib/postgresql/10/main/archive/%f %p'
restore_command = 'cp /var/lib/postgresql/9.6/main/archive/%f %p'
restore_command = 'cp /var/lib/postgresql/9.5/main/archive/%f %p'

Note the trigger_file if you create that file it will stop the replication process on the replica server.

Restart Replica Server

Now our replica is configured so we just need to restart. After a restart it should start the process of restoreing replicated data, and pullig over new data as well.

sudo systemctl restart postgresql

Test Replciation is Working

Your database server should be replicating, but lets test it to be sure. It is as simple as creating a new database on the primary server, then logging in on the replica and validating it is there.

On Primary

sudo su - postgres
psql
create database test1;

On Replica

sudo su - postgres
psql
\l

Final Thoughts

This is just the tip of the iceberg with regards to what you can do with replication of postgresql, hopefully though it gives you a jumping off point. This is great for creating a primary server to write to and a replica you can read from so you can distribute some of the load of using a database. Another great thing about the read-only replica is it is ideal for setting up your database backups to pull from there, again reducing load on the primary.

At this point you should be good to go. The next thing I suggest is taking a look at pg-pool for doing connection pooling between both of the servers. It is a great tool, and I hope to come out with a guide on it soon as well.

Contents