In this guide, you will learn how to setup replication between two PostgreSQL servers on Ubuntu. Along with the how and why of what you will learn.
- Why Replication?
- Ubuntu and PostgreSQL Version Compatibility
- Install Prerequisites
- Install PostgreSQL
- Create Replication User
- Replica Connection Permission
- Configuration for Creating Archive Files for Replication
- Make Folder for Saving Archive To
- Restart Primary Server
- Replication Server
- Configure Replica postgresql.config
- Sync Primary Data to Replica
- Configure recovery.conf for Continuing Replication
- Restart Replica Server
- Test Replciation is Working
- Final Thoughts
Database replication is fairly important when you want to scale your database horizontally. Services like Amazon RDS make it super easy. However, that can be expensive, or as I found in my case, I couldn't use RDS. We ended up needing to use a plugin that wasn't available for RDS so we have to host our own.
The process of setting up replication contains multiple steps, and if you get them wrong it wont work, and wont easily tell you why. Through this guide I will show you what you need to do to setup replication with ubuntu. Also, hopefully, enough information to figure it out for other distributions as well. I also hope to explain what is going on so you can understand the why of what you are doing.
Below are the versions of postgres and ubuntu I have tested this guide on to make sure that it works. As new versions of both come out I hope to test them to keep this guide up-to-date.
|Ubuntu 14.04||Ubuntu 16.04||Ubuntu 18.04|
|PostgreSQL - 10.0|
|PostgreSQL - 9.6|
|PostgreSQL - 9.5|
With ubuntu we need to add the apt repository so that we can install postgres. One of the frustrating sides to notice in this is the
pgdg if you don't get that it wont work. However, you should just be able to copy and paste the following 4 lines and it should work no matter your version of ubuntu.
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' sudo apt-get install wget ca-certificates wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - sudo apt-get update
Next is to actually install PostgreSQL. Through the rest of this guide I'll add the line or file you need to use for each version of postgres that you want to use. Only use that line or frustration will happen. Now all we need to do is a simple
apt-get install and we are good to go.
sudo apt-get install postgresql-10 sudo apt-get install postgresql-9.6 sudo apt-get install postgresql-9.5
Now that Postgres is installed we need to create our user for doing replication. Postgres requires a user with replication privileges in order to pull data over. By convention people use a user called
replication with the
First thing we need to do is log in to postgres, and the easiest way to do that is to switch users to the
postgres user. Once you are the
postgres user use
psql to log in and execute the
CREATE USER command.
sudo su - postgres psql CREATE USER replication REPLICATION LOGIN ENCRYPTED PASSWORD 'password1';
CREATE USER will create the replication user, set the replication permission, set the user to be able to login, and encrypt the password in the database. It is recommended to use a different password than
password1. Also please do encrypt the password because if you don't it can lead to headaches later on with other tool configurations.
Now that we have our user we need to give them permission to actually connect to the primary postgres server. We do that in the
pg_hba.conf file. In the file we declare what user, database, and where they are connecting from. In this case our replica server is at ip
192.168.1.102 so we are explicitly telling our server to allow connections from the user
replication from that
ip only. The more servers or users you have the more you add to this list.
host replication replication 192.168.1.102/32 md5
One of the most critical parts of configuring replicaiton is the saving of data to archive files so they can be pulled over to replicas. The following configuration does just this.
So that you know
WAL stand for
write-ahead logging. It is very useful for a few things, but we are only going to focus on replication to other servers in this guide.
The first step is to set the
wal_level. I liken it to
logging_level it is how you configure the intensity, might be a bad analogy, but it is how I keep it in my head. We want to set it to
replica for postgre 9.6+, for 9.5
hot_standby. We follow the
wal_level with a few other settings that are good defaults.
listen_addresses = '*' wal_level = replica # In releases prior to 9.6, this parameter also allowed the values archive and hot_standby. These are mapped to replica. max_wal_senders = 2 wal_keep_segments = 32 archive_mode = on archive_command = 'cp -i %p /var/lib/postgresql/10/main/archive/%f' # Use with PostgreSQL 10 archive_command = 'cp -i %p /var/lib/postgresql/9.6/main/archive/%f' # Use with PostgreSQL 9.6
listen_addresses = '*' wal_level = hot_standby max_wal_senders = 2 wal_keep_segments = 32 archive_mode = on archive_command = 'cp -i %p /var/lib/postgresql/9.5/main/archive/%f'
To me the key to understanding and getting replication working is the
archive_command this is where you set the location of the archive files for our replicas to restore from. In the case of our archive folder we have it in the base of the postgres main install folder. This is an ideal location because later when we pull the data it will pull not only the archive but a similar set of configurations as well. That way we get a good configuration and setup for our replica.
So with the last paragraph in mind we need to make the
archive folder, and it needs to have the owner be the
postgres user. You can either use
chown or temporarily change to the
postgres user to create the folder.
sudo su - postgres mkdir /var/lib/postgresql/10/main/archive mkdir /var/lib/postgresql/9.6/main/archive mkdir /var/lib/postgresql/9.5/main/archive
At this point our primary server is done being configured, and we just need to restart postgres.
sudo systemctl restart postgresql # or sudo service postgresql restart
The replication server(s) is as simple as a standard PostgreSQL server with a couple of configuration tweaks. First install postgres like we have done above, and then follow along below. It is fairly similar in that you install, tweak configuration, run a sync command, create a file and restart the server.
Much like with the primary server just edit the
postgresql.conf file and add the following lines to the bottom. This sets up that we are going to pull data from the primary server.
wal_keep_segments = 32 wal_level = replica hot_standby = on wal_log_hints = on
wal_keep_segments = 32 wal_level = hot_standby hot_standby = on wal_log_hints = on
Now that our configuration is set we need to pull over all of our data, and some base files. We are running a
main setup for postgres so we need to remove that folder and all of its contents. Then we need to pull the data over from the primary server. The first is a simple
rm -rf command. The later uses
pg_basebackup is called with the host you want to pull data from. Then the directory we want to pull over. Since we deleted the
main folder we need to be sure to pull that one from the primary server. This will pull in the base files, plus all of the files in the
archive folder which we will use for restoring to our replica. You also need to do this as the
postgres user so that it sets proper permissions.
sudo su - postgres rm -rf /var/lib/postgresql/10/main/ rm -rf /var/lib/postgresql/9.6/main/ rm -rf /var/lib/postgresql/9.5/main/ pg_basebackup -h 192.168.1.103 -D /var/lib/postgresql/10/main -P -U replication pg_basebackup -h 192.168.1.103 -D /var/lib/postgresql/9.6/main -P -U replication pg_basebackup -h 192.168.1.103 -D /var/lib/postgresql/9.5/main -P -U replication
At this point we have installed postgres, configured it, and pulled over all of the data. Now we need to add the file that will connect and constantly pull data over. It will also do the restore command of the data. The file needs to be named
recovery.conf, note the
y, and it needs to go in the
main folder we pulled over. This file needs to be added after a sync, and does not belong on the primary server.
The contents set the connection info to the primary server, along with password. It also sets the command for restoring files. This works since we are pulling WAL logs. It is just moving them to where postgres will restore them automatically. This file needs to be owned by the
standby_mode = 'on' primary_conninfo = 'host=192.168.1.103 port=5432 user=replication password=password1' trigger_file = '/var/tmp/standby.trig' restore_command = 'cp /var/lib/postgresql/10/main/archive/%f %p' restore_command = 'cp /var/lib/postgresql/9.6/main/archive/%f %p' restore_command = 'cp /var/lib/postgresql/9.5/main/archive/%f %p'
trigger_file if you create that file it will stop the replication process on the replica server.
Now our replica is configured so we just need to restart. After a restart it should start the process of restoreing replicated data, and pullig over new data as well.
sudo systemctl restart postgresql
Your database server should be replicating, but lets test it to be sure. It is as simple as creating a new database on the primary server, then logging in on the replica and validating it is there.
sudo su - postgres psql create database test1;
sudo su - postgres psql \l
This is just the tip of the iceberg with regards to what you can do with replication of postgresql, hopefully though it gives you a jumping off point. This is great for creating a primary server to write to and a replica you can read from so you can distribute some of the load of using a database. Another great thing about the read-only replica is it is ideal for setting up your database backups to pull from there, again reducing load on the primary.
At this point you should be good to go. The next thing I suggest is taking a look at
pg-pool for doing connection pooling between both of the servers. It is a great tool, and I hope to come out with a guide on it soon as well.