For simplifying installing process, I will perform all actions under root. You should use a non-privileged user with sudo for installing. We’ll skip this question as well as many security aspects.
1.1 Installing etcd and patroni
Let’s start the installation process by installing the PostgreSQL cluster with Patroni.
Prepare a node for installing necessary packages:
# apt-get update # apt-get -y install ntp python3.8 python3-pip python3-apt unzip
Configure your timezone:
# dpkg-reconfigure tzdata
Then install etcd from Ubuntu’s repos:
# apt-get -y install etcd
Then install patroni itself:
# pip3 install patroni python-etcd psycopg2-binary
1.2 Installing PostgreSQL from the official repository
Installing PostgreSQL from the official repo allows us to install the latest stable version.
Just follow the instructions from the official website:
# sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' # wget -quiet -O -https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - # apt-get update # apt-get -y install postgresql
Remove all PostgreSQL data, installed from repository, because patroni cluster will create its own configs and databases:
# systemctl stop postgresql # rm -rf /var/lib/postgresql/13/main/
Disable postgresql service, since the cluster will be started by patroni:
# systemctl disable postresql
1.3 Configuring patroni, etcd, and PostgreSQL
Here is an etcd config template:
# cat /etc/default/etcd.yaml name: 'node1' data-dir: /var/lib/etcd/default listen-peer-urls: http://10.5.0.4:2380 listen-client-urls: http://10.5.0.4:2379,http://127.0.0.1:2379 initial-advertise-peer-urls: http://10.5.0.4:2380 initial-cluster: node1=http://10.5.0.4:2380,node2=http://10.5.0.2:2380,node3=http://10.5.0.3:2380 initial-cluster-state: 'new' advertise-client-urls: http://10.5.0.4:2379 log-outputs: [stderr] log-level: debug initial-cluster-token: 'etcd-external-cluster'
Just replace names and IP addresses with yours.
Then, take the following Patroni config and modify it with your data(please bear in mind it’s YAML and you should use the right indents):
# cat /etc/patroni.yml scope: postgres name: node1 restapi: listen: 10.5.0.4:8000 connect_address: 10.5.0.4:8000 certfile: /etc/ssl/certs/ssl-cert-snakeoil.pem keyfile: /etc/ssl/private/ssl-cert-snakeoil.key etcd: protocol: http hosts: 10.5.0.3:2379,10.5.0.4:2379,10.5.0.2:2379 bootstrap: dcs: ttl: 100 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true use_slots: true parameters: wal_level: hot_standby hot_standby: true wal_keep_segments: 8 max_wal_senders: 10 max_replication_slots: 5 checkpoint_timeout: 30 initdb: - encoding: UTF8 - data-checksums users: admin: password: ifHefshio options: - createrole - createdb replicator: password: ifHefshio options: - replication postgresql: listen: 0.0.0.0:5432 connect_address: 10.5.0.4:5432 data_dir: /var/lib/postgresql/13/main/ config_dir: /etc/postgresql/13/main/ bin_dir: /usr/lib/postgresql/13/bin pgpass: /tmp/pgpass authentication: replication: username: replicator password: ifHefshio superuser: username: admin password: ifHefshio parameters: unix_socket_directories: '/var/run/postgresql/' stats_temp_directory: '/var/run/postgresql/13-main.pg_stat_tmp' tags: nofailover: false noloadbalance: false clonefrom: false nosync: false log: dir: /var/log/postgresql level: INFO
Further, we need to allow connecting to our PostgreSQL instances from the local network:
# vi /etc/postgresql/13/main/pg_hba.conf
I added the following two strings:
host replication replicator 10.5.0.0/24 md5 host all all 10.5.0.0/24 md5
The lines above permit connection any of the PostgreSQL nodes from the local network. If you are not going to connect directly to the node, you can leave only a load balancer IP address.
Now the time has come to add systemd units for etcd and patroni:
# cat /lib/systemd/system/etcd.service [Unit] Description=etcd - highly-available key value store Documentation=https://github.com/coreos/etcd Documentation=man:etcd After=network.target Wants=network-online.target [Service] Type=notify User=etcd PermissionsStartOnly=true ExecStart=/usr/bin/etcd --config-file /etc/default/etcd.yaml Restart=on-abnormal #RestartSec=10s LimitNOFILE=65536 [Install] WantedBy=multi-user.targetAlias=etcd2.service
# cat /lib/systemd/system/patroni.service [Unit] Description=Runners to orchestrate a high-availability PostgreSQL After=syslog.target network.target etcd.service Requires=etcd.service [Service] Type=simple User=postgres Group=postgres ExecStart=/usr/local/bin/patroni /etc/patroni.yml KillMode=process TimeoutSec=30 Restart=no [Install] WantedBy=multi-user.target
You should enable these two services.
Note. As you have to install at least three nodes, there is a convenient way to install and configure the patroni cluster is to create an Ansible playbook(or use something similar).
Note. Pay attention, patroni depends on etcd, if etcd fails at the start, patroni also won’t be launched. Occasionally, it’s useful to avoid starting patroni on a specific node to perform some manual actions like database recovery. Take it into account!
1.4 Start PostgreSQL cluster with Patroni
First of all, service etcd should be running before starting Patroni, which will start PostgreSQL.
# systemctl start etcd
The action above should be done on all three nodes
Let’s check if the etcd cluster is up:
root@node1:~# etcdctl cluster-health member e65bd2725f0955f is healthy: got healthy result from http://10.5.0.4:2379 member 2b6d6c9d377f653a is healthy: got healthy result from http://10.5.0.2:2379 member c496e9114bd232df is healthy: got healthy result from http://10.5.0.3:2379 cluster is healthy
If you got something like that, it means the etcd cluster has been built and running.
Master election and other cluster algorithms are implemented in etcd, and patroni relies on it.
To see which node is master, type in a console:
root@node1:~# etcdctl member list e65bd2725f0955f: name=node1 peerURLs=http://10.5.0.4:2380 clientURLs=http://10.5.0.4:2379 isLeader=false 2b6d6c9d377f653a: name=node2 peerURLs=http://10.5.0.2:2380 clientURLs=http://10.5.0.2:2379 isLeader=true c496e9114bd232df: name=node3 peerURLs=http://10.5.0.3:2380 clientURLs=http://10.5.0.3:2379 isLeader=false
The next step is to start Patroni with PostgreSQL. When patroni is being launched, it automatically starts PostgreSQL, which in its turn, initializes the database, then patroni creates users specified in config, replaces
postgresql.conf is renamed to
postgresql.base.conf and finally, patroni adds
postgresql.conf file with specific settings and with including
Therefore, if you need to change some of the PostgreSQL settings, let’s say timezone, you should modify
# systemctl start patroni
Do it on all nodes!
Let’s check if PostgreSQL is up:
# systemctl status patroni
You’ll see patroni process like:
471 /usr/bin/python3 /usr/local/bin/patroni /etc/patroni/patroni.yml
Moreover, you should see PostgreSQL processes.
Okay, it’s about the process, but how to check if the database cluster is working properly?
There is a command-line interface to patroni:
root@node1:~# patronictl -c /etc/patroni.yml list + Cluster: postgres (6987392780241750765) ------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +-------------+-------------+---------+---------+----+-----------+ | node1 | 10.5.0.4 | Leader | running | 17 | 0 | | node2 | 10.5.0.2 | Replica | running | 17 | 0 | | node3 | 10.5.0.3 | Replica | running | 17 | | +-------------+-------------+---------+---------+----+-----------+
This utility is used for cluster managing (switchover, failover, etc) too. I won’t consider this.
1.5 Load balancer
One of the requirements is to use an HA load balancer. Usually, cloud providers supply load balancers and guarantee high availability.
During creating a load balancer what you should pay attention to:
/masterfill in the response code field by 200.
Attach your LB to the local network, chosen targets should be your three nodes.
I’ll provide a few screenshots from Hetzner, other providers have the same settings, and interfaces look similar(AWS,GCP, etc).
Test your patroni cluster:
# psql -U admin -h 10.5.0.10 -W -d postgres
Where 10.5.0.10 is the load balancer’s IP address.
Admin’s password you can find in patroni config.
Create a user and database for airflow:
psql> CREATE DATABASE airflow; psql> CREATE USER airflow WITH PASSWORD 'airflow'; psql> GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;
Celery can be installed from OS packages and from pip repository. A preferred approach is to install using pip:
# pip3 install celery
You should install celery on all nodes.
Installed package no need to be configured.
To install RabbitMQ type in a console the following(on three nodes):
# apt-get -y install rabbitmq-server
Then enable systemd unit:
# systemctl enable rabbitmq-server.service
and start it:
# systemctl start rabbitmq-server.service
Then we need to configure the RabbitMQ cluster.
To configure the broker we’ll use CLI.
The following actions should be done on one node, say on node1:
# rabbitmqctl add_user airflow cafyilevyeHa # rabbitmqctl set_user_tags airflow administrator # rabbitmqctl add_vhost / # rabbitmqctl set_permissions -p / airflow ".*" ".*" ".*" # rabbitmqctl delete_user guest
where airflow is the user and
cafyilevyeHa is its password
Now, let’s create the RabbitMQ cluster.
It’s important to add all nodes to
10.5.0.4 node1 10.5.0.2 node2 10.5.0.3 node3
Firstly, we need to activate ssh passwordless access between cluster nodes:
generate ssh keys and put them into
authorized_keys files on all three nodes.
You can use once generated keys.
Copy cookies from any node to others(in the example below we use
# scp /var/lib/rabbitmq/.erlang.cookie root@node2:/var/lib/rabbitmq/ # scp /var/lib/rabbitmq/.erlang.cookie root@node3:/var/lib/rabbitmq/
Cookies are used for authentication.
Check if nodes are working independently:
sequentially enter the command below to check the status of the cluster, you’ll see the cluster is not created:
# rabbitmqctl cluster_status
After checking prerequisites, it’s time to add nodes to the cluster.
It’s necessary to perform the actions on node2 and node3:
# rabbitmqctl stop_app # rabbitmqctl reset # rabbitmqctl join_cluster rabbit@node1 # rabbitmqctl start_app
When the cluster has been created, you can check its status:
# rabbitmqctl cluster_status
As you’ve created the cluster, you’ll see something like this:
root@node1:~# rabbitmqctl cluster_status Cluster status of node rabbit@node1 … Basics Cluster name: rabbit@node1 Disk Nodes rabbit@node1 rabbit@node2 rabbit@node3 Running Nodes rabbit@node1 rabbit@node2 rabbit@node3 Versions rabbit@node1: RabbitMQ 3.8.2 on Erlang 22.2.7 rabbit@node2: RabbitMQ 3.8.2 on Erlang 22.2.7 rabbit@node3: RabbitMQ 3.8.2 on Erlang 22.2.7
Also, there is a possibility to check the status in the web interface, create an ssh tunnel:
# ssh <ipaddress-node1> -L 15672:localhost:15672
In your browser’s address line insert
You’ll see the state of the cluster and nodes.
Note. There is a way to enable peers auto-discovery, but it’s not the scope of the article.
See you in the third part of the tutorial.