Photo by Thành ‎ / Unsplash

Mastering PostgreSQL Replication: Setting Up a Master-Slave Configuration and Accessing it from Go

Tech Stuff Jan 18, 2023

Have you ever worked on a project where you have one database and wanted to improve performance when accessing the database? That's where a master-slave replication of the database comes in. This type of configuration allows for a primary database, or "master," to handle all write operations, while one or more secondary databases, or "slaves," handle read operations.

As for our blog, we'll focus on setting up that configuration using the famous open-source database, i.e. PostgreSQL. But what's the benefit of having such a configuration in the first place you might ask. Overall, setting up a master-slave replication of PostgreSQL configuration can help to improve the performance, availability, and scalability of your database. This is because it allows for the distribution of read requests across multiple servers, reducing the load on a single database and improving response times for users. Additionally, it also provides a level of redundancy, ensuring that even if the master database goes down, the slaves can still handle read requests and keep the system running.

Additionally, this setup enables to have a backup of the data on a slave node in case of the master node failure, it also allows better handling of the traffic and having a fail-over plan in case of emergency. Master-slave replication is a powerful tool that can help you to scale your database and ensure high availability. It's also important to note that this setup can be used in conjunction with other techniques, such as load balancing, to further improve the performance and scalability of your database.

For the sake of simplicity we're going to use docker for our small setup configuration and also bitnami PostgreSQL image as it comes with a pre-configured setup for a master-slave setup.

version: '3'
services:
  master:
    image: 'bitnami/postgresql:latest'
    environment:
      - POSTGRESQL_REPLICATION_MODE=master
      - POSTGRESQL_REPLICATION_USER=replicator
      - POSTGRESQL_REPLICATION_PASSWORD=password
    ports:
      - "5432:5432"
    volumes:
      - master-data:/bitnami/postgresql
  slave:
    image: 'bitnami/postgresql:latest'
    deploy:
      replicas: 2
    environment:
      - POSTGRESQL_REPLICATION_MODE=slave
      - POSTGRESQL_MASTER_HOST=master
      - POSTGRESQL_MASTER_PORT_NUMBER=5432
      - POSTGRESQL_REPLICATION_USER=replicator
      - POSTGRESQL_REPLICATION_PASSWORD=password
    depends_on:
      - master
    ports:
      - "5433:5432"
    volumes:
      - slave-data:/bitnami/postgresql
volumes:
  master-data:
  slave-data:

Here in this setup, we can see that we have two instances of PostgreSQL, one for the primary one "master" and the secondary "slave". Remember that we can make the secondary instance to be more than one instance.

BUT ...

As mentioned before we can use this setup for scalability, but it solely depends on the speed at which the primary "master" sends data to the secondary database. We need to make sure the data available in the primary database should reflect fast in the secondary instance(s) or database replicas. Thus the amount of time it takes for a master PostgreSQL server to send an update to a slave PostgreSQL server can vary depending on several factors, such as the size of the update, the network latency between the master and slave servers, and the load on the master server at the time of the update.

In general, if the network connection between the master and slave servers is fast and the load on the master server is low, updates should be sent relatively quickly. However, if the network connection is slow or the load on the master server is high, it may take longer for updates to be sent.

Thus in our case, it's running on the Docker network, thus the operation should be sent relatively quickly. In practice though we can use monitoring tools such as pg_stat_replication to check the replication lag and check the delay between the updates sent to the slave replicas and the time it was processed by the servers. Also, the performance of the container and the host on which it's running and the overall health of the network environment should be taken into consideration.

Unlocking the Potential of Dual Database Instances: One Step Closer to Utilize Them

Now that we've created our multi-instance of the database; we need to find a way how to utilize them from a service point of view. For our blog, we're going to use Go on trying to use the primary database for write operations and the secondary "slave" database for read operations.

Here's an example piece of code written in Go to utilize both instances for write and read operation(s):

package main

import (
    "fmt"
    "database/sql"

    _ "github.com/lib/pq"
)

var (
    readDB  *sql.DB
    writeDB *sql.DB
)

func init() {
    var err error

    // Connect to the "read" database
    readDB, err = sql.Open("postgres", "user=read_user password=read_password dbname=read_db sslmode=disable")
    if err != nil {
        panic(err)
    }

    // Connect to the "write" database
    writeDB, err = sql.Open("postgres", "user=write_user password=write_password dbname=write_db sslmode=disable")
    if err != nil {
        panic(err)
    }
}

func main() {
    // Use the "read" database connection for queries
    rows, err := readDB.Query("SELECT * FROM users")
    if err != nil {
        panic(err)
    }
    defer rows.Close()

    // Use the "write" database connection for updates
    _, err = writeDB.Exec("UPDATE users SET name='new_name' WHERE id=1")
    if err != nil {
        panic(err)
    }
}


In the above example, the init() function is used to establish the two database connections at the beginning of the program, and the main() function demonstrates how to use the readDB connection for queries and the writeDB connection for updates.

Until next time, database friends! 🤙

Tags

Meron Hayle

Hi there, I'm Meron, a software engineer, an entrepreneur, and an artist, also known as a ninja in the art world.