CBDR
CBDR is a backup and recovery tool for SynxDB and Apache Cloudberry™ (Incubating), built on top of WAL-G. It provides a simple command-line interface for performing backup and recovery operations, helping ensure data safety and enabling disaster recovery.
CBDR continuous archiving recovery is a disaster recovery solution based on WAL log archiving. It combines physical full backups with WAL archive files to achieve continuous data protection and Point-in-Time Recovery (PITR) for the database cluster. CBDR makes off-site disaster recovery for MPP databases possible. Users can deploy a cluster with fewer servers but the same number of instances in a disaster recovery site. You can use CBDR for incremental data replication and recovery across data centers, and the disaster recovery cluster can also provide read-only services (hot standby), thus making comprehensive use of system resources.
CBDR offers the following features:
Full backup: Supports full backup of the entire database cluster.
Incremental backup: Backs up only the changes made since the last backup.
Backup listing: Displays all available backups.
Data recovery: Restores data from a specified backup.
Continuous archiving and recovery (PITR): Achieves continuous data protection and point-in-time recovery through physical full backups and WAL archive files, providing better RTO and RPO.
Hot standby: Supports providing read-only query services on the disaster recovery cluster to improve resource utilization.
Storage support: Supports only S3-compatible object storage, not local storage.
Configuration management: Generates and manages the configuration files needed for backup and restore.
Tip
Compared to peer tools like gpbackup and gprestore, CBDR also supports storing backups to S3, multiple compression algorithms (lz4, lzma, zstd, brotli), and backup encryption.
Full backup and restore procedure
Before using CBDR to back up or restore a SynxDB cluster, make sure the following requirements are met:
SynxDB is properly installed and running.
The
wal-gbinary is installed under/usr/local/bin/or/usr/bin/.If using S3 storage, the appropriate credentials have been configured.
The general procedure for performing a backup using CBDR is as follows:
Backup process
Create a backup configuration file named
config.yaml. Assume the file is located at/path/to/config.yaml. For the configuration file template, see Configuration file reference.Distribute the configuration file to all Segment nodes and update the archive command in
postgresql.conf:cbdr configure backup --config=/path/to/config.yaml
Restart the SynxDB cluster:
gpstop -ariPerform the backup:
cbdr backup --config=/path/to/config.yaml
View the list of available backups:
cbdr backup-list --config=/path/to/config.yaml
Restore process
Prepare a new SynxDB cluster as the target for restoration, and create the required configuration file
config.yaml. Assume the file is located at/path/to/config.yaml.Generate the restore configuration file
restore_cfg.json. Before running this command, make sure the new cluster is reachable:cbdr configure restore --config=/path/to/config.yaml --restore-config=/path/to/restore_cfg.json
Delete all existing data directories on the new cluster, including both Coordinator and Segment nodes. For example:
rm -rf /data202502111728221784/coordinator/gpseg-1 rm -rf /data202502111728221784/segment/gpseg-0 rm -rf /data202502111728221784/segment/gpseg-1 rm -rf /data202502111728221784/segment/gpseg-2
Perform the restore:
cbdr restore --config=/path/to/config.yaml --restore-config=/path/to/restore_cfg.json
Start the Coordinator node in admin mode and update the
gp_segment_configurationsystem table to set the correcthostname,address,mirror, and other fields:gpstart -c -a
If the following error occurs during startup, use
ps -ef | grep postgresto check if the Coordinator process is running. If it is, you can safely ignore the error:gpstart failed. (Reason='connection to server at "localhost" (127.0.0.1), port 7000 failed: FATAL: the database system is not accepting connections DETAIL: Hot standby mode is disabled.')
Once the Coordinator starts successfully, you can query the segment configuration:
select * from gp_segment_configuration;
If you have exited admin mode, restart the cluster in admin mode:
gpstop -c -a
Start the full cluster:
gpstart -aYou might encounter the following error during startup:
invalid IP mask "trust": Name or service not known
This happens because the
pg_hba.conffile generated by WAL-G is missing CIDR masks (for example,/32). You need to manually fix the configuration files for the Coordinator, Segment, and Mirror nodes.Example of incorrect configuration:
host all all 192.168.199.42 trust host all gpadmin 192.168.192.159 trust host all gpadmin 192.168.197.5 trust
Corrected configuration:
host all all 192.168.199.42/32 trust host all gpadmin 192.168.192.159/32 trust host all gpadmin 192.168.197.5/32 trust
Attention
Before backing up, make sure the database is running, configuration is correct, and there is enough available disk space.
Prepare the restore environment in advance. Do not interrupt the restore process. After restoring, always verify data integrity.
For storage management, regularly clean up invalid backups and monitor storage usage. If using S3, ensure a stable network connection.
Incremental backup and restore procedure
Before performing an incremental backup, make sure that you have completed at least one full backup.
On the source cluster, run the following command to start an incremental backup based on a specific full backup. Example:
cbdr backup --config=/path/to/config.yaml --delta-from-name=backup_20250409T153036Z
Attention
If you have not run the
cbdr configure backupcommand on the current machine, or if you have run it before but the configuration file has changed, runcbdr configure backup --config=/path/to/config.yamlfirst. This command distributes the/path/to/config.yamlfile from the coordinator node to the same file path on all segment nodes.If you have run the
cbdr configure backupcommand on the current machine, and the configuration file has not changed since then, you can simply runcbdr backup.View all available backups (including full and incremental):
cbdr backup-list --config=/path/to/config.yaml
Sample output:
backup_name modified wal_file_name storage_name backup_20250409T153036Z 2025-04-09T15:31:36+08:00 ZZZZZZZZZZZZZZZZZZZZZZZZ default backup_20250409T153136Z_D_20250409T153036Z 2025-04-09T15:32:36+08:00 ZZZZZZZZZZZZZZZZZZZZZZZZ default
Prepare a new SynxDB cluster as the target for restore. The preparation process is the same as for full backup restore.
Run the restore on the new cluster:
cbdr restore backup_20250409T153136Z_D_20250409T153036Z \ --config=/path/to/config.yaml \ --restore-config=/path/to/restore_cfg.json
Continuous archiving recovery (PITR) and hot standby procedure
Compared to traditional incremental backups, continuous archiving recovery offers a more lightweight and frequent restore point creation capability, providing better Recovery Time Objective (RTO) and Recovery Point Objective (RPO).
This procedure allows you to set up a disaster recovery (DR) cluster that continuously pulls WAL archive logs from the primary cluster and can optionally provide read-only query services (hot standby).
Steps for backing up primary cluster
Prepare configuration file: Prepare the
config.yamlconfiguration file. For specific parameters, refer to Configuration file reference.Configure backup: Distribute the backup configuration file to all segment nodes and modify the archive command in
postgresql.conf.Restart the cluster:
gpstop -ariCreate a base backup: Perform a full backup to serve as the base for continuous archiving.
cbdr backup --full=true --config=/path/to/config.yaml
Create restore points on demand: Create restore points on the primary cluster as needed. A restore point is a specific time marker to which the DR cluster can recover.
cbdr create-restore-point "rp1" --config=/path/to/config.yaml cbdr create-restore-point "rp2" --config=/path/to/config.yaml
Steps for restoring recovery cluster and setting up hot standby
Generate recovery configuration: On the new DR cluster, generate the recovery configuration file
restore_cfg.json.cbdr configure restore --config=/path/to/config.yaml --restore-config=/path/to/restore_cfg.json
Perform initial restore: Ensure that all data directories on the recovery cluster are empty, then perform an initial restore to the latest base backup.
rm -rf /path/to/all/data/dirs/* cbdr restore --restore-config=/path/to/restore_cfg.json --config=/path/to/config.yaml
Set up hot standby mode: After running this command, the recovery cluster will start up in hot standby mode and can serve read-only queries.
cbdr read-replica --config=/path/to/config.yaml --restore-config=/path/to/restore_cfg.json
Continuous recovery (track restore points): Use the
follow-primarycommand to make the recovery cluster track the restore points of the primary cluster.# The first time you run follow-primary, the cluster will start (if not already running), # recover to the specified restore point (e.g., "rp1"), and then pause. # At this point, it can accept read-only queries. cbdr follow-primary "rp1" --config=/path/to/config.yaml --restore-config=/path/to/restore_cfg.json # ...after a new restore point "rp2" is created on the primary cluster... # 1. First, specify the next target restore point "rp2" # (This command does not immediately start replaying logs) cbdr follow-primary "rp2" --config=/path/to/config.yaml --restore-config=/path/to/restore_cfg.json # 2. Then, run replay-resume to make the cluster continue replaying logs from "rp1" # until it reaches "rp2" and pauses again. cbdr replay-resume --config=/path/to/config.yaml --restore-config=/path/to/restore_cfg.json # ...repeat this process to track "rp3", "rp4", etc.
(Optional) Promote to primary: When the primary cluster fails and you need to switch the DR cluster to be the new primary, run this command.
Attention
After running
promote, the cluster will become read-write and stop replaying WAL logs. This is an irreversible operation. The cluster can no longer be used as a DR cluster for archive recovery.cbdr promote --config=/path/to/config.yaml --restore-config=/path/to/restore_cfg.json gpstop -ari
Configuration file reference
CBDR uses YAML configuration files. Currently, it only supports configuring S3 storage parameters and does not support storing backups on the local file system. The following is a sample configuration.
# Database connection settings
PGHOST: "localhost"
PGPORT: 7000
PGUSER: "gpadmin"
PGDATABASE: "postgres"
# Concurrency settings
GOMAXPROCS: 6
# Relative path to the recovery configuration file
WALG_GP_RELATIVE_RECOVERY_CONF_PATH: "conf.d/recovery.conf"
# Polling interval for segment status
WALG_GP_SEG_POLL_INTERVAL: "1m"
# Compression method: supports lz4, lzma, zstd, brotli
WALG_COMPRESSION_METHOD: "lz4"
# Upload and download concurrency
WALG_UPLOAD_CONCURRENCY: 5
WALG_DOWNLOAD_CONCURRENCY: 5
# Retry attempts for file download
WALG_DOWNLOAD_FILE_RETRIES: 5
# Required settings for using S3 storage
WALE_S3_PREFIX: "xxxxxxxxxxxxx"
AWS_ENDPOINT: "xxxxxxxxxxxxx"
AWS_SECRET_ACCESS_KEY: "xxxxxxxxxxxxx"
AWS_ACCESS_KEY_ID: "xxxxxxxxxxxxx"
# Directory for log files
WALG_GP_LOGS_DIR: "/var/log/cbdr"
# Incremental backup limit: maximum number of incremental backups allowed
# after a full backup. For example, if set to 6, a new full backup will be
# forced after 6 consecutive incremental backups to prevent long backup chains.
WALG_DELTA_MAX_STEPS: 10
Command usage
The basic syntax for running CBDR commands is:
cbdr <command> [options] --config=<config_file>
The following sections describe the main usage of each CBDR command.
Configure commands
The cbdr configure command is used to distribute the backup configuration to all Segment nodes or to generate a restore configuration file.
# Distribute the backup configuration and update the archive command in postgresql.conf
cbdr configure backup --config=<config_file>
# Generate the restore configuration file
cbdr configure restore --config=<config_file> --restore-config=<restore_config_file>
The restore configuration file can be automatically generated using cbdr configure restore (requires the cluster to be reachable), or it can be written manually. A sample JSON format is shown below:
{
"segments": {
"-1": {
"hostname": "localhost",
"port": 7000,
"data_dir": "/tmp/tests/gpdemo/datadirs1/qddir/demoDataDir-1"
},
"0": {
"hostname": "localhost",
"port": 7002,
"data_dir": "/tmp/tests/gpdemo/datadirs1/dbfast1/demoDataDir0"
},
"1": {
"hostname": "localhost",
"port": 7003,
"data_dir": "/tmp/tests/gpdemo/datadirs1/dbfast2/demoDataDir1"
},
"2": {
"hostname": "localhost",
"port": 7004,
"data_dir": "/tmp/tests/gpdemo/datadirs1/dbfast3/demoDataDir2"
}
}
}
Cluster backup
The cbdr backup command is used to back up the database cluster. Syntax:
cbdr backup [options] --config=<config_file>
Optional parameters:
--permanent: Marks the backup as permanent. It cannot be deleted unless forced.--full: Performs a full backup.--add-user-data=<json>: Attaches custom metadata to the backup in JSON format.--delta-from-user-data=<json>: Specifies the base backup for incremental backup using metadata.--delta-from-name=<backup_name>: Specifies the base backup for incremental backup by name.
View backup list
The cbdr backup-list command displays the list of available backups.
cbdr backup-list --config=<config_file> [options]
Optional parameters:
--pretty: Outputs the list in a more readable format.--json: Outputs the list in JSON format.--detail: Displays detailed information.
Restore command
The cbdr restore command restores a backup to the target cluster. Usage:
cbdr restore <backup_name> --config=<config_file> [--restore-config=<restore_config_file>] [--target-user-data=<json>]
Parameter descriptions:
backup_name: (Optional) The name of the backup to restore. If omitted, the latest backup is restored by default.--restore-config: Path to the restore configuration file.--target-user-data: Restores a backup that matches the specified user-defined metadata.--restore-point: (Deprecated, use thefollow-primaryprocedure) Restore to a specific restore point.
Delete command
The cbdr delete command removes an existing backup.
cbdr delete --config=<config_file> [--confirm] [--force-delete]
Parameter descriptions:
--confirm: Must be explicitly set to execute the deletion.--force-delete: Forces deletion even for permanent backups.
Continuous recovery and restore point commands
These commands are used for the “continuous archiving recovery (PITR) and hot standby” procedure.
Create a restore point: Create a time marker on the primary cluster for continuous recovery.
cbdr create-restore-point <restore-point-name> --config=<config_file>
View restore points: View all created restore points.
cbdr restore-point-list --config=<config_file> [--pretty] [--json] [--detail]
Set hot standby mode: Run on the recovery cluster to put it in hot standby mode, allowing it to handle read-only queries.
cbdr read-replica --config=<config_file> --restore-config=<restore_config_file>
Continuous recovery: Run on the recovery cluster to make it follow the primary cluster’s restore points.
cbdr follow-primary <restore-point-name> --config=<config_file> --restore-config=<restore_config_file>
The first time you run this, it will start the cluster (if not already running) and recover to the specified restore point, then pause.
Subsequent executions are used to specify the next target restore point (but recovery does not start immediately).
Resume replay: After specifying a new restore point with
follow-primaryon the recovery cluster, run this command to make the cluster continue replaying WAL logs to the next target restore point.cbdr replay-resume --config=<config_file> --restore-config=<restore_config_file>
Promote to primary: Run on the recovery cluster to promote the hot standby cluster to a primary cluster, making it read-write. This operation is irreversible.
cbdr promote --config=<config_file> --restore-config=<restore_config_file>