Install Monitoring Console

DBCC is a console tool for monitoring SynxDB clusters and databases. It provides dashboards and data to help you better diagnose the current status of your clusters and databases.

This document describes how to install and deploy the DBCC component to provide monitoring services for your SynxDB cluster.

The DBCC consists of two main components: DBCC Server and DBCC Agent. The DBCC Server is the console’s server, which receives and displays metrics in a unified way. The DBCC Agent runs on each cluster node to collect and report data.

The DBCC Agent monitors the system and services of the local node and reports data to the DBCC Server via a gRPC port. The DBCC Server is responsible for receiving, processing, and visualizing the operational status and performance metrics of all nodes in the cluster, ultimately providing you with the metric information on the monitoring dashboards.

Prerequisites

  • A SynxDB database cluster is installed and configured, and the gpperfmon service is enabled on all nodes.

  • You have obtained the DBCC installation package and its dependencies.

  • The target server is a Linux system that supports systemd.

  • You have root or sudo privileges.

  • curl is installed (for health checks).

  • glibc 2.17 or a later version.

  • Hardware requirements: at least 1 CPU core and 2 GB of memory.

Tip

gpperfmon is a performance monitoring component that helps administrators and developers view, analyze, and diagnose various runtime metrics of the cluster. To enable gpperfmon in SynxDB, follow these steps:

  1. Initialize the monitoring service:

    gpperfmon_install --enable --password <db_password> --port <db_port>
    
  2. Restart the database:

    gpstop -ar
    

Step 1: (Optional) Configure before installation

Before running the deploy.sh installation script, you can perform the following configurations as needed. All these settings apply to files within the installation package and will take effect during the first installation.

Customize ports

You can customize the default network ports used by the services before running the deploy.sh script. Find the following variables in the deploy.sh script and change their assigned values to your desired ports.

DBCC_SERVER_HTTP_PORT=8080        # |monitor_console| Server Web UI and API
DBCC_SERVER_GRPC_PORT=28080       # For |monitor_console| Agent connections
DBCC_SERVER_MANAGEMENT_PORT=18080 # |monitor_console| Server Actuator/management endpoints
PROMETHEUS_PORT=9090              # Prometheus Web UI and API
ALERTMANAGER_PORT=9093            # Alertmanager Web UI and API

For example, to change the main DBCC Server HTTP port from 8080 to 8888, find the following line:

DBCC_SERVER_HTTP_PORT=8080

And change it to:

DBCC_SERVER_HTTP_PORT=8888

Configure coordinator auto-failover

DBCC supports configuring automatic failover for the coordinator node to enhance database high availability. You can enable and configure this feature by modifying the configuration file in the installation package before installing DBCC Server for the first time.

To enable coordinator auto-failover, edit the config/dbcc-server/application.yml file in the installation package and add or modify the following configuration:

dbcc:
  coordinatorAutoFailover:
    enabled: true  # Defaults to false
    alert:
      confirmationDuration: 5  # Unit: seconds

Parameter descriptions:

  • enabled: Enables or disables the auto-failover feature. Set to true to enable, or false to disable. The default is false.

  • confirmationDuration: The duration (in seconds) for which an alert condition must be met before triggering a failover. This helps prevent false positives caused by network jitter or transient failures.

    Tip

    The confirmationDuration parameter determines how long a failure state must persist before a genuine alert and failover process is triggered. It translates to the for field in Prometheus alert rules. For example, confirmationDuration: 5 generates an alert rule with for: 5s.

    How to set a value for confirmationDuration:

    • A larger value: Can effectively prevent false failovers caused by transient issues (like network jitter), making failure detection more stable but slowing down the failover response.

    • A smaller value: Speeds up the failover response but might trigger unnecessary failovers due to transient failures. A value of 5 seconds is recommended.

Configure scan table task

DBCC supports a scheduled task that periodically scans tables to collect bloat and skew metrics. The scan results are displayed on the Recommendations page. You can enable and configure this task by modifying the config/dbcc-server/application.yml file in the installation package:

dbcc:
  scanTable:
    enabled: false           # Enables or disables the scan table task (default: false)
    cron: "0 0 2 * * ?"      # Cron expression for the task execution time (default: 2:00 AM daily)

Parameter descriptions:

  • enabled: Enables or disables the scheduled scan table task. Set to true to enable, or false to disable. The default is false.

  • cron: A cron expression that defines when the scan task runs. The default value "0 0 2 * * ?" means the task runs daily at 2:00 AM.

After making changes to the configuration, restart the DBCC Server for the changes to take effect:

sudo ./deploy.sh restart dbcc-server

Disable multi-language support

By default, DBCC Server enables multi-language support (i18n). If you want to disable this feature and set a fixed default language, you can modify the config/dbcc-server/application.yml file in the installation package:

dbcc:
  i18n:
    enabled: false          # Set to false to disable i18n
    default-language: en-US # Set the default language

Modify default administrator credentials

The default administrator username and password for DBCC Server are both admin. Although you can modify the password via the UI after the first login, you can also pre-set credentials by modifying the config/dbcc-server/application.yml file in the installation package:

spring:
  security:
    user:
      name: admin      # Change to your preferred username
      password: admin  # Change to your preferred password

Adjust monitoring sensitivity (Prometheus)

You can adjust the core parameters of Prometheus to change the response speed of failure detection. These parameters are located in the config/prometheus/prometheus.yml file within the installation package.

Tip

Reducing the time interval can speed up failure detection, but it will also increase the system load and network overhead on the Server host. Please exercise caution and adjust carefully based on your hardware configuration and monitoring requirements.

global:
  scrape_interval: 5s      # Data scraping interval
  scrape_timeout: 2s       # Scraping timeout
  evaluation_interval: 15s # Alert rule evaluation interval

Parameter descriptions:

  • scrape_interval: The time interval for Prometheus to scrape metrics from each Agent node. A smaller value means faster failure detection.

  • evaluation_interval: The time interval for Prometheus to evaluate alert rules. This affects the delay time from failure occurrence to formal alert trigger.

  • scrape_timeout: The timeout time for a single metric scrape. If the Agent node is unresponsive, Prometheus will mark the scrape task as failed after exceeding this time.

Adjust Agent data collection frequency

You can adjust the frequency of data collection and export by the Agent, which directly affects the real-time data acquisition by Prometheus. These parameters are located in the config/dbcc-agent/config.yml file within the Agent installation package.

databaseMetrics:
  coordinatorUpCollectInterval: 60s # Coordinator status collection interval
  exportInterval: 5s                # Metric export interval

Parameter descriptions:

  • coordinatorUpCollectInterval: The time interval for the Agent to execute SELECT 1 to detect Coordinator connectivity. A smaller value means more frequent and timely detection of database status, but also increases database load.

  • exportInterval: The time interval for the Agent to write collected metrics to a local text file. Prometheus obtains data by scraping this file, so this value directly affects the “freshness” of data that Prometheus can obtain. Reducing this value can improve data real-time performance, but it will increase disk I/O load.

Step 2: Install DBCC components

After completing the necessary pre-configurations, you can start installing the DBCC Server and Agent components.

Install Server

Upload the DBCC Server installation tool to the target server, enter the tool directory, and execute the following command:

sudo ./deploy.sh install

Attention

Before executing this command, please ensure that all necessary configurations from “Step 1” have been completed. The installation script will apply the pre-set configurations to the system.

This command will execute:

  1. Create relevant directories.

  2. Copy binary and configuration files.

  3. Add systemd service files.

  4. Start DBCC Server/Prometheus/AlertManager services.

After deploying DBCC Server, you can access the DBCC control panel by browsing to the following address:

http://<DBCC_SERVER_IP>:8080

<DBCC_SERVER_IP> is the IP address of the server where the DBCC Server is located. The default port is 8080, and if you modified the DBCC_SERVER_HTTP_PORT variable before installation, please use your customized port.

Tip

DBCC Server uses the default administrator username admin and password admin.

It is recommended to modify the default password immediately after the first login. The steps are as follows:

  1. Log in to the |monitor_console| control panel.

  2. Click the user avatar in the top right corner, and select Modify Password from the dropdown menu.

  3. In the Modify Password window, enter the old password and new password, and click Confirm to complete the modification.

Install Agent

Upload the DBCC Agent installation tool to the servers of each SynxDB node, and execute the following command on the tool directory of each node:

sudo ./deploy.sh install --console-url <DBCC_SERVER_IP>:28080

Where <DBCC_SERVER_IP> is the IP address of the server where the DBCC Server is located. The default gRPC port is 28080, and if you modified the DBCC_SERVER_GRPC_PORT variable before installation, please use your customized port. For example:

sudo ./deploy.sh install --console-url 192.168.0.1:28080

This will execute the following actions:

  1. Create necessary directories.

  2. Copy binary files and configuration files.

  3. Set up systemd service files.

  4. Start the proxy service.

Attention

The console-url parameter is crucial for the communication between the Agent proxy and the DBCC Server. This parameter consists of the server IP address and the server gRPC port (that is, DBCC_SERVER_GRPC_PORT, default value 28080).

The console-url parameter format is <server_ip>:<grpc_port>, for example, 192.168.0.1:28080.

The console URL is stored in the Agent proxy configuration file, located at /etc/dbcc/dbcc-agent/config.yml, for example:

# Console configuration
console:
    # Console service address
    url: "192.168.0.1:28080"

Next steps

After deploying DBCC, you can open the control panel to view the operational status of each node in the cluster and CPU, memory, disk I/O, and other performance metrics, as detailed in Use the panel to view cluster monitoring data.

Appendix: Daily maintenance and reference

Maintenance commands

You can use the deploy.sh script to manage the lifecycle of the Server and Agent.

Server

# Start all services
sudo ./deploy.sh start

# Start specific services
sudo ./deploy.sh start dbcc-server
sudo ./deploy.sh start prometheus
sudo ./deploy.sh start alertmanager

# Stop all services
sudo ./deploy.sh stop

# Stop specific services
sudo ./deploy.sh start prometheus

# Restart all services
sudo ./deploy.sh restart

# Restart specific services
sudo ./deploy.sh restart alertmanager

# Check all service status
sudo ./deploy.sh status

# Check the status of specific services
sudo ./deploy.sh status alertmanager

# View version information of the monitoring server package and its components
sudo ./deploy.sh version

# View version information of specific services, for example, alertmanager
sudo ./deploy.sh version alertmanager

Agent

# Start proxy service
sudo ./deploy.sh start

# Stop proxy service
sudo ./deploy.sh stop

# Restart proxy service
sudo ./deploy.sh restart

# Check all service status
sudo ./deploy.sh status

# View version information of the Agent package and its components
sudo ./deploy.sh version

Configuration file path reference

DBCC Server

  • Main configuration file: /etc/dbcc/dbcc-server/application.yml

    Tip

    You can customize the server’s logo, icon, and name by adding the following configurations to this configuration file:

    dbcc:
        title: 'custom name'           # Custom name
        distribution:
        distributor: 'distributor'   # Only supports SynxDB or BlueBerry
    
  • Prometheus configuration files:

    • Main configuration: /etc/dbcc/prometheus/prometheus.yml

    • Alert rules: /etc/dbcc/prometheus/alert_rule.yml

    • Scraping targets: /etc/dbcc/prometheus/scrape_config.yml

  • AlertManager configuration file: /etc/dbcc/alertmanager/alertmanager.yml

DBCC Agent

Agent configuration file: /etc/dbcc/dbcc-agent/config.yml

Default port reference

Component

Port

Description

DBCC Server UI

8080

Web control panel access port

DBCC gRPC

28080

Port used for Agent to report data

Prometheus

9090

Prometheus control panel

AlertManager

9093

Alert control panel

Troubleshooting

Server service fails to start

  1. Check service status: sudo ./deploy.sh status dbcc-server.

  2. View detailed logs: sudo journalctl -u dbcc-server -f.

  3. Verify port availability: sudo netstat -tulpn | grep 8080.

  4. Check for configuration file errors: cat /etc/dbcc/dbcc-server/application.yml.

Server health check fails

  1. Ensure the service is running.

  2. Check if the port is accessible.

  3. View errors in the service logs: sudo journalctl -u dbcc-server -f.

  4. Verify the configuration file is correct.

Agent service fails to start

  1. Check service status: sudo ./deploy.sh status.

  2. View detailed logs: sudo journalctl -u dbcc-agent -f.

  3. Verify configuration: cat /etc/dbcc/dbcc-agent/config.yml.

  4. Check if the console URL is correctly set and the server is reachable: ping <server_ip>.

Agent health check fails

  1. Ensure the service is running.

  2. Check if the DBCC Server is accessible.

  3. View errors in the service logs: sudo journalctl -u dbcc-agent -f.

  4. Verify the console URL in the configuration file is correct.