PostgreSQL
Overview
PostgreSQL is a powerful, open-source relational database known for its reliability, extensibility, and standards compliance. In Zenskar, you can use PostgreSQL as a data source to import transactional, usage, or customer data for metering, billing, and analytical reporting.
Prerequisites
Create a dedicated read-only PostgreSQL user for replicating data. Alternatively, you can use an existing user:
CREATE USER <user_name> PASSWORD 'your_password_here';Provide the user read-only access to relevant schemas and tables:
GRANT USAGE ON SCHEMA <schema_name> TO <user_name>;
GRANT SELECT ON ALL TABLES IN SCHEMA <schema_name> TO <user_name>;
ALTER DEFAULT PRIVILEGES IN SCHEMA <schema_name> GRANT SELECT ON TABLES TO <user_name>;You must re-run the above command for each schema from which you expect to replicate data.
You must allow inbound traffic from Zenskar IP addresses .
Set up a PostgreSQL data-source connector via Zenskar dashboard
- Log into your Zenskar dashboard.
- In the left side bar, click Usage > Data Sources.
- In the top-right corner, click + ADD DATA SOURCE.
- On the Add New Data Source page, configure the PostgreSQL connector.
- Click on the SAVE SOURCE button.
Connector configuration
General configuration
| Field | Description | Required |
|---|---|---|
| Source Name | Enter a unique name for this data source. | Yes |
| Source Type | Select PostgreSQL from the dropdown menu. | Yes |
Connector configuration
| Field | Description | Required |
|---|---|---|
| Host | The hostname or IP address of the PostgreSQL server. | Yes |
| Port | The port on which PostgreSQL listens for incoming connections. | Yes |
| Schemas | The schema you want to replicate in Zenskar. | Yes |
| Database name | The name of the PostgreSQL database to connect to. | Yes |
| Username | The username of the PostgreSQL database. | Yes |
| Password | The password for the database user. | Yes |
| SSL Modes | SSL configuration mode: Available options are: Disable, Allow, Prefer, Require, Verify-CA, and Verify-Full. | Conditional |
Set up a PostgreSQL data-source connector via API
The
connector_configobject is the only part of the request that differs across connector types (such as BigQuery, Snowflake, or Redshift). The create data-source connector API reference provides a generic overview, while this document explains the PostgreSQL-specific structure ofconnector_config.
Request example
curl --location 'https://api.zenskar.com/datasources' \
-H 'x-api-key: <your-api-key>' \
-H 'apiversion: 20240301' \
-H 'Content-Type: application/json' \
-d '{
"name":"test_postgresql",
"connector_type":"Postgres",
"destination":"Postgres",
"status":"active",
"connector_config":{
"port":5432,
"schemas":["public"],
"ssl":true,
"ssl_mode":{
"mode":"require",
"client_certificate":"certificate",
"client_key":"key"},
"database":"db",
"username":"zenskar",
"password":"password",
"host":"acme"},
"source_definition_id":"decd3333-5647-4c0b-adf4-da0e75f5a750",
"remote_conn":true
}' PostgreSQL-specific connector configuration
The connector_config object contains configuration fields specific to the PostgreSQL data source.
| Field | Description | Required |
|---|---|---|
port | The port on which the PostgreSQL server listens for incoming connections (default: 5432). | Yes |
schemas | List of schemas to be replicated from the PostgreSQL database. | Yes |
ssl | Indicates whether SSL should be used for the connection (true or false). | Yes |
ssl_mode.mode | SSL mode to use for the connection. Supported values: disable, allow, prefer, require, verify-ca, verify-full. | Conditional (required if ssl = true) |
ssl_mode.client_certificate | The client certificate to use when connecting over SSL. | Conditional (required for verify-ca and verify-full) |
ssl_mode.client_key | The private key associated with the client certificate. | Conditional (required for verify-ca and verify-full) |
database | The name of the PostgreSQL database to connect to. | Yes |
username | The username used to authenticate with the PostgreSQL database. | Yes |
password | The password for the specified database user. | Yes |
host | The hostname or IP address of the PostgreSQL server. | Yes |
source_definition_id | The unique identifier for the PostgreSQL connector definition in Zenskar. | Yes |
remote_conn | Indicates whether a remote (SSH tunnel or bastion) connection should be used. | No |
Addendum
SSL modes in PostgreSQL
| SSL mode | Description |
|---|---|
| disable | SSL is disabled. The connection between the client (Zenskar) and the PostgreSQL server is unencrypted. |
| allow | Tries to connect without SSL first; if that fails, retries with SSL. This mode allows either encrypted or unencrypted connections depending on server configuration. |
| prefer | (Default behavior in many clients.) Tries to connect with SSL first, but falls back to a non-SSL connection if SSL isn’t available. |
| require | Always uses SSL to encrypt the connection, but does not validate the server certificate. Ensures encryption but not identity verification. |
| verify-ca | Establishes an SSL connection and verifies that the server’s certificate is signed by a trusted Certificate Authority (CA). Protects against untrusted certificates. |
| verify-full | Establishes an SSL connection, verifies the certificate’s CA, and checks that the server hostname matches the certificate. This is the most secure mode. |
If SSL mode is set to Require, the connection will fail if the PostgreSQL instance doesn’t support SSL.
Data replication
Change Data Capture (CDC) replication
Zenskar uses logical replication of PostgreSQL’s write-ahead log (WAL) to incrementally capture data changes, including deletions.
You should configure the PostgreSQL data source with CDC when:
- You need a record of deletions.
- Your database is large (500 GB or more).
- Your table has a primary key but lacks a suitable incremental cursor field (for example,
updated_at).
CDC must connect to the primary database. Configuring CDC on a replica host will result in failures.
Set up CDC
-
Grant
REPLICATIONprivileges to the read-only user:ALTER USER <user_name> REPLICATION; -
Enable logical replication on your PostgreSQL database.
-
For self-hosted or VM environments (bare metal, EC2, GCE, Docker): Edit the
postgresql.conffile:Parameter Description Set value to wal_levelType of logging in the WAL logicalmax_wal_sendersMaximum number of WAL sender processes Minimum 1max_replication_slotsMaximum number of replication slots allowed 1(or higher if other services read from WAL) -
For AWS RDS or Aurora:
- Go to the Configuration tab of your DB cluster.
- Locate your cluster parameter group and set
rds.logical_replication = 1. - Restart your instance manually or during the next maintenance window.
-
For Azure Database for PostgreSQL:
-
Change the replication mode to
logicalfrom the Azure portal, or run:az postgres server configuration set --resource-group <group> \ --server-name <server> \ --name azure.replication_support \ --value logical az postgres server restart --resource-group <group> --name <server>
-
-
-
Create a dedicated replication slot for Zenskar using the
pgoutputplugin:SELECT pg_create_logical_replication_slot('zenskar_slot', 'pgoutput');The output will include the slot name to be filled in the Zenskar setup page.
-
Define replication identities for all tables you want to replicate:
ALTER TABLE <table_name> REPLICA IDENTITY DEFAULT;Use
FULLfor tables with large or TOASTable fields:ALTER TABLE <table_name> REPLICA IDENTITY FULL; -
Create a publication including all tables to be replicated:
CREATE PUBLICATION zenskar_publication FOR TABLE <tbl1, tbl2, tbl3>;The publication name is customizable.
The Zenskar UI allows table selection for CDC, but if a table is not part of the publication, it will not be replicated. If a table lacks a replication identity, Zenskar can create it automatically if the user has sufficient permissions.
Xmin replication
The xmin method is a cursorless replication mode that tracks inserts and updates using PostgreSQL’s internal xmin system column.
Use xmin when:
- There’s no suitable cursor field (for example,
updated_at). - You want to replace a full-refresh sync.
- You are replicating PostgreSQL tables smaller than 500 GB.
- You are not replicating non-materialized views (these are unsupported by xmin).
Updated 13 days ago
