Documentation — Keel Database Proxy

Overview

Keel is a high-performance, database-agnostic connection pooler and proxy written in modern C (C23). It supports both PostgreSQL (v3 wire protocol) and MySQL (client/server protocol) from a single binary, with native io_uring support, transparent read/write routing, prepared-statement virtualization, and full TLS/mTLS.

The core design principle is share-nothing: each worker thread owns its reactor, session slab, backend pool, and timer wheel — eliminating cross-thread locking in the fast path.

ℹ️

Version v0.2-alpha

For v0.2-alpha, the recommended production deployment is mode = pool with prepared_statement = virtualize and experimental_features = false. Routing and sharding features are available but in hardening/experimental stages.

Production Status

Status	Features
Stable	PostgreSQL pool mode, prepared-statement virtualization, admin inspection, basic metrics
Hardening	Smart routing, SSV, Patroni failover, transaction tracking
Experimental	Sharding, scatter-merge, multi-shard 2PC, WAL/GTID catch-up probes, cluster compression

Installation

Debian / Ubuntu

Terminal

wget https://github.com/virtlabs-io/keel/releases/download/v0.2.0/keel_0.2.0_amd64.deb
sudo dpkg -i keel_0.2.0_amd64.deb

Build from Source

Terminal

# Dependencies
sudo apt install cmake gcc libssl-dev liburing-dev

# Clone and build
git clone https://github.com/virtlabs-io/keel.git
cd keel
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)
sudo make install

Docker

Terminal

docker pull ghcr.io/virtlabs-io/keel:latest

Build Options

CMake Option	Default	Description
`KEEL_ENABLE_LUA`	ON	Embed Lua 5.4/LuaJIT scripting engine
`KEEL_ENABLE_PYTHON`	ON	Embed CPython 3.x scripting engine
`KEEL_ENABLE_IOURING`	ON	Enable io_uring backend (Linux 5.6+)
`KEEL_ENABLE_SANITIZERS`	OFF	Enable ASan/UBSan (development builds)
`CMAKE_BUILD_TYPE`	Release	`Release` / `Debug` / `RelWithDebInfo`

Quick Start

The fastest path to running Keel as a PostgreSQL connection pooler:

/etc/keel/keel.ini

[keel]
experimental_features = false

[worker_group.main]
protocol          = postgresql
bind_addr         = 0.0.0.0
bind_port         = 6432
num_workers       = 0             # auto: one per CPU core
mode              = pool          # production default
prepared_statement = virtualize

min_pool_size     = 10
max_pool_size     = 50

[worker_group.main.servers]
primary = host=127.0.0.1 port=5432 dbname=mydb user=app password=secret role=RW weight=100

Terminal

# Start
sudo systemctl start keel

# Connect through Keel (port 6432) instead of direct (5432)
psql "host=localhost port=6432 dbname=mydb user=app"

# Verify via admin console (Prometheus port)
curl http://localhost:9187/metrics | grep keel_connections

Architecture

Keel uses a share-nothing, per-worker architecture. Each worker thread is a completely independent unit — it owns its reactor, session slab, backend pool, and timer wheel. There is no cross-thread locking in the query path.

Worker Components

#	Component	Description
01	Reactor	io_uring ring / kqueue / epoll — drives all I/O asynchronously
02	Memory Arena	64 KB bump-allocated arena for request-scoped allocations
03	Session Slab	O(1) fixed-size allocator for client sessions (~400 B each)
04	Backend Pool	Per-worker pool of backend connections (RW/RO/WO node pools) with async refill
05	Timer Wheel	Refill (100 ms), prune (30 s), idle checks — all reactor-driven
06	Pipe Pool	Pre-allocated pipes for `splice(2)` zero-copy (Linux)

Request Flow

Query path

Client
  → accept (SO_REUSEPORT) → Worker assignment
  → Startup + Auth (SCRAM-SHA-256 / MD5 / caching_sha2 / LDAP / PAM / mTLS / Cloud IAM)
  → Session created from slab
  → Query Rules (declarative INI rules — before SQL parse)
  → SQL Lexer → Parser → Query Tree    [skipped in PROXY mode]
  → Throttle check (token-bucket rate limiter)
  → OSC / NOTIFY / LISTEN / keel.* GUC intercept
  → Router (R/W split, shard dispatch, hook chain, plugin)
  → Backend Pool borrow (or wait queue + async refill)
  → Backend Protocol (proxy query, relay results, zero-copy splice)
  → Backend Pool return (transaction complete)
  → Stats update (counters, histogram, tracing span)

Connection Multiplexing

Workers use SO_REUSEPORT on the listen socket so the kernel distributes new connections across workers without any userspace balancing. Each worker independently runs its own io_uring ring, so there are zero cross-worker lock acquisitions on the query hot path.

Reactor Model

Keel abstracts platform I/O behind a unified keel_reactor_t API. Three backends are supported, selected automatically at startup:

Backend	Platform	Requirements	Notes
io_uring	Linux	Kernel 5.6+, `liburing`	Primary. Linked SQEs, registered FDs, zero-copy splice
epoll	Linux	Kernel 2.6.1+	Fallback. O(1) fd registration, full feature parity
kqueue	macOS / BSD	—	Primary on macOS. EVFILT_READ/WRITE/TIMER

io_uring Optimizations

Registered FDs — file descriptors pre-registered with the ring, skipping per-operation fd table lookups
Linked SQEs — backend connect + SCRAM auth chained as atomic io_uring sequences (IOSQE_IO_LINK), eliminating one syscall per round-trip
MSG_PEEK + Splice bypass — when fast_network_path = on, DataRow frames are spliced directly from backend socket → kernel pipe → client socket with zero userspace copies
Single-shot accept — fair connection distribution across workers

💡

Zero poll() on the hot path

The entire connect/auth/query lifecycle runs without a single poll() syscall when using the io_uring backend. All waits are submitted as ring operations.

Connection Pooling

Keel supports transaction pooling as the primary mode: backend connections are borrowed from the pool at the start of each transaction and returned when the transaction completes. This allows hundreds of application threads to share a small pool of backend connections.

Pool Lifecycle

Client sends a query
Engine borrows a backend connection from the per-worker pool
If no idle connection is available, a waiter is queued and an async refill is triggered
Before forwarding the query, the engine syncs any session state differences (SET parameters, search_path) to the borrowed backend
Query is forwarded; results relayed back to client
On ReadyForQuery('I'), the backend connection is returned to the pool

Pool Sizing

Parameter	Type	Default	Description
`min_pool_size`	int	5	Minimum idle backend connections kept warm per worker
`max_pool_size`	int	50	Maximum backend connections per worker
`idle_timeout`	duration	5m	Idle backend connection eviction time
`max_connection_age_s`	int	—	Max age of a backend connection before replacement
`pool_wait_timeout_ms`	int	5000	Max time a client waits for a pool borrow before error

⚠️

Pool size is per-worker

With num_workers = 4 and max_pool_size = 50, the total maximum backend connections is 200. Plan your PostgreSQL max_connections accordingly.

DISCARD ALL on Return

When a dirty backend connection (one that has SET parameters or other state) is returned to the pool, Keel sends DISCARD ALL asynchronously and only returns it to the available pool after PostgreSQL confirms readiness with ReadyForQuery('I'). This state is tracked by the CLEANING pool entry state.

Session State — SSV

Semantic State Virtualization (SSV) is Keel's consistency model for preserving session state (GUC parameters, search_path, prepared statements) across transaction-pooled backend reassignments.

How It Works

Each client session tracks a sorted key-value state profile (SET parameters, search_path, etc.)
Each backend connection also tracks its current server-side state
On pool borrow, Keel computes a two-pointer merge diff between client state and backend state
If the states match (XXHash64 fingerprint check), the borrow is on the fast path — zero SQL sent
If they differ, minimal SQL (SET x = 'v') is generated and sent as a pre-query phase

State diff example

Client state:   search_path='myschema', statement_timeout='5000', work_mem='256MB'
Backend state:  search_path='public',                              work_mem='128MB'
───────────────────────────────────────────────────────────────────────────────────
Sync SQL:       SET search_path='myschema'; SET statement_timeout='5000';
                                            -- work_mem differs; RESET work_mem sent

70–90% of borrows hit the hash-match fast path (zero sync SQL needed).

5-Tier Borrow Strategy

The backend pool borrow logic uses a 5-tier strategy, attempting each tier in order until a suitable connection is found:

Exact hash match — state already identical, zero sync SQL
Subset match — backend state is a superset; only additions needed
Compatible match — small diff; generate minimal SET commands
DISCARD candidate — send DISCARD ALL to clean and reuse
New connection — establish fresh backend connection

Runtime Modes

Keel has four runtime tiers, each activating additional features. Disabled features cost at most one cmp + jcc instruction — effectively zero overhead.

Feature	proxy	pool	smart	full
Frame extraction + forward	✅	✅	✅	✅
Connection pooling	❌	✅	✅	✅
Prepared statement replay	❌	✅	✅	✅
R/W routing + sticky-primary	❌	❌	✅	✅
Query logging + SQL analysis	❌	❌	✅	✅
Backend state sync (diff)	❌	❌	✅	✅
Full statistics	❌	basic	✅	✅
Hook dispatch (×4 points)	❌	❌	❌	✅
Transaction tracking (XID)	❌	❌	❌	✅
WAL LSN / GTID capture	❌	❌	❌	✅

keel.ini — setting the mode

[worker_group.main]
mode = pool    # proxy | pool | smart | full  (default: pool)

ℹ️

Startup log

On startup, Keel emits: Runtime tier: pool. Enabled features: [pooling, ps_virtualize, basic_stats]

Configuration Reference

Keel uses INI-format configuration files. Sections map to component areas. Indented keys belong to the nearest section above them.

[keel] — Global Settings

Key	Type	Default	Description
`experimental_features`	bool	false	Gate for experimental features (sharding, scatter-merge, WAL probes)
`log_level`	enum	info	`debug` \| `info` \| `warn` \| `error`
`log_format`	enum	text	`text` \| `json` (NDJSON structured logs)
`pid_file`	path	—	Path to write PID file

[worker_group.NAME] — Worker Group

Key	Type	Default	Description
`protocol`	enum	postgresql	`postgresql` \| `mysql`
`bind_addr`	string	0.0.0.0	Listener bind address
`bind_port`	int	6432	Listener TCP port
`num_workers`	int	0	Worker threads; `0` = auto (one per CPU core)
`mode`	enum	pool	Runtime tier: `proxy` \| `pool` \| `smart` \| `full`
`prepared_statement`	enum	virtualize	`virtualize` \| `pinning` \| `tracking` \| `anonymous`
`min_pool_size`	int	5	Min idle connections per worker
`max_pool_size`	int	50	Max connections per worker
`idle_timeout`	duration	5m	Idle backend eviction time (e.g. `30s`, `5m`)
`client_connect_timeout`	duration	10s	Frontend connect timeout
`client_idle_timeout`	duration	5m	Client idle session eviction
`backend_connect_timeout`	duration	5s	Backend TCP connect + auth timeout
`transaction_tracking`	bool	false	XID probe + commit-in-doubt recovery (requires `mode = full`)
`fast_network_path`	bool	false	Enable MSG_PEEK + splice zero-copy DataRow bypass
`result_cache`	bool	false	Enable result cache framework (disables splice bypass)

[worker_group.NAME.servers] — Backends

Syntax

[worker_group.main.servers]
# name = host=HOST port=PORT dbname=DB user=USER password=PASS role=ROLE weight=N
primary  = host=db1.local port=5432 dbname=app user=app password=s role=RW weight=100
replica1 = host=db2.local port=5432 dbname=app user=app password=s role=RO weight=100
replica2 = host=db3.local port=5432 dbname=app user=app password=s role=RO weight=50

Role values: RW (primary, read+write), RO (replica, read-only), auto (detected via pg_is_in_recovery() or Patroni REST API).

[probe] — Health Checks

Key	Type	Default	Description
`probe`	enum	postgres	`postgres` \| `patroni` \| `mysql` \| `tcp` \| `exec`
`probe_interval`	duration	5s	Health check interval
`probe_timeout`	duration	3s	Per-probe timeout
`probe_retries`	int	3	Failures before marking backend down
`failover_delay`	duration	10s	Hold-off before re-routing after failover detected

Prepared Statements

Prepared statements are a challenge for connection pooling because they are bound to a specific backend connection. Keel offers four strategies:

Strategy	How It Works	Best For
`virtualize` Stable	Keel stores the prepared statement text client-side and re-issues `Parse` on whichever backend is borrowed. The client never sees a mismatch.	ORMs (Hibernate, GORM, SQLAlchemy, pgx, Prisma)
`pinning`	Session is pinned to one backend once a prepared statement is active. Released on `Deallocate All`.	Applications that heavily use named PS and can tolerate pinning
`tracking`	Simple PS name tracking — maps client PS names to the same names on the backend. Requires stable backend assignment.	Simple extended-query workloads
`anonymous`	Rewrites all named PS to anonymous (unnamed) `''`. Lowest overhead, breaks applications that rely on PS names.	Bulk insert / simple ORM-free workloads

💡

Recommended

Use prepared_statement = virtualize for all ORM-heavy applications. It is the only strategy that works correctly with transaction pooling and arbitrary backend reassignment.

Query Rules

Declarative INI-based rules evaluated before normal routing. No hook code required.

keel.ini

# Block a specific query pattern
[query_rule.block_truncate]
match      = ^\s*TRUNCATE
block      = true
block_msg  = "TRUNCATE is not allowed through the proxy"

# Route analytics queries to replicas
[query_rule.analytics_replica]
match      = ^\s*SELECT.*FROM\s+(events|metrics|logs)
route_to   = replica

# Rewrite legacy table name
[query_rule.rewrite_legacy]
match      = FROM old_users(\b.*)
rewrite_to = FROM users\1

# Throttle heavy reports
[throttle.reports]
match          = SELECT.*GENERATE_SERIES
rate_per_second = 5
burst          = 10

Key	Description
`match`	POSIX ERE regex matched against the SQL text
`route_to`	`primary` \| `replica` \| `any`
`block`	`true` — return a synthetic error instead of executing
`rewrite_to`	SQL replacement string (capture group references supported)
`user`	Limit rule to a specific PostgreSQL username
`database`	Limit rule to a specific database name

Read/Write Splitting

Available in smart and full modes. Keel's SQL parser classifies every query and routes it to the appropriate backend.

Routing Rules

Writes → primary: INSERT, UPDATE, DELETE, DDL, CALL, MERGE
Reads → replica (by weight): SELECT (without FOR UPDATE)
Locking reads → primary: SELECT ... FOR UPDATE, SELECT ... FOR SHARE
Sticky-primary window: After a write, reads are temporarily pinned to the primary (100 ms default) before replica routing resumes
Open transactions → pinned to whichever backend received BEGIN

Cross-Service Read-After-Write

Application SQL

-- After writing in service A, propagate the WAL position:
SHOW keel.write_lsn   -- returns e.g. '0/15E3B20'

-- In service B, enforce read-after-write consistency:
SET keel.read_after_lsn = '0/15E3B20'
-- Keel will route to primary if no replica is caught up to that LSN

Sharding

Experimental

Keel supports transparent horizontal sharding via SQL AST shard-key extraction. Applications connect to a single Keel endpoint and Keel automatically routes queries to the correct shard.

Configuration

keel.ini

[keel]
experimental_features = true

[shard_rule.orders]
table        = orders
column       = user_id
shard_count  = 4
strategy     = hash    # hash | range | modulo

[shard_backend.0]
host = shard0.local
port = 5432
database = mydb
user = app

[shard_backend.1]
host = shard1.local
port = 5432
database = mydb
user = app

# ... shard_backend.2, shard_backend.3

Admin Commands

psql (admin port)

SHOW SHARD RULES;
EXPLAIN SHARD PLAN FOR 'SELECT * FROM orders WHERE user_id = 42';

Scatter-Merge Queries

Experimental

When sharding is enabled, Keel can fan out aggregation queries to all shards and merge the results transparently — no application changes needed.

Supported Aggregations

COUNT, SUM, AVG, MIN, MAX, COUNT(DISTINCT col)
GROUP BY, HAVING, ORDER BY
LIMIT / OFFSET with correctness fix (strips per-shard limits before merging)

Example

-- Application sends this single query to Keel:
SELECT status, COUNT(*) AS cnt, SUM(amount) AS total
FROM orders
GROUP BY status
ORDER BY total DESC
LIMIT 5;

-- Keel fans out to all 4 shards, merges groups, sorts, returns top-5.

Authentication Methods

Method	Protocol	Notes
`scram-sha-256`	PostgreSQL	Default. Full SCRAM-SHA-256 with OpenSSL
`md5`	PostgreSQL	Legacy password auth
`caching_sha2_password`	MySQL	MySQL 8+ default
`mysql_native_password`	MySQL	MySQL legacy
`trust`	Both	No password required
`pam`	PostgreSQL	System PAM authentication (runs in separate thread)
`ldap`	PostgreSQL	LDAP bind + search with session-level result caching
`cert`	PostgreSQL	mTLS certificate identity — extracts username from CN/SAN

TLS / mTLS / kTLS

keel.ini — TLS config

[worker_group.main]
# Frontend TLS (clients → Keel)
tls_mode        = require          # disable | allow | prefer | require
tls_cert_file   = /etc/keel/server.crt
tls_key_file    = /etc/keel/server.key
tls_ca_file     = /etc/keel/ca.crt  # required for mTLS
tls_min_version = 1.3              # 1.2 | 1.3
tls_ciphers     = TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256

# Backend TLS (Keel → PostgreSQL)
backend_tls     = require
backend_tls_ca  = /etc/keel/backend-ca.crt

# Kernel TLS offload (kTLS) — Linux 4.13+
ktls            = on

Certificate Hot-Reload

Send SIGHUP to reload certificates without restarting:

Terminal

sudo systemctl reload keel
# Or directly:
kill -HUP $(cat /var/run/keel/keel.pid)

# Verify via admin console:
psql -h localhost -p 9187 -c "SHOW CERTIFICATES"
psql -h localhost -p 9187 -c "RELOAD CERTS"

Cloud IAM Authentication

Keel handles cloud-native authentication token generation and rotation automatically. No external scripts needed.

AWS RDS IAM

keel.ini

[worker_group.main.servers]
primary = host=mydb.cluster-xxx.us-east-1.rds.amazonaws.com port=5432 \
          dbname=app user=app role=RW \
          auth=aws_iam region=us-east-1

# IAM tokens cached for 14 minutes and auto-refreshed.

GCP Cloud SQL IAM

keel.ini

[worker_group.main.servers]
primary = host=/cloudsql/project:region:instance port=5432 \
          dbname=app user=app@project.iam role=RW auth=gcp_iam

Azure AD / Entra ID

keel.ini

[worker_group.main.servers]
primary = host=myserver.postgres.database.azure.com port=5432 \
          dbname=app user=app@myserver role=RW auth=azure_ad

Prometheus Metrics

Metrics are exposed at GET /metrics on the admin port (default: 9187).

Key metrics

keel_connections_active{role="client"}         # active client sessions
keel_connections_active{role="backend"}        # active backend connections
keel_pool_size{worker="0",state="idle"}        # idle pool entries
keel_pool_size{worker="0",state="active"}      # borrowed pool entries
keel_pool_borrows_total                        # total pool borrows
keel_pool_wait_queue_enqueued                  # clients waiting for a connection
keel_queries_total{type="read"}                # total read queries
keel_queries_total{type="write"}               # total write queries
keel_query_duration_seconds{quantile="0.99"}   # P99 query latency
keel_tls_handshakes_total                      # TLS handshakes
keel_migrations_sent_total                     # session migrations sent
keel_router_scatter_merge_duration_seconds     # scatter-merge latency histogram

Grafana Dashboard

A pre-built Grafana dashboard is available at etc/grafana/keel-dashboard.json. Import it directly into Grafana.

Admin Console

Connect with any PostgreSQL client to the admin port:

Admin commands

psql -h localhost -p 9187

SHOW POOL;          -- per-worker pool status
SHOW WORKERS;       -- worker statistics
SHOW REBALANCE;     -- rebalance metrics
SHOW STATS;         -- aggregate counters
SHOW SHARD RULES;   -- shard routing rules
SHOW CERTIFICATES;  -- loaded TLS certs
EXPLAIN SHARD PLAN FOR 'SELECT ...';  -- routing plan
RELOAD CERTS;       -- hot-reload TLS certs

OpenTelemetry Tracing

Keel injects W3C traceparent headers as SQL block comments and exports spans via OTLP/HTTP.

keel.ini

[worker_group.main]
tracing           = on
trace_sample_rate = 0.1              # 10% head-based sampling
otlp_endpoint     = http://otel-collector:4318/v1/traces

Each traced query will appear in the PostgreSQL log as:

/* traceparent=00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01 */
SELECT * FROM users WHERE id = $1

Logging

Format	Config	Description
Text	`log_format = text`	Human-readable timestamped log lines
JSON (NDJSON)	`log_format = json`	Structured JSON with `trace_id`, `span_id` correlation fields

Query Logging

keel.ini

[worker_group.main]
query_log        = on
query_log_mode   = all    # all | read | write | none
query_log_file   = /var/log/keel/queries.log

Audit Logging

keel.ini

[audit]
enabled       = true
events        = auth,admin,query     # auth | admin | query | pool
output        = file
file          = /var/log/keel/audit.log
format        = json

Hook System

Available in full mode. Four hook points in the query pipeline:

Hook Point	When	Can modify
`after_query_read`	Raw query bytes received, before parse	SQL text, flags
`after_query_parse`	SQL parsed into AST	Query tree, routing hint
`before_route`	Before routing decision	Route override, session metadata
`before_send`	Before forwarding to backend	Final SQL text, abort

Hooks can return true (continue) or false (abort). Multiple hooks at the same point execute in priority order. Lua, Python, and native .so hooks can be mixed at any point.

Lua & Python Hooks

Lua Hook Example

hooks/route.lua

-- Route queries from 'analytics' user to replica
function before_route(ctx)
    if ctx.user == "analytics" then
        ctx.route = "replica"
    end
    return true  -- continue
end

keel.ini

[worker_group.main]
mode = full
hook.before_route = /etc/keel/hooks/route.lua

Python Hook Example

hooks/block.py

def after_query_parse(ctx):
    # Block DROP TABLE in production
    if ctx.query_type == "DDL" and "DROP TABLE" in ctx.sql.upper():
        ctx.abort_message = "DROP TABLE is not allowed"
        return False
    return True

Hot Reload

Send SIGHUP to reload configuration without restarting:

Terminal

sudo systemctl reload keel
# Or:
kill -HUP $(pgrep keel)

What Can Be Hot-Reloaded

Pool sizes (min_pool_size, max_pool_size)
Timeouts and probe settings
Server weights
TLS certificates (atomic SSL_CTX swap)
Log level and format
Shard rules
Query rules and throttle rules
Audit configuration
Rebalancing configuration

🚫

Cannot hot-reload

Bind address, port, protocol, number of workers, and runtime mode require a full restart.

Multi-Proxy HA Cluster

Two or three Keel instances can form a cluster with heartbeat-based peer monitoring and configuration gossip. Each node runs independently — the cluster layer propagates config changes and detects peer failures without affecting the query hot path.

keel.ini

[cluster]
enabled              = true
node_id              = keel-0
gossip_port          = 7000
peers                = keel-1.local:7000,keel-2.local:7000
heartbeat_interval   = 2s
compression          = zstd   # zlib | zstd | none (for WAN links)

Kubernetes

Helm Chart

Terminal

helm install keel ./helm/keel \
  --set image.tag=latest \
  --set config.bindPort=6432 \
  --set config.numWorkers=4 \
  --set config.maxPoolSize=50

Environment Variables

All INI settings can be overridden with KEEL_* environment variables:

Kubernetes Deployment env

env:
  - name: KEEL_BIND_PORT
    value: "6432"
  - name: KEEL_NUM_WORKERS
    value: "4"
  - name: KEEL_MAX_POOL_SIZE
    value: "50"
  - name: KEEL_MODE
    value: "pool"
  - name: KEEL_SERVER_PASSWORD
    valueFrom:
      secretKeyRef:
        name: db-credentials
        key: password

Health Endpoints

Kubernetes liveness / readiness probes

livenessProbe:
  httpGet:
    path: /healthz
    port: 9187
  initialDelaySeconds: 5
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /ready
    port: 9187

HPA Scaling

Scale based on keel_pool_wait_queue_enqueued metric — clients waiting for a connection is the best signal that the proxy is under load.

Docker

docker-compose.yml

services:
  keel:
    image: ghcr.io/virtlabs-io/keel:latest
    ports:
      - "6432:6432"
      - "9187:9187"
    volumes:
      - ./keel.ini:/etc/keel/keel.ini:ro
    environment:
      KEEL_LOG_LEVEL: info
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9187/healthz"]
      interval: 10s
      timeout: 3s
      retries: 3

Multi-arch images are available for linux/amd64 and linux/arm64.

Health & Failover

Probe Types

Probe	Protocol	Description
`postgres`	PostgreSQL	Executes `pg_is_in_recovery()` to detect role
`patroni`	HTTP	Polls `GET /cluster` for member roles and health
`mysql`	MySQL	Checks replication lag and primary status
`tcp`	TCP	Simple port reachability check
`exec`	Shell	Runs a custom script; exit code 0 = healthy

Graceful Drain

Lifecycle: CREATED → ACTIVE → DRAINING → STOPPING → STOPPED. During DRAINING, Keel rejects new connections with PostgreSQL error 57P03 (cannot_connect_now) and waits for in-flight transactions to complete before shutting down backends.

Keel Reference Guide

Overview

Production Status

Installation

Debian / Ubuntu

Build from Source

Docker

Build Options

Quick Start

Architecture

Worker Components

Request Flow

Connection Multiplexing

Reactor Model

io_uring Optimizations

Connection Pooling

Pool Lifecycle

Pool Sizing

DISCARD ALL on Return

Session State — SSV

How It Works

5-Tier Borrow Strategy

Runtime Modes

Configuration Reference

[keel] — Global Settings

[worker_group.NAME] — Worker Group

[worker_group.NAME.servers] — Backends

[probe] — Health Checks

Prepared Statements

Query Rules

Read/Write Splitting

Routing Rules

Cross-Service Read-After-Write

Sharding

Configuration

Admin Commands

Scatter-Merge Queries

Supported Aggregations

Authentication Methods

TLS / mTLS / kTLS

Certificate Hot-Reload

Cloud IAM Authentication

AWS RDS IAM

GCP Cloud SQL IAM

Azure AD / Entra ID

Prometheus Metrics

Grafana Dashboard

Admin Console

OpenTelemetry Tracing

Logging

Query Logging

Audit Logging

Hook System

Lua & Python Hooks

Lua Hook Example

Python Hook Example

Hot Reload

What Can Be Hot-Reloaded

Multi-Proxy HA Cluster

Kubernetes

Helm Chart

Environment Variables

Health Endpoints

HPA Scaling

Docker

Health & Failover

Probe Types

Graceful Drain