Realtime Operations Runbook
This runbook covers day-2 operations for realtime datasources (sse, websocket, mqtt).
Scope
- Gateway realtime channel (
/gateway/realtime) - Broker profile administration for MQTT
- Public/link authorization revocation behavior
- Deterministic local fixtures for validation
Prerequisites
- API and gateway are running with shared:
JWT_GATEWAY_SECRETGATEWAY_SERVICE_TOKEN
- Gateway can reach API (
GATEWAY_API_BASE_URL). - Realtime is enabled (
REALTIME_ENABLED=true). - Secret setup/rotation process is managed via Secrets Operations Runbook.
Baseline Validation (Local)
- Start fixture stack:
bash
npm run demo:realtime:up- Run smoke checks:
bash
npm run demo:realtime:smoke- Stop fixtures when done:
bash
npm run demo:realtime:downOne-shot integration loop:
bash
npm run test:realtime:integrationMQTT Setup Workflow
- In Admin Console, create a credential profile (type:
basic) if broker auth is required. - Create a broker profile:
- protocol:
mqtt - broker URL:
mqtt://...ormqtts://... - optional credential profile reference
- optional topic allowlist
allowPublicUseonly when needed for public/link dashboards
- protocol:
- In dashboard datasource dialog, create MQTT datasource referencing broker profile + topic.
Revocation Behavior
Public/link realtime access is revoked by:
- share token rotation
- visibility changes affecting external access
- public-use policy changes on credential/broker profiles
Gateway enforcement path:
- Polls API revocation feed (
/internal/gateway/revoked-tokens) on interval. - Disconnects stale public/link subscriptions.
- Runs periodic full revalidation as fallback.
- If feed cursor expires, performs immediate full revalidation and resumes incremental polling.
Key Realtime Policy Knobs
Connection and rate limits:
REALTIME_MAX_CLIENT_CONNECTIONS_PER_IPREALTIME_MAX_CONNECTIONS_PER_DASHBOARDREALTIME_MAX_SUBSCRIPTIONS_PER_CONNECTIONREALTIME_CONNECT_RATE_LIMIT_IP_PER_MINREALTIME_PUBLIC_SUBSCRIBE_RATE_LIMIT_IP_PER_MINREALTIME_PUBLIC_SUBSCRIBE_RATE_LIMIT_SHARE_TOKEN_PER_MINREALTIME_TRUST_PROXY_HOPSAPI_TRUST_PROXY_HOPSREALTIME_LIMITER_FAILURE_MODEGATEWAY_LIMITER_TIMEOUT_MS
Revalidation:
REALTIME_PUBLIC_REVALIDATE_INTERVAL_MSREALTIME_PUBLIC_FULL_REVALIDATE_INTERVAL_MSREALTIME_REVOKE_EVENT_RETENTION_SECONDS
Protocol toggles:
REALTIME_SSE_ENABLEDREALTIME_WS_ENABLEDREALTIME_MQTT_ENABLED
MQTT policy:
REALTIME_MQTT_ALLOWED_TOPICSREALTIME_MQTT_MAX_QOSREALTIME_MQTT_MAX_CONNECTIONS_PER_BROKERREALTIME_MQTT_IDLE_DISCONNECT_MS
Production Hardening Checklist
- Set explicit
EGRESS_ALLOWED_HOSTSandEGRESS_ALLOWED_PORTS. - Keep
EGRESS_ALLOW_PRIVATE_DESTINATIONS=falseunless intentionally required. - Keep
EGRESS_ALLOW_INSECURE_TLS=false. - Set MQTT allowlists (global and/or broker-level) before enabling MQTT broadly.
- Keep protocol toggles off for unused transports.
- Validate
REALTIME_TRUST_PROXY_HOPSandAPI_TRUST_PROXY_HOPSfor reverse-proxy deployments. - Ensure edge reverse proxies overwrite
X-Forwarded-Forinstead of appending untrusted inbound values. - Keep
SECURITY_LIMITER_BACKENDaligned withDB_BACKENDandSECURITY_LIMITER_FAILURE_MODE=fail-closedfor non-dev runtime. - Keep
REALTIME_LIMITER_FAILURE_MODE=fail-closedunless a controlled degraded-mode runbook explicitly allows fail-open. - Use Secrets Operations Runbook for token/key rotation windows.
Limiter Outage Response Expectations
Default non-dev behavior is fail-closed.
- API limiter backend unavailable:
- login can return temporary-unavailable failures.
- datasource session mint and gateway internal limiter consume paths can return
503.
- Gateway realtime limiter unavailable:
- connect and public subscribe limiter paths return temporary-unavailable behavior (
503).
- connect and public subscribe limiter paths return temporary-unavailable behavior (
Temporary fail-open is an exception path only:
- require explicit change approval window and incident tracking.
- monitor
backend_errorandfail_openlimiter metrics continuously. - revert to fail-closed after backend recovery.
For staged deployment, canary watchlist, and rollback order, use: