Production Limits and Capacity

0.5 has four layers of admission and capacity controls. Each documents its own behavior individually; this page shows the precedence order and how they compose. Operators tuning production limits need this.

The four layers, in firing order

A request travels through these gates in order. Each layer is opt-in or has documented defaults; each can be configured independently.

Order	Layer	Where	What it caps	When it fires
1	Handshake admission	`websocket.upgradeAdmission`	Concurrent connections, upgrade rate	HTTP -> WS handshake
2	Worker pressure	`websocket.pressure` + `platform.onPressure`	Memory, per-topic publish rate, subscriber counts	Any message dispatch
3	Message-tier admission	`createAdmissionControl` (extensions)	Per-RPC-class admission rules	Inside `createMessage`’s `beforeExecute`
4	Per-primitive capacity caps	`MAX_*` constants + plugin `maxBuckets`/`maxTopics`/etc.	Map / Set growth bounds	Inside individual operations

A request that fails at any layer never reaches the next. A request must pass all four to execute the handler.

Layer 1: Handshake admission

Caps connection count and paces upgrade flood. Fires before the WebSocket is established.

// svelte.config.js
adapter({
  websocket: {
    upgradeAdmission: {
      maxConcurrent: 10_000,
      perTickBudget: 100
    },
    upgradeRateLimit: 10,
    upgradeRateLimitWindow: 10
  }
});

Knob	Default	Effect on breach
`upgradeRateLimit` per IP per window	`10 per 10s`	HTTP 429
`upgradeAdmission.maxConcurrent`	unset (opt-in)	HTTP 503 fast-fail
`upgradeAdmission.perTickBudget`	unset (opt-in)	Paced via `setImmediate`

When to set. A small app behind a load balancer with finite worker capacity. maxConcurrent is your hard cap; the LB sees 503 and routes elsewhere. perTickBudget smooths connection-burst storms (e.g. a deploy that disconnects 50k clients simultaneously).

Sets the floor for everything else. A user denied at this layer never appears in any plugin’s bookkeeping.

Layer 2: Worker pressure

Runtime backpressure signal, computed on the worker. Precedence inside this layer: MEMORY > PUBLISH_RATE > SUBSCRIBERS > NONE.

// svelte.config.js
adapter({
  websocket: {
    pressure: {
      memory: { thresholdMB: 1024 },
      publishRate: {
        topicPublishRatePerSec: 5000,
        topicPublishBytesPerSec: 10 * 1024 * 1024
      },
      subscribers: { perTopic: 50_000 }
    }
  }
});

Pressure is a signal, not a hard cap. Code that wants to shed under pressure consumes it:

// In your message handler:
const { reason } = platform.pressure;
if (reason !== 'NONE') {
  // throttle, queue, or reject as appropriate
}

// Or subscribe to transitions:
platform.onPressure(({ reason }) => {
  metrics.gauge('ws_pressure', 1, { reason });
});

When to set. Always. Even if no code reads the signal, the framework’s own backoff and throttle plugins consume it. Disable individual thresholds by setting them to false.

The bus extensions (createPubSubBus) auto-emit degraded / recovered system events when a shared circuit breaker trips, which is a related but distinct signal (backend-availability, not worker-local pressure).

Layer 3: Message-tier admission

Per-RPC-class rules consulted in beforeExecute. Lets you shed cosmetic operations under pressure while keeping critical ones running.

import { createAdmissionControl } from 'svelte-adapter-uws-extensions/admission';
import { live, LiveError } from 'svelte-realtime/server';

const admission = createAdmissionControl({
  rules: {
    background: ['MEMORY', 'PUBLISH_RATE', 'SUBSCRIBERS'],  // shed under any pressure
    critical:   ['MEMORY']                                  // shed only on memory pressure
  }
});

// In createMessage:
export const message = createMessage({
  async beforeExecute(ws, rpcPath) {
    const cls = classifyRpc(rpcPath);  // 'background' or 'critical'
    if (!await admission.shouldAccept(cls, platform)) {
      throw new LiveError('OVERLOADED', 'Server is shedding load');
    }
  }
});

The demo’s live.admission({ classes: { background, critical } }) shape is the same idea wired closer to the realtime layer. Either works; pick whichever ergonomically fits.

When to set. Once your handler count grows past ~10 RPCs and some are clearly “ok to fail under pressure” (cursor moves, presence updates, notification badges) and others are “must not fail under pressure” (save mutations, payments, auth). Without this layer every RPC has equal priority, which is wrong for most apps.

Per-class metrics: admission_accepted_total{class}, admission_rejected_total{class, reason}.

Layer 4: Per-primitive capacity caps

Every internal Map / Set has an explicit upper bound. Per-plugin maxBuckets / maxTopics / maxConnections / maxKeys options override the defaults.

// Each plugin has its own caps:
createPresence({ maxTopics: 100_000 });
createRateLimit({ maxBuckets: 50_000, points: 30, interval: 10_000 });
createCursor({ maxConnections: 10_000, maxTopics: 1000 });

// Framework-level caps are exported constants you can override in tests:
import { MAX_PRESENCE_REF, MAX_PUSH_REGISTRY, MAX_AGGREGATE_BUCKETS } from 'svelte-realtime/server';

Constant	Default	Saturation behavior
`MAX_PRESENCE_REF`	1,000,000	FIFO-evict pending leaves, then drop new joins with one-shot warning
`MAX_PUSH_REGISTRY`	1,000,000	REJECT new userIds with one-shot warning
`MAX_AGGREGATE_BUCKETS`	1,000	REJECT at module load time (refuses to register)
`MAX_OPTIMISTIC_QUEUE_DEPTH`	1,000,000	WARN-then-skip
Adapter plugin defaults	1,000,000	REJECT or WARN, see Architecture
`queue.maxSize`	1,000,000	`onDrop` callback fires, oldest entries drop

Saturation behavior is one of:

REJECT - new entries refused, caller sees an error
WARN-only - one-shot warning, growth continues (used for state-the-protocol-depends-on, where eviction would corrupt routing)
FIFO-evict - oldest entry dropped to make room
WARN-then-skip - one-shot warning, the operation is skipped

When to set. Almost never. Defaults are deliberately generous (1M). Apps that approach 1M of any single resource should investigate the leak rather than raise the cap. The constants exist so tests can override them and so dashboards can know what “saturation” means without reading source.

The shortlinks in cap-saturation log messages go to per-cap docs pages so operators can click straight from their terminal.

Putting it together: a worked example

A board collaboration app, 500 concurrent users per worker, 10 workers in a cluster:

// svelte.config.js
adapter({
  websocket: {
    upgradeAdmission: {
      maxConcurrent: 600,         // per-worker, ~20% headroom over expected
      perTickBudget: 50            // smooth deploy-storm reconnects
    },
    pressure: {
      memory: { thresholdMB: 1024 },
      publishRate: {
        topicPublishRatePerSec: 2000,
        topicPublishBytesPerSec: 4 * 1024 * 1024
      },
      subscribers: false           // not a concern at this scale
    }
  }
});

// hooks.ws.js
import { createAdmissionControl } from 'svelte-adapter-uws-extensions/admission';
const admission = createAdmissionControl({
  rules: {
    cursorMove:    ['MEMORY', 'PUBLISH_RATE'],   // shed cursor under pressure
    presenceUpdate: ['MEMORY'],                  // keep presence under publish pressure
    noteEdit:       []                           // never shed
  }
});

// extensions
import { createPresence } from 'svelte-adapter-uws-extensions/redis/presence';
const presence = createPresence(redis, {
  // 500 users * 10 workers = 5000 active presence entries
  // 100x headroom: 500_000
  maxTopics: 500_000
});

What happens to a cursor-move RPC during a memory pressure event:

Layer 1: handshake admission allowed the connection (not relevant on every message).
Layer 2: platform.pressure.reason === 'MEMORY'.
Layer 3: admission.shouldAccept('cursorMove', platform) returns false because cursorMove rules include 'MEMORY'.
beforeExecute throws LiveError('OVERLOADED'). The handler never runs.
Layer 4 never fires because the request did not reach a primitive.

A note-edit RPC during the same pressure event:

Layer 1: same.
Layer 2: same signal.
Layer 3: admission.shouldAccept('noteEdit', platform) returns true because noteEdit rules are empty.
Handler runs.
Layer 4 may still trigger inside the handler if e.g. presence.join() would push past maxTopics.

Monitoring

Each layer has its own metrics. A complete dashboard tracks:

Layer	Metric	Type
1	`upgrade_rate_limited_total`	counter
1	`upgrade_admission_rejected_total`	counter
2	`ws_pressure{reason}`	gauge
2	`ws_topic_publish_rate{topic}`	gauge
2	`ws_topic_publish_bytes{topic}`	gauge
3	`admission_accepted_total{class}`	counter
3	`admission_rejected_total{class, reason}`	counter
4	Per-plugin saturation counters (see plugin docs)	counter
4	`svelte_realtime_assertion_violations_total{category}`	counter (framework bugs only)

A healthy production graph has Layer 1 rejections during obvious attacks, occasional Layer 2 signal transitions, near-zero Layer 3 rejections except during incidents, and zero Layer 4 saturation. Assertion-violation counters should always be zero; non-zero means a framework bug (report it).