Production Limits and Capacity
0.5 has four layers of admission and capacity controls. Each documents its own behavior individually; this page shows the precedence order and how they compose. Operators tuning production limits need this.
The four layers, in firing order
A request travels through these gates in order. Each layer is opt-in or has documented defaults; each can be configured independently.
| Order | Layer | Where | What it caps | When it fires |
|---|---|---|---|---|
| 1 | Handshake admission | websocket.upgradeAdmission | Concurrent connections, upgrade rate | HTTP -> WS handshake |
| 2 | Worker pressure | websocket.pressure + platform.onPressure | Memory, per-topic publish rate, subscriber counts | Any message dispatch |
| 3 | Message-tier admission | createAdmissionControl (extensions) | Per-RPC-class admission rules | Inside createMessage’s beforeExecute |
| 4 | Per-primitive capacity caps | MAX_* constants + plugin maxBuckets/maxTopics/etc. | Map / Set growth bounds | Inside individual operations |
A request that fails at any layer never reaches the next. A request must pass all four to execute the handler.
Layer 1: Handshake admission
Caps connection count and paces upgrade flood. Fires before the WebSocket is established.
// svelte.config.js
adapter({
websocket: {
upgradeAdmission: {
maxConcurrent: 10_000,
perTickBudget: 100
},
upgradeRateLimit: 10,
upgradeRateLimitWindow: 10
}
}); | Knob | Default | Effect on breach |
|---|---|---|
upgradeRateLimit per IP per window | 10 per 10s | HTTP 429 |
upgradeAdmission.maxConcurrent | unset (opt-in) | HTTP 503 fast-fail |
upgradeAdmission.perTickBudget | unset (opt-in) | Paced via setImmediate |
When to set. A small app behind a load balancer with finite worker capacity. maxConcurrent is your hard cap; the LB sees 503 and routes elsewhere. perTickBudget smooths connection-burst storms (e.g. a deploy that disconnects 50k clients simultaneously).
Sets the floor for everything else. A user denied at this layer never appears in any plugin’s bookkeeping.
Layer 2: Worker pressure
Runtime backpressure signal, computed on the worker. Precedence inside this layer: MEMORY > PUBLISH_RATE > SUBSCRIBERS > NONE.
// svelte.config.js
adapter({
websocket: {
pressure: {
memory: { thresholdMB: 1024 },
publishRate: {
topicPublishRatePerSec: 5000,
topicPublishBytesPerSec: 10 * 1024 * 1024
},
subscribers: { perTopic: 50_000 }
}
}
}); Pressure is a signal, not a hard cap. Code that wants to shed under pressure consumes it:
// In your message handler:
const { reason } = platform.pressure;
if (reason !== 'NONE') {
// throttle, queue, or reject as appropriate
}
// Or subscribe to transitions:
platform.onPressure(({ reason }) => {
metrics.gauge('ws_pressure', 1, { reason });
}); When to set. Always. Even if no code reads the signal, the framework’s own backoff and throttle plugins consume it. Disable individual thresholds by setting them to false.
The bus extensions (createPubSubBus) auto-emit degraded / recovered system events when a shared circuit breaker trips, which is a related but distinct signal (backend-availability, not worker-local pressure).
Layer 3: Message-tier admission
Per-RPC-class rules consulted in beforeExecute. Lets you shed cosmetic operations under pressure while keeping critical ones running.
import { createAdmissionControl } from 'svelte-adapter-uws-extensions/admission';
import { live, LiveError } from 'svelte-realtime/server';
const admission = createAdmissionControl({
rules: {
background: ['MEMORY', 'PUBLISH_RATE', 'SUBSCRIBERS'], // shed under any pressure
critical: ['MEMORY'] // shed only on memory pressure
}
});
// In createMessage:
export const message = createMessage({
async beforeExecute(ws, rpcPath) {
const cls = classifyRpc(rpcPath); // 'background' or 'critical'
if (!await admission.shouldAccept(cls, platform)) {
throw new LiveError('OVERLOADED', 'Server is shedding load');
}
}
}); The demo’s live.admission({ classes: { background, critical } }) shape is the same idea wired closer to the realtime layer. Either works; pick whichever ergonomically fits.
When to set. Once your handler count grows past ~10 RPCs and some are clearly “ok to fail under pressure” (cursor moves, presence updates, notification badges) and others are “must not fail under pressure” (save mutations, payments, auth). Without this layer every RPC has equal priority, which is wrong for most apps.
Per-class metrics: admission_accepted_total{class}, admission_rejected_total{class, reason}.
Layer 4: Per-primitive capacity caps
Every internal Map / Set has an explicit upper bound. Per-plugin maxBuckets / maxTopics / maxConnections / maxKeys options override the defaults.
// Each plugin has its own caps:
createPresence({ maxTopics: 100_000 });
createRateLimit({ maxBuckets: 50_000, points: 30, interval: 10_000 });
createCursor({ maxConnections: 10_000, maxTopics: 1000 });
// Framework-level caps are exported constants you can override in tests:
import { MAX_PRESENCE_REF, MAX_PUSH_REGISTRY, MAX_AGGREGATE_BUCKETS } from 'svelte-realtime/server'; | Constant | Default | Saturation behavior |
|---|---|---|
MAX_PRESENCE_REF | 1,000,000 | FIFO-evict pending leaves, then drop new joins with one-shot warning |
MAX_PUSH_REGISTRY | 1,000,000 | REJECT new userIds with one-shot warning |
MAX_AGGREGATE_BUCKETS | 1,000 | REJECT at module load time (refuses to register) |
MAX_OPTIMISTIC_QUEUE_DEPTH | 1,000,000 | WARN-then-skip |
| Adapter plugin defaults | 1,000,000 | REJECT or WARN, see Architecture |
queue.maxSize | 1,000,000 | onDrop callback fires, oldest entries drop |
Saturation behavior is one of:
- REJECT - new entries refused, caller sees an error
- WARN-only - one-shot warning, growth continues (used for state-the-protocol-depends-on, where eviction would corrupt routing)
- FIFO-evict - oldest entry dropped to make room
- WARN-then-skip - one-shot warning, the operation is skipped
When to set. Almost never. Defaults are deliberately generous (1M). Apps that approach 1M of any single resource should investigate the leak rather than raise the cap. The constants exist so tests can override them and so dashboards can know what “saturation” means without reading source.
The shortlinks in cap-saturation log messages go to per-cap docs pages so operators can click straight from their terminal.
Putting it together: a worked example
A board collaboration app, 500 concurrent users per worker, 10 workers in a cluster:
// svelte.config.js
adapter({
websocket: {
upgradeAdmission: {
maxConcurrent: 600, // per-worker, ~20% headroom over expected
perTickBudget: 50 // smooth deploy-storm reconnects
},
pressure: {
memory: { thresholdMB: 1024 },
publishRate: {
topicPublishRatePerSec: 2000,
topicPublishBytesPerSec: 4 * 1024 * 1024
},
subscribers: false // not a concern at this scale
}
}
});
// hooks.ws.js
import { createAdmissionControl } from 'svelte-adapter-uws-extensions/admission';
const admission = createAdmissionControl({
rules: {
cursorMove: ['MEMORY', 'PUBLISH_RATE'], // shed cursor under pressure
presenceUpdate: ['MEMORY'], // keep presence under publish pressure
noteEdit: [] // never shed
}
});
// extensions
import { createPresence } from 'svelte-adapter-uws-extensions/redis/presence';
const presence = createPresence(redis, {
// 500 users * 10 workers = 5000 active presence entries
// 100x headroom: 500_000
maxTopics: 500_000
}); What happens to a cursor-move RPC during a memory pressure event:
- Layer 1: handshake admission allowed the connection (not relevant on every message).
- Layer 2:
platform.pressure.reason === 'MEMORY'. - Layer 3:
admission.shouldAccept('cursorMove', platform)returnsfalsebecausecursorMoverules include'MEMORY'. beforeExecutethrowsLiveError('OVERLOADED'). The handler never runs.- Layer 4 never fires because the request did not reach a primitive.
A note-edit RPC during the same pressure event:
- Layer 1: same.
- Layer 2: same signal.
- Layer 3:
admission.shouldAccept('noteEdit', platform)returnstruebecausenoteEditrules are empty. - Handler runs.
- Layer 4 may still trigger inside the handler if e.g.
presence.join()would push pastmaxTopics.
Monitoring
Each layer has its own metrics. A complete dashboard tracks:
| Layer | Metric | Type |
|---|---|---|
| 1 | upgrade_rate_limited_total | counter |
| 1 | upgrade_admission_rejected_total | counter |
| 2 | ws_pressure{reason} | gauge |
| 2 | ws_topic_publish_rate{topic} | gauge |
| 2 | ws_topic_publish_bytes{topic} | gauge |
| 3 | admission_accepted_total{class} | counter |
| 3 | admission_rejected_total{class, reason} | counter |
| 4 | Per-plugin saturation counters (see plugin docs) | counter |
| 4 | svelte_realtime_assertion_violations_total{category} | counter (framework bugs only) |
A healthy production graph has Layer 1 rejections during obvious attacks, occasional Layer 2 signal transitions, near-zero Layer 3 rejections except during incidents, and zero Layer 4 saturation. Assertion-violation counters should always be zero; non-zero means a framework bug (report it).
See also
- Architecture - Capacity model - the saturation-behavior taxonomy and full list of
MAX_*constants. - Adapter Configuration - Admission and backpressure - the Layer 1 / Layer 2 option reference.
createAdmissionControl- the Layer 3 implementation.- Deployment - production deploy checklist that wires this all together.
Was this page helpful?