Production Limits and Capacity

0.5 has four layers of admission and capacity controls. Each documents its own behavior individually; this page shows the precedence order and how they compose. Operators tuning production limits need this.

The four layers, in firing order

A request travels through these gates in order. Each layer is opt-in or has documented defaults; each can be configured independently.

OrderLayerWhereWhat it capsWhen it fires
1Handshake admissionwebsocket.upgradeAdmissionConcurrent connections, upgrade rateHTTP -> WS handshake
2Worker pressurewebsocket.pressure + platform.onPressureMemory, per-topic publish rate, subscriber countsAny message dispatch
3Message-tier admissioncreateAdmissionControl (extensions)Per-RPC-class admission rulesInside createMessage’s beforeExecute
4Per-primitive capacity capsMAX_* constants + plugin maxBuckets/maxTopics/etc.Map / Set growth boundsInside individual operations

A request that fails at any layer never reaches the next. A request must pass all four to execute the handler.


Layer 1: Handshake admission

Caps connection count and paces upgrade flood. Fires before the WebSocket is established.

// svelte.config.js
adapter({
  websocket: {
    upgradeAdmission: {
      maxConcurrent: 10_000,
      perTickBudget: 100
    },
    upgradeRateLimit: 10,
    upgradeRateLimitWindow: 10
  }
});
KnobDefaultEffect on breach
upgradeRateLimit per IP per window10 per 10sHTTP 429
upgradeAdmission.maxConcurrentunset (opt-in)HTTP 503 fast-fail
upgradeAdmission.perTickBudgetunset (opt-in)Paced via setImmediate

When to set. A small app behind a load balancer with finite worker capacity. maxConcurrent is your hard cap; the LB sees 503 and routes elsewhere. perTickBudget smooths connection-burst storms (e.g. a deploy that disconnects 50k clients simultaneously).

Sets the floor for everything else. A user denied at this layer never appears in any plugin’s bookkeeping.


Layer 2: Worker pressure

Runtime backpressure signal, computed on the worker. Precedence inside this layer: MEMORY > PUBLISH_RATE > SUBSCRIBERS > NONE.

// svelte.config.js
adapter({
  websocket: {
    pressure: {
      memory: { thresholdMB: 1024 },
      publishRate: {
        topicPublishRatePerSec: 5000,
        topicPublishBytesPerSec: 10 * 1024 * 1024
      },
      subscribers: { perTopic: 50_000 }
    }
  }
});

Pressure is a signal, not a hard cap. Code that wants to shed under pressure consumes it:

// In your message handler:
const { reason } = platform.pressure;
if (reason !== 'NONE') {
  // throttle, queue, or reject as appropriate
}

// Or subscribe to transitions:
platform.onPressure(({ reason }) => {
  metrics.gauge('ws_pressure', 1, { reason });
});

When to set. Always. Even if no code reads the signal, the framework’s own backoff and throttle plugins consume it. Disable individual thresholds by setting them to false.

The bus extensions (createPubSubBus) auto-emit degraded / recovered system events when a shared circuit breaker trips, which is a related but distinct signal (backend-availability, not worker-local pressure).


Layer 3: Message-tier admission

Per-RPC-class rules consulted in beforeExecute. Lets you shed cosmetic operations under pressure while keeping critical ones running.

import { createAdmissionControl } from 'svelte-adapter-uws-extensions/admission';
import { live, LiveError } from 'svelte-realtime/server';

const admission = createAdmissionControl({
  rules: {
    background: ['MEMORY', 'PUBLISH_RATE', 'SUBSCRIBERS'],  // shed under any pressure
    critical:   ['MEMORY']                                  // shed only on memory pressure
  }
});

// In createMessage:
export const message = createMessage({
  async beforeExecute(ws, rpcPath) {
    const cls = classifyRpc(rpcPath);  // 'background' or 'critical'
    if (!await admission.shouldAccept(cls, platform)) {
      throw new LiveError('OVERLOADED', 'Server is shedding load');
    }
  }
});

The demo’s live.admission({ classes: { background, critical } }) shape is the same idea wired closer to the realtime layer. Either works; pick whichever ergonomically fits.

When to set. Once your handler count grows past ~10 RPCs and some are clearly “ok to fail under pressure” (cursor moves, presence updates, notification badges) and others are “must not fail under pressure” (save mutations, payments, auth). Without this layer every RPC has equal priority, which is wrong for most apps.

Per-class metrics: admission_accepted_total{class}, admission_rejected_total{class, reason}.


Layer 4: Per-primitive capacity caps

Every internal Map / Set has an explicit upper bound. Per-plugin maxBuckets / maxTopics / maxConnections / maxKeys options override the defaults.

// Each plugin has its own caps:
createPresence({ maxTopics: 100_000 });
createRateLimit({ maxBuckets: 50_000, points: 30, interval: 10_000 });
createCursor({ maxConnections: 10_000, maxTopics: 1000 });

// Framework-level caps are exported constants you can override in tests:
import { MAX_PRESENCE_REF, MAX_PUSH_REGISTRY, MAX_AGGREGATE_BUCKETS } from 'svelte-realtime/server';
ConstantDefaultSaturation behavior
MAX_PRESENCE_REF1,000,000FIFO-evict pending leaves, then drop new joins with one-shot warning
MAX_PUSH_REGISTRY1,000,000REJECT new userIds with one-shot warning
MAX_AGGREGATE_BUCKETS1,000REJECT at module load time (refuses to register)
MAX_OPTIMISTIC_QUEUE_DEPTH1,000,000WARN-then-skip
Adapter plugin defaults1,000,000REJECT or WARN, see Architecture
queue.maxSize1,000,000onDrop callback fires, oldest entries drop

Saturation behavior is one of:

  • REJECT - new entries refused, caller sees an error
  • WARN-only - one-shot warning, growth continues (used for state-the-protocol-depends-on, where eviction would corrupt routing)
  • FIFO-evict - oldest entry dropped to make room
  • WARN-then-skip - one-shot warning, the operation is skipped

When to set. Almost never. Defaults are deliberately generous (1M). Apps that approach 1M of any single resource should investigate the leak rather than raise the cap. The constants exist so tests can override them and so dashboards can know what “saturation” means without reading source.

The shortlinks in cap-saturation log messages go to per-cap docs pages so operators can click straight from their terminal.


Putting it together: a worked example

A board collaboration app, 500 concurrent users per worker, 10 workers in a cluster:

// svelte.config.js
adapter({
  websocket: {
    upgradeAdmission: {
      maxConcurrent: 600,         // per-worker, ~20% headroom over expected
      perTickBudget: 50            // smooth deploy-storm reconnects
    },
    pressure: {
      memory: { thresholdMB: 1024 },
      publishRate: {
        topicPublishRatePerSec: 2000,
        topicPublishBytesPerSec: 4 * 1024 * 1024
      },
      subscribers: false           // not a concern at this scale
    }
  }
});

// hooks.ws.js
import { createAdmissionControl } from 'svelte-adapter-uws-extensions/admission';
const admission = createAdmissionControl({
  rules: {
    cursorMove:    ['MEMORY', 'PUBLISH_RATE'],   // shed cursor under pressure
    presenceUpdate: ['MEMORY'],                  // keep presence under publish pressure
    noteEdit:       []                           // never shed
  }
});

// extensions
import { createPresence } from 'svelte-adapter-uws-extensions/redis/presence';
const presence = createPresence(redis, {
  // 500 users * 10 workers = 5000 active presence entries
  // 100x headroom: 500_000
  maxTopics: 500_000
});

What happens to a cursor-move RPC during a memory pressure event:

  1. Layer 1: handshake admission allowed the connection (not relevant on every message).
  2. Layer 2: platform.pressure.reason === 'MEMORY'.
  3. Layer 3: admission.shouldAccept('cursorMove', platform) returns false because cursorMove rules include 'MEMORY'.
  4. beforeExecute throws LiveError('OVERLOADED'). The handler never runs.
  5. Layer 4 never fires because the request did not reach a primitive.

A note-edit RPC during the same pressure event:

  1. Layer 1: same.
  2. Layer 2: same signal.
  3. Layer 3: admission.shouldAccept('noteEdit', platform) returns true because noteEdit rules are empty.
  4. Handler runs.
  5. Layer 4 may still trigger inside the handler if e.g. presence.join() would push past maxTopics.

Monitoring

Each layer has its own metrics. A complete dashboard tracks:

LayerMetricType
1upgrade_rate_limited_totalcounter
1upgrade_admission_rejected_totalcounter
2ws_pressure{reason}gauge
2ws_topic_publish_rate{topic}gauge
2ws_topic_publish_bytes{topic}gauge
3admission_accepted_total{class}counter
3admission_rejected_total{class, reason}counter
4Per-plugin saturation counters (see plugin docs)counter
4svelte_realtime_assertion_violations_total{category}counter (framework bugs only)

A healthy production graph has Layer 1 rejections during obvious attacks, occasional Layer 2 signal transitions, near-zero Layer 3 rejections except during incidents, and zero Layer 4 saturation. Assertion-violation counters should always be zero; non-zero means a framework bug (report it).


See also

Was this page helpful?