Performance

Why uWebSockets.js?

uWebSockets.js is a C++ HTTP and WebSocket server compiled to a native V8 addon. It consistently outperforms Node.js’ built-in http module, Express, Fastify, and every other JavaScript HTTP server by a significant margin.

We ran a comprehensive benchmark suite isolating every layer of overhead - from barebones uWS through the full adapter pipeline - and compared against @sveltejs/adapter-node (Node http + Polka + sirv) and the most popular WebSocket libraries (socket.io, ws). The benchmark code is in the bench/ directory so you can reproduce it yourself.

HTTP: adapter-uws vs adapter-node

Tested with a trivial SvelteKit handler (isolates adapter overhead from your app code):

	adapter-uws	adapter-node	Multiplier
Static files	165,700 req/s	24,500 req/s	6.8x faster
SSR	150,500 req/s	58,300 req/s	2.6x faster

^{100 connections, 10 pipelining, 10s, 2 runs averaged. Node v24, Windows 11.}

The static file gap is the largest because adapter-node uses sirv which calls fs.createReadStream().pipe(res) per request, while we serve from an in-memory Map with a single res.cork() + res.end(). The SSR gap comes from uWS’s C++ HTTP parsing and batched writes vs Node’s async drain event cycle.

WebSocket: uWS vs socket.io vs ws

50 connected clients, 10 senders, burst mode, 8 seconds:

Server	Messages delivered/s	vs adapter-uws
uWS native (barebones)	3,583,000	baseline
adapter-uws (full handler)	3,583,000	1.0x
ws library	232,200	15.4x slower
socket.io	226,700	15.8x slower

uWS native pub/sub delivered 3.5M messages/s with exact 50x fan-out. The adapter matches it - the byte-prefix check and string template envelope add near-zero overhead to the hot path. socket.io and ws both collapsed under the same load, delivering less than 1x fan-out (massive message loss/queueing).

Where the overhead goes

HTTP (SSR path) - ~32% total overhead vs barebones uWS

Layer	Cost	Notes
`res.cork()` + status + headers	~12.6%	Writing a proper HTTP response - unavoidable
`new Request()` construction	~9%	Required by SvelteKit’s `server.respond()` contract
async/Promise scheduling	~3%	`getReader()` + `read()` + event loop yield
Header collection, remoteAddress	~1%	`req.forEach` + TextDecoder

WebSocket - at parity with barebones uWS pub/sub

Layer	Cost	How
Subscribe/unsubscribe check	~0%	Byte-prefix discriminator: byte[3] is `y` for `{"ty` (control) and `o` for `{"to` (user envelope). One comparison skips `JSON.parse` for all user messages (0.001us per message).
Envelope wrapping	~0%	String template + `esc()` char scan instead of `JSON.stringify` on a wrapper object. Only `data` is stringified. ~0.085us per publish.
Connection tracking	~2%	`Set` add/delete on open/close.
Origin validation, upgrade headers	~2%	Four `req.getHeader` calls on upgrade.

Internal optimizations

The adapter applies several allocation and caching strategies to stay off the GC’s radar on the hot path:

Request state pooling - SSR requests need a { aborted: false } state object. Instead of allocating one per request (which promotes to V8’s old generation and stays there), the adapter maintains a pool of up to 256 reusable state objects. Eliminates young-gen GC churn under sustained load.
Envelope prefix cache - platform.publish() and platform.send() wrap data in a {"topic":"...","event":"...","data":...} envelope. The prefix up to "data": is cached in a 256-entry LRU map keyed by topic+event. Repeated publishes to the same topic/event (the common case) skip 4 string concatenations and the character validation scan. The cache is trimmed every 60 seconds to reclaim stale entries from shifted traffic patterns.

What we don’t add

No middleware chain (no Polka, no Express)
No routing layer (uWS native routing + SvelteKit’s router)
No per-request stream allocation for static files (in-memory Buffer, not fs.createReadStream)
No Node.js http.IncomingMessage shim (we construct Request directly from uWS)

SSR request deduplication

When multiple concurrent requests arrive for the same anonymous (no cookie/auth) GET or HEAD URL, only one is dispatched to SvelteKit. The others wait for the result and reconstruct their own response from the shared buffer. This prevents redundant rendering work during traffic spikes - a common pattern when a post goes viral or a cron job hits a popular page at the same time as real users.

Dedup is automatically skipped for:

Any request with a Cookie or Authorization header (personalized responses must not be shared)
POST, PUT, PATCH, DELETE (mutations must always execute)
Responses with a Set-Cookie header (personalized)
Response bodies larger than 512 KB (too large to buffer and share)
Requests with an X-No-Dedup: 1 header (opt-out escape hatch)

No configuration is needed. The dedup map holds at most 500 in-flight keys simultaneously as a safety valve against memory pressure from unique URLs.

Vary and personalization contract

The adapter deduplicates by method + URL only. It cannot inspect every possible input that might affect your response (user-agent quirks, custom headers, etc.). The contract is:

If your route handler produces different output based on a request header or other input, emit a Vary header listing those headers. The adapter checks the Vary header after rendering and discards the dedup entry if Vary is present, preventing that response from being shared.
If you have a route that varies by something the adapter cannot detect (e.g. server-side A/B test state), add X-No-Dedup: 1 to opt out entirely.

Anonymous GET/HEAD routes that produce the same output for all users (landing pages, docs, prerendered pages) benefit most from dedup and require no action.

Measured benefit: 200 concurrent requests to the same anonymous URL with a 5ms render delay: without dedup, 200 render calls; with dedup, 1 render call. 200x reduction in CPU and memory pressure.

The bottom line

The adapter retains ~68% of raw uWS HTTP throughput and matches uWS native WebSocket throughput. The HTTP overhead is dominated by things SvelteKit requires (new Request(), proper HTTP headers). The WebSocket overhead is now almost entirely the JSON.stringify of your data payload - the adapter’s own machinery costs near zero. In a real app, your load functions and component rendering will dwarf all of this - the adapter’s job is to get out of the way, and it does.

Running the benchmarks yourself

npm install  # installs uWebSockets.js, autocannon, etc.
node bench/run.mjs          # adapter overhead breakdown
node bench/run-compare.mjs  # full comparison vs adapter-node + socket.io
node bench/run-dedup.mjs    # SSR dedup render-call reduction

Was this page helpful?