Real-time features often get postponed because WebSockets can feel “expensive”:
- You need servers that stay up.
- You need sticky sessions or stateful scaling.
- You need to handle reconnects, fanout, authorization, and rate limiting.
- And you still need a database so you don’t lose messages.
Azure Web PubSub is a pragmatic alternative: it offloads WebSocket connection management to a managed service, so your application can focus on domain logic and persistence.
This article walks through:
- How to create a chat application using Azure Web PubSub at very low operational cost.
- How the same approach can power real-time in-app notifications.
- A practical reference implementation pattern (client + API + PubSub).
- A reliable persistence architecture (e.g., SQL Server) so no messages are missed.
Why Azure Web PubSub is “low cost” in practice
“Low cost” here doesn’t just mean the invoice — it means low operational cost:
- No WebSocket servers to run: Web PubSub handles connections, scaling, and fanout.
- Serverless-friendly: you can keep most of your backend HTTP-based.
- Pay for what you use: you can control cost by limiting message size, message rate, and connection lifetime.
Cost control levers that work in real systems:
- Connect only when needed (e.g., only while a chat panel is open).
- Scope fanout with groups (one group per conversation; separate hub/group for notifications).
- Persist first, broadcast second (you can retry broadcast without losing the record).
- Reconnect with backoff (avoid reconnect storms after transient outages).
- Bound payloads (limit message size; consider separate storage for large attachments).
The core building block: “client access URL” issuance
Web PubSub clients do not connect with your service key. Instead, they connect using a short‑lived signed URL created by your backend (often via an Azure Function, API endpoint, or BFF service).
Generic flow
Browser/App
|
| POST /realtime/clientAccessUrl (authenticated)
| body: { userId, hubName, groupName }
v
Backend (Function/API)
|
| validates identity + authorization
| generates signed Web PubSub client URL (short TTL)
v
Browser/App receives: { url }
|
v
WebPubSubClient(url).start(); joinGroup(groupName)Example (pseudocode)
// client
async function connectToHub({ hubName, groupName, userId }) {
const { url } = await fetchJson('/realtime/clientAccessUrl', {
method: 'POST',
headers: { Authorization: `Bearer ${accessToken}` },
body: JSON.stringify({ hubName, groupName, userId })
});
const client = new WebPubSubClient(url);
registerHandlers(client);
await client.start();
await client.joinGroup(groupName);
return client;
}That’s the “secret sauce” for both chat and notifications: the same token-issuing backend can serve multiple hubs/groups.
Part 1 — Chat architecture (real-time + durable)
Key idea: persist first, then broadcast
A reliable chat system must behave well even when:
- the sender loses connection mid-send
- the receiver is offline
- the browser tab is suspended
- Web PubSub has transient delivery issues
The most robust pattern is:
- Send the message to your HTTP API (so it can be stored durably).
- If the API succeeds, broadcast the stored message via Web PubSub to the conversation group.
- If broadcast fails, the message is still safe in the database; clients can fetch it on reconnect.
Example send path (pseudocode)
async function sendMessage({ conversationId, text }) {
// 1) persist
const saved = await postJson('/chat/reply', { conversationId, message: text });
// saved includes messageId, timestamp, sender, etc.
// 2) broadcast (best-effort)
await pubsubClient.sendToGroup(
saved.conversationContext, // groupName
{ conversationId, message: saved }, // payload
'json'
);
}Why this is so effective:
- Durability is decoupled from delivery.
- You can retry broadcast without corrupting the message timeline.
- Offline users catch up via history APIs.
Conversation group model
Use one group per conversation, for example:
- group name =
conversation:{conversationId}(recommended)
or - group name = a stable, sanitized “conversation context” string
Recommendation (better than many ad-hoc implementations):
- Prefer a stable ID-based group name (
conversation:{id}) rather than free-form text. - Keep the human-readable context as metadata, not as the group identifier.
Receiving messages
client.on('group-message', (e) => {
const msg = e.message.data;
appendToUI(msg);
});Reconnects that don’t melt your system
Best practice:
- reconnect with exponential backoff + jitter
- rebuild the signed client URL on retry (it may have expired)
- separate “manual close” vs “network drop” so you don’t reconnect after user closes
async function reconnectWithBackoff(makeClient, attempts = 5) {
for (let i = 1; i <= attempts; i++) {
try {
return await makeClient();
} catch (e) {
if (i === attempts) throw e;
await sleep(backoff(i) + jitter());
}
}
}Part 2 — Real-time in-app notifications using the same Web PubSub approach
In-app notifications are the same pattern as chat:
- separate hub or group for notifications
- a client that connects and listens for group messages
- an API-backed store for history/read-state
Notification fanout patterns (choose one)
- Broadcast to all users (group =
notifications:global)
Use when notifications are not personalized. - Broadcast to a user-specific group (group =
notifications:user:{userId})
Best for personalization and privacy. - Broadcast to a segment group (group =
notifications:team:{teamId})
Good for teams/roles/regions.
Example (pseudocode)
class NotificationService {
listeners = [];
async start({ userId }) {
this.client = await connectToHub({
hubName: 'notifications',
groupName: `notifications:user:${userId}`,
userId
});
this.client.on('group-message', (e) => {
const notif = normalizeNotification(e.message.data);
this.listeners.forEach(fn => fn(notif));
});
}
}Hybrid model: API load + real-time updates (recommended)
Do both:
- initial load from API (history + unread state)
- live updates via Web PubSub
This ensures:
- fast first paint
- correctness even if a client missed events while offline
Part 3 — The missing piece: “no missed messages” with SQL Server
Web PubSub is excellent for live delivery, but it is not your system of record. If you want “no missed messages”, you need durable storage plus a safe publish pipeline.
Reliability goal
If a message is sent and:
- the broadcast fails, or
- a receiver is offline, or
- a browser reconnects late,
…the message must still appear when the conversation is reopened.
Best-practice backend: Store → Outbox → Publish
This is the safest approach for SQL-backed systems:
Client
|
| 1) POST /chat/reply
v
API Service
| 2) Begin DB transaction
| - insert message row (SQL Server)
| - insert outbox event row (same transaction)
| - commit
v
Outbox Processor (background worker)
| 3) Reads unprocessed outbox rows
| 4) Publishes to Web PubSub group
| 5) Marks outbox row processed (idempotent)
v
Web PubSub -> connected clientsWhy this works:
- If Web PubSub is temporarily unavailable, messages are still stored.
- You can retry delivery without duplicating writes.
- Clients can always catch up by calling
/conversation/detail.
Suggested SQL Server tables (minimum viable)
Conversations(ConversationId, Context, Status, CreatedBy, CreatedAt, …)ConversationUsers(ConversationId, UserId/UserEmail, Role, JoinedAt, …)Messages(MessageId, ConversationId, SenderId, Body, CreatedAt, …)OutboxEvents(EventId, Type, PayloadJson, CreatedAt, ProcessedAt, RetryCount, …)
Idempotency and dedupe (do this even if you think you don’t need it)
- Make
MessageIdunique in the DB. - Make
OutboxEvents.EventIdunique. - Publish events with a stable identifier so clients can ignore duplicates.
Catch-up strategy (how clients avoid gaps)
Even with perfect broadcast, clients can miss real-time messages (tab sleeping, network drops). The safest UX is:
- On open/reconnect: call
/conversation/detailand render from the DB. - During active session: append live
group-messageevents for responsiveness.
Better ways / improvements over a basic implementation
1) Use ID-based group names (not free text)
Better:
conversation:{conversationId}
Avoid:
- free-form context strings as identifiers (they can be long, unsafe, or collide).
2) Don’t broadcast from the request thread (use outbox + worker)
If you broadcast inside the API request handler, you risk:
- slow sends under load
- partial failures (DB commit succeeded, broadcast timed out, user sees “failed”)
Better: commit DB + outbox, then publish asynchronously.
3) Add delivery semantics explicitly
Web PubSub delivery is best-effort to connected clients. If you need stronger semantics:
- Use outbox processing + retry.
- Add “last seen messageId” per user and support incremental fetch.
4) Use push notifications for offline delivery
In-app real-time via Web PubSub is great when the app is open. For offline/locked screens:
- Web Push / Firebase Cloud Messaging (web/mobile)
- Azure Notification Hubs (mobile-heavy scenarios)
5) Handle attachments properly
Don’t stream large images/files through PubSub:
- upload to blob storage
- send a message containing a secure URL + metadata
6) Security hardening
- Sign client URLs with short TTL.
- Validate group membership in the token-issuing backend.
- Use per-user or per-conversation authorization checks on every API call.
Closing notes
The key mindset shift:
Treat Web PubSub as a delivery channel, not your database.
When you combine:
- Web PubSub for managed fanout
- SQL Server for durability
- an outbox for reliable publishing
…you get real-time UX without sacrificing correctness or cost control.
No comments:
Post a Comment