WAHA Scaling - How To Handle 500+ Sessions

August 14, 2024 in Tips by devlikeapro6 minutes

Overview

This article is for people who want to scale their WhatsApp API for customers, like CRM, SaaS, or other services and who need to handle a lot (>100) of 🖥️ Sessions (WhatsApp Accounts).

If you’re using WAHA for 1-10 sessions - just make sure to follow the 🔧 Install & Update guide. It handles all the necessary steps to make it work. 🚀

There are two ways to scale WAHA:

Vertical Scaling - adding more resources (CPU, RAM) to a single server to handle more sessions. That’s a good way to go if you need to handle up to 50 sessions (WEBJS) or 500 sessions (NOWEB).
Horizontal Scaling - adding more servers to handle more sessions. Requires a bit more work to set up, but it’s the best way to go if you need to handle more than 500 sessions.

Vertical Scaling

alt

Vertical Scaling is the process of adding more resources (CPU, RAM) to a single server to handle more sessions.

Assuming you’ve followed the guide 🔧 Install & Update and you have something like this architecture:

Click to open full size.

How many sessions can you run by adding more resources (CPU and RAM) to a single WAHA server?

Here’s an approximate example of how many sessions you can run on a single server using the Vertical Scaling approach:

🏭 Engine	Sessions	CPU	Memory
WEBJS	10	270%	2.5GB
WEBJS	50	1500%	20GB
NOWEB	50	150%	4GB
NOWEB	500	300%	30GB

👉 The benchmark may differ from case to case, depending on usage patterns - how many messages you receive, how many you send, etc.

So if you need to run up to 50 sessions on the WEBJS engine or up to 500 sessions on NOWEB - you can just keep adding more resources (CPU and RAM) to the single server! Fast to scale, easy to manage. 🎉

If you want to run more sessions - you need to consider Horizontal Scaling. It’s not safe to run more than the above numbers on a single server!

Horizontal Scaling - Sharding

alt

Horizontal Scaling is the process of adding more servers to handle more sessions.

Right now, the only way to do it is to run multiple WAHA instances and distribute the sessions between them in Your Application logic using the Sharding technique:

Click to open full size.

Here are the key points for setting up Horizontal Scaling using the Sharding technique:

You run multiple WAHA instances listening on different hostnames (http://waha1.example.com, http://waha2.example.com, etc.) or ports (http://waha.example.com:3001, http://waha.example.com:3002, etc.).
You save the list of url, api-key, capacity to Your Application Database - Entities Schema
When a new user asks to run a new session - you follow the Where to run a new session? logic to find a suitable WAHA instance and save the user <-> session <-> server association to Your Application Database.
When you need to send a request to WhatsApp API - you follow the Where to find the session? logic to find the WAHA instance to send the request.
All webhooks come to Your Application directly from the WAHA instance, so you don’t need to worry about it.

We’ll guide you through the process of setting up Horizontal Scaling using the Sharding technique in the next sections.

👉 Please note that each WAHA Worker must have its own database or WAHA_WORKER_ID=waha{N} environment variable set for either File Storage or MongoDB URL (not a database).

Entities Schema

In order to save WAHA instances and session associations, you need to have the following entities in Your Application Database: alt

Worker

Worker represents a single WAHA instance that can handle sessions.

id - unique identifier
url - URL of the WAHA instance, http://waha1.example.com, http://waha2.example.com, etc.
api_key - API Key to authorize requests
capacity - how many sessions can be run on the WAHA instance (for simplicity, we’re using a single field, but it can be a new model AvailableSession or similar).

By setting capacity, you can manage the WAHA Worker usage and prevent overloading.

User

User is a user of Your Application that can run sessions.

id - unique identifier

👉 You can use either User or Tenant or Organization - it completely depends on your application logic and business model. We’ll use User for simplicity with a single field:

WAHASession

WAHASession represents a single session that is running on the WAHA Worker associated with the User.

id - unique identifier
name - WAHA session name
user_id - reference to the User
worker_id - reference to the Worker

Where to run a new session?

When a new user asks to run a new session - you need to find a suitable WAHA instance to run it. You can simply get a list of sessions with capacity>0 and pick the one with the highest capacity.

Click to open full size.

This is just an example with simple logic. You can adjust it and distribute WhatsApp sessions based on country, proxy settings, customer level, etc.

Where to find the session?

When you need to send a request to WhatsApp API - you need to find the WAHA instance to send the request. You can simply get the worker_id from the WAHASession and send the request to the WAHA instance using the url and api_key.

Click to open full size.

Why this way?

WAHA is not a stateless application, it has a runtime state (not technically a state as in database, but still a state) - the connection to WhatsApp (either browser or websocket connection) which cannot be moved automatically, so all HTTP requests MUST be “sticky”, meaning they MUST go only to the certain “worker” - one with a “running” session.

This is why we cannot simply run more containers using Kubernetes Deployment/AWS ECS (though you can use StatefulSets for k8s). We need to care about WHERE we run the session and WHERE we send the request.

And a few more reasons:

Simple - you don’t need to worry about the load balancer, k8s, docker, etc.
Independent - you can run WAHA instances on different servers, different cloud providers, bare-metal, etc.
No single point of failure - if one WAHA instance goes down - the others are still working.
Flexible - you can configure HOW you distribute sessions across different servers based on your business logic.
Frees our hands - we can focus on building the best WhatsApp API and adding new features, and you can focus on building the best application. 😊

Single Dashboard - Multiple Servers

If you’re running multiple servers, you can run a dedicated WAHA 📊 Dashboard just to have a single place from which you can manage all servers:

Click to open full size.

After that, you can connect all servers to the single dashboard:

Click to open full size.

Horizontal Scaling - Auto-Scaling

🚧🔨⏳ Auto-Scaling IS NOT AVAILABLE out-of-the-box in WAHA yet! ⏳🔨🚧

We’re working on it, but it’s not ready yet, so we’re just giving you a future vision of how it will work.

The idea is to build WAHA Hub that will handle all API requests and distribute them to the WAHA Workers based on information about where each session is running.

It’ll also control (using underlying k8s or docker infrastructure) the number of workers based on the load.

Click to open full size.

Kindly support the project on the PRO tier if you wish to use this feature in the future! 🙏

For now, Vertical Scaling and Horizontal Scaling - Sharding are the ways to go.

WAHA Scaling - How To Handle 500+ Sessions

Overview

Vertical Scaling

Horizontal Scaling - Sharding

Entities Schema

Worker

User

WAHASession

Where to run a new session?

Where to find the session?

Why this way?

Single Dashboard - Multiple Servers

Horizontal Scaling - Auto-Scaling

Related posts

WAHA + n8n: No Code Low Code WhatsApp Automation Step-By-Step Guide

How to send a post to WhatsApp Channel via API

Setting up HTTPS for WAHA