WAHA Scaling - How To Handle 500+ Sessions

Posted in Tips on August 14, 2024 by devlikeapro ‐ 6 min read

WAHA Scaling - How To Handle 500+ Sessions

Overview

The article is for people who want to scale their WhatsApp API for customers, like CRM, SaaS, or other services and who needs to handle a lot (>100) of 🖥️ Sessions (WhatsApp Accounts)

If you’re using WAHA for 1-10 sessions - just make sure to follow 🔧 Install & Update guide. It handles all the necessary steps to make it work. 🚀

There are two ways to scale WAHA:

  1. Vertical Scaling - adding more resources (CPU, RAM) to the single server to handle more sessions. That’s a good way to go if you need to handle up to 50 sessions (WEBJS) or 500 sessions (NOWEB).
  2. Horizontal Scaling - adding more servers to handle more sessions. Requires a bit of more work to set up, but it’s the best way to go if you need to handle more than 500 sessions.

Vertical Scaling

Vertical Scaling is the process of adding more resources (CPU, RAM) to the single server to handle more sessions.

Assuming you’ve followed the guide 🔧 Install & Update and you got something like this architecture:

Click to open full size.

How many sessions you can run adding more resources (CPU and RAM) to the single WAHA server?

Here’s approximate example how many session you can run on a single server using Vertical Scaling approach:

🏭 EngineSessionsCPUMemory
WEBJS10270%2.5GB
WEBJS501500%20GB
NOWEB50150%4GB
NOWEB500300%30GB

👉 The benchmark may differ from case to case, it depends on usage pattern - how many messages you get, how many send, etc.

So if you need to run up to 50 session on WEBJS engine or up to 500 sessions on NOWEB - you can just keep adding more resources (CPU and RAM) to the single server! Fast to scale, easy to manage.🎉

If you want to run more sessions - you need to consider Horizontal Scaling. It’s not safe to run more than the above numbers on a single server!

Horizontal Scaling - Sharding

Horizontal Scaling is the process of adding more servers to handle more sessions.

Right now the only way to do it is to run multiple WAHA instances and distribute the sessions between them in Your Application logic using Sharding technique:

Click to open full size.

Here’s key points how to set up Horizontal Scaling using Sharding technique:

  1. You run multiple WAHA instances listening different hostnames (http://waha1.example.com, http://waha2.example.com, etc) or ports (http://waha.example.com:3001, http://waha.example.com:3002, etc).
  2. You save the list of url, api-key, capacity to Your Application Database - Entities Schema
  3. When a new user asks to run a new session - you follow Where to run a new session? logic to find a suitable WAHA instance and save user <-> session <-> server association to Your Application Database.
  4. When you need to send a request to WhatsApp API - you follow Where to find the session? logic to find the WAHA instance to send the request.
  5. All webhooks come to Your Application directly from the WAHA instance, so you don’t need to worry about it.

We’ll guide you through the process of setting up Horizontal Scaling using Sharding technique in the next sections.

👉 Please note that each WAHA Worker must have its own database - either File Storage or MongoDB URL (not a database). Otherwise, the WHATSAPP_RESTART_ALL_SESSIONS=True option will restart ALL sessions in ALL workers on worker restart (you’ll need disable it and run the session restart logic in your application).

Entities Schema

In order to save WAHA instances and sessions associations you need to have the following entities in Your Application Database:

Worker

Worker represents a single WAHA instance that can handle sessions.

  • id - unique identifier
  • url - URL of the WAHA instance, http://waha1.example.com, http://waha2.example.com, etc
  • api_key - API Key to authorize requests
  • capacity - how many sessions can be run on the WAHA instance (for simplicity, we’re using a single field, but it can be a new model AvailableSession or similar).

By setting capacity you can manage the WAHA Worker usage and prevent overloading.

User

User is a user of Your Application that can run sessions.

  • id - unique identifier

👉 You can use either User or Tenant or Organization it completely depends on your application logic and business model. We’ll use User for simplicity with a single field:

WAHASession

WAHASession represents a single session that is running on the WAHA Worker associated with the User.

  • id - unique identifier
  • name - WAHA session name
  • user_id - reference to the User
  • worker_id - reference to the Worker

Where to run a new session?

When a new user asks to run a new session - you need to find a suitable WAHA instance to run it. You can simply get a list of session with capacity>0 and pick the one with the highest capacity.

Click to open full size.

It’s just an example with simple logic. You can adjust it and distribute WhatsApp session based on country, proxy settings, customer level, etc.

Where to find the session?

When you need to send a request to WhatsApp API - you need to find the WAHA instance to send the request. You can simply get the worker_id from the WAHASession and send the request to the WAHA instance using the url and api_key.

Click to open full size.

Why this way?

WAHA is not stateless application, it has a runtime state (not technically a state as in database, but still a state) - the connection to WhatsApp (either browser or websocket connection) which can not be moved automatically, so all HTTP requests MUST be “sticky”, meaning it MUST go only to the certain “worker” - one with “running” session.

This is why we can simply run more containers using Kubernetes Deployment/AWS ECS (tho you can use StatefulSets for k8s) , we need to care about WHERE we run the session and WHERE we send the request.

and few more reasons:

  • Simple - you don’t need to worry about the load balancer, k8s, docker, etc.
  • Independent - you can run WAHA instances on different servers, different cloud providers, bare-metal, etc.
  • No single point of failure - if one WAHA instance goes down - the others are still working.
  • Flexible - you can configure HOW you distribute sessions across different servers based on your business logic.
  • Frees our hands - we can focus on building the best WhatsApp API and adding new features, and you can focus on building the best application. 😊

Single Dashboard - Multiple Servers

If you’re running multiple servers you can run a dedicated WAHA 📊 Dashboard just to have a single place where from you can manage all servers:

Click to open full size.

After that you can connect all server to the single dashboard:

Click to open full size.

Horizontal Scaling - Auto-Scaling

🚧🔨⏳ Auto-Scaling IS NOT AVAILABLE out-of-the-box in WAHA yet! ⏳🔨🚧

We’re working on it, but it’s not ready yet, so we’re just giving you a future vision how it will work.

The idea is to build WAHA Hub that will handle all API requests and distribute them to the WAHA Workers based on information where each session is running.

It’ll also control (using underlying k8s or docker infrastructure) the number of workers based on the load.

Click to open full size.

Kindly support the project on PRO tier if you wish to use the feature in the future! 🙏

For now, Vertical Scaling and Horizontal Scaling - Sharding are the ways to go.