Place WhatsApp voice calls from Node.js.
Wraps WhatsApp Web's official VoIP WASM stack and uses Baileys for authentication and signaling. Audio (MP3, WAV, or Float32Array) is encoded with Opus and sent over the live RTP session.
Author: ShellTear
Status
- ✅ Outbound 1:1 voice calls
- ✅ Stream audio from MP3/WAV files
- ✅ Receive remote audio as
Float32Array - ✅ Mute / unmute / hang up
- ❌ Group calls
- ❌ Video
- ❌ Inbound calls
Requirements
- Node.js ≥ 20
ffmpegonPATH(used to decode/resample audio sources)- A linked WhatsApp account (you'll scan a QR on first run)
Install
This package isn't published on npm. Pull it in directly from git:
git clone https://github.com/SheIITear/baileys-caller
cd baileys-caller
npm install
npm run build
You can also depend on it from another project via a git URL in package.json:
{
"dependencies": {
"baileys-caller": "git+https://github.com/SheIITear/baileys-caller.git",
"@whiskeysockets/baileys": "^7.0.0-rc11"
}
}
@whiskeysockets/baileys is a peer dependency — install it in your project alongside this one.
Quick Start
import { VoipClient } from "baileys-caller";
const client = new VoipClient({ authDir: "./auth" });
await client.connect(); // first run prints a QR for WhatsApp > Linked Devices
const call = await client.call("12345678901", {
audioSource: "./hello.mp3",
});
call.on("ringing", () => console.log("ringing"));
call.on("connected", () => console.log("connected"));
call.on("audio", (pcm) => { /* 16 kHz mono Float32Array from the peer */ });
call.on("ended", (reason) => console.log("ended:", reason));
await call.waitForEnd();
client.disconnect();
Run the bundled example from a clone:
npx tsx examples/call.mts ./auth 12345678901 ./hello.mp3
API
new VoipClient(options)
| Option | Type | Description |
|---|---|---|
authDir |
string |
Baileys multi-file auth state directory |
client.connect(): Promise<void>
Connects to WhatsApp. On first run a QR code is printed; scan it from WhatsApp > Settings > Linked Devices. Subsequent runs reuse authDir.
client.call(phoneNumber, opts?): Promise<ActiveCall>
Places an outbound call. phoneNumber is digits only (e.g. "12345678901").
| Option | Type | Description |
|---|---|---|
audioSource |
string | "silence" |
Path to MP3/WAV, or "silence" for an empty stream |
durationMs |
number? |
Auto-hangup after N ms |
client.disconnect(): void
Closes the WhatsApp socket and releases resources.
ActiveCall
Returned by client.call(). Extends EventEmitter.
Events
| Event | Payload | When |
|---|---|---|
ringing |
— | Remote device is ringing |
connected |
— | Call answered, media flowing |
audio |
Float32Array |
16 kHz mono PCM frame from the remote peer |
ended |
string |
Call ended (hangup, timeout, rejected) |
error |
Error |
Fatal error |
Methods
call.end(): void— hang upcall.mute(muted: boolean): void— toggle outgoing mutecall.waitForEnd(): Promise<string>— resolves with end reason
Properties
call.callId: string
How it works
- Baileys handles WhatsApp authentication, encryption, and signaling stanzas.
- The WhatsApp Web VoIP WASM stack runs in-process to negotiate the call, encode/decode Opus, and manage the RTP/SRTP session.
- A pthread pool of
worker_threadsmirrors the browser's Web Worker pool the WASM expects. - Outbound audio is decoded with
ffmpeg, resampled to 16 kHz mono, fed into the WASM, and delivered to the relay. - Inbound audio is exposed as
Float32Arraychunks via theaudioevent.
Auth state
authDir stores Baileys session keys after the first QR scan. Treat it like a credential — anyone with that directory can act as your linked device.
WASM resources
The WASM binary and its loader (whatsapp.wasm, loader.js, worker-modules.js) live under assets/wasm/. To refresh them from a current WhatsApp Web session:
npm run fetch-wasm
Comments