What is a PBX?
Think of a PBX as a router for phone calls, just like your home WiFi router sends internet traffic to the right device.
The client (our stakeholder) owns a Yeastar P550 PBX. It is a physical box in their office that manages all their phone lines. Businesses will get phone numbers from the client. When a customer dials one of those numbers, the Yeastar PBX forwards the call to our AI voice agent.
What is a SIP Trunk?
SIP (Session Initiation Protocol) is the language phones use to talk to each other. A SIP trunk is like an HTTP connection between two servers.
72.62.70.61) and port (15060). When a call comes in, it sends an
INVITE (like an HTTP POST request) to that address.
We currently have a "Peer Trunk" set up. This means: no username/password login needed. The PBX just sends calls directly to our IP address. Trust is based on IP, not on credentials.
What is the INVITE / Call-ID / SDP?
When a call starts, the PBX sends a SIP INVITE message. Think of it as the phone call equivalent of a new HTTP request.
INVITE = POST request saying "I want to start a call"
Call-ID = A unique request ID (like a UUID in your API). Each call gets its own Call-ID so the system knows which call is which, even when 10 calls happen at the same time.
SDP (Session Description Protocol) = The request body / payload. It contains: "I want to send audio using codec X, send it to my IP on port Y."
200 OK response = Our server's response saying: "Cool, I accept. Send YOUR audio to MY IP on port Z."
BYE = DELETE request. "This call is over, hang up."
What is RTP?
After the SIP "handshake" (INVITE + 200 OK), the actual voice audio flows over a different protocol called RTP (Real-time Transport Protocol).
Each active call needs its own RTP port. This is why the client's message mentions an "RTP port pool" (like 20000 to 20500). Each new call grabs a free port, so the audio streams never mix together.
How Does This Map to Our System?
Here is the good news: LiveKit already handles most of the SIP complexity. LiveKit SIP service acts as our SIP server. It:
1. Listens for INVITE messages on port 15060
2. Automatically manages Call-IDs and RTP ports
3. Creates a LiveKit "Room" for each call
4. Converts phone audio (G.711) to WebRTC audio (Opus)
5. Our Python agent joins the Room and talks to the caller
What About Concurrent Calls?
The client says one business can receive multiple calls at the same time. This is handled at two levels:
Level 1 (PBX side, client handles): The Yeastar PBX sends each call as a separate INVITE
with a unique Call-ID. It does NOT confuse them. This is automatic.
Level 2 (Our side, we handle): LiveKit creates a separate Room for each call. Each Room
gets its own agent instance. Currently, our session_gate.py limits to ONE active session per
org per GPU. This is the bottleneck we need to fix.
The Current "Hardcoded 1000" Problem
Right now, our LiveKit SIP is set up with a single rule: "any call to number 1000 goes to the
agent." This was fine for testing, but for production multi-tenant:
Problem: If Business A's customer calls +39-055-1234567 and Business B's customer calls
+39-055-7654321, both numbers hit our server. But how does our agent know which business's knowledge base
and customer database to use?
Solution: We need to look at the "called number" (the number the customer
dialed) in the SIP INVITE, look it up in our sip_numbers table, find the matching
organization_id, and load that org's config. This is the main routing problem we need to solve.
Blue arrows = LiveKit handles automatically.
Red arrows = NEW code we need to write (the number-to-org lookup).
Green arrows = Exists but needs updating for multi-tenant routing.
Purple = Active call (already working).
Where: LiveKit SIP configuration (likely via LiveKit API or config YAML). Check
docs/Yeastar-setup.md for the current config.How: Instead of matching
called_number: "1000", configure a wildcard or
regex match. The dispatch rule should put the called_number and caller_number
into the Room metadata so the agent can read it.Simple example: Think of it like changing a route from
/api/agent/1000 to
/api/agent/:phoneNumber. The number becomes a variable, not a hardcoded value.Reference: LiveKit SIP dispatch rules docs:
https://docs.livekit.io/agents/sip/
called_number from
room metadata, queries sip_numbers table, and gets the
organization_id.Where:
services/supabase_client.py has
lookup_org_by_sip_number() already. Check if it works properly and returns the right data
including any per-number config overrides.Where (agent):
agent.py and voice_agent.py. The agent's
entrypoint function needs to: (1) check if it's a SIP call, (2) extract called_number, (3) call lookup,
(4) use that org_id for everything.Simple example: Like a middleware in Express.js that reads the API key from the request header, looks up the tenant, and attaches
req.tenant for all downstream handlers.
services/session_gate.py currently enforces ONE active session per
organization per GPU. For SIP calls, we need to allow multiple concurrent calls per org (a restaurant
might get 5 calls at once).How: Change the gate from "1 session per org" to "N sessions per org" with a configurable limit. Or better: track per-org concurrent count and reject new calls with a SIP 486 (Busy) when the limit is reached.
Consideration: GPU memory is limited. Each concurrent call needs its own STT + LLM context + TTS. For a single RTX 3090, maybe 3 to 5 concurrent sessions max. This should be configurable per org or per GPU.
agent_instances, (2) check if the sip_numbers row has a
config_override JSONB, (3) merge the override on top of the org config.Where:
services/tenant_config.py. The TenantConfig dataclass
needs a merge method that accepts per-number overrides.Why: Business has number 1001 for orders (greeting: "Ready to take your order") and 1002 for support (greeting: "How can I help?"). Same org, different agent behavior.
sip_numbers table exists with basic columns (phone_number,
organization_id, active).New columns needed:
-
label (text) = Human-friendly name like "Sales Line"-
config_override (jsonb) = Per-number agent config overrides-
assigned_by (uuid, nullable) = Who assigned this number (for audit)-
assigned_at (timestamptz) = When it was assigned-
max_concurrent_calls (int, default 1) = How many simultaneous calls this number can
handleIndex: Add unique index on
phone_number (one number can only belong to one
org at a time).RLS: Update Row Level Security policies so tenants can only see their own numbers.
active_calls table (or use Redis) to track which calls are
currently active per org and per number.Columns:
-
id (uuid)-
session_id (fk to sessions)-
organization_id (fk to organizations)-
sip_number_id (fk to sip_numbers)-
started_at (timestamptz)-
call_id_sip (text) = The SIP Call-ID for debuggingPurpose: Quick COUNT query to check concurrent calls before accepting a new call. Rows are deleted when calls end.
Features:
- Table showing: phone number, label, status (active/inactive), concurrent call limit
- Click a number to edit: label, greeting override, language override, system prompt override
- Show live call count (how many active calls on this number right now)
- Toggle active/inactive
Note: The actual number ASSIGNMENT is done by the client (stakeholder) via an admin panel or API. The business tenant can only VIEW and CONFIGURE their assigned numbers, not create new ones.
Where:
src/app/(dashboard)/settings/ or a new
src/app/(dashboard)/phone-numbers/ route.
- Add new phone numbers to the system
- Assign numbers to organizations
- Reassign numbers between organizations
- See which numbers are in use across all tenants
Who uses this: Only the client/stakeholder, not the business tenants.
Consideration: Could be a separate admin route (e.g.,
/admin/numbers) or
an API-only feature initially.
- Currently active calls (live count) per number
- Call history with caller number, duration, tickets created
- Simple analytics: calls per day, average duration, busiest hours
Where: Enhance existing
src/app/(dashboard)/operations/page.tsx
How: Two approaches:
- Firewall level: iptables/ufw rule on the VPS to DROP any UDP to port 15060 that does not come from the Yeastar's public IP
- LiveKit level: Configure the SIP trunk's
allowed_addresses to only
accept the PBX IPThe client's message explicitly mentions this: "The bot's SIP listener should silently drop any UDP packets on port 15060 that do not originate from the Yeastar PBX's designated IP address."
How: Before the agent joins a Room, check concurrent call count. If over limit, do not join and let LiveKit handle the SIP rejection. Or configure LiveKit dispatch rules with a max participant limit.
Also needed: Ghost session cleanup. If a call drops without a SIP BYE (network failure), detect the silence after 10 to 15 seconds and force-close the session to free the slot.
Current: Ports 10000 to 12000 are open for RTP. If we increase concurrent calls, verify this range is sufficient.
Rule of thumb: Each call uses 2 RTP ports (audio send + receive). With 50 concurrent calls, we need at least 100 ports. 10000 to 12000 gives us 2000 ports, so plenty for now.