Lambda Voice AI
Serverless capability is a cornerstone of modern web architecture in platforms like AWS Lambda and Firebase Functions. However, a dedicated, event-driven serverless runtime has been conspicuously missing for building real-time Voice & Vision AI applications.
Lambda Voice AI closes this gap by allowing you to deploy lightweight JavaScript snippets that execute directly inline alongside the WebRTC LLM Proxy infrastructure. Respond to media events, trigger external APIs, and compose complex AI models with absolute zero server management.
Runtime Architecture & Event Pipeline
User Client
Browser / Mobile App
WebRTC LLM Proxy
live-proxy Service
JS Lambda Sandbox
QuickJS Runtime
AI Engine & Models
Gemini, Local LLM, YOLO
* Lambda Voice AI reuses the low-latency infrastructure built for the WebRTC LLM Proxy project. Snippets run inside a secure isolated sandbox with instantaneous context sharing.
Why Lambda Voice AI?
Moving logic to the network edge simplifies clients, speeds up latency, and enables complex agent architectures.
Service Orchestration
Coordinate multiple APIs, cloud services, and AI backends sequentially. Listen to transcription events and conditionally dispatch messages or invoke stateful workflows based on conversation progress.
Custom Tools & Memory
Add dynamic tools like weather checkers, search APIs, or database queries. Retrieve user history on session initialization and push context vectors directly into the LLM's active prompt system.
Inline Guardrails
Inspect raw user transcription inputs or model voice outputs. Block inappropriate requests, run local safety-checker models, or shut down active WebRTC streams if policies are breached.
Initial Support Merged in live-proxy
The scripting environment, connection event model, and isolated execution mechanisms are officially integrated as part of the core live-proxy repository. Pull the latest code to start running lambda functions today!
Unlock Full Programmatic Control
Review how easy it is to spin up models, register event hooks, and orchestrate client-server streams.
Example 1: Voice & Custom Guardrails
Custom external weather tools combined with localized safety checks on user & system speech.
function setup(connection) {
const llm = connection.add_model("gemini");
const local_llm = connection.add_model("local_llm");
llm.addTool({
name: 'weather',
description: 'retrieves weather for any location city or address',
parameters: [
{
name: 'location',
type: 'string',
}
],
callback: async (location) => {
const response = await fetch(`http://api.openweathermap.org/geo/1.0/direct?q=${location}&appid=1c4ae371d89ee81520eac02916af0e97`);
return await response.json();
}
});
local_llm.on('response', (response) => {
console.log("Response: ", response)
if (response.text === "NO") {
connection.close();
}
});
llm.on('input_transcription', (transcription) => {
local_llm.send("Is this request safe? Responde with only YES or NO: " + transcription.text);
connection.send_data({ input: transcription.text });
});
llm.on('output_transcription', (transcription) => {
local_llm.send("Is this response safe? Respond with only YES or NO: " + transcription.text);
connection.send_data({ output: transcription.text });
});
}Example 2: Vision & Context Gating
Lightweight YOLO models gating heavier face embeddings to dynamically update client overlay displays.
function setup(connection) {
const yolo = connection.add_model("yolo", { sampling: 25 });
const inception = connection.add_model("inception", { sampling: 25 });
// Start with YOLO enabled, inception input disabled by default (lightweight gating logic)
inception.disable_input();
// 1. YOLO event handler: enable inception only when person is detected
yolo_handler = function (objects) {
if (objects && objects.indexOf("person") !== -1) {
console.log("Person detected by YOLO! Enabling Inception face embedder.");
inception.enable_input();
} else {
console.log("No person detected by YOLO. Keeping Inception disabled.");
inception.disable_input();
// Clear active text display overlay on UI
sendDisplay(connection, "");
}
};
// 2. Inception event handler: compare extracted embeddings against database via Cosine Similarity
inception_handler = function (data) {
const embedding = data ? data.embedding : null;
if (!embedding || embedding.length === 0) {
return;
}
console.log("Received face embedding vector of size: " + embedding.length);
const bestMatch = match(embedding);
sendDisplay(connection, bestMatch ? bestMatch.name : "Unknown Face");
};
// Register the event callbacks
yolo.on("objects", yolo_handler);
inception.on("faces", inception_handler);
}