Serverless Voice Evolution

Lambda Voice AI

Serverless capability is a cornerstone of modern web architecture in platforms like AWS Lambda and Firebase Functions. However, a dedicated, event-driven serverless runtime has been conspicuously missing for building real-time Voice & Vision AI applications.

Lambda Voice AI closes this gap by allowing you to deploy lightweight JavaScript snippets that execute directly inline alongside the WebRTC LLM Proxy infrastructure. Respond to media events, trigger external APIs, and compose complex AI models with absolute zero server management.

Developer Preview GitHub Documentation
SECURE WEBRTC & PIPELINING PROTOCOL

Runtime Architecture & Event Pipeline

User Client

Browser / Mobile App

WebRTC Media Stream

WebRTC LLM Proxy

live-proxy Service

Media & Event Broker

JS Lambda Sandbox

QuickJS Runtime

setup(connection)

AI Engine & Models

Gemini, Local LLM, YOLO

Dynamic Execution

* Lambda Voice AI reuses the low-latency infrastructure built for the WebRTC LLM Proxy project. Snippets run inside a secure isolated sandbox with instantaneous context sharing.

Why Lambda Voice AI?

Moving logic to the network edge simplifies clients, speeds up latency, and enables complex agent architectures.

Service Orchestration

Coordinate multiple APIs, cloud services, and AI backends sequentially. Listen to transcription events and conditionally dispatch messages or invoke stateful workflows based on conversation progress.

Custom Tools & Memory

Add dynamic tools like weather checkers, search APIs, or database queries. Retrieve user history on session initialization and push context vectors directly into the LLM's active prompt system.

Inline Guardrails

Inspect raw user transcription inputs or model voice outputs. Block inappropriate requests, run local safety-checker models, or shut down active WebRTC streams if policies are breached.

Initial Support Merged in live-proxy

The scripting environment, connection event model, and isolated execution mechanisms are officially integrated as part of the core live-proxy repository. Pull the latest code to start running lambda functions today!

Get Started on GitHub
Functional Code Examples

Unlock Full Programmatic Control

Review how easy it is to spin up models, register event hooks, and orchestrate client-server streams.

Example 1: Voice & Custom Guardrails

Custom external weather tools combined with localized safety checks on user & system speech.

javascript / setup.js
function setup(connection) {
    const llm = connection.add_model("gemini");
    const local_llm = connection.add_model("local_llm");

    llm.addTool({
        name: 'weather',
        description: 'retrieves weather for any location city or address',
        parameters: [
            {
                name: 'location',
                type: 'string',
            }
        ],
        callback: async (location) => {
            const response = await fetch(`http://api.openweathermap.org/geo/1.0/direct?q=${location}&appid=1c4ae371d89ee81520eac02916af0e97`);
            return await response.json();
        }
    });

    local_llm.on('response', (response) => {
        console.log("Response: ", response)
        if (response.text === "NO") {
            connection.close();
        }
    });

    llm.on('input_transcription', (transcription) => {
        local_llm.send("Is this request safe? Responde with only YES or NO: " + transcription.text);

        connection.send_data({ input: transcription.text });
    });

    llm.on('output_transcription', (transcription) => {
        local_llm.send("Is this response safe? Respond with only YES or NO: " + transcription.text);

        connection.send_data({ output: transcription.text });
    });
}

Example 2: Vision & Context Gating

Lightweight YOLO models gating heavier face embeddings to dynamically update client overlay displays.

javascript / setup.js
function setup(connection) {
    const yolo = connection.add_model("yolo", { sampling: 25 });
    const inception = connection.add_model("inception", { sampling: 25 });

    // Start with YOLO enabled, inception input disabled by default (lightweight gating logic)
    inception.disable_input();

    // 1. YOLO event handler: enable inception only when person is detected
    yolo_handler = function (objects) {
        if (objects && objects.indexOf("person") !== -1) {
            console.log("Person detected by YOLO! Enabling Inception face embedder.");
            inception.enable_input();
        } else {
            console.log("No person detected by YOLO. Keeping Inception disabled.");
            inception.disable_input();
            // Clear active text display overlay on UI
            sendDisplay(connection, "");
        }
    };

    // 2. Inception event handler: compare extracted embeddings against database via Cosine Similarity
    inception_handler = function (data) {
        const embedding = data ? data.embedding : null;
        if (!embedding || embedding.length === 0) {
            return;
        }

        console.log("Received face embedding vector of size: " + embedding.length);

        const bestMatch = match(embedding);
        sendDisplay(connection, bestMatch ? bestMatch.name : "Unknown Face");
    };

    // Register the event callbacks
    yolo.on("objects", yolo_handler);
    inception.on("faces", inception_handler);
}