Smart Assistant // AI Research Project

AMADEUS

SYSTEM

NEURAL NET ONLINE MULTIMODAL I/O ACTIVE MEMORY INITIALIZED API GATEWAY READY

A modular, open-source multimodal agent platform — featuring real-time speech processing, camera-based visual understanding, and multi-provider LLM integration. Fully customizable: define your agent's persona and prompts, swap Live2D models, and deploy via a unified management dashboard.

WATCH DEMO COMMING SOON...

DIVERGENCE METER 1.048596%

SCROLL

DEMONSTRATION

SYSTEM IN ACTION

Multimodal conversation with the Amadeus agent — voice input, camera vision, LLM reasoning, and TTS output in a unified pipeline.

CORE COMPONENTS

SYSTEM ARCHITECTURE

Four specialized modules engineered to work in concert, from raw audio to rendered avatar.

MODULE // 01

phonewave

Audio Processing Engine

Core Python library powering the full multimodal pipeline — real-time Voice Activity Detection, streaming ASR transcription, neural TTS synthesis, camera snapshot capture for visual context, and a unified LLM API abstraction layer with support for multiple providers.

VAD ASR TTS Vision Streaming Python

MODULE // 02

amadeus-core

Agent Runtime Engine

Central orchestration layer for the agent. Manages conversation state and context windows, routes intents through the skill pipeline, interfaces with MCP tools, triggers scheduled tasks, and maintains persistent cross-session memory.

Agent MCP Skills Memory Context Core

MODULE // 03

sern

Management Dashboard

Full-featured admin dashboard for deploying and configuring agents at runtime. Manage Live2D models, persona prompts, MCP integrations, skill modules, scheduled autonomous tasks, and multi-provider API credentials.

Deploy Live2D Prompts Scheduler Providers Admin

MODULE // 04

Nixie-UI

Frontend Interface

The user-facing frontend delivering an immersive multimodal interaction experience. Live2D avatar rendering, real-time lip-sync, and a clean conversational interface connected via WebSocket to the agent backend.

Live2D WebSocket Lip-sync Real-time Frontend

🎙️

USER VOICE

Microphone

〰️

VAD

phonewave

📝

ASR

phonewave

🧠

AMADEUS CORE

Agent + Skills + MCP + ...

🔊

TTS

phonewave

🖼️

NIXIE-UI

Live2D + Lip-sync

Real-time Voice Activity Detection Streaming ASR & Neural TTS Camera-based Visual Understanding Multi-Provider LLM Support Live2D Avatar Integration MCP Tool Ecosystem Scheduled Autonomous Tasks Persistent Agent Memory WebSocket Streaming Architecture Multi-Agent Deployment via sern Configurable Persona & Prompts Real-time Voice Activity Detection Streaming ASR & Neural TTS Camera-based Visual Understanding Multi-Provider LLM Support Live2D Avatar Integration MCP Tool Ecosystem Scheduled Autonomous Tasks Persistent Agent Memory WebSocket Streaming Architecture Multi-Agent Deployment via sern Configurable Persona & Prompts