Smart Assistant  //  AI Research Project

AMADEUS

SYSTEM

NEURAL NET ONLINE MULTIMODAL I/O ACTIVE MEMORY INITIALIZED API GATEWAY READY

A modular, open-source multimodal agent platform — featuring real-time speech processing, camera-based visual understanding, and multi-provider LLM integration. Fully customizable: define your agent's persona and prompts, swap Live2D models, and deploy via a unified management dashboard.

DIVERGENCE METER  1.048596%

SCROLL
DEMONSTRATION

SYSTEM IN ACTION

Multimodal conversation with the Amadeus agent — voice input, camera vision, LLM reasoning, and TTS output in a unified pipeline.

amadeus-system  ::  live_demo.mp4 PLAYBACK READY
CORE COMPONENTS

SYSTEM ARCHITECTURE

Four specialized modules engineered to work in concert, from raw audio to rendered avatar.

MODULE // 01
phonewave
Audio Processing Engine

Core Python library powering the full multimodal pipeline — real-time Voice Activity Detection, streaming ASR transcription, neural TTS synthesis, camera snapshot capture for visual context, and a unified LLM API abstraction layer with support for multiple providers.

VAD ASR TTS Vision Streaming Python
MODULE // 02
amadeus-core
Agent Runtime Engine

Central orchestration layer for the agent. Manages conversation state and context windows, routes intents through the skill pipeline, interfaces with MCP tools, triggers scheduled tasks, and maintains persistent cross-session memory.

Agent MCP Skills Memory Context Core
MODULE // 03
sern
Management Dashboard

Full-featured admin dashboard for deploying and configuring agents at runtime. Manage Live2D models, persona prompts, MCP integrations, skill modules, scheduled autonomous tasks, and multi-provider API credentials.

Deploy Live2D Prompts Scheduler Providers Admin
MODULE // 04
Nixie-UI
Frontend Interface

The user-facing frontend delivering an immersive multimodal interaction experience. Live2D avatar rendering, real-time lip-sync, and a clean conversational interface connected via WebSocket to the agent backend.

Live2D WebSocket Lip-sync Real-time Frontend
🎙️
USER VOICE
Microphone
〰️
VAD
phonewave
📝
ASR
phonewave
🧠
AMADEUS CORE
Agent + Skills + MCP + ...
🔊
TTS
phonewave
🖼️
NIXIE-UI
Live2D + Lip-sync
Real-time Voice Activity Detection Streaming ASR & Neural TTS Camera-based Visual Understanding Multi-Provider LLM Support Live2D Avatar Integration MCP Tool Ecosystem Scheduled Autonomous Tasks Persistent Agent Memory WebSocket Streaming Architecture Multi-Agent Deployment via sern Configurable Persona & Prompts Real-time Voice Activity Detection Streaming ASR & Neural TTS Camera-based Visual Understanding Multi-Provider LLM Support Live2D Avatar Integration MCP Tool Ecosystem Scheduled Autonomous Tasks Persistent Agent Memory WebSocket Streaming Architecture Multi-Agent Deployment via sern Configurable Persona & Prompts