Welcome to Clandestine Labs
Welcome to the official documentation for the Zima project.
Zima is a next-generation mobile application built with Flutter that allows users to engage in real-time, voice-driven conversations with AI-powered digital avatars. Leveraging the D-ID Streaming API, the application provides a seamless, interactive experience where users can talk to an AI agent and receive an immediate, dynamic video response.
The project is architected using Clean Architecture principles, ensuring a scalable, maintainable, and testable codebase. Key Technologies:
Framework: Flutter
State Management: Provider
Real-time Video: WebRTC via D-ID API
Real-time Transcription: Socket.IO
API Communication: Dio & HTTP
UI/Animations:
flutter_animate
,animated_text_kit
The Core Flow: From Liveness to Digital Life
The user journey is designed around security and personal identity, broken down into four key steps:
Secure Sign-Up: The platform is built with a fully implemented OTP-based authentication system to ensure secure user access. (Note: The current repository uses a username/password flow as a temporary substitute for ease of development).
Prove You're Real (Liveness Verification): To prevent spoofing and ensure every agent is tied to a real person, the user performs a liveness check. They record a short video or selfie and follow on-screen prompts to move their head. This verification is powered by a custom-built Flutter package that uses Google's ML Kit for real-time head motion detection.
Create Your Digital Clone: Once liveness is verified, the media is securely uploaded to our backend. The backend then integrates with the D-ID API to process the media and generate a unique, user-bound AI agent.
Interact with Yourself: With the digital clone created, the user can engage in a seamless, real-time conversation via a functional chat interface, using either text or voice.
Key Innovations & Technology
Custom Liveness Detection: A from-scratch Flutter package using Google's ML Kit to analyze real-time video and verify human presence.
Personal AI Agent Generation: Deep integration with the D-ID API to create personalized, user-specific agents from verified media.
Real-time Conversational Interface: A combination of WebRTC for video streaming and a Socket. IO-based service for low-latency voice transcription enables natural conversation.
Current Project Status
Authentication System: The logic for login and user management is complete.
Liveness & Agent Creation: The liveness detection package and its integration with the backend for agent creation are currently in progress.
Repository Ready: The GitHub repository is structured and ready for collaboration
Last updated