# Distributed Runtime Flow Dynavera behaves like a streaming agentic system rather than a simple CRUD app. Runtime responsibility is split into three buckets. ## 1) MCP Surface (Django-side tool layer) This is the tool-facing layer that lets the model request structured actions such as retrieval and session updates. Typical tool intents: - `search_knowledge(query, role_uuid)` - `update_progress(session context)` - `get_role_context(role_uuid)` - `list_training_files(role_uuid)` Conceptually, this layer translates model tool calls into standard Django queries and vector lookups. ## 2) Orchestrator (Channels consumer + async control loop) The orchestrator lives in the WebSocket runtime and coordinates each user request lifecycle. Typical interaction path: 1. User sends message over WebSocket 2. Orchestrator builds/updates context 3. Orchestrator calls inference endpoint 4. Model requests tool calls when needed 5. Orchestrator executes tool calls and continues generation 6. Streamed/assembled response returns to user This is the central control plane for session continuity, tool usage, and response streaming. ## 3) GPU Inference Pipe (passive engine) The GPU service is designed as a passive inference engine: - Receives prompts/inference payloads - Produces chat/embedding outputs - Does not initiate calls back into the VPS Using OpenAI-style request/response patterns keeps integration predictable. ## Interface Summary | Component | Typical Path / Endpoint | Role | | :--- | :--- | :--- | | MCP Surface | Internal Django tool handlers (and/or MCP endpoint) | Data/tool translation | | Orchestrator | `apps/onboarding/consumers/` | Coordination + streaming | | GPU Inference | `gpu_server.py` HTTP endpoints | Generation + embeddings | ## Navigation - [Application Structure (Detailed)](application-structure.md) - [Deployment Topologies](deployment-topologies.md) - [Project README](../README.md)