diff --git a/README.md b/README.md index 6307ff4..0642a99 100644 --- a/README.md +++ b/README.md @@ -56,7 +56,7 @@ High-level architecture diagram: Key backend runtime entry points: -- `apps/onboarding/consumers.py` for orchestration loop and WebSocket flow +- `apps/onboarding/consumers/` for orchestration loop and WebSocket flow - `apps/onboarding/mcp.py` for tool routing and backend tool execution - `apps/knowledge/tasks.py` for ingestion/chunking/embedding workflow - `gpu_server.py` for inference and embedding endpoints diff --git a/docs/deployment-topologies.md b/docs/deployment-topologies.md index f1455bf..c0fd213 100644 --- a/docs/deployment-topologies.md +++ b/docs/deployment-topologies.md @@ -26,7 +26,8 @@ Purpose: production-like separation of concerns. ## Operational Notes -- Confirm inference host/port values in runtime container env +- Confirm inference host/port/protocol values in runtime container env +- Set `INFERENCE_USERNAME` and `INFERENCE_PASSWORD` — the GPU node requires HTTP Basic Auth on all endpoints - Confirm pgvector extension is enabled in target database - Keep role flow generation permissions constrained to trusted user types diff --git a/docs/distributed-runtime-flow.md b/docs/distributed-runtime-flow.md index 3b764e5..7afbd80 100644 --- a/docs/distributed-runtime-flow.md +++ b/docs/distributed-runtime-flow.md @@ -9,8 +9,9 @@ This is the tool-facing layer that lets the model request structured actions suc Typical tool intents: - `search_knowledge(query, role_uuid)` -- `get_user_progress(user/session context)` -- `update_session_state(session_uuid, patch)` +- `update_progress(session context)` +- `get_role_context(role_uuid)` +- `list_training_files(role_uuid)` Conceptually, this layer translates model tool calls into standard Django queries and vector lookups. @@ -44,7 +45,7 @@ Using OpenAI-style request/response patterns keeps integration predictable. | Component | Typical Path / Endpoint | Role | | :--- | :--- | :--- | | MCP Surface | Internal Django tool handlers (and/or MCP endpoint) | Data/tool translation | -| Orchestrator | `apps.onboarding.consumers` | Coordination + streaming | +| Orchestrator | `apps/onboarding/consumers/` | Coordination + streaming | | GPU Inference | `gpu_server.py` HTTP endpoints | Generation + embeddings | ## Navigation