diff --git a/README.md b/README.md
index 6307ff4..0642a99 100644
--- a/README.md
+++ b/README.md
@@ -56,7 +56,7 @@ High-level architecture diagram:
 
 Key backend runtime entry points:
 
-- `apps/onboarding/consumers.py` for orchestration loop and WebSocket flow
+- `apps/onboarding/consumers/` for orchestration loop and WebSocket flow
 - `apps/onboarding/mcp.py` for tool routing and backend tool execution
 - `apps/knowledge/tasks.py` for ingestion/chunking/embedding workflow
 - `gpu_server.py` for inference and embedding endpoints
diff --git a/docs/deployment-topologies.md b/docs/deployment-topologies.md
index f1455bf..c0fd213 100644
--- a/docs/deployment-topologies.md
+++ b/docs/deployment-topologies.md
@@ -26,7 +26,8 @@ Purpose: production-like separation of concerns.
 
 ## Operational Notes
 
-- Confirm inference host/port values in runtime container env
+- Confirm inference host/port/protocol values in runtime container env
+- Set `INFERENCE_USERNAME` and `INFERENCE_PASSWORD` — the GPU node requires HTTP Basic Auth on all endpoints
 - Confirm pgvector extension is enabled in target database
 - Keep role flow generation permissions constrained to trusted user types
 
diff --git a/docs/distributed-runtime-flow.md b/docs/distributed-runtime-flow.md
index 3b764e5..7afbd80 100644
--- a/docs/distributed-runtime-flow.md
+++ b/docs/distributed-runtime-flow.md
@@ -9,8 +9,9 @@ This is the tool-facing layer that lets the model request structured actions suc
 Typical tool intents:
 
 - `search_knowledge(query, role_uuid)`
-- `get_user_progress(user/session context)`
-- `update_session_state(session_uuid, patch)`
+- `update_progress(session context)`
+- `get_role_context(role_uuid)`
+- `list_training_files(role_uuid)`
 
 Conceptually, this layer translates model tool calls into standard Django queries and vector lookups.
 
@@ -44,7 +45,7 @@ Using OpenAI-style request/response patterns keeps integration predictable.
 | Component | Typical Path / Endpoint | Role |
 | :--- | :--- | :--- |
 | MCP Surface | Internal Django tool handlers (and/or MCP endpoint) | Data/tool translation |
-| Orchestrator | `apps.onboarding.consumers` | Coordination + streaming |
+| Orchestrator | `apps/onboarding/consumers/` | Coordination + streaming |
 | GPU Inference | `gpu_server.py` HTTP endpoints | Generation + embeddings |
 
 ## Navigation