Updated readme and added subdocs

2026-02-27 02:09:54 +00:00 · 2026-02-27 02:09:54 +00:00 · b5f25411f2
commit b5f25411f2
parent 4ac57a38d0
5 changed files with 358 additions and 66 deletions
--- a/README.md
+++ b/README.md
@ -1,66 +1,132 @@
 # Dynavera: Distributed Agentic Onboarding System
-Dynavera is a multi-agent AI platform designed to automate role-specific onboarding. The system utilizes a distributed architecture to separate application logic from high-latency LLM inference, employing the Model Context Protocol (MCP) for internal data retrieval and Retrieval-Augmented Generation (RAG).
+Dynavera is a multi-agent onboarding platform that combines role-specific training flows, retrieval from organization documents, and LLM-powered guidance. The system is intentionally distributed so that app orchestration and heavy inference can run independently.
 Repository: https://git.cs.bham.ac.uk/projects-2025-26/vxn217
 ---
 ## Table of Contents
 - [At a Glance](#at-a-glance)
 - [Inspector & Supervisor Notes](#inspector--supervisor-notes)
 - [Screenshots](#screenshots)
 - [System Architecture (High-Level)](#system-architecture-high-level)
 - [Project Goals](#project-goals)
 - [Tech Stack](#tech-stack)
 - [Repository Guide](#repository-guide)
 - [Evaluation Credentials](#evaluation-credentials)
 - [Recommended Evaluation Walkthrough](#recommended-evaluation-walkthrough)
 - [Local Setup (Cross-Platform)](#local-setup-cross-platform)
 - [Common Commands](#common-commands)
 - [Additional Documentation](#additional-documentation)
 ---
 ## At a Glance
 Dynavera focuses on one question: **how do we deliver onboarding that is role-aware, context-aware, and operationally practical?**
 The platform does this by combining:
 - A Django management layer for accounts, roles, sessions, and APIs
 - An agentic orchestration loop over WebSockets for responsive interactions
 - A retrieval layer using pgvector and organization-provided documents
 - A GPU inference service for chat completions, embeddings, and chunking support
 ---
 ## Inspector & Supervisor Notes
 Primary locations relevant to technical quality, architecture reasoning, and evaluation:
 - Setup, context, and high-level flow: this `README.md`
 - Architecture notes: `docs/`
 - Orchestration runtime: `apps/onboarding/consumers.py`
 - Retrieval bridge and tool routing: `apps/onboarding/mcp.py`
 - Ingestion and vectorization pipeline: `apps/knowledge/tasks.py`
 - Inference service entrypoint: `gpu_server.py`
 Evaluation-relevant themes represented in the codebase:
 - Role-scoped onboarding generation and progression
 - Retrieval grounding through uploaded training files
 - Separation of management services and inference services
 - End-to-end flow from upload to onboarding completion
 ---
 ## Screenshots
 Placeholder slots for final screenshots.
 ### Home Page
 ![Home Page Placeholder](docs/images/home-page-placeholder.png)
 ### Organization Page
 ![Organization Page Placeholder](docs/images/organization-page-placeholder.png)
 ### Onboarding Loading / Generation State
 ![Onboarding Loading Placeholder](docs/images/onboarding-loading-placeholder.png)
 ### Onboarding Content Flow
 ![Onboarding Flow Placeholder](docs/images/onboarding-flow-placeholder.png)
 ---
 ## System Architecture (High-Level)
 At a high level, Dynavera is split into a management side and an inference side. The orchestrator coordinates user interaction, tool calls, and model responses between the two.
 ![High Level System Architecture](docs/high-level-system-architecture.png)
 For the fuller architecture narrative (runtime flow and component placement), see:
 - [Distributed Runtime Flow](docs/distributed-runtime-flow.md)
 ---
 ## Project Goals
- [x] Distributed Orchestration: Implementation of a dual-node system (VPS/GPU) to manage real-time user interaction and heavy computational inference independently.
+- [x] Distributed orchestration across VPS and GPU nodes
-
+- [x] Context-aware onboarding with RAG (semantic chunking + vector search)
- [x] Context-Aware Training: Development of a RAG pipeline that utilizes semantic chunking and vector similarity search to provide role-specific guidance.
+- [x] Stateful agent workflow over WebSockets
-
+- [x] Automated ingestion from role training documents (PDF/TXT)
 - [x] Agentic Workflow: Utilizing an orchestrator to manage stateful conversations, tool calls, and user progress tracking via WebSockets.
 - [x] Automated Ingestion: Creating a pipeline for converting raw organizational documents (PDF/TXT) into searchable vector embeddings.
 ---
 ## System Architecture
 The application is split into two primary layers:
 ### Management Layer (VPS)
 * **Framework**: Django 5.x with Django Channels for WebSocket management.
 * **Database**: PostgreSQL with the pgvector extension for semantic storage.
 * **Task Queue**: Celery and Redis for asynchronous document processing and ingestion.
 * **Internal Routing**: `apps/onboarding/mcp.py` serves as the Model Context Protocol router, bridging the agent to the PostgreSQL vector store.
 ### Intelligence Layer (GPU Node)
 * **Inference Server**: `gpu_server.py` (FastAPI) located in the root, exposing endpoints for LLM chat completions and embeddings.
 * **Semantic Processor**: Custom logic within the inference server for smart chunking that detects topic shifts in text to optimize retrieval accuracy.
 ---
 ## Tech Stack
-* **Backend**: Django, Django REST Framework, Django Channels.
+- **Backend**: Django, Django REST Framework, Django Channels
-* **Frontend**: Vue 3, Vite, Pinia.
+- **Frontend**: Vue 3, Vite, Pinia
-* **Database**: PostgreSQL (pgvector).
+- **Database**: PostgreSQL with pgvector
-* **AI/ML**: FastAPI, OpenAI-compatible API structures, Sentence-Transformers.
+- **AI/ML**: FastAPI, Sentence Transformers, llama.cpp-compatible serving
-* **Infrastructure**: Docker, Redis, Celery.
+- **Infra**: Docker, Redis, Celery
 ---
-## Application Structure
+## Repository Guide
-* **apps.accounts**: Manages User, Organization, and Role models, including invite-based onboarding logic.
+Key areas in the repo:
-* **apps.knowledge**: Handles the RAG pipeline, including TrainingFile management and RoleRagDocument vector storage.
+
-* **apps.onboarding**: Contains the core logic for the onboarding experience:
+- `apps/accounts`: user model, organization/role ownership, membership flows
-    * `consumers.py`: The Agent Orchestrator managing WebSocket handshakes and session loops.
+- `apps/knowledge`: file ingestion, chunking pipeline, vector document persistence
-    * `mcp.py`: The internal router for Model Context Protocol tool execution.
+- `apps/onboarding`: role flows, sessions, websocket orchestration, MCP-style tool routing
-    * `models.py`: Stores AgentConfig (prompts/tools) and OnboardingSession state.
+- `config/`: settings, API/ASGI routing, environment wiring
-* **gpu_server.py**: The entry point for the Intelligence Layer, handling embedding generation and LLM inference.
+- `compose/`: development and production deployment manifests
 - `gpu_server.py`: inference and embedding service
 For a more detailed breakdown:
 - [Application Structure (Detailed)](docs/application-structure.md)
 ---
-## Instructions for Evaluation
+## Evaluation Credentials
 The system is currently pre-loaded with demonstration data from internal configuration files.
 ### Access Credentials
 | Role | Email | Password |
 | :--- | :--- | :--- |
@ -68,36 +134,107 @@ The system is currently pre-loaded with demonstration data from internal configu
 | **Manager** | haleisaac@example.com | password |
 | **User** | j.thompson@example.com | password |
-### Recommended Technical Walkthrough
+Manager registration code: `MANAGER2026`
 To verify the integration of the Knowledge Pipeline and the Agentic Orchestrator, follow these steps:
 1. **Environment Setup**: Navigate to https://fyp.viswamedha.com. *
 2. **Document Ingestion**: Log in as the **Manager** (haleisaac@example.com). Navigate to the **University of Birmingham** organization. Upload a PDF relevant to a specific role.
 3. **Vectorization**: Observe the ingestion status. The system will extract text, send it to the GPU node for semantic chunking, and store the resulting 1536-dimension vectors in PostgreSQL.
 4. **Agent Interaction**: Access the **Role Onboarding** interface. Initiate a session.
 5. **Retrieval Verification**: This will query the agent regarding specific details within the uploaded PDF. The agent in `consumers.py` will trigger a tool call via `mcp.py`, retrieve the relevant document chunks, and provide a contextualized response via onboarding pages.
 *Note: If the website that I hosted is not accessible, please set up the project locally by following the instructions in the Usage section below.
 ---
-## Usage
+## Recommended Evaluation Walkthrough
-1. Clone the repository.
+1. Open https://fyp.viswamedha.com
-2. Copy the `.env.example` file to `.env` or create a new `.env` file based on `.env.template`, and change the necessary environment variables. *
+2. Log in as **Manager** and open the target organization
-3. Deploy via Docker Compose: `docker compose -f compose/dev/docker-compose.yml --env-file .env up -d` in the root directory.
+3. Upload a role-relevant document (PDF recommended)
-4. Access the frontend at the configured port (usually `localhost:8000`).
+4. Wait for ingestion and embedding completion
 5. Start role onboarding and trigger generation
 6. Check if responses are grounded in uploaded material
 7. Optionally review progress details and logs
-* Note: If you use a different secret key, when the fyp-django-dev container starts, you will need to execute the following command to reset all accounts to default passwords of "admin" for admin users and "password" for manager and user accounts:
+If the hosted deployment is unavailable, local setup is documented below.
 ---
 ## Local Setup (Cross-Platform)
 ### Prerequisites
 - Docker Engine / Docker Desktop
 - NVIDIA drivers + NVIDIA Container Toolkit (for GPU inference)
 ### 1) Clone
 ```bash
 git clone https://git.cs.bham.ac.uk/projects-2025-26/vxn217
 cd vxn217
 ```
 ### 2) Create `.env`
 **PowerShell**
 ```powershell
 Copy-Item .env.template .env
 ```
 **CMD**
 ```cmd
 copy .env.template .env
 ```
 **macOS/Linux**
 ```bash
 cp .env.template .env
 ```
 Then update `.env` values for your environment.
 ### 3) Start services (development)
 ```bash
 docker compose -f compose/dev/docker-compose.yml --env-file .env up -d --build
 ```
 ### 4) Access endpoints
 - App: http://localhost:8000
 ### 5) Optional: reset seeded passwords
 ```bash
 docker exec -it fyp-django-dev python manage.py reset_passwords
 ```
-### Warnings
+Reset defaults:
-* The development compose is used here to allow HMR and easier debugging. Please only use this file.
+- Admin users: `admin`
-* Ensure that a GPU is available and CUDA drivers are properly installed for the inference server to function.
+- Manager and user accounts: `password`
-* I have tested this on an RTX 3060 with 12GB VRAM, so I am not sure if it will work on other GPUs. 
+
-* There is no guarantee that it will load on a CPU-only machine as the batch size and model parameters are configured for GPU inference.
+---
 ## Common Commands
 Stop services:
 ```bash
 docker compose -f compose/dev/docker-compose.yml --env-file .env down
 ```
 Tail logs:
 ```bash
 docker compose -f compose/dev/docker-compose.yml --env-file .env logs -f
 ```
 Run migrations:
 ```bash
 docker exec -it fyp-django-dev python manage.py migrate
 ```
 ---
 ## Additional Documentation
 - [Distributed Runtime Flow](docs/distributed-runtime-flow.md)
 - [Application Structure (Detailed)](docs/application-structure.md)
 - [Deployment Topologies](docs/deployment-topologies.md)
--- a/docs/application-structure.md
+++ b/docs/application-structure.md
@ -0,0 +1,64 @@
 # Application Structure (Detailed)
 This page expands on where responsibilities live in the codebase.
 ## Core Apps
 ### `apps.accounts`
 Handles identity and tenancy concerns:
 - User model and role flags
 - Organization ownership and membership
 - Role assignment and invite flows
 ### `apps.knowledge`
 Handles ingestion and retrieval data prep:
 - Upload and tracking of training files
 - Content extraction and chunking pipeline
 - Embedding persistence in role-scoped vector documents
 ### `apps.onboarding`
 Handles the agentic onboarding runtime:
 - Session and flow models
 - WebSocket consumer orchestrator
 - Tool routing (MCP-style handler)
 - Flow/session APIs for frontend integration
 ## Infrastructure Modules
 ### `config/*`
 Framework-level config and wiring:
 - Django settings
 - URL/API routing
 - ASGI/Channels entry points
 - Celery config
 ### `compose/*`
 Environment-specific deployment configuration:
 - Development compose stack
 - Production compose stack
 - Inference compose profile
 ### `gpu_server.py`
 Inference service entry point:
 - Chat completions endpoint
 - Embeddings endpoint
 - Semantic chunking endpoint
 - Health checks and model lifecycle
 ## Navigation
 - [Distributed Runtime Flow](distributed-runtime-flow.md)
 - [Deployment Topologies](deployment-topologies.md)
 - [Project README](../README.md)
--- a/docs/deployment-topologies.md
+++ b/docs/deployment-topologies.md
@ -0,0 +1,37 @@
 # Deployment Topologies
 This page compares local and distributed deployment shapes.
 ## Local Development Topology
 Purpose: fast iteration and debugging.
 - App services run via `compose/dev/docker-compose.yml`
 - Django, Celery, Redis, Postgres, Node, and inference can run together
 - Suitable for feature work and integration checks
 ## Distributed Topology (VPS + GPU Node)
 Purpose: production-like separation of concerns.
 - **VPS node**: web app, orchestration, API, websocket handling, task queue, database
 - **GPU node**: dedicated inference service (chat + embeddings + chunking helpers)
 - Request direction is primarily **VPS -> GPU** for model tasks
 ## Why Split Nodes?
 - Keeps model latency/VRAM pressure away from user/session services
 - Allows independent scaling of orchestration and inference
 - Improves operational clarity around failures and bottlenecks
 ## Operational Notes
 - Confirm inference host/port values in runtime container env
 - Confirm pgvector extension is enabled in target database
 - Keep role flow generation permissions constrained to trusted user types
 ## Navigation
 - [Distributed Runtime Flow](distributed-runtime-flow.md)
 - [Application Structure (Detailed)](application-structure.md)
 - [Project README](../README.md)
--- a/docs/distributed-runtime-flow.md
+++ b/docs/distributed-runtime-flow.md
@ -0,0 +1,54 @@
 # Distributed Runtime Flow
 Dynavera behaves like a streaming agentic system rather than a simple CRUD app. Runtime responsibility is split into three buckets.
 ## 1) MCP Surface (Django-side tool layer)
 This is the tool-facing layer that lets the model request structured actions such as retrieval and session updates.
 Typical tool intents:
 - `search_knowledge(query, role_uuid)`
 - `get_user_progress(user/session context)`
 - `update_session_state(session_uuid, patch)`
 Conceptually, this layer translates model tool calls into standard Django queries and vector lookups.
 ## 2) Orchestrator (Channels consumer + async control loop)
 The orchestrator lives in the WebSocket runtime and coordinates each user request lifecycle.
 Typical interaction path:
 1. User sends message over WebSocket
 2. Orchestrator builds/updates context
 3. Orchestrator calls inference endpoint
 4. Model requests tool calls when needed
 5. Orchestrator executes tool calls and continues generation
 6. Streamed/assembled response returns to user
 This is the central control plane for session continuity, tool usage, and response streaming.
 ## 3) GPU Inference Pipe (passive engine)
 The GPU service is designed as a passive inference engine:
 - Receives prompts/inference payloads
 - Produces chat/embedding outputs
 - Does not initiate calls back into the VPS
 Using OpenAI-style request/response patterns keeps integration predictable.
 ## Interface Summary
 | Component | Typical Path / Endpoint | Role |
 | :--- | :--- | :--- |
 | MCP Surface | Internal Django tool handlers (and/or MCP endpoint) | Data/tool translation |
 | Orchestrator | `apps.onboarding.consumers` | Coordination + streaming |
 | GPU Inference | `gpu_server.py` HTTP endpoints | Generation + embeddings |
 ## Navigation
 - [Application Structure (Detailed)](application-structure.md)
 - [Deployment Topologies](deployment-topologies.md)
 - [Project README](../README.md)
--- a/docs/high-level-system-architecture.png
+++ b/docs/high-level-system-architecture.png