Updated readme and added subdocs
This commit is contained in:
parent
4ac57a38d0
commit
b5f25411f2
5 changed files with 358 additions and 66 deletions
269
README.md
269
README.md
|
|
@ -1,66 +1,132 @@
|
||||||
# Dynavera: Distributed Agentic Onboarding System
|
# Dynavera: Distributed Agentic Onboarding System
|
||||||
|
|
||||||
Dynavera is a multi-agent AI platform designed to automate role-specific onboarding. The system utilizes a distributed architecture to separate application logic from high-latency LLM inference, employing the Model Context Protocol (MCP) for internal data retrieval and Retrieval-Augmented Generation (RAG).
|
Dynavera is a multi-agent onboarding platform that combines role-specific training flows, retrieval from organization documents, and LLM-powered guidance. The system is intentionally distributed so that app orchestration and heavy inference can run independently.
|
||||||
|
|
||||||
|
Repository: https://git.cs.bham.ac.uk/projects-2025-26/vxn217
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Table of Contents
|
||||||
|
|
||||||
|
- [At a Glance](#at-a-glance)
|
||||||
|
- [Inspector & Supervisor Notes](#inspector--supervisor-notes)
|
||||||
|
- [Screenshots](#screenshots)
|
||||||
|
- [System Architecture (High-Level)](#system-architecture-high-level)
|
||||||
|
- [Project Goals](#project-goals)
|
||||||
|
- [Tech Stack](#tech-stack)
|
||||||
|
- [Repository Guide](#repository-guide)
|
||||||
|
- [Evaluation Credentials](#evaluation-credentials)
|
||||||
|
- [Recommended Evaluation Walkthrough](#recommended-evaluation-walkthrough)
|
||||||
|
- [Local Setup (Cross-Platform)](#local-setup-cross-platform)
|
||||||
|
- [Common Commands](#common-commands)
|
||||||
|
- [Additional Documentation](#additional-documentation)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## At a Glance
|
||||||
|
|
||||||
|
Dynavera focuses on one question: **how do we deliver onboarding that is role-aware, context-aware, and operationally practical?**
|
||||||
|
|
||||||
|
The platform does this by combining:
|
||||||
|
|
||||||
|
- A Django management layer for accounts, roles, sessions, and APIs
|
||||||
|
- An agentic orchestration loop over WebSockets for responsive interactions
|
||||||
|
- A retrieval layer using pgvector and organization-provided documents
|
||||||
|
- A GPU inference service for chat completions, embeddings, and chunking support
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Inspector & Supervisor Notes
|
||||||
|
|
||||||
|
Primary locations relevant to technical quality, architecture reasoning, and evaluation:
|
||||||
|
|
||||||
|
- Setup, context, and high-level flow: this `README.md`
|
||||||
|
- Architecture notes: `docs/`
|
||||||
|
- Orchestration runtime: `apps/onboarding/consumers.py`
|
||||||
|
- Retrieval bridge and tool routing: `apps/onboarding/mcp.py`
|
||||||
|
- Ingestion and vectorization pipeline: `apps/knowledge/tasks.py`
|
||||||
|
- Inference service entrypoint: `gpu_server.py`
|
||||||
|
|
||||||
|
Evaluation-relevant themes represented in the codebase:
|
||||||
|
|
||||||
|
- Role-scoped onboarding generation and progression
|
||||||
|
- Retrieval grounding through uploaded training files
|
||||||
|
- Separation of management services and inference services
|
||||||
|
- End-to-end flow from upload to onboarding completion
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Screenshots
|
||||||
|
|
||||||
|
Placeholder slots for final screenshots.
|
||||||
|
|
||||||
|
### Home Page
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
### Organization Page
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
### Onboarding Loading / Generation State
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
### Onboarding Content Flow
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## System Architecture (High-Level)
|
||||||
|
|
||||||
|
At a high level, Dynavera is split into a management side and an inference side. The orchestrator coordinates user interaction, tool calls, and model responses between the two.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
For the fuller architecture narrative (runtime flow and component placement), see:
|
||||||
|
|
||||||
|
- [Distributed Runtime Flow](docs/distributed-runtime-flow.md)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Project Goals
|
## Project Goals
|
||||||
|
|
||||||
- [x] Distributed Orchestration: Implementation of a dual-node system (VPS/GPU) to manage real-time user interaction and heavy computational inference independently.
|
- [x] Distributed orchestration across VPS and GPU nodes
|
||||||
|
- [x] Context-aware onboarding with RAG (semantic chunking + vector search)
|
||||||
- [x] Context-Aware Training: Development of a RAG pipeline that utilizes semantic chunking and vector similarity search to provide role-specific guidance.
|
- [x] Stateful agent workflow over WebSockets
|
||||||
|
- [x] Automated ingestion from role training documents (PDF/TXT)
|
||||||
- [x] Agentic Workflow: Utilizing an orchestrator to manage stateful conversations, tool calls, and user progress tracking via WebSockets.
|
|
||||||
|
|
||||||
- [x] Automated Ingestion: Creating a pipeline for converting raw organizational documents (PDF/TXT) into searchable vector embeddings.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## System Architecture
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
The application is split into two primary layers:
|
|
||||||
|
|
||||||
### Management Layer (VPS)
|
|
||||||
* **Framework**: Django 5.x with Django Channels for WebSocket management.
|
|
||||||
* **Database**: PostgreSQL with the pgvector extension for semantic storage.
|
|
||||||
* **Task Queue**: Celery and Redis for asynchronous document processing and ingestion.
|
|
||||||
* **Internal Routing**: `apps/onboarding/mcp.py` serves as the Model Context Protocol router, bridging the agent to the PostgreSQL vector store.
|
|
||||||
|
|
||||||
### Intelligence Layer (GPU Node)
|
|
||||||
* **Inference Server**: `gpu_server.py` (FastAPI) located in the root, exposing endpoints for LLM chat completions and embeddings.
|
|
||||||
* **Semantic Processor**: Custom logic within the inference server for smart chunking that detects topic shifts in text to optimize retrieval accuracy.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Tech Stack
|
## Tech Stack
|
||||||
|
|
||||||
* **Backend**: Django, Django REST Framework, Django Channels.
|
- **Backend**: Django, Django REST Framework, Django Channels
|
||||||
* **Frontend**: Vue 3, Vite, Pinia.
|
- **Frontend**: Vue 3, Vite, Pinia
|
||||||
* **Database**: PostgreSQL (pgvector).
|
- **Database**: PostgreSQL with pgvector
|
||||||
* **AI/ML**: FastAPI, OpenAI-compatible API structures, Sentence-Transformers.
|
- **AI/ML**: FastAPI, Sentence Transformers, llama.cpp-compatible serving
|
||||||
* **Infrastructure**: Docker, Redis, Celery.
|
- **Infra**: Docker, Redis, Celery
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Application Structure
|
## Repository Guide
|
||||||
|
|
||||||
* **apps.accounts**: Manages User, Organization, and Role models, including invite-based onboarding logic.
|
Key areas in the repo:
|
||||||
* **apps.knowledge**: Handles the RAG pipeline, including TrainingFile management and RoleRagDocument vector storage.
|
|
||||||
* **apps.onboarding**: Contains the core logic for the onboarding experience:
|
- `apps/accounts`: user model, organization/role ownership, membership flows
|
||||||
* `consumers.py`: The Agent Orchestrator managing WebSocket handshakes and session loops.
|
- `apps/knowledge`: file ingestion, chunking pipeline, vector document persistence
|
||||||
* `mcp.py`: The internal router for Model Context Protocol tool execution.
|
- `apps/onboarding`: role flows, sessions, websocket orchestration, MCP-style tool routing
|
||||||
* `models.py`: Stores AgentConfig (prompts/tools) and OnboardingSession state.
|
- `config/`: settings, API/ASGI routing, environment wiring
|
||||||
* **gpu_server.py**: The entry point for the Intelligence Layer, handling embedding generation and LLM inference.
|
- `compose/`: development and production deployment manifests
|
||||||
|
- `gpu_server.py`: inference and embedding service
|
||||||
|
|
||||||
|
For a more detailed breakdown:
|
||||||
|
|
||||||
|
- [Application Structure (Detailed)](docs/application-structure.md)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Instructions for Evaluation
|
## Evaluation Credentials
|
||||||
|
|
||||||
The system is currently pre-loaded with demonstration data from internal configuration files.
|
|
||||||
|
|
||||||
### Access Credentials
|
|
||||||
|
|
||||||
| Role | Email | Password |
|
| Role | Email | Password |
|
||||||
| :--- | :--- | :--- |
|
| :--- | :--- | :--- |
|
||||||
|
|
@ -68,36 +134,107 @@ The system is currently pre-loaded with demonstration data from internal configu
|
||||||
| **Manager** | haleisaac@example.com | password |
|
| **Manager** | haleisaac@example.com | password |
|
||||||
| **User** | j.thompson@example.com | password |
|
| **User** | j.thompson@example.com | password |
|
||||||
|
|
||||||
### Recommended Technical Walkthrough
|
Manager registration code: `MANAGER2026`
|
||||||
|
|
||||||
To verify the integration of the Knowledge Pipeline and the Agentic Orchestrator, follow these steps:
|
|
||||||
|
|
||||||
1. **Environment Setup**: Navigate to https://fyp.viswamedha.com. *
|
|
||||||
2. **Document Ingestion**: Log in as the **Manager** (haleisaac@example.com). Navigate to the **University of Birmingham** organization. Upload a PDF relevant to a specific role.
|
|
||||||
3. **Vectorization**: Observe the ingestion status. The system will extract text, send it to the GPU node for semantic chunking, and store the resulting 1536-dimension vectors in PostgreSQL.
|
|
||||||
4. **Agent Interaction**: Access the **Role Onboarding** interface. Initiate a session.
|
|
||||||
5. **Retrieval Verification**: This will query the agent regarding specific details within the uploaded PDF. The agent in `consumers.py` will trigger a tool call via `mcp.py`, retrieve the relevant document chunks, and provide a contextualized response via onboarding pages.
|
|
||||||
|
|
||||||
*Note: If the website that I hosted is not accessible, please set up the project locally by following the instructions in the Usage section below.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Usage
|
## Recommended Evaluation Walkthrough
|
||||||
|
|
||||||
1. Clone the repository.
|
1. Open https://fyp.viswamedha.com
|
||||||
2. Copy the `.env.example` file to `.env` or create a new `.env` file based on `.env.template`, and change the necessary environment variables. *
|
2. Log in as **Manager** and open the target organization
|
||||||
3. Deploy via Docker Compose: `docker compose -f compose/dev/docker-compose.yml --env-file .env up -d` in the root directory.
|
3. Upload a role-relevant document (PDF recommended)
|
||||||
4. Access the frontend at the configured port (usually `localhost:8000`).
|
4. Wait for ingestion and embedding completion
|
||||||
|
5. Start role onboarding and trigger generation
|
||||||
|
6. Check if responses are grounded in uploaded material
|
||||||
|
7. Optionally review progress details and logs
|
||||||
|
|
||||||
* Note: If you use a different secret key, when the fyp-django-dev container starts, you will need to execute the following command to reset all accounts to default passwords of "admin" for admin users and "password" for manager and user accounts:
|
If the hosted deployment is unavailable, local setup is documented below.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Local Setup (Cross-Platform)
|
||||||
|
|
||||||
|
### Prerequisites
|
||||||
|
|
||||||
|
- Docker Engine / Docker Desktop
|
||||||
|
- NVIDIA drivers + NVIDIA Container Toolkit (for GPU inference)
|
||||||
|
|
||||||
|
### 1) Clone
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone https://git.cs.bham.ac.uk/projects-2025-26/vxn217
|
||||||
|
cd vxn217
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2) Create `.env`
|
||||||
|
|
||||||
|
**PowerShell**
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
Copy-Item .env.template .env
|
||||||
|
```
|
||||||
|
|
||||||
|
**CMD**
|
||||||
|
|
||||||
|
```cmd
|
||||||
|
copy .env.template .env
|
||||||
|
```
|
||||||
|
|
||||||
|
**macOS/Linux**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cp .env.template .env
|
||||||
|
```
|
||||||
|
|
||||||
|
Then update `.env` values for your environment.
|
||||||
|
|
||||||
|
### 3) Start services (development)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose -f compose/dev/docker-compose.yml --env-file .env up -d --build
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4) Access endpoints
|
||||||
|
|
||||||
|
- App: http://localhost:8000
|
||||||
|
|
||||||
|
### 5) Optional: reset seeded passwords
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker exec -it fyp-django-dev python manage.py reset_passwords
|
docker exec -it fyp-django-dev python manage.py reset_passwords
|
||||||
```
|
```
|
||||||
|
|
||||||
### Warnings
|
Reset defaults:
|
||||||
|
|
||||||
* The development compose is used here to allow HMR and easier debugging. Please only use this file.
|
- Admin users: `admin`
|
||||||
* Ensure that a GPU is available and CUDA drivers are properly installed for the inference server to function.
|
- Manager and user accounts: `password`
|
||||||
* I have tested this on an RTX 3060 with 12GB VRAM, so I am not sure if it will work on other GPUs.
|
|
||||||
* There is no guarantee that it will load on a CPU-only machine as the batch size and model parameters are configured for GPU inference.
|
---
|
||||||
|
|
||||||
|
## Common Commands
|
||||||
|
|
||||||
|
Stop services:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose -f compose/dev/docker-compose.yml --env-file .env down
|
||||||
|
```
|
||||||
|
|
||||||
|
Tail logs:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose -f compose/dev/docker-compose.yml --env-file .env logs -f
|
||||||
|
```
|
||||||
|
|
||||||
|
Run migrations:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker exec -it fyp-django-dev python manage.py migrate
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Additional Documentation
|
||||||
|
|
||||||
|
- [Distributed Runtime Flow](docs/distributed-runtime-flow.md)
|
||||||
|
- [Application Structure (Detailed)](docs/application-structure.md)
|
||||||
|
- [Deployment Topologies](docs/deployment-topologies.md)
|
||||||
|
|
|
||||||
64
docs/application-structure.md
Normal file
64
docs/application-structure.md
Normal file
|
|
@ -0,0 +1,64 @@
|
||||||
|
# Application Structure (Detailed)
|
||||||
|
|
||||||
|
This page expands on where responsibilities live in the codebase.
|
||||||
|
|
||||||
|
## Core Apps
|
||||||
|
|
||||||
|
### `apps.accounts`
|
||||||
|
|
||||||
|
Handles identity and tenancy concerns:
|
||||||
|
|
||||||
|
- User model and role flags
|
||||||
|
- Organization ownership and membership
|
||||||
|
- Role assignment and invite flows
|
||||||
|
|
||||||
|
### `apps.knowledge`
|
||||||
|
|
||||||
|
Handles ingestion and retrieval data prep:
|
||||||
|
|
||||||
|
- Upload and tracking of training files
|
||||||
|
- Content extraction and chunking pipeline
|
||||||
|
- Embedding persistence in role-scoped vector documents
|
||||||
|
|
||||||
|
### `apps.onboarding`
|
||||||
|
|
||||||
|
Handles the agentic onboarding runtime:
|
||||||
|
|
||||||
|
- Session and flow models
|
||||||
|
- WebSocket consumer orchestrator
|
||||||
|
- Tool routing (MCP-style handler)
|
||||||
|
- Flow/session APIs for frontend integration
|
||||||
|
|
||||||
|
## Infrastructure Modules
|
||||||
|
|
||||||
|
### `config/*`
|
||||||
|
|
||||||
|
Framework-level config and wiring:
|
||||||
|
|
||||||
|
- Django settings
|
||||||
|
- URL/API routing
|
||||||
|
- ASGI/Channels entry points
|
||||||
|
- Celery config
|
||||||
|
|
||||||
|
### `compose/*`
|
||||||
|
|
||||||
|
Environment-specific deployment configuration:
|
||||||
|
|
||||||
|
- Development compose stack
|
||||||
|
- Production compose stack
|
||||||
|
- Inference compose profile
|
||||||
|
|
||||||
|
### `gpu_server.py`
|
||||||
|
|
||||||
|
Inference service entry point:
|
||||||
|
|
||||||
|
- Chat completions endpoint
|
||||||
|
- Embeddings endpoint
|
||||||
|
- Semantic chunking endpoint
|
||||||
|
- Health checks and model lifecycle
|
||||||
|
|
||||||
|
## Navigation
|
||||||
|
|
||||||
|
- [Distributed Runtime Flow](distributed-runtime-flow.md)
|
||||||
|
- [Deployment Topologies](deployment-topologies.md)
|
||||||
|
- [Project README](../README.md)
|
||||||
37
docs/deployment-topologies.md
Normal file
37
docs/deployment-topologies.md
Normal file
|
|
@ -0,0 +1,37 @@
|
||||||
|
# Deployment Topologies
|
||||||
|
|
||||||
|
This page compares local and distributed deployment shapes.
|
||||||
|
|
||||||
|
## Local Development Topology
|
||||||
|
|
||||||
|
Purpose: fast iteration and debugging.
|
||||||
|
|
||||||
|
- App services run via `compose/dev/docker-compose.yml`
|
||||||
|
- Django, Celery, Redis, Postgres, Node, and inference can run together
|
||||||
|
- Suitable for feature work and integration checks
|
||||||
|
|
||||||
|
## Distributed Topology (VPS + GPU Node)
|
||||||
|
|
||||||
|
Purpose: production-like separation of concerns.
|
||||||
|
|
||||||
|
- **VPS node**: web app, orchestration, API, websocket handling, task queue, database
|
||||||
|
- **GPU node**: dedicated inference service (chat + embeddings + chunking helpers)
|
||||||
|
- Request direction is primarily **VPS -> GPU** for model tasks
|
||||||
|
|
||||||
|
## Why Split Nodes?
|
||||||
|
|
||||||
|
- Keeps model latency/VRAM pressure away from user/session services
|
||||||
|
- Allows independent scaling of orchestration and inference
|
||||||
|
- Improves operational clarity around failures and bottlenecks
|
||||||
|
|
||||||
|
## Operational Notes
|
||||||
|
|
||||||
|
- Confirm inference host/port values in runtime container env
|
||||||
|
- Confirm pgvector extension is enabled in target database
|
||||||
|
- Keep role flow generation permissions constrained to trusted user types
|
||||||
|
|
||||||
|
## Navigation
|
||||||
|
|
||||||
|
- [Distributed Runtime Flow](distributed-runtime-flow.md)
|
||||||
|
- [Application Structure (Detailed)](application-structure.md)
|
||||||
|
- [Project README](../README.md)
|
||||||
54
docs/distributed-runtime-flow.md
Normal file
54
docs/distributed-runtime-flow.md
Normal file
|
|
@ -0,0 +1,54 @@
|
||||||
|
# Distributed Runtime Flow
|
||||||
|
|
||||||
|
Dynavera behaves like a streaming agentic system rather than a simple CRUD app. Runtime responsibility is split into three buckets.
|
||||||
|
|
||||||
|
## 1) MCP Surface (Django-side tool layer)
|
||||||
|
|
||||||
|
This is the tool-facing layer that lets the model request structured actions such as retrieval and session updates.
|
||||||
|
|
||||||
|
Typical tool intents:
|
||||||
|
|
||||||
|
- `search_knowledge(query, role_uuid)`
|
||||||
|
- `get_user_progress(user/session context)`
|
||||||
|
- `update_session_state(session_uuid, patch)`
|
||||||
|
|
||||||
|
Conceptually, this layer translates model tool calls into standard Django queries and vector lookups.
|
||||||
|
|
||||||
|
## 2) Orchestrator (Channels consumer + async control loop)
|
||||||
|
|
||||||
|
The orchestrator lives in the WebSocket runtime and coordinates each user request lifecycle.
|
||||||
|
|
||||||
|
Typical interaction path:
|
||||||
|
|
||||||
|
1. User sends message over WebSocket
|
||||||
|
2. Orchestrator builds/updates context
|
||||||
|
3. Orchestrator calls inference endpoint
|
||||||
|
4. Model requests tool calls when needed
|
||||||
|
5. Orchestrator executes tool calls and continues generation
|
||||||
|
6. Streamed/assembled response returns to user
|
||||||
|
|
||||||
|
This is the central control plane for session continuity, tool usage, and response streaming.
|
||||||
|
|
||||||
|
## 3) GPU Inference Pipe (passive engine)
|
||||||
|
|
||||||
|
The GPU service is designed as a passive inference engine:
|
||||||
|
|
||||||
|
- Receives prompts/inference payloads
|
||||||
|
- Produces chat/embedding outputs
|
||||||
|
- Does not initiate calls back into the VPS
|
||||||
|
|
||||||
|
Using OpenAI-style request/response patterns keeps integration predictable.
|
||||||
|
|
||||||
|
## Interface Summary
|
||||||
|
|
||||||
|
| Component | Typical Path / Endpoint | Role |
|
||||||
|
| :--- | :--- | :--- |
|
||||||
|
| MCP Surface | Internal Django tool handlers (and/or MCP endpoint) | Data/tool translation |
|
||||||
|
| Orchestrator | `apps.onboarding.consumers` | Coordination + streaming |
|
||||||
|
| GPU Inference | `gpu_server.py` HTTP endpoints | Generation + embeddings |
|
||||||
|
|
||||||
|
## Navigation
|
||||||
|
|
||||||
|
- [Application Structure (Detailed)](application-structure.md)
|
||||||
|
- [Deployment Topologies](deployment-topologies.md)
|
||||||
|
- [Project README](../README.md)
|
||||||
BIN
docs/high-level-system-architecture.png
Normal file
BIN
docs/high-level-system-architecture.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 67 KiB |
Loading…
Reference in a new issue