CORTEX v2.0 — SPRINT PLAN

6 Sprints | 12 Weeks | Reset from Sprint 13

Created: 03/03/2026 Per sprint: 2 weeks Total duration: 12 weeks (03/03 – 26/05/2026)

Timeline Overview

Week 1-2    Week 3-4    Week 5-6    Week 7-8    Week 9-10   Week 11-12
+--------+  +--------+  +--------+  +--------+  +--------+  +--------+
|Sprint13|  |Sprint14|  |Sprint15|  |Sprint16|  |Sprint17|  |Sprint18|
|Memory  |->|Skills  |->|RAG v2  |  |Self-   |  |Efficien|  |Agent   |
|Archit. |  |+MCP    |  |GraphRAG|  |Learning|  |cy Eng. |  |Mode    |
+--------+  +--------+  +--------+  +--------+  +--------+  +--------+
                |           |           |           |           |
                +-----------+-----------+-----------+-----------+
                    All depend on Sprint 14 (Skill Registry)

Sprint 13: Memory Architecture (Weeks 1–2)

Goal

Build the multi-tier memory system (Letta/MemGPT inspired) to replace the current nano-brain. This is the FOUNDATION for everything else — agents need memory to learn, remember, and improve.

Task Breakdown

#	Task	File(s)	Effort	Status
13.1	Design Memory interfaces and types	`electron/services/memory/types.ts`	2h	-
13.2	Core Memory service (system prompt, user prefs, project ctx)	`electron/services/memory/core-memory.ts`	1d	-
13.3	Archival Memory service (long-term, vector-searchable)	`electron/services/memory/archival-memory.ts`	2d	-
13.4	Recall Memory service (conversation history + search)	`electron/services/memory/recall-memory.ts`	1d	-
13.5	Memory Manager (orchestrate 3 tiers, load/save context)	`electron/services/memory/memory-manager.ts`	2d	-
13.6	SQLite schema for memory tables	`electron/services/memory/memory-db.ts`	4h	-
13.7	Migrate existing nano-brain data to new schema	`electron/services/memory/migration.ts`	1d	-
13.8	Memory Dashboard UI (display 3 tiers, search, stats)	`src/components/memory/MemoryDashboard.tsx`	1d	-
13.9	Memory Editor UI (manually edit core memory)	`src/components/memory/MemoryEditor.tsx`	4h	-
13.10	IPC handlers for memory operations	`electron/main.ts` (add handlers)	4h	-
13.11	Unit tests for memory services	`tests/unit/memory/*.test.ts`	1d	-
13.12	Integration test: full memory lifecycle	`tests/unit/memory-integration.test.ts`	4h	-

SQLite Schema

-- Core Memory (always in context, ~2000 tokens)
CREATE TABLE core_memory (
  id TEXT PRIMARY KEY,
  project_id TEXT NOT NULL,
  section TEXT NOT NULL, -- 'user_profile' | 'project_context' | 'preferences'
  content TEXT NOT NULL,
  updated_at INTEGER NOT NULL,
  UNIQUE(project_id, section)
);

-- Archival Memory (long-term, unlimited)
CREATE TABLE archival_memory (
  id TEXT PRIMARY KEY,
  project_id TEXT NOT NULL,
  content TEXT NOT NULL,
  embedding BLOB, -- vector embedding for search
  metadata TEXT, -- JSON: source, type, tags
  created_at INTEGER NOT NULL,
  accessed_at INTEGER NOT NULL,
  access_count INTEGER DEFAULT 0,
  relevance_score REAL DEFAULT 1.0 -- decays over time
);
CREATE INDEX idx_archival_project ON archival_memory(project_id);

-- Recall Memory (conversation history)
CREATE TABLE recall_memory (
  id TEXT PRIMARY KEY,
  project_id TEXT NOT NULL,
  conversation_id TEXT NOT NULL,
  role TEXT NOT NULL, -- 'user' | 'assistant'
  content TEXT NOT NULL,
  embedding BLOB,
  timestamp INTEGER NOT NULL
);
CREATE INDEX idx_recall_project ON recall_memory(project_id);
CREATE INDEX idx_recall_conv ON recall_memory(conversation_id);

Acceptance Criteria

Core memory saves and loads user preferences, project context
Archival memory supports vector search (similarity > 0.8)
Recall memory indexes conversation history with semantic search
Memory Manager correctly routes read/write to the right tier
UI displays 3 tiers with search
Nano-brain data migrated successfully (0 data loss)
All tests passing (unit + integration)
Memory read latency < 50ms, search latency < 200ms

Risks

Sprint 14: Skill Registry + MCP Integration (Weeks 3–4)

Goal

Build the Skill Registry — a plugin system that allows loading/unloading AI skills. Integrate the MCP protocol for connecting to external tools. Wrap existing services (agentic-rag, brain-engine, etc.) as skills.

Task Breakdown

#	Task	File(s)	Effort
14.1	Skill interface + types	`electron/services/skills/types.ts`	2h
14.2	Skill Registry (register, load, activate, deactivate, list)	`electron/services/skills/skill-registry.ts`	2d
14.3	Skill Loader (dynamic import from directory)	`electron/services/skills/skill-loader.ts`	1d
14.4	Skill Router (classify intent, route to best skill)	`electron/services/skills/skill-router.ts`	2d
14.5	MCP Client implementation	`electron/services/skills/mcp/mcp-client.ts`	2d
14.6	MCP Skill Adapter (wrap MCP server as CortexSkill)	`electron/services/skills/mcp/mcp-adapter.ts`	1d
14.7	Builtin: RAG Skill (wrap agentic-rag.ts)	`electron/services/skills/builtin/rag-skill.ts`	4h
14.8	Builtin: Code Analysis Skill	`electron/services/skills/builtin/code-analysis-skill.ts`	4h
14.9	Builtin: Chat Skill (core conversation)	`electron/services/skills/builtin/chat-skill.ts`	4h
14.10	Builtin: Memory Skill (wrap memory manager)	`electron/services/skills/builtin/memory-skill.ts`	4h
14.11	Skill Manager UI	`src/components/skills/SkillManager.tsx`	1d
14.12	Skill Config UI (per-skill settings)	`src/components/skills/SkillConfig.tsx`	4h
14.13	skillStore.ts (Zustand)	`src/stores/skillStore.ts`	4h
14.14	IPC handlers for skill operations	`electron/main.ts` (extend)	4h
14.15	Tests	`tests/unit/skills/*.test.ts`	1d

Acceptance Criteria

>= 4 built-in skills loaded and working
Skill Router correctly routes queries to the right skill
MCP client connects to at least 1 external MCP server
UI displays skill list with status (active/inactive/error)
Skills can call each other (composition)
Hot-reload: add skill without restarting the app
All tests passing

Sprint 15: Advanced RAG Pipeline (Weeks 5–6)

Goal

Upgrade RAG from simple hybrid search to a multi-strategy pipeline. GraphRAG + RAG Fusion + Contextual Retrieval (3 P0 skills) are the priority.

Task Breakdown

#	Task	File(s)	Effort
15.1	Knowledge Graph Builder (entity extraction from code)	`electron/services/skills/rag/graph-builder.ts`	3d
15.2	Graph storage (SQLite graph tables + indexes)	`electron/services/skills/rag/graph-db.ts`	1d
15.3	GraphRAG query engine (vector + graph traversal)	`electron/services/skills/rag/graphrag-skill.ts`	3d
15.4	RAG Fusion (multi-query generation + RRF merge)	`electron/services/skills/rag/rag-fusion-skill.ts`	2d
15.5	Contextual Chunking (add context to chunks before embed)	`electron/services/skills/rag/contextual-chunk.ts`	2d
15.6	RAG Strategy Router (classify query → select strategy)	`electron/services/skills/rag/rag-router.ts`	1d
15.7	Re-embed existing brains with contextual chunking	`electron/services/skills/rag/re-embed.ts`	1d
15.8	Upgrade agentic-rag.ts to compose all strategies	`electron/services/agentic-rag.ts` (refactor)	1d
15.9	Tests + evaluation (manual relevance scoring)	`tests/unit/rag/*.test.ts`	1d

Graph SQLite Schema

CREATE TABLE graph_nodes (
  id TEXT PRIMARY KEY,
  project_id TEXT NOT NULL,
  type TEXT NOT NULL, -- 'file' | 'function' | 'class' | 'module' | 'variable'
  name TEXT NOT NULL,
  file_path TEXT,
  start_line INTEGER,
  end_line INTEGER,
  content_hash TEXT,
  embedding BLOB,
  metadata TEXT -- JSON
);

CREATE TABLE graph_edges (
  id TEXT PRIMARY KEY,
  project_id TEXT NOT NULL,
  source_id TEXT NOT NULL REFERENCES graph_nodes(id),
  target_id TEXT NOT NULL REFERENCES graph_nodes(id),
  type TEXT NOT NULL, -- 'imports' | 'calls' | 'inherits' | 'implements' | 'uses'
  weight REAL DEFAULT 1.0,
  metadata TEXT
);
CREATE INDEX idx_edges_source ON graph_edges(source_id);
CREATE INDEX idx_edges_target ON graph_edges(target_id);

Acceptance Criteria

Knowledge graph built for at least 1 real project
GraphRAG answers multi-hop questions (e.g., ‘what does function A call and who calls it?’)
RAG Fusion improves relevance > 15% compared to single query
Contextual chunks have file path + function context in the embedding
RAG Router automatically selects the right strategy for the query type
Re-embed does not lose existing data

Sprint 16: Self-Learning Pipeline (Weeks 7–8)

Goal

Self-learning system: DSPy prompt optimization + behavioral analytics + feedback loops. Cortex starts TRULY learning from the user.

Task Breakdown

#	Task	File(s)	Effort
16.1	Behavioral Event Collector	`electron/services/skills/learning/event-collector.ts`	1d
16.2	Event storage schema + SQLite tables	`electron/services/skills/learning/learning-db.ts`	4h
16.3	Implicit feedback detection (accept/reject/edit)	`electron/services/skills/learning/feedback-detector.ts`	2d
16.4	DSPy integration (Python bridge or TS port)	`electron/services/skills/learning/dspy-bridge.ts`	3d
16.5	Prompt optimizer service	`electron/services/skills/learning/prompt-optimizer.ts`	2d
16.6	Feedback-driven reranker update	`electron/services/learned-reranker.ts` (upgrade)	1d
16.7	Self-Learning Dashboard UI	`src/components/learning/LearningDashboard.tsx`	1d
16.8	learningStore.ts (Zustand)	`src/stores/learningStore.ts`	4h
16.9	Tests + evaluation metrics	`tests/unit/learning/*.test.ts`	1d

Acceptance Criteria

Behavioral events captured: >= 20 events per session
DSPy optimization runs successfully (at least 1 prompt improved)
Dashboard displays learning progress (events, improvements)
Feedback detector accuracy > 80% (manual validation)
Reranker updates from feedback data

Sprint 17: Efficiency Engine (Weeks 9–10)

Goal

Optimize token usage: LLMLingua + Semantic Cache + Model Routing + Cost Tracking. Target: reduce token cost by 40% compared to v1.0.

Task Breakdown

#	Task	File(s)	Effort
17.1	LLMLingua integration (Python child_process)	`electron/services/skills/efficiency/llmlingua.ts`	2d
17.2	Semantic Cache service	`electron/services/skills/efficiency/semantic-cache.ts`	2d
17.3	Cache key generation (embedding similarity)	`electron/services/skills/efficiency/cache-key.ts`	1d
17.4	Model Router (complexity classifier + model selection)	`electron/services/skills/efficiency/model-router.ts`	2d
17.5	Model Registry (define models with cost/quality scores)	`electron/services/skills/efficiency/model-registry.ts`	4h
17.6	Cost Tracker service	`electron/services/skills/efficiency/cost-tracker.ts`	1d
17.7	Cost Dashboard UI	`src/components/efficiency/CostDashboard.tsx`	1d
17.8	costStore.ts (Zustand)	`src/stores/costStore.ts`	4h
17.9	Integrate efficiency pipeline into main query flow	`electron/services/skills/skill-router.ts` (update)	1d
17.10	Tests + benchmarks	`tests/unit/efficiency/*.test.ts`	1d

Acceptance Criteria

LLMLingua compresses context >= 40% while maintaining quality
Semantic cache hit rate >= 20% after 1 week of use
Model router correctly classifies complexity (manual eval > 80%)
Cost tracker is accurate (error < 5% vs. actual API cost)
Dashboard displays: cost/query, total cost, cache hit rate, compression ratio

Sprint 18: Agent Mode (Weeks 11–12)

Goal

Turn Cortex into a coding agent: code execution, browser automation, multi-step tasks. Cortex doesn’t just answer — Cortex ACTS.

Task Breakdown

#	Task	File(s)	Effort
18.1	Code Execution Sandbox (Docker or safe eval)	`electron/services/skills/agent/code-executor.ts`	2d
18.2	Playwright MCP integration	`electron/services/skills/mcp/playwright-adapter.ts`	1d
18.3	ReAct loop implementation	`electron/services/skills/reasoning/react-skill.ts`	2d
18.4	Plan-and-Execute pattern	`electron/services/skills/reasoning/plan-execute-skill.ts`	2d
18.5	Reflexion module (self-correction)	`electron/services/skills/reasoning/reflexion-skill.ts`	1d
18.6	Agent UI (show plan, steps, results)	`src/components/agent/AgentPanel.tsx`	2d
18.7	Terminal integration (run commands safely)	`electron/services/skills/agent/terminal.ts`	1d
18.8	Git operations as agent actions	`electron/services/skills/agent/git-actions.ts`	1d
18.9	Tests	`tests/unit/agent/*.test.ts`	1d

Acceptance Criteria

Code execution sandbox runs code safely (no file system access outside sandbox)
Playwright can navigate, click, and scrape web pages
ReAct loop completes multi-step tasks (e.g., ‘find the bug in file X and fix it’)
Agent UI displays plan + execution steps + results
Git operations: commit, branch, diff work through the agent

Summary

Sprint	Effort (days)	P0 Skills	Key Deliverables
13	10 days	Memory tiers	Memory Manager + Dashboard + Migration
14	12 days	Skill Registry, MCP	Skill system + 4 built-in skills + MCP client
15	10 days	GraphRAG, Fusion, Contextual	Advanced RAG pipeline + Knowledge graph
16	10 days	DSPy, Behavioral	Self-learning pipeline + Dashboard
17	10 days	LLMLingua, Cache, Router	Efficiency engine + Cost dashboard
18	10 days	ReAct, Code Exec	Agent mode + Terminal + Git actions
TOTAL	62 days	19 P0 skills	Cortex v2.0

Definition of Done for v2.0

19 P0 skills working and passing tests
Multi-tier memory system working (3 tiers)
Self-learning running (DSPy + behavioral analytics)
Token cost reduced >= 30% compared to v1.0
Agent mode completing multi-step tasks
0 type errors, all tests passing
Documentation updated