input
information to LLMs and then (2) route LLM responses to various system actions, such as speak
and move
. The system provides a standardized interface for communicating with many different LLM endpoints from all the major providers including Anthropic, Google, DeepSeek, OpenAI, Meta (through OpenRouter).
The plugins handle authentication, API communication, prompt formatting, response parsing, and conversation history management. LLM plugin examples are located in src/llm/plugins
: Code.
Endpoint Overview
Single-Agent LLM Integration
For testing and introductory educational purposes, we integrate with multiple language models (LLMs) to provide chat completion via aPOST /api/core/{provider}/chat/completions
endpoint. Each LLM plugin takes fused input data (the prompt
) and sends it to an LLM. The response is then parsed and provided to runtime/cortex.py
for distribution to the system actions:
pydantic
output model is defined in src/llm/output_model.py
.
Single-Agent LLM Configuration
Multi-Agent LLM Integration
The Multi-Agent endpoint at/api/core/agent
utilizes a collaborative system of specialized agents to perform more complex robotics tasks. The multi-agent system:
- Processes navigation, perception, and RAG queries in parallel using
asyncio.gather()
- Sends results to the team agent for synthesis
- Returns comprehensive response with individual agent outputs
- Tracks usage and duration metrics for each agent
Agent Architecture
The system employs four primary agents that work together:- Navigation Agent: Processes spatial and movement-related tasks
- Perception Agent: Handles sensory input analysis and environmental understanding
- RAG Agent: Provides retrieval-augmented generation (RAG) capabilities using the user’s knowledge base
- Team Agent: Synthesizes outputs from all agents into a unified response
Main API Endpoint
API Debug Response Structure
In addition to the response flowing to OM1, which contains actions the robot should perform, there is an additional response you can use for debugging and to observe token usage (“usage”).Supported Models
Memory Management
The system includes memory capabilities at/api/core/agent/memory
:
- Session-based memory storage via API keys
- Graph memory integration using Zep
- Conversation history tracking
RAG Integration
The RAG agent connects to the knowledge base system (/api/core/rag
) to provide retrieval-augmented generation capabilities. To use RAG with your documents:
- Upload Documents: Visit https://portal.openmind.org/machines to upload your documents and files to your knowledge base
- Ask Questions: Once uploaded, you can ask questions about your documents through the multi-agent system at
/api/core/agent
- Retrieve relevant documents during agent processing
- Provide context-aware responses based on your uploaded content
- Access and search through your user-uploaded documents and files
Getting Started
To try out the multi-agent system:Examples
A Smart Dog
Imagine you would like to program a smart dog. Describe the desired capabilities and behaviors of the dog insystem_prompt_base
. For example:
Medical Robot
To convert the robotic dog (example above) into a four-legged medical doctor, you can change the prompt and route traffic to a specialized healthcare optimized endpoint (/api/core/agent/medical
). This endpoint emphasizes the careful, responsible delivery of general health-related non-diagnostic responses. A suitable prompt might be:
- Verifier Agent: Validates medical information
- Questioner Agent: Manages medical consultation flow
/agent
endpoint.