input information to LLMs and then (2) route LLM responses to various system actions, such as speak and move. The OM1 system integrates various concrete implementations of Large Language Models (LLMs) and specialized agents, each designed to address different requirements and interaction patterns. These implementations manage API communication, conversation history, and the processing of structured responses, particularly for function calls that trigger agent actions. The framework ensures a consistent interface, allowing the system to interchangeably utilize diverse LLM backends.
The plugins handle authentication, API communication, prompt formatting, response parsing, and conversation history management. LLM plugin examples are located in src/llm/plugins: Code.
Endpoint Overview
Single-Agent LLM Integration
For testing and introductory educational purposes, we integrate with multiple language models (LLMs) to provide chat completion via aPOST /api/core/{provider}/chat/completions endpoint. Each LLM plugin takes fused input data (the prompt) and sends it to an LLM. The response is then parsed and provided to runtime/cortex.py for distribution to the system actions:
pydantic output model is defined in src/llm/output_model.py.
Single-Agent LLM Configuration
Multi-Agent LLM Integration
The Multi-Agent endpoint at/api/core/agent utilizes a collaborative system of specialized agents to perform more complex robotics tasks. The multi-agent system:
- Processes navigation, perception, and RAG queries in parallel using
asyncio.gather() - Sends results to the team agent for synthesis
- Returns comprehensive response with individual agent outputs
- Tracks usage and duration metrics for each agent
Local LLMs
The system supports on-device inference using the Qwen3-30B local LLM. This enables low-latency responses and allows certain workloads to run entirely on the device without relying on cloud connectivity.Ollama Integration
Ollama provides an easy way to run open-source models locally. OM1 supports Ollama through theOllamaLLM plugin.
Prerequisites:
- Install Ollama: https://ollama.ai
- Pull a model:
ollama pull llama3.2 - Ensure Ollama is running:
ollama serve
Dual LLM support
OM1 implements a dual-LLM response mechanism that combines both local and cloud-based models to optimize response quality and latency.- Local model: Qwen3-30B (on-device)
- Cloud model: GPT-4.1
How It Works
- For each request, OM1 sends the prompt to both the local and cloud LLMs in parallel.
- The system waits up to 3.2 seconds for responses.
-
If both models return a response within the threshold:
- The two responses are evaluated by the local LLM.
- The local LLM selects the better response as the final output.
- If only one model responds within the threshold: That response is used directly as the final output.
Agent Architecture
The system employs four primary agents that work together:- Navigation Agent: Processes spatial and movement-related tasks
- Perception Agent: Handles sensory input analysis and environmental understanding
- RAG Agent: Provides retrieval-augmented generation (RAG) capabilities using the user’s knowledge base
- Team Agent: Synthesizes outputs from all agents into a unified response
Main API Endpoint
API Debug Response Structure
In addition to the response flowing to OM1, which contains actions the robot should perform, there is an additional response you can use for debugging and to observe token usage (“usage”).Supported Models
Memory Management
The system includes memory capabilities at/api/core/agent/memory:
- Session-based memory storage via API keys
- Graph memory integration using Zep
- Conversation history tracking
RAG Integration (Currently Disabled)
The RAG agent connects to the knowledge base system (/api/core/rag) to provide retrieval-augmented generation capabilities. To use RAG with your documents:
- Upload Documents: Visit https://portal.openmind.org/machines to upload your documents and files to your knowledge base
- Ask Questions: Once uploaded, you can ask questions about your documents through the multi-agent system at
/api/core/agent
- Retrieve relevant documents during agent processing
- Provide context-aware responses based on your uploaded content
- Access and search through your user-uploaded documents and files
Examples
A Smart Dog
Imagine you would like to program a smart dog. Describe the desired capabilities and behaviors of the dog insystem_prompt_base. For example:
Medical Robot
To convert the robotic dog (example above) into a four-legged medical doctor, you can change the prompt and route traffic to a specialized healthcare optimized endpoint (/api/core/agent/medical). This endpoint emphasizes the careful, responsible delivery of general health-related non-diagnostic responses. A suitable prompt might be:
- Verifier Agent: Validates medical information
- Questioner Agent: Manages medical consultation flow
/agent endpoint.