LLMs
LLM Integration
OM1’s LLM integration is intended to make it easy to (1) send input
information to LLMs and then (2) route LLM responses to various system actions
, such as speak
and move
. The system provides a standardized interface for communicating with many different LLM endpoints from all the major providers including Anthropic, Google, DeepSeek, and OpenAI. The plugins handle authentication, API communication, prompt formatting, response parsing, and conversation history management.
LLM Plugin Examples
LLM plugins are located in src/llm/plugins
: Code.
How it Works - Single-Agent LLM Integration
For testing and introductory educational purposes, we integrate with multiple language models (LLMs) to provide chat completion via a post /{provider}/chat/completions
endpoint. Each LLM plugin takes fused input data (the prompt
) and sends it to an LLM (or a system of LLMs), and then waits for the response. The response is then parsed and provided to runtime/cortex.py
for distribution to the system actions:
The standard pydantic
output model is defined in src/llm/output_model.py
.
LLM Configuration
Multi-Agent LLM Integration (post /agent
)
The Multi-Agent robotics endpoint (accessed via post /agent
) accepts fused data and routes them to a collaborative multi-agent system, providing a powerful way to solve more complex robotics tasks.
Key Features
- Collaborative Intelligence: Multiple specialized agents work together to solve complex problems
- Dynamic Timescales: Depending on the robot’s immedaite situation, the LLMs will respond quickly (within 300 ms for
emergency actions
) or more slowly (within 2 seconds) - Compatible Interface: Uses the same pattern as standard single-agent LLM implementations
Getting Started
To try out the multi-agent system:
Technical Details
The MultiLLM
module sends requests to the endpoint at https://api.openmind.org/api/core/agent
. Like for the single agent LLM integration, the request includes:
- System prompt context
- User inputs
- Available actions for the agents
This information then flows to three interacting LLMs, which work together to decide on response priority and specific response actions.
Use Cases
- Robotics: Interface with and control physical systems through natural language.
- Complex Decision Making: Combine multiple specialized agents for better reasoning.