Installation Guide
Learn how to install, set up and configure OM1.
System Requirements
Operating System
- Linux (Ubuntu 20.04+)
- MacOS 12.0+
Hardware
- Sufficient memory to run vision and other models
- Reliable WiFi or other networking
- Sensors such as cameras, microphones, LIDAR units, IMUs
- Actuators and outputs such as speakers, visual displays, and movement platforms (legs, arms, hands)
- Hardware connected to the “central” computer via
Zenoh
,CycloneDDS
, serial, usb, or custom APIs/libraries
Software
Ensure you have the following installed on your machine:
Python
>= 3.10uv
>= 0.6.2 as the Python package manager and virtual environmentportaudio
for audio input and outputffmpeg
for video processing- Openmind API key
UV (A Rust and Python package manager)
PortAudio Library
For audio functionlity, install portaudio
:
ffmpeg
For video functionlity, install FFmpeg:
Installation and Setup
- Clone the repository
Run the following commands to clone the repository and set up the environment:
- Set the configuration variables
Locate the config
folder and add your OpenMind API key to /config/spot.json
(for example). If you do not already have one, you can obtain a free access key at https://portal.openmind.org/.
Note: Using the placeholder key openmind-free will generate errors.
Or, create a .env
file in the project directory and add the following:
- Run the Spot Agent
Run the following command to start the Spot Agent:
Some necessary packages will be installed during this process, the first time you run the command. This might take a little time. Please be patient. Then you will see the system come to life.
WebSim to check input and output
Go to http://localhost:8000 to see real time logs along with the input and output in the terminal. For easy debugging, add --debug
to see additional logging information.
Understanding the Log Data
The log data provide insight into how the spot
agent makes sense of its enviroment and decides on its next actions.
- First, it detects a person using vision.
- Communicates with an external AI API for response generation.
- The LLM(s) decide on a set of actions (dancing and speaking).
- The simulated robot expresses emotions via a front-facing display.
- Logs latency and processing times to monitor system performance.
More Examples
There are more pre-configured agents in the /config
folder. They can be run with the following command:
For example, to run the conversation
agent: