System Requirements
Operating System
- Linux (Ubuntu 20, 22, 24)
- MacOS 12.0+
Hardware
- Sufficient memory to run vision and other models
- Reliable WiFi or other networking
- Sensors such as cameras, microphones, LIDAR units, IMUs
- Actuators and outputs such as speakers, visual displays, and movement platforms (legs, arms, hands)
- Hardware connected to the “central” computer via
Zenoh
,CycloneDDS
, serial, usb, or custom APIs/libraries
Software
Ensure you have the following installed on your machine:Python
>= 3.10uv
>= 0.6.2 as the Python package manager and virtual environmentportaudio
for audio input and outputffmpeg
for video processing- Openmind API key
UV (A Rust and Python package manager)
PortAudio Library
For audio functionality, installportaudio
:
ffmpeg
For video functionality, install FFmpeg:Mac
Installation and Setup
- Clone the repository
clone repo
- Set the configuration variables
config
folder and add your OpenMind API key to /config/spot.json
(for example). If you do not already have one, you can obtain a free access key at https://portal.openmind.org/.
.env
file in the project directory and add the following:
- Run the Spot Agent
WebSim to check input and output
Go to http://localhost:8000 to see real time logs along with the input and output in the terminal. For easy debugging, add--debug
to see additional logging information.
Understanding the Log Data
The log data provide insight into how thespot
agent makes sense of its environment and decides on its next actions.
- First, it detects a person using vision.
- Communicates with an external AI API for response generation.
- The LLM(s) decide on a set of actions (dancing and speaking).
- The simulated robot expresses emotions via a front-facing display.
- Logs latency and processing times to monitor system performance.
More Examples
There are more pre-configured agents in the/config
folder. They can be run with the following command:
For example, to run the conversation
agent:
<agent_name>
with your agent and run the below command: