VILA VLM

We provide a simple om1_vlm/VideoStream wrapper to streamline integration with the VILA VLM API.

Usage

The VLM endpoint utilizes WebSockets for efficient, low latency communication.

wss://api-vila.openmind.org?api_key=<YOUR_API_KEY>

This code snippet demonstrates how to interact with the VILA VLM API:

from om1_vlm import VideoStream

# Initialize the VILA VLM API
ws_client = ws.Client(url="wss://api-vila.openmind.org?api_key=<YOUR_API_KEY>")
vlm = VideoStream(ws_client.send_message, fps=30)

# Start the VILA VLM API
ws_client.start()
vlm.start()

# Retrieve the VILA VLM API response
ws_client.register_message_callback(lambda msg: print(msg))

while True:
  time.sleep(1)

The expected response from the VILA VLM API will be in the following format:

{
  "vlm_reply": "The most interesting aspect in this series of images is the man's constant motion of speaking and looking in different directions while sitting in front of a laptop."
}

API Reference

OM Core Endpoints

Usage

API Reference

OM Core Endpoints

​Usage

Usage