# Cua | Artificialus

> For the complete content index, see [llms.txt](https://artificialus.com/llms.txt). Markdown versions of all pages are available by appending `.md` to any URL.

- Home
- /
- Agents
- /
- Cua

CU

# Cua

Open-source infrastructure for computer-use agents — sandboxes, SDKs, benchmarks, and background desktop drivers.

Cua AI, Inc. (YC X25)

Open source

Since 2025

Visit Website

Repository

Docs

Download

Share

X

Facebook

Reddit

Telegram

Bluesky

Email

Cua is an open-source (MIT) infrastructure platform for building, benchmarking, and deploying computer-use agents. It provides background desktop control via Cua Driver, ephemeral multi-OS sandboxes via Cua Sandbox, standardized benchmarks via Cua Bench, and macOS virtualization via Lume — all accessible through a unified Python/TypeScript SDK, MCP server, and CLI.

##
+

Pros
- First comprehensive open-source infrastructure layer for computer-use agents — fills the gap between vision models and real desktop control
- Background desktop control on macOS and Windows without stealing cursor or focus — agents click, type, and verify invisibly
- MCP server enables drop-in integration with existing coding agents (Claude Code, Codex, Cursor, Hermes, OpenCode) in minutes
- Unified multi-OS sandbox API — one SDK to boot Linux, Windows, macOS, and Android environments locally or in the cloud
- Cua Bench provides standardized benchmarks (OSWorld, ScreenSpot, Windows Arena) and RL environments for reproducible evaluation
- Lume delivers near-native macOS VM performance on Apple Silicon using Apple's Virtualization.Framework
- 18,000+ GitHub stars, 3,500+ commits, 521 releases — exceptionally active development with rapid iteration
- MIT licensed — fully open-source core with self-hosted and on-prem deployment options
- Verified data pipeline delivers human-reviewed trajectory datasets for training computer-use models

##
−

Cons
- Cloud pricing is opaque — no public self-serve tiers, only 'request access' for hosted fleets and dedicated infrastructure
- Linux Cua Driver is pre-release only — not yet stable for production Linux desktop automation
- Early-stage company (founded March 2025) — rapidly evolving APIs and product surface may shift unexpectedly
- Complex product surface (Driver, Sandbox, Run, Bench, Lume, Verified Data) can be confusing to navigate for new users
- Cua Driver requires macOS or Windows host — no self-hosted driver option on Linux yet
- Some components (Cua Run, Verified Data) are cloud-only with no self-hosted alternative
- Heavy dependency on Apple ecosystem for macOS sandboxing — Lume requires Apple Silicon hardware

##

Pricing

### Open Source

$0

MIT-licensed core — Cua Driver, Cua Sandbox SDK, Cua Bench, and Lume. Self-hosted on your own infrastructure.

### Cloud

Request

Hosted sandbox fleets on cua.ai with warm pools, multi-OS support (Linux, Windows, macOS, Android), and snapshot-based parallelism. BYOC and on-prem available.

### Verified Data

Request

Curated, human-reviewed trajectory datasets generated from verified rollouts across all four OS families for training computer-use models.

## Introduction

Cua is the first comprehensive open-source infrastructure layer purpose-built for computer-use agents — AI systems that see a screen, reason about it, and interact with desktop applications the way a human would. Developed by Cua AI, Inc. ( Y Combinator X25 ) and released under the MIT license , Cua has grown to over 18,000 GitHub stars, 3,500+ commits, and 521 releases since its founding in March 2025.

The platform fills a critical gap in the AI agent stack. While vision-language models have become increasingly capable of understanding screenshots and generating actions, the infrastructure to actually run those actions on real desktops — in the background, across operating systems, at scale — was fragmented or proprietary. Cua provides the missing layer: background desktop drivers, ephemeral sandboxes, standardized benchmarks, and macOS virtualization, all accessible through a unified Python and TypeScript SDK, an MCP server, and a CLI.

Cua is used by over 50,000 engineers at companies including Google, Meta, Apple, and NVIDIA for training, evaluation, and deployment of computer-use agents.

## Key Components

Cua is not a single tool but a platform of interoperable components. You enter through one of four products:

### Cua Driver — Background Desktop Control

Cua Driver is the entry point for giving an existing coding agent the ability to control a desktop computer. It runs as a background daemon on macOS (stable) and Windows (stable), with Linux in pre-release.

> Agents click, type, scroll, inspect accessibility trees, and capture window state — all without stealing the cursor or focus from the user.

The driver exposes a single MCP stdio server and a CLI that call the same driver handlers. This means any MCP-capable client — Claude Code, Codex, Cursor, Hermes, OpenClaw, OpenCode — can wire it up in minutes:

```
`
# macOS / Linux

/bin/bash -c
"
$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/cua-driver/scripts/install.sh)
"

# Then connect to Claude Code

claude mcp
add
--transport stdio cua-driver -- cua-driver mcp`
```

Cua Driver is built on native platform APIs — Accessibility (AX) on macOS, UI Automation on Windows — and uses SkyLight internals on macOS for multi-cursor support. Each agent session gets its own synthetic cursor, and the driver orchestrates per-window element indexes that refresh on every snapshot, so pre-action and post-action window state are part of the contract.

Key capabilities:
- `launch_app` — Start an application by bundle ID or path
- `click` / `double_click` / `right_click` — Click by element index or coordinates
- `type_text` — Send keystrokes to a focused element
- `scroll` — Scroll in any direction
- `get_window_state` — Returns accessibility tree + screenshot of the current window
- `drag` / `held_button` — Mouse drag and held-button interactions (Linux pre-release)

### Cua Sandbox — Agent-Ready Environments

Cua Sandbox provides ephemeral desktop environments for agents to run in. One Python or TypeScript API boots Linux, Windows, macOS, or Android environments — locally via QEMU or in the cloud via Cua Cloud:

```
`
from
cua
import
Sandbox, Image

# Same API regardless of OS or runtime

async

with
Sandbox.ephemeral(Image.linux())
as
sb:
result =
await
sb.shell.run(
"echo hello"
)
screenshot =
await
sb.screenshot()

await
sb.mouse.click(
100
,
200
)

await
sb.keyboard.
type
(
"Hello from Cua!"
)

await
sb.mobile.gesture((
100
,
500
), (
100
,
200
))`
```

Sandboxes support snapshot-native rollouts — fork known machine states over copy-on-write storage, run parallel episodes, and reproduce failures without rebuilding from scratch. Cua Run extends this to elastic infrastructure: claim pre-booted machines from formally verified warm pools in milliseconds, scaling to zero when idle.

Runtime

Linux

Windows

macOS

Android

BYOI

Cloud (
cua.ai

)

✅

✅

✅

✅

🔜

Local (QEMU)

✅

✅

✅

✅

✅

###
Cua Bench — Benchmarks & RL Environments

Cua Bench is a standardized evaluation framework for computer-use agents. It supports running agents against established benchmarks — OSWorld, ScreenSpot, Windows Arena — and custom task definitions:

```
`
# Clone, install, and create base image

git clone https:
//github.com/trycua/cua
&&
cd
cua/cua-bench
uv tool install -e . && cb image create linux-docker

# Run benchmark with agent

cb run dataset datasets/cua-bench-basic
--agent
cua-agent
--max-parallel
4`
```

Cua Bench also provides RL environments for training computer-use models, with trajectory export for downstream training pipelines. The cuabench.ai
registry hosts community-contributed tasks and leaderboards.

###
Lume — macOS Virtualization

Lume is a CLI tool for creating and managing macOS and Linux VMs on Apple Silicon with near-native performance, built on Apple's `Virtualization.Framework`
. It predates Apple's Containerization announcement and was Cua's original open-source project:

```
`
# Install Lume

/bin/bash -c
"
$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)
"

# Pull & start a macOS VM

lume
run
macos-sequoia-vanilla:latest`
```

Lume supports restoring from IPSW files, Resizeable Raw Disk Images, and a Docker-compatible interface called Lumier for orchestrating multiple VMs.

###
Verified Data

Cua offers a Verified Data pipeline: human-reviewed trajectory datasets generated from verified agent rollouts across all four OS families. Every trajectory is scored against an evaluator, accepted runs pass human QA, and golden trajectories carry step-level annotations for training computer-use models.

##
Architecture

Cua's architecture is modular by design — each component can be used independently or composed together:

```
`┌─────────────────────────────────────────────────────────────┐
│ Coding Agents │
│ Claude Code · Codex · Cursor · Hermes · OpenClaw · ... │
└────────────────────────┬────────────────────────────────────┘
│ MCP / HTTP
┌────────────────────────▼────────────────────────────────────┐
│ Cua Driver │
│ Background desktop control (macOS / Windows / Linux) │
│ Accessibility · SkyLight · UI Automation │
└────────────────────────┬────────────────────────────────────┘
│
┌────────────────────────▼────────────────────────────────────┐
│ Cua Sandbox + Cua
Run
│
│ Ephemeral VMs/containers — Linux · Windows · macOS · Android │
│
Local
(QEMU) · Cloud (cua.ai) · BYOC ·
On
-prem │
│ Warm pools · Snapshots ·
Copy
- on
-write │
└────────────────────────┬────────────────────────────────────┘
│
┌────────────────────────▼────────────────────────────────────┐
│ Cua Bench │ Lume │
│ Benchmarks + RL │ macOS VMs
on
Apple Silicon │
│ OSWorld · ScreenSpot │ Virtualization.Framework │
│ Windows Arena │ Lumier (Docker-compatible) │
└────────────────────────┴────────────────────────────────────┘`
```

The stack is unified by the Python SDK (`pip install cua`) and the TypeScript SDK (`@trycua/cua`), with MCP as the universal protocol layer connecting coding agents to the driver.

##
Integration with Coding Agents

Cua Driver's MCP server turns any MCP-capable coding agent into a computer-use agent with a single command.

###
Claude Code

```
`
# Standard MCP — agent sees accessibility tree + tools

claude mcp
add

--transport stdio cua-driver -- cua-driver mcp

# Screenshot-grounded mode — Claude Code computer-use compatible

claude mcp
add

--transport stdio cua-computer-use -- cua-driver mcp --claude-code-computer-use-compat
`
```

###
Codex

```
`codex mcp
add
cua-driver
-- cua-driver mcp
`
```

###
Cursor / OpenCode / Custom MCP Clients

```
`# Generate client-specific
config

cua-driver mcp-
config

--client cursor

cua-driver mcp-
config

--client opencode

# Generic MCP server shape
cua-driver mcp-
config
`
```

###
Hermes

```
`cua-driver mcp-
config

--client hermes

# paste into ~/.hermes/
config
.yaml`
```

Any coding agent workflow extends to desktop automation — filling forms in native apps, exporting data from enterprise software, testing GUI applications, and automating multi-step workflows — all from the same chat interface the developer already uses.

##
Use Cases

###
Training Computer-Use Models

Cua provides the full pipeline: Cua Sandbox for environment orchestration at scale, Cua Bench for standardized evaluation, and Verified Data for human-reviewed training trajectories. Research teams at Google, Meta, and NVIDIA use Cua to generate training data for their computer-use vision-language models.

###
Automated GUI Testing

QA teams can deploy Cua Driver in CI/CD pipelines to run GUI tests against native desktop applications. The driver's background execution means tests run without interfering with other work on the same machine, and snapshots provide deterministic replay of failures.

###
Enterprise Desktop Automation

Cua Driver automates repetitive desktop workflows — data entry across legacy applications, software installation and configuration, file management across network drives — with the reliability of accessibility-tree-grounded interaction rather than fragile pixel coordinates.

###
Multi-Agent Parallel Workflows

Cua Run's warm pools and session-identity system enable multiple agents to work in parallel on the same or different machines. Each agent gets its own cursor and window session, enabling coordinated multi-agent desktop automation.

###
macOS CI/CD for Apple Ecosystem

Lume provides near-native macOS VMs for CI/CD pipelines — build and test macOS, iOS, and visionOS applications without dedicated Mac hardware. Combined with Cua's sandbox API, teams can spin up macOS environments on demand for testing.

##
Pricing

Tier

Price

Description

Open Source

$0

MIT-licensed core — Cua Driver, Cua Sandbox SDK, Cua Bench, and Lume. Self-hosted on your infrastructure.

Cloud

By request

Hosted sandbox fleets with warm pools, multi-OS support (Linux, Windows, macOS, Android), and snapshot-based parallelism. BYOC and on-prem available.

Verified Data

By request

Curated, human-reviewed trajectory datasets from verified rollouts across all four OS families.

The entire core platform is MIT-licensed and available on GitHub
. Cloud infrastructure and verified data are available by contacting the Cua team, with options for SOC 2-ready deployment, BYOC, and on-premises hosting.

##
Getting Started

###
Try Cua Driver (quickest start)

```
`
# Install on macOS or Windows

# macOS / Linux

/bin/bash -c
"$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/cua-driver/scripts/install.sh)"

# Windows (PowerShell as admin)

irm https:
//raw.githubusercontent.com/trycua/cua/main/libs/cua-driver/scripts/install.ps1 | iex

# Start the daemon

open -n -g -a CuaDriver --args serve
cua-
driver
status

# Wire into Claude Code

claude mcp add --transport stdio cua-
driver
-- cua-
driver
mcp

# Launch an app and interact

cua-
driver

call
launch_app
'{"bundle_id":"com.apple.calculator"}'

cua-
driver

call
get_window_state
'{"pid":844,"window_id":10725}'

cua-
driver

call
click
'{"pid":844,"window_id":10725,"element_index":14}'
`
```

###
Try Cua Sandbox (Python)

```
`pip
install
cua`
```

```
`
from
cua
import
Sandbox, Image

async

with
Sandbox.ephemeral(Image.linux())
as
sb:

print
(
await
sb.shell.run(
"uname -a"
))

await
sb.mouse.click(
500
,
400
)

await
sb.keyboard.
type
(
"Hello from Cua!"
)
screenshot =
await
sb.screenshot()

# screenshot is a PIL Image — save, display, or send to a VLM
`
```

###
Try Cua Bench

```
`git
clone
https://github.com/trycua/cua &&
cd
cua/cua-bench
uv tool install -e .
cb image create linux-docker
cb run dataset datasets/cua-bench-basic --agent cua-agent`
```

###
Try Lume

```
`/bin/bash -c
"
$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)
"

lume
run
macos-sequoia-vanilla:latest`
```

##
Further Reading
- Cua Documentation
— Full guides, examples, and API reference
- Cua Driver Quickstart
— Install and drive your first app
- Cua Sandbox Guide
— Agent-ready sandboxes for any OS
- Cua Bench Documentation
— Benchmarks and RL environments
- Lume Documentation
— macOS VM management on Apple Silicon
- GitHub Repository
— Source code, issues, and contributions
- Cua Blog
— Tutorials, updates, and research
- ClawCon Multiplayer Announcement
— Multi-player computer-use for coding agents
- Inside macOS Window Internals
— How SkyLight enables multi-cursor background agents
- Inside Windows Computer-Use
— Synthetic cursors for background agents on Windows

## Version History

cua-computer-server v0.3.41
Jun 15, 2026
Latest — maintenance release with dependency updates

cua-computer-server v0.3.40
Jun 12, 2026
Fix MCP routing at both /mcp and /mcp/ paths, route single printable keypress through type_text

cua-driver-rs v0.5.3
Jun 12, 2026
Rust driver release — Linux background drag and held-button tools, show cursor and type in background terminals

lume v0.3.10
Jun 8, 2026
macOS VM management — bump with stability improvements

cua-driver-rs v0.5.0
Jun 1, 2026
Caller-declared session identity + Streamable-HTTP transport for multi-agent parallelism

Best for AI engineers and researchers who need infrastructure to build, evaluate, and deploy agents that control full desktops across Linux, Windows, macOS, and Android

Capability Background desktop control without stealing cursor/focus · MCP server for drop-in agent integration (Claude Code, Codex, Cursor) · Multi-OS sandboxes via single API (Linux, Windows, macOS, Android) · Cua Bench standardized benchmarks (OSWorld, ScreenSpot, Windows Arena) · Lume macOS VMs on Apple Silicon with near-native performance

Runs on macOS · Windows · Linux · CLI · Python SDK · TypeScript SDK · MCP

Signature Snippet

Copy

```
`Install Cua Driver on macOS and connect it to Claude Code via MCP: 'claude mcp add --transport stdio cua-driver -- cua-driver mcp'. Then inside Claude Code, type: 'Open the Calculator app, compute 15% of 340, and paste the result into a new TextEdit document.' Cua Driver drives the desktop apps in the background — clicking, typing, and verifying via accessibility trees and screenshots — without stealing cursor focus or interrupting your workflow.`
```

## More in this Space

PA

### Paca

Open source

Paca is a free, open-source, self-hosted Scrum board where AI agents work as equal teammates — assigned to sprints, picking up tasks, and collaborating on BDD specs alongside humans. Built as an alternative to Jira and Linear, it treats AI agents as first-class Scrum members.

View profile

NA

### Nanobot

Open source

Nanobot is an ultra-lightweight, open-source (MIT) personal AI agent that ships with WebUI, multi-channel chat (Telegram, Discord, WeChat, Slack, Feishu, email), MCP support, memory, model routing with fallbacks, cron automation, and a plugin skill system — all pip-installable in seconds. Built on a deliberately small and readable Python core, it lets you truly own your AI agent stack.

View profile

AI

### AionUi

Open source

AionUi is a free, open-source Cowork desktop app that runs Claude Code, Codex, Gemini CLI, OpenClaw, Hermes Agent, and 20+ more coding agents side-by-side within a unified Electron interface. It auto-detects installed CLI agents, provides a built-in AI engine with 30+ provider support, and adds Team Mode for multi-agent orchestration — all from one desktop workspace.

View profile