Chapter 22: Browser Automation
OpenClaw's browser automation skill gives your agent the ability to control a real web browser โ navigating to URLs, clicking buttons, filling forms, extracting data, and taking screenshots. This chapter covers how to set up browser automation and use it safely and effectively.
What Can Browser Automation Do?
With the browser skill enabled, your agent can:
- Visit any website and read its content
- Fill in and submit forms
- Log into web services (with stored credentials)
- Click through multi-step workflows
- Extract structured data from dynamic web pages (JavaScript-heavy sites that web-fetch can't handle)
- Take screenshots of pages or specific elements
- Download files from websites
How It Works
OpenClaw uses Playwright under the hood โ the same browser automation library used in professional software testing. Playwright controls a real Chromium, Firefox, or WebKit browser in headless mode (no visible window).
User message โ Agent decides to browse โ Playwright launches browser
โ Browser navigates to URL โ Agent reads page content
โ Agent takes actions (click, type, submit) โ Returns results
Installation
The browser skill requires Playwright, which is not bundled by default:
# Install the browser skill
openclaw hub install @openclaw/browser
# Install browser binaries (run once)
openclaw browser install
Or install Playwright manually:
npm install -g playwright
npx playwright install chromium
Configuration
{
"skills": {
"browser": {
"enabled": true,
"browserType": "chromium",
"headless": true,
"timeout": 30000,
"userAgent": "Mozilla/5.0 (compatible; OpenClaw/1.0)",
"viewport": { "width": 1280, "height": 800 },
"allowedDomains": [],
"blockedDomains": ["localhost", "192.168.", "10.", "172.16."],
"maxConcurrentBrowsers": 2,
"screenshotOnError": true,
"downloadPath": "~/.openclaw/downloads"
}
}
}
| Field | Description |
|---|---|
browserType | chromium, firefox, or webkit |
headless | Set to false to see the browser window (debug only) |
timeout | Max milliseconds per navigation or action |
allowedDomains | Whitelist domains (empty = all allowed) |
blockedDomains | Domains the agent can never visit |
maxConcurrentBrowsers | Limit parallel browser instances |
What the Agent Can Do
Navigate and Read
User: Go to https://news.ycombinator.com and tell me the top 5 stories
The agent visits the page, reads the content, and summarizes it โ even if the page requires JavaScript to render.
Form Filling
User: Go to our internal ticketing system at tickets.internal.com, log in with the ops credentials, and create a ticket titled "Database backup failed" assigned to the infrastructure team
The agent navigates, logs in, fills the form, and confirms submission.
Data Extraction
User: Go to our competitor's pricing page and extract all plan names and prices into a table
The agent reads the page and returns structured data.
Screenshots
User: Take a screenshot of the current state of our staging dashboard
The agent returns an image attachment with the screenshot.
Credential Management
Store credentials securely so the agent can log into websites:
{
"skills": {
"browser": {
"credentials": {
"tickets.internal.com": {
"username": "${TICKET_SYSTEM_USER}",
"password": "${TICKET_SYSTEM_PASS}"
}
}
}
}
}
Credentials are never sent to the AI model โ they are filled in directly by the Playwright automation layer before the agent reads the page.
Session Cookies
The browser can maintain authenticated sessions across multiple interactions:
{
"skills": {
"browser": {
"persistSessions": true,
"sessionPath": "~/.openclaw/browser-sessions"
}
}
}
The first time the agent logs into a site, the session cookie is saved. On subsequent visits to the same domain, the agent is already authenticated.
Proxy Support
Route browser traffic through a proxy for privacy or geo-location:
{
"skills": {
"browser": {
"proxy": {
"server": "http://proxy.example.com:8080",
"username": "${PROXY_USER}",
"password": "${PROXY_PASS}"
}
}
}
}
Safety Limits
Browser automation is powerful โ apply strict controls for shared gateways:
{
"skills": {
"browser": {
"allowedDomains": ["yourcompany.com", "internal.yourcompany.com"],
"blockedActions": ["download", "form-submit"],
"requireConfirmation": ["form-submit", "click-purchase"]
}
}
}
The requireConfirmation field causes the agent to pause and describe what it is about to do before executing potentially irreversible actions like form submissions.
Workspace-Level Browser Access
Limit browser access to specific workspaces:
{
"workspaces": [
{
"id": "admins",
"agent": "expert",
"skills": ["browser", "bash", "files"]
},
{
"id": "team",
"agent": "balanced",
"skills": ["web-search", "web-fetch"]
}
]
}
The team workspace gets lightweight web access (search and fetch), while admins gets full browser automation.
Next: Chapter 23 โ Cron & Scheduled Tasks โ How to schedule your agent to run tasks automatically on a time-based schedule.