๐Ÿ”
Tools & AutomationChapter 22 of 33ยท 4 min read

Chapter 22: Browser Automation

OpenClaw's browser automation skill gives your agent the ability to control a real web browser โ€” navigating to URLs, clicking buttons, filling forms, extracting data, and taking screenshots. This chapter covers how to set up browser automation and use it safely and effectively.


What Can Browser Automation Do?

With the browser skill enabled, your agent can:

  • Visit any website and read its content
  • Fill in and submit forms
  • Log into web services (with stored credentials)
  • Click through multi-step workflows
  • Extract structured data from dynamic web pages (JavaScript-heavy sites that web-fetch can't handle)
  • Take screenshots of pages or specific elements
  • Download files from websites

How It Works

OpenClaw uses Playwright under the hood โ€” the same browser automation library used in professional software testing. Playwright controls a real Chromium, Firefox, or WebKit browser in headless mode (no visible window).

User message โ†’ Agent decides to browse โ†’ Playwright launches browser
    โ†’ Browser navigates to URL โ†’ Agent reads page content
    โ†’ Agent takes actions (click, type, submit) โ†’ Returns results

Installation

The browser skill requires Playwright, which is not bundled by default:

# Install the browser skill
openclaw hub install @openclaw/browser

# Install browser binaries (run once)
openclaw browser install

Or install Playwright manually:

npm install -g playwright
npx playwright install chromium

Configuration

{
  "skills": {
    "browser": {
      "enabled": true,
      "browserType": "chromium",
      "headless": true,
      "timeout": 30000,
      "userAgent": "Mozilla/5.0 (compatible; OpenClaw/1.0)",
      "viewport": { "width": 1280, "height": 800 },
      "allowedDomains": [],
      "blockedDomains": ["localhost", "192.168.", "10.", "172.16."],
      "maxConcurrentBrowsers": 2,
      "screenshotOnError": true,
      "downloadPath": "~/.openclaw/downloads"
    }
  }
}
FieldDescription
browserTypechromium, firefox, or webkit
headlessSet to false to see the browser window (debug only)
timeoutMax milliseconds per navigation or action
allowedDomainsWhitelist domains (empty = all allowed)
blockedDomainsDomains the agent can never visit
maxConcurrentBrowsersLimit parallel browser instances

What the Agent Can Do

Navigate and Read

User: Go to https://news.ycombinator.com and tell me the top 5 stories

The agent visits the page, reads the content, and summarizes it โ€” even if the page requires JavaScript to render.

Form Filling

User: Go to our internal ticketing system at tickets.internal.com, log in with the ops credentials, and create a ticket titled "Database backup failed" assigned to the infrastructure team

The agent navigates, logs in, fills the form, and confirms submission.

Data Extraction

User: Go to our competitor's pricing page and extract all plan names and prices into a table

The agent reads the page and returns structured data.

Screenshots

User: Take a screenshot of the current state of our staging dashboard

The agent returns an image attachment with the screenshot.


Credential Management

Store credentials securely so the agent can log into websites:

{
  "skills": {
    "browser": {
      "credentials": {
        "tickets.internal.com": {
          "username": "${TICKET_SYSTEM_USER}",
          "password": "${TICKET_SYSTEM_PASS}"
        }
      }
    }
  }
}

Credentials are never sent to the AI model โ€” they are filled in directly by the Playwright automation layer before the agent reads the page.


Session Cookies

The browser can maintain authenticated sessions across multiple interactions:

{
  "skills": {
    "browser": {
      "persistSessions": true,
      "sessionPath": "~/.openclaw/browser-sessions"
    }
  }
}

The first time the agent logs into a site, the session cookie is saved. On subsequent visits to the same domain, the agent is already authenticated.


Proxy Support

Route browser traffic through a proxy for privacy or geo-location:

{
  "skills": {
    "browser": {
      "proxy": {
        "server": "http://proxy.example.com:8080",
        "username": "${PROXY_USER}",
        "password": "${PROXY_PASS}"
      }
    }
  }
}

Safety Limits

Browser automation is powerful โ€” apply strict controls for shared gateways:

{
  "skills": {
    "browser": {
      "allowedDomains": ["yourcompany.com", "internal.yourcompany.com"],
      "blockedActions": ["download", "form-submit"],
      "requireConfirmation": ["form-submit", "click-purchase"]
    }
  }
}

The requireConfirmation field causes the agent to pause and describe what it is about to do before executing potentially irreversible actions like form submissions.


Workspace-Level Browser Access

Limit browser access to specific workspaces:

{
  "workspaces": [
    {
      "id": "admins",
      "agent": "expert",
      "skills": ["browser", "bash", "files"]
    },
    {
      "id": "team",
      "agent": "balanced",
      "skills": ["web-search", "web-fetch"]
    }
  ]
}

The team workspace gets lightweight web access (search and fetch), while admins gets full browser automation.


Next: Chapter 23 โ€” Cron & Scheduled Tasks โ€” How to schedule your agent to run tasks automatically on a time-based schedule.