Browser Use: The Open Source Tool That Lets AI Agents Control Any Website
Browser Use is an open source Python library that gives AI agents the ability to browse the web like a human. It has 85,800 stars on GitHub, raised $17M in seed funding, went through Y Combinator W25, and just shipped something that changes the game for autonomous agents: self-registration. Your agent can sign up for an account, get an API key, and start browsing without any human involvement. This kind of agent-first design is exactly what standards like ai-agent.json are trying to formalize across the web.
That last part matters more than you think.
What Browser Use Actually Does
The core idea is simple. You give an AI agent a task in natural language. Browser Use opens a browser, navigates to the right pages, clicks buttons, fills forms, reads content, and returns results. No XPath selectors. No CSS queries. No brittle scripts that break when a website changes its layout.
The agent sees the page the way a human would and figures out what to do. Pair it with a local model like Google Gemma 4 and you have fully autonomous browsing that never leaves your machine.
from browser_use import Agent
from langchain_openai import ChatOpenAI
agent = Agent(
task="Find the cheapest flight from NYC to London next Friday",
llm=ChatOpenAI(model="gpt-4o"),
)
result = await agent.run()
That is a complete working example. The agent will open a browser, navigate to flight search sites, enter the query, compare prices, and return the result. No Selenium. No Playwright scripts. No manual element targeting.
The Five Products
Web Agents
The core product. AI agents that can extract data, automate workflows, fill forms, run tests, and monitor websites. They work with any LLM: Claude, GPT-4, Gemini, Llama, or custom models. The agent interprets the page visually and decides how to interact with it, which means it handles dynamic content, SPAs, and JavaScript-heavy sites without special configuration.
Stealth Browsers
Anti-detection fingerprinting is enabled by default. Browser Use includes CAPTCHA solving, residential proxies across 195+ countries, and Cloudflare bypass. Zero configuration required. This is the feature that makes it practical for real-world use. Most websites actively block automated browsers. Browser Use makes the agent indistinguishable from a real user.
Custom Models
LLMs purpose-built for browser automation. The recommended model is Claude Sonnet 4.6 at $3.60/$18 per million tokens. You can also use Gemini 3 Flash ($0.60/$3.60 per million) for cost-sensitive tasks or Claude Opus 4.6 ($6/$30 per million) for complex multi-step workflows.
Skill APIs
This is the most interesting product for developers. You define a task and a schema (Pydantic or Zod). Browser Use executes the task, caches the workflow, and exposes it as a reliable API endpoint. First execution uses the LLM. Subsequent executions replay the cached script at nearly zero cost. When a website changes its layout, the system auto-heals the script for roughly $0.05 to $1.00.
Any website becomes an API. Create once, call forever.
Proxies
Residential proxies across 195+ countries. Integrated into the stealth browser stack so there is nothing to configure. $5 per GB across all plans.
AI Agent Self-Registration
This is the feature that got the AI agent community paying attention. Browser Use lets agents register for an account autonomously through a challenge-response flow. No human needed. No email verification. No CAPTCHA (ironic for a CAPTCHA-solving tool).
The flow works like this:
POST /cloud/signupto request a challenge. Email and name are optional.- The API returns an obfuscated math problem. Numbers are encoded in Chinese characters, Esperanto, or other languages. The text is intentionally garbled with random punctuation. The problem requires an LLM to parse. Deterministic code will not work.
- Solve the problem and submit the answer as a string with two decimal places.
POST /cloud/signup/verifywith the challenge ID and answer. The API returns abu_...API key.
That API key gives the agent access to unlimited browsers and proxies on the free tier (3 concurrent sessions). The agent can start browsing immediately.
To let a human claim the account later, the agent calls POST /cloud/signup/claim which returns a one-time URL valid for one hour.
The security model is clever. The obfuscated math problems are too messy for regex or simple parsing but trivial for any LLM. This gates registration to actual AI agents while blocking automated scripts and bots. An intelligence test as an API gate.
Pricing
| Plan | Price | Sessions | Key Features |
|---|---|---|---|
| Free | $0/mo | 3 concurrent | Stealth mode, community support |
| Starter | $40/mo | 50 concurrent | Premium proxies, priority support |
| Enterprise | Custom | Unlimited | Zero data retention, dedicated SLA |
Pay-per-use costs: $0.002 per LLM step, $0.01 per task initialization, $0.06/hour per browser session (billed per minute), $2.00 per skill creation, $0.02 per skill API call. Proxy data at $5/GB.
The free tier is genuinely usable. Three concurrent agents with stealth mode and CAPTCHA solving included. No credit card required.
How It Compares to Traditional Tools
Browser Use is not a replacement for Playwright or Selenium. It is a different category entirely.
| Browser Use | Playwright | Selenium | |
|---|---|---|---|
| Approach | Natural language | Code selectors | Code selectors |
| Handles layout changes | Yes (auto) | Scripts break | Scripts break |
| Anti-detection | Built-in | Manual config | Manual config |
| CAPTCHA solving | Included | No | No |
| Cost per task | $0.01+ | Free | Free |
| Setup complexity | 3 lines of code | Medium | High |
| Best for | AI agent tasks | Testing, scraping | Testing, scraping |
If you are writing deterministic test scripts, Playwright is still the right tool. If you are building an AI agent that needs to interact with arbitrary websites in unpredictable ways, Browser Use is purpose-built for that.
The Team
Browser Use was built by Magnus Muller and Gregor Zunic, who met at a hacker house at ETH Zurich. They went through Y Combinator W25 and raised $17M in seed funding from Felicis and others. The team is currently 7 to 16 people based in San Francisco.
The open source library is MIT licensed and the most-starred browser agent project on GitHub.
Why This Matters
Most of the internet is not accessible to AI agents. There are no APIs for browsing Amazon, checking flight prices, filling out government forms, or navigating internal tools. Browser Use makes all of that accessible through natural language.
The self-registration feature is particularly significant. Until now, every tool that required an account was gated by human setup. Browser Use is one of the first infrastructure products designed to be consumed by agents directly. The agent discovers the service, solves a challenge to prove it is an LLM, gets an API key, and starts working. The entire flow is autonomous.
That is what agent-native infrastructure looks like. Not "AI-powered" marketing on top of a human product. An actual product designed for AI agents as first-class users. We track this shift in our AI agent tools cheat sheet, which covers the full stack of tools autonomous agents need today.
Links: GitHub (85.8K stars) | Website | Open Source Docs | Cloud Docs
Support independent AI writing
If this was useful, you can tip us with crypto
Base (USDC)
0x74F9B96BBE963A0D07194575519431c037Ea522A
Solana (USDC)
F1VSkM4Pa7byrKkEPDTu3i9DEifvud8SURRw8niiazP8