How Web Automation works?
Firstly, let’s understand what is web automation. It’s an automated process of browsing the web, using web applications and accessing online content.
In order to automate interactions with websites, we need to have:
- code which will automate entire process,
- “web driver” allowing programmatic control over browser capabilities It’s shown on the diagram below:
What is Web Driver?
WebDriver is a bridge between your automation code and the browser. It allows your script to control a browser just like a human — but programmatically.
Web driver role is to:
- Browser Control
- WebDriver launches and controls a real browser instance (e.g., Chrome, Firefox).
- It can open URLs, click buttons, fill forms, scroll, take screenshots, etc. Anything you can do manually
- Interprets Automation Commands
- Your code sends instructions, which are interpreted into commands for browser
- Ensures Cross-Browser Compatibility
- Different browsers have different drivers (e.g.,
chromedriver
,geckodriver
), and WebDriver abstracts these differences, so your automation code can run on multiple browsers with minimal changes.
- Different browsers have different drivers (e.g.,
- Waits and Synchronization
- It can wait for elements to appear, ensuring actions are performed at the right time (important for JavaScript-heavy pages).
- Fetches Data from Web Pages
- WebDriver can retrieve page content, extract text, attributes, and HTML structure — useful for web scraping or testing.
”No Driver” vs “Web Driver”
Traditional web automation uses WebDriver. Just think of Selenium with Chromedriver, typical usage.
Currently, there is better approach that is faster, easier to customize and requires less resources.
“No Driver” or “No WebDriver” automation skips the WebDriver entirely and uses low-level protocol such as Chrome DevTools Protocol (CDP) to control the browser directly.
What is CDP? Chrome DevTools Protocol (CDP) is a set of low-level browser APIs that let you control and inspect Chromium-based browsers like Chrome and Edge.
- It’s what Chrome DevTools (F12 tools) use internally.
- CDP exposes more powerful, faster, and lower-level control than WebDriver.
Why Use CDP Instead of WebDriver?
Feature | WebDriver | CDP |
---|---|---|
Speed | Moderate | Faster (direct connection) |
Stealth (anti-bot detection) | Easy detectable | Harder to detect |
Control over network, cache | Limited | Full control (block, mock, throttle) |
Access to browser internals | Limited | Full access (DOM, JS, console, etc.) |
Requires separate driver | Yes | No |
Speed: WebDriver operates through an intermediary driver (like chromedriver
), which introduces some network and protocol overhead. In contrast, CDP connects directly to the browser’s debugging interface, making it faster and more responsive.
Stealth (Bot Detection): WebDriver-based automation is often easy to detect by websites, as it leaves telltale signatures and uses standard automation interfaces. CDP, on the other hand, mimics a more “natural” user behavior and avoids the common WebDriver fingerprints, making it more stealthy and suitable for bypassing bot detection.
Network and Cache Control: WebDriver offers limited control over browser internals like network traffic. CDP provides full access to intercept, block, modify, or mock network requests and responses, giving fine-grained control ideal for scraping, testing, or simulating network conditions.
Browser Internals Access: WebDriver focuses on user-level interactions (clicks, typing, etc.) and does not expose much of the browser’s internal mechanisms. CDP allows direct access to the DOM, JavaScript execution, console logs, and more, offering much deeper automation and debugging capabilities.
Setup Requirements: WebDriver requires a separate driver binary (like chromedriver
or geckodriver
) to act as a bridge between your script and the browser. CDP requires no external driver — it talks to the browser directly through a WebSocket endpoint.
Support for other browser
As name suggest, Chrome DevTools Protocol is working only with Chromium-based browsers. It mean it’s working with:
- Chrome
- Brave (inherits from Chromium)
- Microsoft Edge (new versions are Chromium-based)
So what if you need to use Firefox or Safari?
For Firefox there is an equivalent - Remote Debugging Protocol (RDP).
For Safari there is no such tool with similar capabilities. There is only WebKit Remote Debug Protocol, which is limited compares to others.
Tools using low-level protocols
There are some tools using low-level protocols for browser automation.
- Puppeteer - Headless‑Chrome Node.js API from the Chrome DevTools team—high‑level, easy automation of Chromium browsers.
- Playwright - Cross‑browser (Chromium, Firefox, WebKit) automation library with CDP‑style low‑level control and built‑in test runner.
- playwright-python - Official Python bindings for Playwright—full CDP‑based, cross‑browser automation in Python.
- nodriver-python Async web scraping focused automation library using CDP.
- chromedp - Go library for driving Chrome via the DevTools Protocol—lets Go programs drive headless or full‑browser instances.
- rod - Another Go toolkit for CDP‑based browser automation, with a concise, chainable API and built‑in element waiting.