seleniumboot-mcp

June 27, 2026 · View on GitHub

A Python Model Context Protocol (MCP) server for Selenium WebDriver automation. Let Claude or GitHub Copilot control a real browser — navigate pages, interact with elements, run assertions, and generate ready-to-run Java TestNG / JUnit 5 / Cucumber / pytest test code from recorded sessions. 82 tools. No ChromeDriver setup. Browser auto-starts on first use.

PyPI Python License: MIT VS Code Marketplace JetBrains Plugin JetBrains Downloads


Demo

Watch the demo on YouTube


Installation

pip install seleniumboot-mcp

Requires Python 3.10+ and Chrome. No separate ChromeDriver needed — Selenium Manager handles it automatically.


Setup

VS Code — Marketplace Extension (easiest)

Install Selenium Boot MCP from the VS Code Marketplace.

The extension automatically:

  • Registers the MCP server with GitHub Copilot (no config file needed)
  • Creates a .mcp.json in your project so Claude Code detects it on next open
  • Prompts to pip install seleniumboot-mcp if the Python package is missing

When Claude Code asks "Allow MCP server seleniumboot?" — click Allow.

JetBrains IDEs — Marketplace Plugin

Install Selenium Boot MCP from Settings → Plugins → Marketplace (search "Selenium Boot MCP"), or from the plugin page.

The plugin registers the MCP server with the JetBrains AI Assistant and prompts to pip install seleniumboot-mcp if the Python package is missing. Use Tools → Selenium Boot MCP to install/upgrade, register, or check status.

VS Code — Manual

Add .vscode/mcp.json to your project root (for GitHub Copilot):

{
  "servers": {
    "selenium": {
      "type": "stdio",
      "command": "seleniumboot-mcp"
    }
  }
}

For Claude Code, add .mcp.json to your project root:

{
  "mcpServers": {
    "seleniumboot": {
      "command": "seleniumboot-mcp",
      "args": []
    }
  }
}

Open the project in VS Code → Claude Code will prompt to approve the server → done.

Claude Desktop

Edit your Claude Desktop config:

  • Windows: %APPDATA%\claude-desktop\config.json
  • macOS: ~/.config/claude-desktop/config.json
{
  "mcpServers": {
    "selenium": {
      "command": "seleniumboot-mcp"
    }
  }
}

Restart Claude Desktop.


How to use

Once the server is running, talk to Claude naturally — no start_browser call needed, Chrome launches automatically on first use:

Go to https://myapp.com and fill the login form with admin/password, then click Login
Assert the dashboard heading is visible
Generate a Java TestNG test class for everything we just did
Generate a Gherkin feature file and step definitions for the login flow

Claude controls the real browser, records every action, and on request generates complete test code ready to paste into your Maven or Gradle project.


Tools (84 total)

Browser

ToolDescription
start_browserOptional — Chrome auto-starts on first use. Use this to pick Firefox, enable headless, or set window size
navigateGo to a URL
take_screenshotCapture page as an inline image
get_page_titleReturn page title
get_current_urlReturn current URL
get_page_sourceReturn full HTML source
execute_scriptRun JavaScript
go_back / go_forwardBrowser history
refreshReload page
switch_to_windowSwitch between tabs by index
open_new_tabOpen a new browser tab, optionally at a URL
close_current_tabClose the active tab and switch to the previous
list_windowsList all open tabs with index, title, and URL
close_browserQuit the browser
scroll_to_topScroll page to the top
scroll_to_bottomScroll page to the bottom
scroll_byScroll page by x/y pixels
emulate_deviceEmulate a mobile device (iPhone, iPad, Pixel, Galaxy) via CDP
get_console_logsGet browser console errors/warnings/info (Chrome)
get_cookies / set_cookieRead or write a cookie
delete_cookie / delete_all_cookiesRemove cookies
get_local_storage / set_local_storageRead or write localStorage
get_session_storage / set_session_storageRead or write sessionStorage
wait_for_network_idleWait until XHR/fetch traffic is quiet — essential for SPAs
inspect_pageDiscover all inputs, buttons, selects, links with best-fit CSS selectors
get_network_logsCaptured XHR/fetch requests — method, URL, status, timing
mock_responseStub fetch/XHR by URL pattern with a canned response
clear_mock_responsesRemove all active mock rules
compare_screenshotPixel diff against a saved baseline — visual regression
check_accessibilityBuilt-in WCAG audit — alt text, labels, headings, keyboard access

Elements

ToolDescription
find_elementFind element, return tag/text/state
find_elementsFind all matching elements
clickClick with explicit wait
type_textClear + type into input
get_textGet visible text
get_attributeGet any attribute value
select_optionSelect from <select> by text, value, or index
hoverMouse hover
double_clickDouble click
right_clickContext menu click
drag_and_dropDrag source → target
is_displayedCheck visibility
is_enabledCheck enabled state
wait_for_elementWait: visible / clickable / present / invisible
scroll_to_elementScroll element into view
clear_fieldClear input field
send_keysSend special keys (Tab, Enter, Escape, Ctrl+A, F5, …)
upload_fileUpload a file via <input type="file">
accept_alert / dismiss_alertHandle JS alert/confirm dialogs
get_alert_textRead the message from an alert
type_in_alertType into a JS prompt and accept
switch_to_frameFocus into an iframe by index, name, or selector
switch_to_default_contentReturn to the main page from a frame
find_shadow_elementFind element inside a shadow DOM
get_table_dataExtract an HTML table as a formatted text grid
fill_formFill multiple fields at once — auto-detects input/select/checkbox/radio
get_healed_locatorsView all self-healed selector mappings for the session
clear_healed_locatorsReset the self-healing cache

Assertions

ToolDescription
assert_titlePage title equals/contains
assert_urlURL equals/contains
assert_textElement text equals/contains
assert_element_visibleElement is visible
assert_element_not_visibleElement is hidden or absent
assert_attributeElement attribute has expected value
assert_page_containsPage body contains a string
assert_element_countCount of matching elements equals expected

Codegen

ToolDescription
generate_java_testngJava TestNG test class from session
generate_java_junit5Java JUnit 5 test class from session
generate_java_page_objectJava Page Object class + test class from session
generate_gherkinGherkin .feature file + Java step definitions from session
generate_python_testpytest class from session
generate_csharp_nunitC# NUnit + Selenium test class from session
generate_github_actionsGitHub Actions CI workflow YAML (Maven / Gradle / pytest)
generate_jenkins_pipelineDeclarative Jenkinsfile (Maven / Gradle / pytest)
generate_gitlab_ciGitLab CI .gitlab-ci.yml pipeline (Maven / Gradle / pytest)
generate_playwright_hintsEquivalent Playwright TypeScript code from session
get_session_logView recorded actions
clear_session_logReset the session recording

Codegen Examples

Java TestNG

Go to https://myapp.com/login, enter admin/password, click Login
→ Generate a Java TestNG test
public class LoginTest {
    @BeforeMethod public void setUp() { driver = new ChromeDriver(); ... }
    @AfterMethod  public void tearDown() { driver.quit(); }

    @Test
    public void recordedFlowTest() {
        driver.get("https://myapp.com/login");
        WebElement f = wait.until(visibilityOf(By.cssSelector("#username")));
        f.clear(); f.sendKeys("admin");
        wait.until(elementToBeClickable(By.cssSelector("button[type='submit']"))).click();
    }
}

Page Object Model

Generate a Java Page Object for the login page
// LoginPage.java
public class LoginPage {
    private final By usernameField = By.cssSelector("#username");
    private final By submitButton  = By.cssSelector("button[type='submit']");

    public LoginPage enterUsernameField(String text) { ... return this; }
    public LoginPage clickSubmitButton()              { ... return this; }
}

// LoginTest.java
public class LoginTest {
    @Test public void recordedFlowTest() {
        driver.get("https://myapp.com/login");
        page.enterUsernameField("admin").clickSubmitButton();
    }
}

Cucumber / Gherkin

Generate a Gherkin feature file and step definitions for the login flow
Feature: Login

  Scenario: User logs in with valid credentials
    Given I navigate to "https://myapp.com/login"
    And I enter "admin" in the username field
    And I enter "password" in the password field
    And I click the submit button
// LoginSteps.java
public class LoginSteps {
    @Given("I navigate to {string}")
    public void iNavigateTo(String url) { driver.get(url); }

    @And("I enter {string} in the username field")
    public void iEnterInUsernameField(String text) { ... el.sendKeys(text); }

    @And("I click the submit button")
    public void iClickTheSubmitButton() { wait.until(...).click(); }
}

Self-Healing Locators

When a selector fails to find an element, seleniumboot-mcp automatically tries alternative strategies before giving up:

Primary selectorAlternatives tried
#my-id (CSS)by=id "my-id", [id='my-id']
.my-class (CSS)by=class "my-class", [class*='my-class']
input[type='email'] (CSS)//input[@type='email'] (XPath)
//button[@id='ok'] (XPath)button[id='ok'] (CSS), by=id "ok"
"A, B" comma listtries A first, then B

Successful fallbacks are cached so the healed selector is reused automatically. Use get_healed_locators to inspect the cache and update your test code, and clear_healed_locators to start fresh.



Roadmap

  • Java TestNG / JUnit 5 / Python pytest code generation
  • Screenshot returned as ImageContent (renders inline in Claude)
  • Full session recording — hover, double_click, right_click, scroll, select_option
  • Codegen for hover, drag-and-drop, select, scroll in Java and Python templates
  • Auto-start browser on first use (no explicit start_browser needed)
  • Page Object Model generation (generate_java_page_object)
  • Cucumber / Gherkin step generation (generate_gherkin)
  • Self-healing locators — automatic fallback when a selector breaks
  • Alert/dialog handling, iframe switching, shadow DOM, table extraction
  • Cookie, localStorage, sessionStorage management
  • Mobile device emulation via Chrome DevTools Protocol
  • Special key sending (Tab, Enter, F-keys, Ctrl+A, …)
  • File upload via <input type="file">
  • Browser console log capture
  • Multi-tab management (open, close, list)
  • Page scroll (top, bottom, by pixels)
  • C# NUnit + Selenium codegen
  • GitHub Actions CI workflow generator (Maven / Gradle / pytest)
  • Playwright TypeScript migration hints
  • CI/CD config for Jenkins / GitLab CI

License

MIT