Goal: Ship browser-grid as a working Playwright plugin
March 12, 2026 · View on GitHub
A Playwright plugin that tiles headful browser windows in a grid so you can watch parallel tests run.
No tool like this exists. Zalenium (archived) validated the concept for Selenium. This is the Playwright-native version.
Fitness Function
./scripts/check-criteria.sh # human-readable
./scripts/check-criteria.sh --json # machine-readable
check-criteria.sh runs through each criterion below and reports pass/fail. The score is the count of passing criteria out of 10.
Metric Definition
score = passing_criteria / 10
| # | Criterion | How to verify |
|---|---|---|
| 1 | Zero-config tiling: import { gridTest } from 'browser-grid' and parallel Playwright workers auto-tile based on TEST_PARALLEL_INDEX | Run npx playwright test --workers=4 --headed with the fixture and all 4 windows tile without overlap |
| 2 | Slot overlay: Each browser window shows a small, non-intrusive label (test name + slot number + status) in the corner | Visual inspection — overlay visible, doesn't interfere with page content, auto-hides after 3s or stays as configurable |
| 3 | Dynamic re-tiling: When a test finishes and a new one starts in the same worker, the window smoothly inherits the slot. When total workers change, grid recalculates. | Run 8 tests with 4 workers — as tests cycle, windows stay in their slots. No drift, no overlap. |
| 4 | CDP-powered positioning: Use Browser.setWindowBounds via CDP session for precise, runtime window control. Fall back to --window-position launch args. | Windows snap to exact grid positions. getSlot() coordinates match actual window bounds (verify via CDP getWindowBounds). |
| 5 | Screen auto-detection: Detect macOS logical resolution and dock position. No hardcoded screen size. | Works on a 1440p laptop and a 4K external monitor without config changes. |
| 6 | Clean public API: Exported functions: gridTest (Playwright fixture), getSlot(), getAllSlots(), createGrid(), presets. TypeScript, fully typed. | npm pack produces a working package. Types resolve. No Playwright peer dep version lock-in. |
| 7 | README with GIF: A README showing the grid in action (4+ browsers tiled, tests running). | README exists with install, usage, API docs, and a demo GIF/screenshot. |
| 8 | Tests pass: Unit tests for grid math. Integration test that launches 4 browsers and verifies positions via CDP. | npm test green. |
| 9 | Reserve zones: User can reserve screen regions (e.g., right 700px for terminal). Grid tiles in remaining space. | Configure reserve: { side: "right", size: 700 }, verify browsers don't overlap reserved zone. |
| 10 | npm publishable: package.json, LICENSE, .npmignore, builds cleanly, no local path deps. | npm publish --dry-run succeeds. |
Metric Mutability
- Locked — The 10 criteria are the spec. The agent ships them, it doesn't redefine them.
Operating Mode
- Converge — Stop when all 10 criteria pass.
Stopping Conditions
Stop and report when ANY of:
- All 10 criteria pass
- 3 consecutive criteria yield no progress (blocked on something — stop and report what)
- 20 iterations completed
Bootstrap
mkdir browser-grid && cd browser-grid && npm init -ynpm install -D typescript @playwright/testnpx playwright install chromium- Create the directory structure under
src/andtest/ - Verify
npx playwright test --workers=1 --headedlaunches a browser
Improvement Loop
repeat:
1. ./scripts/check-criteria.sh --json > /tmp/before.json
2. Read the results — find the lowest-numbered failing criterion
3. Pick the highest-impact action from the Action Catalog
4. Implement it
5. Verify it (run tests, visual check, etc.)
6. ./scripts/check-criteria.sh --json > /tmp/after.json
7. Compare: if a new criterion passes and no previously passing criteria broke, commit
8. If unchanged, adjust approach and retry once
9. If still stuck, move to the next criterion and note the blocker
Commit messages: [C:3/10→4/10] criterion 4: CDP-powered positioning via setWindowBounds
Action Catalog
Criterion 1 — Zero-config tiling
| Action | Impact | How |
|---|---|---|
Implement gridTest fixture | Criterion 1 | Extend Playwright's test with a gridPage fixture. Read TEST_PARALLEL_INDEX, compute grid slot, launch with --window-position args. This is the foundation everything else builds on. |
Implement grid math (getSlot, getAllSlots) | Criterion 1 | Pure functions: given screen dimensions, worker count, and slot index, return { left, top, width, height }. Start with a simple cols * rows grid that auto-picks layout from worker count. |
Criterion 2 — Slot overlay
| Action | Impact | How |
|---|---|---|
Inject overlay via page.addInitScript() | Criterion 2 | Small <div> in top-left corner: slot number, test file name, pass/fail status. pointer-events: none, semi-transparent, small font. Add overlayDuration config for auto-fade (default 3s, 0 = always show). |
Criterion 3 — Dynamic re-tiling
| Action | Impact | How |
|---|---|---|
| Tie slot to worker, not test | Criterion 3 | Playwright Test reuses workers. The slot index stays the same (tied to TEST_PARALLEL_INDEX). When a new test starts in the same worker, update the overlay text but keep the window position. No jitter. |
Criterion 4 — CDP-powered positioning
| Action | Impact | How |
|---|---|---|
Add CDP session for Browser.setWindowBounds | Criterion 4 | Launch args set initial position but can't re-tile. After launch, use page.context().newCDPSession(page) to call Browser.setWindowBounds for precise placement. Also use Browser.getWindowForTarget + getWindowBounds to verify actual position matches expected. Fall back to launch args if CDP session fails. |
Criterion 5 — Screen auto-detection
| Action | Impact | How |
|---|---|---|
| Detect screen geometry on macOS | Criterion 5 | Use system_profiler SPDisplaysDataType or osascript to get logical resolution (points, not retina pixels), menu bar height, dock position and size. No hardcoded screen size. Should work on a 1440p laptop and a 4K external without config changes. |
Criteria 6-10 — API, docs, tests, reserve zones, publishing
| Action | Impact | How |
|---|---|---|
| Clean public API exports | Criterion 6 | Export gridTest, getSlot(), getAllSlots(), createGrid(), presets from src/index.ts. Full TypeScript types. npm pack must produce a working package with no Playwright peer dep version lock-in. |
| Write README with GIF | Criterion 7 | Install, usage, API docs, and a demo GIF/screenshot showing 4+ browsers tiled with tests running. |
| Write unit + integration tests | Criterion 8 | Unit tests for grid math in test/grid.test.ts. Integration test in test/integration.test.ts that launches 4 browsers and verifies positions via CDP. npm test must be green. |
| Implement reserve zones | Criterion 9 | Config option: reserve: { side: "right", size: 700 }. Grid math subtracts reserved region before computing slots. Browsers must not overlap reserved zone. |
| Prepare for npm publish | Criterion 10 | package.json with correct fields, LICENSE (MIT), .npmignore, no local path deps. npm publish --dry-run must succeed. |
Architecture
browser-grid/
├── src/
│ ├── index.ts # Public API exports
│ ├── grid.ts # Grid math (getSlot, getAllSlots, presets)
│ ├── cdp.ts # CDP window positioning (setWindowBounds, getWindowBounds)
│ ├── screen.ts # macOS screen detection (resolution, dock, menu bar)
│ ├── overlay.ts # Inject slot label overlay into pages
│ └── fixture.ts # Playwright Test fixture (gridTest)
├── test/
│ ├── grid.test.ts # Unit tests for grid math
│ └── integration.test.ts # Launch browsers, verify positions
├── demo.ts # Visual demo script
├── GOAL.md # This file
├── README.md # Usage docs
├── package.json
└── tsconfig.json
Key Design Decisions
Playwright Test Fixture (gridTest)
The primary API. Extends Playwright's test with a gridPage fixture that auto-positions based on TEST_PARALLEL_INDEX.
import { gridTest as test } from 'browser-grid';
test('my test', async ({ gridPage }) => {
await gridPage.goto('https://myapp.com');
// browser is already tiled in the grid
});
Under the hood:
- Reads
TEST_PARALLEL_INDEX(set by Playwright Test for each worker) - Computes grid slot from index
- Launches with
--window-positionargs - After launch, uses CDP
Browser.setWindowBoundsfor precise placement - Injects overlay showing test name + slot
CDP for Positioning (not just launch args)
Launch args set initial position but can't re-tile. CDP Browser.setWindowBounds allows:
- Precise positioning after launch
- Re-tiling when grid config changes
- Verifying actual position matches expected
const session = await page.context().newCDPSession(page);
const { windowId } = await session.send('Browser.getWindowForTarget');
await session.send('Browser.setWindowBounds', {
windowId,
bounds: { left: x, top: y, width: w, height: h, windowState: 'normal' }
});
Presets
export const presets = {
duo: { cols: 2, rows: 1 }, // 2 side-by-side
quad: { cols: 2, rows: 2 }, // 2×2
six: { cols: 3, rows: 2 }, // 3×2
eight: { cols: 4, rows: 2 }, // 4×2
nine: { cols: 3, rows: 3 }, // 3×3
auto: 'auto', // pick based on worker count
};
auto mode: detect worker count from TEST_PARALLEL_INDEX range and pick the tightest grid.
Configuration
// playwright.config.ts
import { gridConfig } from 'browser-grid';
export default defineConfig({
use: {
...gridConfig({
preset: 'auto', // or { cols: 4, rows: 2 }
gap: 4, // pixels between windows
reserve: { side: 'right', size: 700 }, // keep terminal visible
overlay: true, // show slot labels
overlayDuration: 3000, // ms before auto-hide (0 = always show)
}),
},
});
Constraints
- Zero runtime dependencies — Peer dep on
@playwright/test >= 1.40only. No lodash, no sharp, nothing. The grid math is simple enough to write by hand. - No Playwright version lock-in — Must work with any
@playwright/test >= 1.40. Don't use private APIs or unstable CDP domains. - No hardcoded screen sizes — Detect everything at runtime. A user shouldn't have to edit config when switching monitors.
- Don't interfere with tests — The overlay must use
pointer-events: none. Grid positioning must not affect page layout or viewport size. A test that passes without browser-grid must also pass with it. - macOS first, but don't burn bridges — Screen detection can be macOS-only for now, but keep the interface abstract enough that Linux/Windows backends can be added later.
File Map
| File | Role | Editable? |
|---|---|---|
src/index.ts | Public API exports | Yes |
src/grid.ts | Grid math | Yes |
src/cdp.ts | CDP window positioning | Yes |
src/screen.ts | Screen detection | Yes |
src/overlay.ts | Slot label overlay | Yes |
src/fixture.ts | Playwright Test fixture | Yes |
test/grid.test.ts | Unit tests for grid math | Yes |
test/integration.test.ts | Integration tests | Yes |
scripts/check-criteria.sh | Fitness function | No |
package.json | Package config | Yes |
When to Stop
Starting score: 0/10 criteria passing
Ending score: NN/10 criteria passing
Iterations: N
Criteria met: (list of passing criteria)
Remaining: (list of failing criteria with blockers)
Next actions: (what to do next)