[PR #6747] [PM-34487] llm: Add Android device interaction MCP server with ADB tooling #12983

Open
opened 2026-04-11 03:55:21 -05:00 by GiteaMirror · 0 comments
Owner

Original Pull Request: https://github.com/bitwarden/android/pull/6747

State: open
Merged: No


🎟️ Tracking

https://bitwarden.atlassian.net/browse/PM-34487

📔 Objective

Add a TypeScript MCP server and companion skill for structured Android device interaction via ADB. Replaces the initial shell script approach with proper XML parsing, structured dumpsys window parsing, and native geometry for obstruction detection.

🏗️ Architecture

Claude Code ──stdio──> MCP Server (TypeScript)
                         ├── tools/        (6 tool handlers)
                         ├── adb/          (execFile wrapper, device discovery)
                         ├── parsers/      (UIAutomator XML, dumpsys window)
                         └── geometry/     (bounds, obstruction, visible-region)

🔧 MCP Tools

Tool Description
capture Dump UI hierarchy XML and/or screenshot
find_element Find element by text/content-desc with two-layer obstruction detection
tap_element Find + tap + capture (auto-adjusts for obstructions)
tap_at Tap specific coordinates + capture
navigate Home, back, app-drawer navigation + capture
input_text Type text into focused field with shell-safe escaping

🛡️ Obstruction Detection

Two-layer system for detecting when UI elements are blocked:

  • Layer 1 — System overlays: Parses dumpsys window windows for TalkBack, PiP, accessibility overlays using touchable region bounds
  • Layer 2 — In-app elements: Finds topmost clickable element at tap point via XML tree traversal (catches FABs, dialogs, bottom sheets)

When obstructed, computes the largest visible strip (top/bottom/left/right of obstructor) and returns adjusted tap coordinates.

📁 Key Files

  • .mcp.json — MCP server config (stdio, auto-build)
  • .claude/mcp/android-device-server/ — TypeScript MCP server (self-contained for future plugin extraction)
  • .claude/skills/interacting-with-android-device/SKILL.md — Companion skill with tool docs and allowed-tools

Testing

  • 57 unit tests across geometry, parsers, ADB discovery (fixture-based, no device required)
  • fast-xml-parser for UIAutomator XML → typed tree
  • Zod validation on all tool inputs
  • execFile (not exec) prevents host-side command injection
**Original Pull Request:** https://github.com/bitwarden/android/pull/6747 **State:** open **Merged:** No --- ## 🎟️ Tracking https://bitwarden.atlassian.net/browse/PM-34487 ## 📔 Objective Add a TypeScript MCP server and companion skill for structured Android device interaction via ADB. Replaces the initial shell script approach with proper XML parsing, structured `dumpsys window` parsing, and native geometry for obstruction detection. ## 🏗️ Architecture ``` Claude Code ──stdio──> MCP Server (TypeScript) ├── tools/ (6 tool handlers) ├── adb/ (execFile wrapper, device discovery) ├── parsers/ (UIAutomator XML, dumpsys window) └── geometry/ (bounds, obstruction, visible-region) ``` ## 🔧 MCP Tools | Tool | Description | |------|-------------| | `capture` | Dump UI hierarchy XML and/or screenshot | | `find_element` | Find element by text/content-desc with two-layer obstruction detection | | `tap_element` | Find + tap + capture (auto-adjusts for obstructions) | | `tap_at` | Tap specific coordinates + capture | | `navigate` | Home, back, app-drawer navigation + capture | | `input_text` | Type text into focused field with shell-safe escaping | ## 🛡️ Obstruction Detection Two-layer system for detecting when UI elements are blocked: - **Layer 1 — System overlays**: Parses `dumpsys window windows` for TalkBack, PiP, accessibility overlays using `touchable region` bounds - **Layer 2 — In-app elements**: Finds topmost clickable element at tap point via XML tree traversal (catches FABs, dialogs, bottom sheets) When obstructed, computes the largest visible strip (top/bottom/left/right of obstructor) and returns adjusted tap coordinates. ## 📁 Key Files - `.mcp.json` — MCP server config (stdio, auto-build) - `.claude/mcp/android-device-server/` — TypeScript MCP server (self-contained for future plugin extraction) - `.claude/skills/interacting-with-android-device/SKILL.md` — Companion skill with tool docs and `allowed-tools` ## ✅ Testing - 57 unit tests across geometry, parsers, ADB discovery (fixture-based, no device required) - `fast-xml-parser` for UIAutomator XML → typed tree - Zod validation on all tool inputs - `execFile` (not `exec`) prevents host-side command injection
GiteaMirror added the pull-request label 2026-04-11 03:55:21 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/android#12983