[PR #6747] [PM-34487] llm: Add Android device interaction MCP server with ADB tooling #43930

Open
opened 2026-04-23 22:36:10 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/bitwarden/android/pull/6747
Author: @SaintPatrck
Created: 3/31/2026
Status: 🔄 Open

Base: mainHead: chore/improve-android-ui-verification-skill


📝 Commits (10+)

  • 483c8e7 [PM-34487] llm: Add interacting-with-android-device skill with ADB scripts
  • f487a28 Replace python3 XML parsing with grep + awk in adb-find-element
  • 9ee824b Address code review feedback for ADB scripts
  • 3771fb1 Fix allowed-tools format and add content-desc search fallback
  • 5302c0d Add obstruction detection to adb-find-element.sh
  • 17a9d81 chore: Convert ADB shell scripts to TypeScript MCP server
  • 3d607ed Fix .mcp.json location — move to project root
  • 64cf237 Extract shared find-element pipeline to reduce duplication
  • 12f19e8 Add input_text tool and update SKILL.md allowed-tools
  • 7f3e6d5 Fix shell escaping in input_text and stale path in SKILL.md

📊 Changes

26 files changed (+2670 additions, -0 deletions)

View changed files

.claude/mcp/android-device-server/.gitignore (+37 -0)
.claude/mcp/android-device-server/package.json (+34 -0)
.claude/mcp/android-device-server/src/adb/adb.spec.ts (+60 -0)
.claude/mcp/android-device-server/src/adb/adb.ts (+141 -0)
.claude/mcp/android-device-server/src/geometry/bounds.ts (+54 -0)
.claude/mcp/android-device-server/src/geometry/geometry.spec.ts (+334 -0)
.claude/mcp/android-device-server/src/geometry/obstruction.ts (+127 -0)
.claude/mcp/android-device-server/src/geometry/visible-region.ts (+91 -0)
.claude/mcp/android-device-server/src/index.ts (+64 -0)
.claude/mcp/android-device-server/src/parsers/__fixtures__/dumpsys-windows.txt (+630 -0)
.claude/mcp/android-device-server/src/parsers/__fixtures__/view.xml (+1 -0)
.claude/mcp/android-device-server/src/parsers/dumpsys.spec.ts (+122 -0)
.claude/mcp/android-device-server/src/parsers/dumpsys.ts (+105 -0)
.claude/mcp/android-device-server/src/parsers/xml.spec.ts (+120 -0)
.claude/mcp/android-device-server/src/parsers/xml.ts (+121 -0)
.claude/mcp/android-device-server/src/tools/capture.ts (+52 -0)
.claude/mcp/android-device-server/src/tools/find-element-pipeline.ts (+65 -0)
.claude/mcp/android-device-server/src/tools/find-element.ts (+77 -0)
.claude/mcp/android-device-server/src/tools/input-text.ts (+70 -0)
.claude/mcp/android-device-server/src/tools/navigate.ts (+62 -0)

...and 6 more files

📄 Description

🎟️ Tracking

https://bitwarden.atlassian.net/browse/PM-34487

📔 Objective

Add a TypeScript MCP server and companion skill for structured Android device interaction via ADB. Replaces the initial shell script approach with proper XML parsing, structured dumpsys window parsing, and native geometry for obstruction detection.

🏗️ Architecture

Claude Code ─���stdio──> MCP Server (TypeScript)
                         ├── tools/        (6 tool handlers)
                         ├── adb/          (execFile wrapper, device discovery)
                         ├── parsers/      (UIAutomator XML, dumpsys window)
                         └── geometry/     (bounds, obstruction, visible-region)

🔧 MCP Tools

Tool Description
capture Dump UI hierarchy XML and/or screenshot
find_element Find element by text/content-desc with two-layer obstruction detection
tap_element Find + tap + capture (auto-adjusts for obstructions)
tap_at Tap specific coordinates + capture
navigate Home, back, app-drawer navigation + capture
input_text Type text into focused field with shell-safe escaping

🛡️ Obstruction Detection

Two-layer system for detecting when UI elements are blocked:

  • Layer 1 — System overlays: Parses dumpsys window windows for TalkBack, PiP, accessibility overlays using touchable region bounds
  • Layer 2 — In-app elements: Finds topmost clickable element at tap point via XML tree traversal (catches FABs, dialogs, bottom sheets)

When obstructed, computes the largest visible strip (top/bottom/left/right of obstructor) and returns adjusted tap coordinates.

📁 Key Files

  • .mcp.json — MCP server config (stdio, auto-build)
  • .claude/mcp/android-device-server/ — TypeScript MCP server (self-contained for future plugin extraction)
  • .claude/skills/interacting-with-android-device/SKILL.md — Companion skill with tool docs and allowed-tools

Testing

  • 57 unit tests across geometry, parsers, ADB discovery (fixture-based, no device required)
  • fast-xml-parser for UIAutomator XML → typed tree
  • Zod validation on all tool inputs
  • execFile (not exec) prevents host-side command injection

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/bitwarden/android/pull/6747 **Author:** [@SaintPatrck](https://github.com/SaintPatrck) **Created:** 3/31/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `chore/improve-android-ui-verification-skill` --- ### 📝 Commits (10+) - [`483c8e7`](https://github.com/bitwarden/android/commit/483c8e7fb762c88b93fe608b1f74ac35437d7d8e) [PM-34487] llm: Add interacting-with-android-device skill with ADB scripts - [`f487a28`](https://github.com/bitwarden/android/commit/f487a28dce04de2e97d8874546bc4126f0263d59) Replace python3 XML parsing with grep + awk in adb-find-element - [`9ee824b`](https://github.com/bitwarden/android/commit/9ee824bb47a0c4c68175fd0164f21c478a824678) Address code review feedback for ADB scripts - [`3771fb1`](https://github.com/bitwarden/android/commit/3771fb18cd0d34d61291c045fc80201a384f80fc) Fix allowed-tools format and add content-desc search fallback - [`5302c0d`](https://github.com/bitwarden/android/commit/5302c0d00a5311e6b355ea1a9ca623a1592d4097) Add obstruction detection to adb-find-element.sh - [`17a9d81`](https://github.com/bitwarden/android/commit/17a9d81b8dd01b04e973a4ce42bb24c2dbbcf090) chore: Convert ADB shell scripts to TypeScript MCP server - [`3d607ed`](https://github.com/bitwarden/android/commit/3d607ed21f1d0f92a98cd163a3089f03e8686f93) Fix .mcp.json location — move to project root - [`64cf237`](https://github.com/bitwarden/android/commit/64cf23714e5b5b05211152ea472b935c7fc61324) Extract shared find-element pipeline to reduce duplication - [`12f19e8`](https://github.com/bitwarden/android/commit/12f19e84c38d8cb2f083f519a1fb401bd909043b) Add input_text tool and update SKILL.md allowed-tools - [`7f3e6d5`](https://github.com/bitwarden/android/commit/7f3e6d505026c1c6a2de359ece7ac5c15a8d036a) Fix shell escaping in input_text and stale path in SKILL.md ### 📊 Changes **26 files changed** (+2670 additions, -0 deletions) <details> <summary>View changed files</summary> ➕ `.claude/mcp/android-device-server/.gitignore` (+37 -0) ➕ `.claude/mcp/android-device-server/package.json` (+34 -0) ➕ `.claude/mcp/android-device-server/src/adb/adb.spec.ts` (+60 -0) ➕ `.claude/mcp/android-device-server/src/adb/adb.ts` (+141 -0) ➕ `.claude/mcp/android-device-server/src/geometry/bounds.ts` (+54 -0) ➕ `.claude/mcp/android-device-server/src/geometry/geometry.spec.ts` (+334 -0) ➕ `.claude/mcp/android-device-server/src/geometry/obstruction.ts` (+127 -0) ➕ `.claude/mcp/android-device-server/src/geometry/visible-region.ts` (+91 -0) ➕ `.claude/mcp/android-device-server/src/index.ts` (+64 -0) ➕ `.claude/mcp/android-device-server/src/parsers/__fixtures__/dumpsys-windows.txt` (+630 -0) ➕ `.claude/mcp/android-device-server/src/parsers/__fixtures__/view.xml` (+1 -0) ➕ `.claude/mcp/android-device-server/src/parsers/dumpsys.spec.ts` (+122 -0) ➕ `.claude/mcp/android-device-server/src/parsers/dumpsys.ts` (+105 -0) ➕ `.claude/mcp/android-device-server/src/parsers/xml.spec.ts` (+120 -0) ➕ `.claude/mcp/android-device-server/src/parsers/xml.ts` (+121 -0) ➕ `.claude/mcp/android-device-server/src/tools/capture.ts` (+52 -0) ➕ `.claude/mcp/android-device-server/src/tools/find-element-pipeline.ts` (+65 -0) ➕ `.claude/mcp/android-device-server/src/tools/find-element.ts` (+77 -0) ➕ `.claude/mcp/android-device-server/src/tools/input-text.ts` (+70 -0) ➕ `.claude/mcp/android-device-server/src/tools/navigate.ts` (+62 -0) _...and 6 more files_ </details> ### 📄 Description ## 🎟️ Tracking https://bitwarden.atlassian.net/browse/PM-34487 ## 📔 Objective Add a TypeScript MCP server and companion skill for structured Android device interaction via ADB. Replaces the initial shell script approach with proper XML parsing, structured `dumpsys window` parsing, and native geometry for obstruction detection. ## 🏗️ Architecture ``` Claude Code ─���stdio──> MCP Server (TypeScript) ├── tools/ (6 tool handlers) ├── adb/ (execFile wrapper, device discovery) ├── parsers/ (UIAutomator XML, dumpsys window) └── geometry/ (bounds, obstruction, visible-region) ``` ## 🔧 MCP Tools | Tool | Description | |------|-------------| | `capture` | Dump UI hierarchy XML and/or screenshot | | `find_element` | Find element by text/content-desc with two-layer obstruction detection | | `tap_element` | Find + tap + capture (auto-adjusts for obstructions) | | `tap_at` | Tap specific coordinates + capture | | `navigate` | Home, back, app-drawer navigation + capture | | `input_text` | Type text into focused field with shell-safe escaping | ## 🛡️ Obstruction Detection Two-layer system for detecting when UI elements are blocked: - **Layer 1 — System overlays**: Parses `dumpsys window windows` for TalkBack, PiP, accessibility overlays using `touchable region` bounds - **Layer 2 — In-app elements**: Finds topmost clickable element at tap point via XML tree traversal (catches FABs, dialogs, bottom sheets) When obstructed, computes the largest visible strip (top/bottom/left/right of obstructor) and returns adjusted tap coordinates. ## 📁 Key Files - `.mcp.json` — MCP server config (stdio, auto-build) - `.claude/mcp/android-device-server/` — TypeScript MCP server (self-contained for future plugin extraction) - `.claude/skills/interacting-with-android-device/SKILL.md` — Companion skill with tool docs and `allowed-tools` ## ✅ Testing - 57 unit tests across geometry, parsers, ADB discovery (fixture-based, no device required) - `fast-xml-parser` for UIAutomator XML → typed tree - Zod validation on all tool inputs - `execFile` (not `exec`) prevents host-side command injection --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-23 22:36:10 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/android#43930