CDP 协议自动化实战：用 Chrome DevTools Protocol 控制 Electron 桌面应用

Electron 应用自动化为什么这么难

做过桌面应用自动化的人都知道，这是个坑多到数不过来的领域。

传统的桌面自动化方案（PyAutoGUI、AutoIt、WinAppDriver）本质上是模拟鼠标点击和键盘输入——它们不知道界面上有什么元素，只知道屏幕坐标。一旦窗口位置变了、分辨率变了、弹了个对话框挡住了，整个脚本就废了。

而 Selenium、Playwright 这些 Web 自动化工具虽然好用，但它们控制的是浏览器，不是 Electron 应用。Electron 虽然底层也是 Chromium，但它不会暴露远程调试端口，标准的 Selenium 连不上。

CDP（Chrome DevTools Protocol）提供了一条路：让 Electron 应用暴露调试端口，然后用跟控制浏览器一样的方式控制它。你可以精确地定位 DOM 元素、执行 JavaScript、拦截网络请求——跟你在 Chrome DevTools 里手动操作一模一样。

CDP 是什么

Chrome DevTools Protocol 是 Chrome 浏览器暴露的一套调试协议。你平时打开 F12 开发者工具，用的就是这套协议。它支持的操作包括：

能力	说明
DOM 操作	查询元素、修改属性、监听变化
JavaScript 执行	在页面上下文中执行任意 JS 代码
网络拦截	拦截/修改/mock HTTP 请求
页面导航	控制页面跳转、刷新
截图	页面截图、元素截图
性能分析	CPU Profile、内存快照

Electron 应用的内核就是 Chromium，所以它也支持 CDP——只是默认没有开启。

第一步：让 Electron 应用开启远程调试

方法 1：启动参数

在启动 Electron 应用时加上远程调试参数：

1
2
3
4
5
6
7
8
9
# macOS
/Applications/YourApp.app/Contents/MacOS/YourApp \
  --remote-debugging-port=9222

# Windows
YourApp.exe --remote-debugging-port=9222

# Linux
./your-app --remote-debugging-port=9222

方法 2：修改 Electron 主进程代码

如果你有应用的源码，可以在主进程中配置：

1
2
3
4
5
6
const { app } = require('electron');

app.commandLine.appendSwitch('remote-debugging-port', '9222');

// 可选：允许远程连接（默认只允许 localhost）
app.commandLine.appendSwitch('remote-debugging-address', '0.0.0.0');

验证连接

开启后，访问 http://localhost:9222/json 应该能看到页面信息：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
[
  {
    "description": "",
    "devtoolsFrontendUrl": "/devtools/inspector.html?ws=localhost:9222/...",
    "id": "ABC123",
    "title": "Your App Window",
    "type": "page",
    "url": "file:///app/index.html",
    "webSocketDebuggerUrl": "ws://localhost:9222/devtools/page/ABC123"
  }
]

第二步：用 Python 连接并控制

基础连接

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
import asyncio
import websockets
import json

class CDPClient:
    """Chrome DevTools Protocol 客户端"""

    def __init__(self, ws_url):
        self.ws_url = ws_url
        self.ws = None
        self.message_id = 0
        self.responses = {}

    async def connect(self):
        self.ws = await websockets.connect(self.ws_url)
        # 启动消息接收循环
        asyncio.create_task(self._receive_loop())

    async def _receive_loop(self):
        async for message in self.ws:
            data = json.loads(message)
            msg_id = data.get("id")
            if msg_id and msg_id in self.responses:
                self.responses[msg_id].set_result(data)

    async def send(self, method, params=None):
        """发送 CDP 命令"""
        self.message_id += 1
        msg = {"id": self.message_id, "method": method}
        if params:
            msg["params"] = params

        future = asyncio.get_event_loop().create_future()
        self.responses[self.message_id] = future

        await self.ws.send(json.dumps(msg))
        result = await asyncio.wait_for(future, timeout=10)

        if "error" in result:
            raise Exception(f"CDP Error: {result['error']}")
        return result.get("result", {})

    async def close(self):
        if self.ws:
            await self.ws.close()


async def main():
    # 1. 获取页面列表
    import aiohttp
    async with aiohttp.ClientSession() as session:
        async with session.get("http://localhost:9222/json") as resp:
            pages = await resp.json()

    page = pages[0]  # 取第一个页面
    ws_url = page["webSocketDebuggerUrl"]

    # 2. 连接
    client = CDPClient(ws_url)
    await client.connect()

    # 3. 执行 JavaScript
    result = await client.send("Runtime.evaluate", {
        "expression": "document.title"
    })
    print("页面标题:", result["result"]["value"])

    await client.close()

asyncio.run(main())

推荐方案：用 pychrome 或 pyppeteer

自己写 CDP 客户端太底层了。推荐使用现成的库：

1
2
3
pip install pyppeteer  # Puppeteer 的 Python 版
# 或
pip install pychrome    # 更轻量的 CDP 封装

用 pyppeteer 连接 Electron：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import asyncio
from pyppeteer import connect

async def control_electron():
    # 连接到已开启调试端口的 Electron 应用
    browser = await connect(
        browserURL="http://localhost:9222",
        defaultViewport={"width": 1280, "height": 800}
    )

    # 获取页面
    pages = await browser.pages()
    page = pages[0]

    # 操作 DOM
    await page.click('#login-button')
    await page.type('#username', 'admin')
    await page.type('#password', 'password123')
    await page.click('#submit')

    # 等待导航
    await page.waitForSelector('.dashboard', timeout=5000)

    # 截图
    await page.screenshot({'path': 'dashboard.png'})

    # 执行 JS
    title = await page.evaluate('document.title')
    print(f"页面标题: {title}")

    # 获取元素文本
    elements = await page.querySelectorAll('.menu-item')
    for el in elements:
        text = await page.evaluate('(el) => el.textContent', el)
        print(f"菜单项: {text}")

asyncio.run(control_electron())

第三步：常见自动化场景

场景 1：自动登录

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
async def auto_login(page, username, password):
    """自动登录"""
    # 等待登录页面加载
    await page.waitForSelector('#login-form')

    # 清空并输入用户名
    await page.click('#username', clickCount=3)  # 全选
    await page.type('#username', username)

    # 清空并输入密码
    await page.click('#password', clickCount=3)
    await page.type('#password', password)

    # 点击登录
    await page.click('#login-button')

    # 等待登录完成（判断条件：URL 变化或特定元素出现）
    await page.waitForSelector('.main-content', timeout=10000)
    print("登录成功")

场景 2：数据提取

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
async def extract_table_data(page):
    """提取页面中的表格数据"""
    # 等待表格渲染
    await page.waitForSelector('table.data-table')

    # 用 JS 直接提取表格数据
    data = await page.evaluate('''
        () => {
            const table = document.querySelector('table.data-table');
            const headers = Array.from(
                table.querySelectorAll('thead th')
            ).map(th => th.textContent.trim());

            const rows = Array.from(
                table.querySelectorAll('tbody tr')
            ).map(tr =>
                Array.from(tr.querySelectorAll('td'))
                    .map(td => td.textContent.trim())
            );

            return { headers, rows };
        }
    ''')

    print(f"表头: {data['headers']}")
    print(f"数据行数: {len(data['rows'])}")
    return data

场景 3：网络请求拦截

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
async def intercept_requests(page):
    """拦截并记录网络请求"""
    # 开启网络拦截
    await page.setRequestInterception(True)

    async def handle_request(request):
        url = request.url

        # 记录所有 API 请求
        if '/api/' in url:
            print(f"API 请求: {request.method} {url}")

        # Mock 特定接口
        if '/api/config' in url:
            await request.respond({
                'status': 200,
                'contentType': 'application/json',
                'body': json.dumps({
                    'feature_flag': True,
                    'max_upload_size': 100
                })
            })
            return

        # 其他请求正常放行
        await request.continue_()

    page.on('request', lambda req: asyncio.ensure_future(handle_request(req)))

场景 4：表单批量填写

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
async def batch_fill_forms(page, records):
    """批量填写表单"""
    for i, record in enumerate(records):
        print(f"正在填写第 {i + 1}/{len(records)} 条记录")

        # 导航到新建页面
        await page.goto('http://localhost:3000/forms/new')
        await page.waitForSelector('#form')

        # 填写各字段
        for field_name, value in record.items():
            selector = f'input[name="{field_name}"]'
            await page.waitForSelector(selector)
            await page.click(selector, clickCount=3)
            await page.type(selector, str(value))

        # 提交
        await page.click('#submit-button')

        # 等待成功提示
        await page.waitForSelector('.success-message', timeout=5000)
        print(f"  ✅ 第 {i + 1} 条提交成功")

        # 间隔，避免太快
        await asyncio.sleep(1)

Electron 特有的坑

坑 1：多窗口/多 WebView

Electron 应用经常有多个窗口或 WebView。/json 接口会列出所有可调试的页面：

1
2
3
4
5
async with aiohttp.ClientSession() as session:
    async with session.get("http://localhost:9222/json") as resp:
        pages = await resp.json()
        for p in pages:
            print(f"[{p['type']}] {p['title']} - {p['url']}")

你需要根据 title 或 url 找到目标页面，而不是盲目取第一个。

坑 2：Electron 的 Node.js 上下文

Electron 页面有两个 JavaScript 上下文：渲染进程（跟浏览器一样）和主进程（Node.js）。CDP 连接的是渲染进程，无法直接访问 Node.js 的 API（如 require('fs')）。

如果需要操作文件系统，要么通过主进程暴露的 IPC 接口，要么直接在系统层面操作。

坑 3：frame 嵌套

Electron 应用经常使用 <webview> 标签或 <iframe> 嵌入其他页面。CDP 默认只能操作顶层 frame，要操作嵌套的 frame 需要切换 target：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# 获取所有 frame
targets = await client.send("Target.getTargets")
for target in targets["targetInfos"]:
    if target["type"] == "iframe":
        print(f"iframe: {target['title']} - {target['url']}")

# 切换到特定 frame
await client.send("Target.attachToTarget", {
    "targetId": "目标frame的ID",
    "flatten": True
})

坑 4：反自动化检测

某些 Electron 应用会检测自动化环境（跟网站检测 Selenium 一样）。常见的检测手段：

1
2
3
// 应用可能检查这些属性
navigator.webdriver  // true 表示自动化环境
window.chrome  // 可能缺失

绕过方法：

1
2
3
4
5
6
# 在页面加载前注入脚本，隐藏自动化特征
await page.evaluateOnNewDocument('''
    () => {
        Object.defineProperty(navigator, 'webdriver', { get: () => false });
    }
''')

完整的自动化框架封装

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
class ElectronAutomation:
    """Electron 应用自动化框架"""

    def __init__(self, debug_port=9222):
        self.debug_port = debug_port
        self.browser = None
        self.page = None

    async def start(self):
        """连接到 Electron 应用"""
        self.browser = await connect(
            browserURL=f"http://localhost:{self.debug_port}"
        )
        pages = await self.browser.pages()
        self.page = pages[0]
        print(f"已连接: {await self.page.title()}")

    async def wait_for(self, selector, timeout=5000):
        """等待元素出现"""
        return await self.page.waitForSelector(selector, timeout=timeout)

    async def click(self, selector):
        """点击元素"""
        await self.wait_for(selector)
        await self.page.click(selector)

    async def type_text(self, selector, text, clear=True):
        """输入文本"""
        await self.wait_for(selector)
        if clear:
            await self.page.click(selector, clickCount=3)
        await self.page.type(selector, text)

    async def get_text(self, selector):
        """获取元素文本"""
        await self.wait_for(selector)
        return await self.page.evaluate(
            f'document.querySelector("{selector}").textContent'
        )

    async def screenshot(self, path):
        """截图"""
        await self.page.screenshot({'path': path})

    async def execute_js(self, expression):
        """执行 JavaScript"""
        return await self.page.evaluate(expression)

    async def close(self):
        """断开连接（不关闭应用）"""
        if self.browser:
            await self.browser.disconnect()


# 使用
async def main():
    app = ElectronAutomation(debug_port=9222)
    await app.start()

    await app.click('#login-btn')
    await app.type_text('#username', 'admin')
    await app.type_text('#password', 'pass123')
    await app.click('#submit')

    await app.wait_for('.dashboard', timeout=10000)
    title = await app.get_text('.page-title')
    print(f"进入: {title}")

    await app.screenshot('result.png')
    await app.close()

asyncio.run(main())

写在最后

CDP 协议给 Electron 应用自动化打开了一扇大门。相比传统的坐标点击方案，CDP 的优势是压倒性的：元素级定位、JS 执行能力、网络拦截、稳定可靠。

核心要点：

让应用开启调试端口：--remote-debugging-port=9222
用 pyppeteer 连接：比手写 WebSocket 客户端方便得多
注意 Electron 特有坑：多窗口、Node 上下文、WebView 嵌套
封装成框架：把常用操作封装成类，用起来跟 Selenium 一样方便

Electron 应用虽然是桌面程序，但骨子里是 Web 技术。用 Web 的调试协议来控制它，是最自然也是最强大的方式。