Playwright（Python）里 page.route 的概念、API、常见用法、注意事项和可复用的同步/异步示例

概览（一句话）

page.route(pattern, handler) 用来拦截页面发出的网络请求（XHR、fetch、图片、脚本等），把请求交给你写的 handler(route, request) 来决定：继续发送 / 修改后发送 / 直接返回假数据 / 中止请求。一旦启用匹配的路由，请求会被“挂起”，直到你用 route.continue_() / route.fulfill() / route.abort() / route.fallback() 之一处理它。 ([playwright.dev][1])

基本概念与签名

注册路由（同步 API）：

page.route("**/api/**", handler)   # handler(route, request)

注册路由（异步 API）通常写法（也可直接不 await，官方示例有 await 的写法）：

await page.route("**/api/**", handler)   # async def handler(route, request)

handler 的两个参数：route（Route 对象，用来操作请求）和 request（Request 对象，用来读取原始请求信息，例如 request.url, request.method, request.post_data 等）。 ([playwright.dev][1])

Route 常用方法（速查）

route.continue_(**overrides)：把请求发到网络，可以用 headers, method, post_data, url 等覆盖（注意 continue_ 在 Python 中是 continue_，因为 continue 是关键字）。使用 continue_ 时，其他已注册的匹配处理器不会再被调用（立即发送）。 ([playwright.dev][2])
route.fallback(**overrides)：与 continue_ 类似，但会把请求“传递到下一个匹配的 handler”，即允许链式/层叠的处理器协作（introduced v1.23）。 ([playwright.dev][2])
route.fulfill(status=..., body=..., content_type=..., headers=..., path=..., response=..., json=...)：直接用你指定的响应来“完成”请求（mock 响应）。可以直接以 body/path/或 response（来自 route.fetch()）作为基础并做修改。 ([playwright.dev][3])
route.abort(error_code=None)：中止请求（默认 failed，也可以传特定错误码）。 ([playwright.dev][2])
route.fetch()：从网络执行原始请求并返回 Response 对象（便于基于真实响应做修改后再 fulfill）。 ([playwright.dev][4])

匹配规则、优先级与取消路由

url 参数可以是 glob（**/*）、正则或回调 predicate（接收 URL 判断），非常灵活。 ([playwright.dev][5])
优先级：page.route（只对该 page）比 browser_context.route 更高（即同一请求同时匹配 page 和 context 的规则时，page 的 handler 优先）。要对所有页面（包括 popup）拦截，考虑用 context.route。 ([playwright.dev][5])
取消路由：page.unroute(url[, handler]) —— 如果不传 handler，会移除匹配 URL 的所有路由。 ([playwright.dev][1])

实战示例（可直接复制粘贴改造）

1) 同步 API：拦截并 mock 某个 API

from playwright.sync_api import sync_playwright
import json

def run():
    with sync_playwright() as pw:
        browser = pw.chromium.launch()
        page = browser.new_page()

        def handler(route, request):
            if "api.example.com/data" in request.url:
                body = json.dumps({"mocked": True, "data": []})
                route.fulfill(status=200, content_type="application/json", body=body)
            else:
                route.continue_()   # 一定要处理，否则请求会挂起

        page.route("**/api/**", handler)
        page.goto("https://example.com")
        browser.close()

if __name__ == "__main__":
    run()

2) 异步 API：基于真实响应修改并返回（fetch → 改 json → fulfill）

import asyncio
from playwright.async_api import async_playwright

async def run():
    async with async_playwright() as pw:
        browser = await pw.chromium.launch()
        context = await browser.new_context()
        page = await context.new_page()

        async def handler(route, request):
            if "/xhr_endpoint" in request.url:
                # 请求原始数据，然后在返回内容上打补丁
                response = await route.fetch()
                data = await response.json()
                data["injected_by_test"] = True
                await route.fulfill(response=response, json=data)
            else:
                await route.continue_()

        await page.route("**/xhr_endpoint", handler)
        await page.goto("https://example.com")
        await browser.close()

asyncio.run(run())

3) 同步 API：基于真实响应修改并返回（fetch → 改 json → fulfill）

from playwright.sync_api import sync_playwright

def run():
    with sync_playwright() as pw:
        browser = pw.chromium.launch()
        page = browser.new_page()

        def handler(route, request):
            if "/xhr_endpoint" in request.url:
                response = route.fetch()
                data = response.json()
                data["patched_by_test"] = True
                route.fulfill(response=response, json=data)
            else:
                route.continue_()

        page.route("**/xhr_endpoint", handler)
        page.goto("https://example.com")
        browser.close()

if __name__ == "__main__":
    run()

（上面 route.fetch() + route.fulfill(response=..., json=...) 是官方推荐用法，用于“在真实响应基础上做小改动再返回”。） ([playwright.dev][4])

常见场景与小技巧

只拦截你关心的 URL 区间（比如 **/api/**），避免用 **/* 全拦截，性能和调试都会更友好。
模拟慢网速 / 超时：可以 time.sleep（同步）或 await asyncio.sleep（异步）后 fulfill() 来模拟延迟，或 abort("timedout")。
修改请求头 / body：在 route.continue_(headers=..., method=..., post_data=..., url=...) 里覆盖。注意：headers 会作用到重定向的请求，但 url/method/post_data 的覆盖只应用到原始请求，不会自动传递到重定向后的请求。 ([playwright.dev][3])
层叠路由（fallback）：用 route.fallback() 可以让多个 handler 串联，最后一个 handler 决定最终行为；而 continue_() 则会立即把请求发出并跳过后续 handler。 ([playwright.dev][2])
不要忘了所有分支都处理：如果 handler 在某些分支既没 continue_()、也没 fulfill() / abort()，对应请求会一直挂起，导致页面操作卡住（非常常见的陷阱）。 ([playwright.dev][1])

注意事项 / 常见坑

启用路由会禁用 HTTP 缓存（文档有明确说明），测试时可能因此看到不同的行为。 ([playwright.dev][5])
continue_() 不能覆盖 Cookie header（浏览器会从 cookie store 加载），如果你需要自定义 cookie，请用 context.add_cookies() 或其他办法。（文档和实现里提到 Cookie 覆盖是受限的） ([playwright-ruby-client.vercel.app][6])
当多个 route pattern 匹配时，执行顺序是注册顺序的反向（最后注册的先执行），这让“覆盖默认路由”的模式很方便，但也要留意顺序。 ([Cuketest][7])

小型 cheatsheet（调用格式）

注册：page.route(url, handler) 或 context.route(url, handler)。 ([playwright.dev][1])
处理：route.continue_(headers=..., url=..., method=..., post_data=...)。 ([playwright.dev][3])
mock：route.fulfill(status=200, content_type='application/json', body=json.dumps(...)) 或 route.fulfill(response=response, json=...)。 ([playwright.dev][3])
中止：route.abort()。 ([playwright.dev][2])
原始请求：response = await route.fetch()。 ([playwright.dev][4])
取消路由：page.unroute(url[, handler])。 ([playwright.dev][1])

4.实战代码示例1

from playwright.sync_api import sync_playwright,Request,Response,Route
import json
    # 存储所有JSON响应的列表
json_responses = []

def handle_route(route:Route, request:Request):
    if request.resource_type == "image":
        # 拦截图片请求，
        route.continue_(url="https://wxhzhwxhzh.github.io/sao/fav2.png")
    else:
        # 继续处理其他请求
        route.continue_()

       
def handle_route2(route):
    # 抓取后台json数据包
    response = route.fetch()    
    # 检查响应是否为JSON类型
    content_type = response.headers.get("content-type", "")
    if "/json" in content_type:
        try:
            # 解析JSON数据
            json_data = response.json()          
            # 打印捕获到的JSON信息
            print(f"\n捕获到JSON响应: {response.url} (状态码: {response.status})")
            print("JSON数据:")
            print(json_data)
        except Exception as e:
            print(f"解析JSON失败 {response.url}: {str(e)}")
    
    # 继续原始响应
    route.continue_()


def handle_route3(route:Route, request:Request):
    # 拦截并篡改网页原生js的内容
    if "head.js" in request.url:
        response:Response = route.fetch()
        old_str = "window.Douban = "
        new_str = "console.log('骚神真帅！！！！！！');"
        data = response.text().replace(old_str, new_str+old_str)
        print(data)
        body_bytes = data.encode("utf-8")        
        route.fulfill(headers=response.headers, status=response.status, body=body_bytes)
    else:
        route.continue_()    

def main():
    with sync_playwright() as p:
        # 启动浏览器（默认为 Chromium，也可以使用 p.firefox 或 p.webkit）
        browser = p.chromium.launch(headless=False)  # headless=False 表示显示浏览器窗口
        page = browser.new_page()
        page.route("**/*", handle_route3)  # 拦截所有请求
        
        # 打开豆瓣首页
        page.goto("https://www.douban.com")
        
       
        page.wait_for_timeout(3000)  # 等待3秒（
        
        # 打印页面标题
        print("页面标题:", page.title())
        
        # 关闭浏览器
        browser.close()

if __name__ == "__main__":
    main()