Laisky's Notes

记录和分享有趣的信息。

Record and share interesting information.

contact: [email protected]

02:59 · 2026年1月14日 · 周三

https://laisky.notion.site/Agent-Skills-Comprehensive-Guide-2e4ba4011a868055b2b0e2e128da1538?source=copy_link

学习了一下 Anthropic 的 Agent Skills 设计。通过在文件系统中定义一套文件结构，让 Agent 可以按需加载（Progressive Disclosure）相关的技能描述和脚本。

Skills 通过 metadata + instructions + scripts 定义一个领域技能。Agent 在启动时全量加载 metadatas，然后按需加载 instructions，而 instructions 会指导 agent 调用 scripts。scripts 的内容不会被加载，只有 output 会被加载进入 context。

通过一个三层的 lazy-loading 设计，相当于 RAG + tools，既减少了对 context 的占用，也赋予了 agent 工具调用的能力。

但是我认为，Anthropic 没有选择优化 MCP，而是硬造了一个新轮子，给 Agent 世界带来更多的混乱，有些可惜。

Skills 的功能完全可以通过对 MCP 的些微优化来实现，只需要在 MCP 中增加两个 tools：

- find_skills(query): 根据 query 查找相关的 skills metadata 列表
- describe_skill(skill_id): 根据 skill_id 获取对应的 instructions。instructions 内会指导 agent 如果通过调用其他 tools 完成任务。

实际上，已经有很多人发布了相关的 skills-mcp 工具，只要搜索 github skills-mcp 就能找到很多。

00:42 · 2025年12月25日 · 周四

https://laisky.notion.site/msanft-CVE-2025-55182-Explanation-and-full-RCE-PoC-for-CVE-2025-55182-2c2ba4011a8681b89b60cf02827a6276?source=copy_link

之前提到的 next/react 严重服务端 RCE 漏洞 CVE-2025-55182

攻击者传递两个 chunk。chunk 0 作为攻击载体，chunk 1 作为正常 chunk，然后：

1. 利用自定义 then 将 chunk.prototype.then 挂载到 chunk 0，使得 chunk 0 变成 thenable，成为可被执行的 Promise
2. chunk 1 中，通过 $@0，将 chunk 0 作为 decodeReplyFromBusboy 的返回值，被 await 执行（thenable 对象都会被作为 Promise 而被 await 执行）
3. chunk0 中，通过自定义 react 状态机 status，触发 initializeModelChunk 中对 _response 的读取
4. 在 chunk 0 中，$B 会让 react 调用 response._formData.get()。但是 chunk 0 自定义了 _response 中的 _formData 和 _get，导致读取 _response body 的操作变成了一次代码执行

在我看来严重隐患似乎存在于两个地方：

1. 允许用户操作 prototype 的 function constructor，可以通过限制仅允许访问 ownProperty 来抑制
2. $@0 直接将 chunk 返回，而且 react 会将这个 chunk 视为一个可信的内部对象，这个对象通过 status 来控制状态，_response、_get 来控制数据。也就是说，外部的不可信输入直接控制了状态和数据。我觉得这个设计的问题可比 prototype 的访问越权严重多了。

laisky on Notion

msanft/CVE-2025-55182: Explanation and full RCE PoC for CVE-2025-55182 | Notion

00:51 · 2025年12月23日 · 周二

https://www.notion.so/laisky/Cloudflare-outage-on-November-18-2025-2bcba4011a8681e79964ff8933d3c5b2

CloudFlare 对于 2025-11-18 的故障报告。边缘服务设定了严格的内存限制，其中bot modules 限制了能够处理的规则行数。当接收到超过限制的规则文件后导致 panic。而新的 FL2 系统在 bot modules panic 时，没能仅降级 bot 服务，而是全链路 panic，导致全站 5xx。旧版的 FL 系统就很好的仅降级了 bot score 服务，没有对用户业务造成中断。

我个人认为的经验教训就是：

1. bot modules panic 不是问题，有助于尽早暴露错误，但是必须被隔离。
2. FL2 在 bot modules panic 后，应该进行告警和服务降级，而不是整体 panic（Blast Radius）
3. chaos engineering 应该对各种交互都进行测试，比如文件过大、格式错误等等。任何关键服务都不应该信任外部输入，即使这个外部输入来自友方。

laisky on Notion

Cloudflare outage on November 18, 2025 | Notion

On 18 November 2025 at 11:20 UTC (all times in this blog are UTC), Cloudflare's network began experiencing significant failures to deliver core network traffic. This showed up to Internet users trying to access our customers' sites as an error page indicating…

Before

After