利用传统的 XML 绕过 DOMPurify

admin

145335
文章

119
评论

2024年6月19日22:00:13评论67 views字数 5379阅读17分55秒阅读模式

利用传统的 XML 绕过 DOMPurify

介绍

大家好，我是 RyotaK（@ryotkak ），Flatt Security Inc. 的安全工程师。

最近，@slonser_ 发现DOMPurify 在清理 XML 文档时存在绕过问题。查看补丁后，我发现了另外两个 XML/HTML 混淆绕过问题，因此我在此记录下来。

HTML != XML

正如@slonser_ 在他的帖子中所写，HTML和XML的解析规则略有不同。

例如，以下文本在XML解析器中被解析为单个节点，但HTML解析器识别该h1标签。

<?xml-stylesheet ><h1>Hello</h1> ?>

这是因为 XML 将处理指令的结构定义如下：

https://www.w3.org/TR/xml/#sec-pi

'<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>'

然而，当 HTML 遇到以下情况时，它会进入虚假注释状态<?：

https://html.spec.whatwg.org/#tag-open-state

U+003F QUESTION MARK (?)    This is an unexpected-question-mark-instead-of-tag-name parse error. Create a comment token whose data is the empty string. Reconsume in the bogus comment state.

由于虚假注释状态使用>而不是?>作为结束标记，因此 HTML 解析器和 XML 解析器解析处理指令的方式不匹配。

https://html.spec.whatwg.org/#bogus-comment-state

U+003E GREATER-THAN SIGN (>)   Switch to the data state. Emit the current comment token.

由于存在这种差异，如果经过清理的 XML 文档稍后在 HTML 文档中使用，则注入处理指令可以绕过清理程序。

由于 DOMPurify 没有扫描处理指令，@slonser_ 设法通过插入以下有效载荷来绕过清理程序：

<?xml-stylesheet > <img src=x onerror="alert('DOMPurify bypassed!!!')"> ?>

补丁概览

为了正确处理处理指令，DOMPurify 应用了以下补丁：

diff --git a/src/purify.js b/src/purify.jsindex 4594ba09..5b7bc2aa 100644--- a/src/purify.js+++ b/src/purify.js@@ -909,7 +909,10 @@ function createDOMPurify(window = getGlobal()) {       root.ownerDocument || root,       root,       // eslint-disable-next-line no-bitwise-      NodeFilter.SHOW_ELEMENT | NodeFilter.SHOW_COMMENT | NodeFilter.SHOW_TEXT,+      NodeFilter.SHOW_ELEMENT |+        NodeFilter.SHOW_COMMENT |+        NodeFilter.SHOW_TEXT |+        NodeFilter.SHOW_PROCESSING_INSTRUCTION,       null     );   };

由于NodeFilter.SHOW_PROCESSING_INSTRUCTION指定了该选项，DOMPurify 现在可以正确扫描处理指令，如果不允许，则将其删除。那么，这个补丁可能有什么问题？

令人困惑的节点名称

事实证明，处理指令返回了在中指定的<?tag值nodeName。

https://dom.spec.whatwg.org/#dom-node-nodename

The nodeName getter steps are to return the first matching statement, switching on the interface this implements:[...]ProcessingInstruction    Its target.

例如，在访问可以表示为的处理指令的属性tag时将返回。nodeName<?tag ?>

由于 DOMPurify 高度依赖nodeName节点来确定该节点是否被允许，这在清理节点时会引起混乱：

src/purify.js 第 992-1013 行

    /* Now let's check the element's type and name */    const tagName = transformCaseFunc(currentNode.nodeName);    [...]    /* Remove element if anything forbids its presence */    if (!ALLOWED_TAGS[tagName] || FORBID_TAGS[tagName]) {

再次使用处理指令绕过 DOMPurify

我们可以将任意节点名与处理指令一起使用，因此我们要做的就是使用允许的标签名称创建处理指令。

例如，以下处理指令在被清理为 XML 文档时会绕过 DOMPurify：

<?img a ?>

正如我们之前看到的，HTML 和 XML 对于处理指令的解析不一致。

因此，通过使用以下 XML，我们可以绕过 DOMPurify，并alert(1)在稍后在 HTML 文档中使用它时执行：

<?img ><img src onerror=alert(1)>?>

您可以使用以下脚本和 DOMPurify 3.0.10 来确认这一点：

document.documentElement.innerHTML = DOMPurify.sanitize("<?img ><img src onerror=alert(1)>?>", {PARSER_MEDIA_TYPE: "application/xhtml+xml"})

寻找另一种绕过方法

为了防止上述问题，应用以下补丁来删除所有处理指令。

diff --git a/src/purify.js b/src/purify.jsindex 061ba1a8..1d984685 100644--- a/src/purify.js+++ b/src/purify.js@@ -1009,6 +1009,12 @@ function createDOMPurify(window = getGlobal()) {       return true;     }+    /* Remove any ocurrence of processing instructions */+    if (currentNode.nodeType === 7) {+      _forceRemove(currentNode);+      return true;+    }+     /* Remove element if anything forbids its presence */     if (!ALLOWED_TAGS[tagName] || FORBID_TAGS[tagName]) {       /* Check if we have a custom element to handle */

由于它完全删除了处理指令，因此不再可能使用处理指令的解析器不一致。

但是，还有其他不一致的解析吗？

在阅读XML的规范之后，我注意到有一个有趣的部分：

https://www.w3.org/TR/xml/#sec-cdata-sect

CDATA sections may occur anywhere character data may occur; they are used to escape blocks of text containing characters that would otherwise be recognized as markup. CDATA sections begin with the string " <![CDATA[ " and end with the string " ]]> "

对我来说幸运的是，CDATA 部分有一个单独的 NodeFilter 选项，该选项未在 DOMPurify 上启用。

https://dom.spec.whatwg.org/#callbackdef-nodefilter

const unsigned long SHOW_CDATA_SECTION = 0x8;

所以，我要做的是找到 XML 和 HTML 解析器之间的不一致之处。

乍一看，HTML 解析器似乎以与 XML 兼容的方式解析 CDATA 部分：

https://html.spec.whatwg.org/#cdata-sections

CDATA sections must consist of the following components, in this order:   1. The string "<![CDATA[".   2. Optionally, text, with the additional restriction that the text must not contain the string "]]>".   3. The string "]]>".

然而，经过进一步调查，事实证明 HTML 仅支持 SVG 和 MathML 命名空间内的 CDATA 部分，而不支持 HTML 命名空间中的 CDATA 部分。

https://html.spec.whatwg.org/#markup-declaration-open-state

The string "[CDATA[" (the five uppercase letters "CDATA" with a U+005B LEFT SQUARE BRACKET character before and after)    Consume those characters. If there is an adjusted current node and it is not an element in the HTML namespace, then switch to the CDATA section state. Otherwise, this is a cdata-in-html-content parse error. Create a comment token whose data is the "[CDATA[" string. Switch to the bogus comment state.

如果 CDATA 部分出现在 HTML 命名空间中，它会切换到虚假注释状态，使用>而不是]]>作为结束标记。

https://html.spec.whatwg.org/#bogus-comment-state

U+003E GREATER-THAN SIGN (>)   Switch to the data state. Emit the current comment token.

因此，与处理指令类似，以下 XMLh1在使用 HTML 解析器解析时会创建标签：

<![CDATA[ ><h1>Hello</h1> ]]>

与处理指令一样，这种不一致性允许使用以下有效载荷绕过 DOMPurify：

<![CDATA[ ><img src onerror=alert(1)> ]]>

您可以使用以下脚本和 DOMPurify 3.0.11 来确认这一点：

document.documentElement.innerHTML = DOMPurify.sanitize("<![CDATA[ ><img src onerror=alert(1)> ]]>", {PARSER_MEDIA_TYPE: "application/xhtml+xml"})

为了修复这种不一致，DOMPurify 应用了以下补丁：

diff --git a/src/purify.js b/src/purify.jsindex 1d984685..72c925a0 100644--- a/src/purify.js+++ b/src/purify.js@@ -913,7 +913,8 @@ function createDOMPurify(window = getGlobal()) {       NodeFilter.SHOW_ELEMENT |         NodeFilter.SHOW_COMMENT |         NodeFilter.SHOW_TEXT |-        NodeFilter.SHOW_PROCESSING_INSTRUCTION,+        NodeFilter.SHOW_PROCESSING_INSTRUCTION |+        NodeFilter.SHOW_CDATA_SECTION,       null     );   };diff --git a/src/purify.js b/src/purify.js

由于 CDATA 部分具有#cdata-section，除非明确允许nodeName，否则无法以我对处理指令所做的方式绕过此补丁。#cdata-section

https://dom.spec.whatwg.org/#dom-node-nodename

The nodeName getter steps are to return the first matching statement, switching on the interface this implements:[...]CDATASection    "#cdata-section".

Bypassing DOMPurify with good old XMLhttps://flatt.tech/research/posts/bypassing-dompurify-with-good-old-xml/

原文始发于微信公众号（Ots安全）：利用传统的 XML 绕过 DOMPurify

免责声明:文章中涉及的程序(方法)可能带有攻击性，仅供安全研究与教学之用，读者将其信息做其他用途，由读者承担全部法律及连带责任，本站不承担任何法律及连带责任；如有问题可邮件联系(建议使用企业邮箱或有效邮箱,避免邮件被拦截，联系方式见首页)，望知悉。

左青龙
微信扫一扫

右白虎
微信扫一扫

利用传统的 XML 绕过 DOMPurify

渗透思路 | js泄露用的好，洞洞少不了

云安全实战：一份保障 IaC 安全的实用指南

大模型注入攻击和防御

渗透测试 | 一次SQL注入不成但解锁爆出额外信息的 Tips

记一次信息收集+前台的全校信息泄露

ChatGPT官方网络安全类GPTs推荐清单，及提示词破解

在受限 SQL 注入场景中枚举 MySQL 8.x 和 9.x 中表名的有趣技术

内网渗透—访问控制

200小时狂赚$20,300：我的漏洞赏金黑客挑战实录

让主流大模型集体破防的回音室攻击

发表评论

在线咨询

微信