Android FART脱壳机流程分析

admin

102210
文章

87
评论

2020年10月21日18:00:50评论234 views字数 14288阅读47分37秒阅读模式

Android FART脱壳机流程分析

1. 前言

在Android平台上，程序员编写的Java代码最终将被编译成字节码在Android虚拟机上运行。自从Android进入大众的视野后，apktool,jadx等反编译工具也层出不穷，功能也越来越强大，由Java编译成的字节码在这些反编译工具面前变得不堪一击，这相当于一个人裸奔在茫茫人海，身体的各个部位被众人一览无余。一种事物的出现，也会有与之对立的事物出现。有反编译工具的出现，当然也会出现反反编译工具的出现，这种技术一般我们加固技术。APP经过加固，就相当于给那个裸奔的人穿了衣服，“衣服”在一定程度上保护了APP，使APP没那么容易被反编译。当然，有加固技术的出现，也会有反加固技术的出现，即本文要分析的脱壳技术。

Android经过多个版本的更迭，它无论在外观还是内在都有许多改变，早期的Android使用的是dalvik虚拟机，Android4.4开始加入ART虚拟机，但不默认启用。从Android5.0开始，ART取代dalvik，成为默认虚拟机。由于dalvik和ART运行机制的不同，在它们内部脱壳原理也不太相同，本文分析的是ART下的脱壳方案：FART。它的整体思路是通过主动调用的方式来实现脱壳，项目地址：https://github.com/hanbinglengyue/FART（请点击“阅读原文”查看链接）。FART的代码是通过修改少量Android源码文件而成的，经过修改的Android源码编译成系统镜像，刷入手机，这样的手机启动后，就成为一台可以用于脱壳的脱壳机。

2. 流程分析

FART的入口在frameworksbasecorejavaandroidappActivityThread.java的performLaunchActivity函数中，即APP的Activity启动的时候执行fartthread

private Activity performLaunchActivity(ActivityClientRecord r, Intent customIntent) {    Log.e("ActivityThread","go into performLaunchActivity");    ActivityInfo aInfo = r.activityInfo;    if (r.packageInfo == null) {        r.packageInfo = getPackageInfo(aInfo.applicationInfo, r.compatInfo,                Context.CONTEXT_INCLUDE_CODE);    }    ......    //开启fart线程    fartthread();    ......}

fartthread函数开启一个线程，休眠一分钟后调用fart函数

public static void fartthread() {    new Thread(new Runnable() {
        @Override        public void run() {            try {                Log.e("ActivityThread", "start sleep,wait for fartthread start......");                Thread.sleep(1 * 60 * 1000);            } catch (InterruptedException e) {                e.printStackTrace();            }            Log.e("ActivityThread", "sleep over and start fartthread");            fart();            Log.e("ActivityThread", "fart run over");
        }    }).start();}

fart函数中，获取Classloader,反射获取一些类。反射调用dalvik.system.DexPathList的dexElements字段得到dalvik.system.DexPathList$Element类对象数组，Element类存储着dex的路径等信息。接下来通过遍历dexElements，得到每一个Element对象中的DexFile对象，再获取DexFile对象中的mCookie字段值，调用DexFile类中的String[] getClassNameList(Object cookie)函数并传入获取到mCookie，以得到dex文件中所有的类名。随后，遍历dex中的所有类名，传入loadClassAndInvoke函数。

public static void fart() {    ClassLoader appClassloader = getClassloader();    List<Object> dexFilesArray = new ArrayList<Object>();    Field pathList_Field = (Field) getClassField(appClassloader, "dalvik.system.BaseDexClassLoader", "pathList");    Object pathList_object = getFieldOjbect("dalvik.system.BaseDexClassLoader", appClassloader, "pathList");    Object[] ElementsArray = (Object[]) getFieldOjbect("dalvik.system.DexPathList", pathList_object, "dexElements");    Field dexFile_fileField = null;    try {        dexFile_fileField = (Field) getClassField(appClassloader, "dalvik.system.DexPathList$Element", "dexFile");    } catch (Exception e) {        e.printStackTrace();    }    Class DexFileClazz = null;    try {        DexFileClazz = appClassloader.loadClass("dalvik.system.DexFile");    } catch (Exception e) {        e.printStackTrace();    }    Method getClassNameList_method = null;    Method defineClass_method = null;    Method dumpDexFile_method = null;    Method dumpMethodCode_method = null;
    for (Method field : DexFileClazz.getDeclaredMethods()) {        if (field.getName().equals("getClassNameList")) {            getClassNameList_method = field;            getClassNameList_method.setAccessible(true);        }        if (field.getName().equals("defineClassNative")) {            defineClass_method = field;            defineClass_method.setAccessible(true);        }        if (field.getName().equals("dumpMethodCode")) {            dumpMethodCode_method = field;            dumpMethodCode_method.setAccessible(true);        }    }    Field mCookiefield = getClassField(appClassloader, "dalvik.system.DexFile", "mCookie");    for (int j = 0; j < ElementsArray.length; j++) {        Object element = ElementsArray[j];        Object dexfile = null;        try {            dexfile = (Object) dexFile_fileField.get(element);        } catch (Exception e) {            e.printStackTrace();        }        if (dexfile == null) {            continue;        }        if (dexfile != null) {            dexFilesArray.add(dexfile);            Object mcookie = getClassFieldObject(appClassloader, "dalvik.system.DexFile", dexfile, "mCookie");            if (mcookie == null) {                continue;            }            String[] classnames = null;            try {                classnames = (String[]) getClassNameList_method.invoke(dexfile, mcookie);            } catch (Exception e) {                e.printStackTrace();                continue;            } catch (Error e) {                e.printStackTrace();                continue;            }            if (classnames != null) {                for (String eachclassname : classnames) {                    loadClassAndInvoke(appClassloader, eachclassname, dumpMethodCode_method);                }            }
        }    }    return;}

loadClassAndInvoke除了传入上面提到的类名，还传入ClassLoader对象和dumpMethodCode函数的Method对象，看上面的代码可以知道，dumpMethodCode函数来自DexFile,原本的DexFile类没有这个函数，是FART加上去的。dumpMethodCode究竟做了什么我们待会再来看，先把loadClassAndInvoke函数看完。loadClassAndInvoke工作也很简单，根据传入的类名来加载类，再从加载的类获取它的所有的构造函数和函数，然后调用dumpMethodCode，传入Constructor对象或者Method对象

public static void loadClassAndInvoke(ClassLoader appClassloader, String eachclassname, Method dumpMethodCode_method) {    Log.i("ActivityThread", "go into loadClassAndInvoke->" + "classname:" + eachclassname);    Class resultclass = null;    try {        resultclass = appClassloader.loadClass(eachclassname);    } catch (Exception e) {        e.printStackTrace();        return;    } catch (Error e) {        e.printStackTrace();        return;    }     if (resultclass != null) {        try {            Constructor<?> cons[] = resultclass.getDeclaredConstructors();            for (Constructor<?> constructor : cons) {                if (dumpMethodCode_method != null) {                    try {                        dumpMethodCode_method.invoke(null, constructor);                    } catch (Exception e) {                        e.printStackTrace();                        continue;                    } catch (Error e) {                        e.printStackTrace();                        continue;                    }                 } else {                    Log.e("ActivityThread", "dumpMethodCode_method is null ");                }
            }        } catch (Exception e) {            e.printStackTrace();        } catch (Error e) {            e.printStackTrace();        }         try {            Method[] methods = resultclass.getDeclaredMethods();            if (methods != null) {                for (Method m : methods) {                    if (dumpMethodCode_method != null) {                        try {                           dumpMethodCode_method.invoke(null, m);                         } catch (Exception e) {                            e.printStackTrace();                            continue;                        } catch (Error e) {                            e.printStackTrace();                            continue;                        }                     } else {                        Log.e("ActivityThread", "dumpMethodCode_method is null ");                    }                }            }        } catch (Exception e) {            e.printStackTrace();        } catch (Error e) {            e.printStackTrace();        }     }}

上面提到dumpMethodCode函数在DexFile类中，DexFile的完整路径为：libcoredalviksrcmainjavadalviksystemDexFile.java,它是这么定义的：

private static native void dumpMethodCode(Object m);

可见，它是一个native方法，它的实际代码在：artruntimenativedalvik_system_DexFile.cc，代码为：

static void DexFile_dumpMethodCode(JNIEnv* env, jclass,jobject method) {ScopedFastNativeObjectAccess soa(env);  if(method!=nullptr)  {          ArtMethod* artmethod = ArtMethod::FromReflectedMethod(soa, method);          myfartInvoke(artmethod);      }      

  return;}

DexFile_dumpMethodCode函数中，method是loadClassAndInvoke函数传过来的java.lang.reflect.Method对象，传进来的Java层Method对象传入FromReflectedMethod函数得到ArtMethod结构指针，再将ArtMethod结构指针传入myfartInvoke函数。

myfartInvoke实际代码在art/runtime/art_method.cc文件里

extern "C" void myfartInvoke(ArtMethod * artmethod) SHARED_LOCKS_REQUIRED(Locks::mutator_lock_) {    JValue *result = nullptr;    Thread *self = nullptr;    uint32_t temp = 6;    uint32_t *args = &temp;    uint32_t args_size = 6;    artmethod->Invoke(self, args, args_size, result, "fart");}

在myfartInvoke函数中，值得关注的是self被设置为空指针，并传入ArtMethod的Invoke函数。

Invoke函数也是在art/runtime/art_method.cc文件里，在Invoke函数开头，它对self参数做了个判断，如果self为空，说明Invoke函数是被FART所调用的，反之则是系统本身的调用。self为空的时候，调用dumpArtMethod函数，并立即返回

void ArtMethod::Invoke(Thread * self, uint32_t * args,               uint32_t args_size, JValue * result,               const char *shorty) {

    if (self == nullptr) {        dumpArtMethod(this);        return;    }    ......    }

dumpArtMethod函数这里就到了dump dex的代码了。

extern "C" void dumpArtMethod(ArtMethod * artmethod) SHARED_LOCKS_REQUIRED(Locks::mutator_lock_) {    char *dexfilepath = (char *) malloc(sizeof(char) * 2000);    if (dexfilepath == nullptr) {        LOG(INFO) <<            "ArtMethod::dumpArtMethodinvoked,methodname:"            << PrettyMethod(artmethod).            c_str() << "malloc 2000 byte failed";        return;    }    int fcmdline = -1;    char szCmdline[64] = { 0 };    char szProcName[256] = { 0 };    int procid = getpid();    sprintf(szCmdline, "/proc/%d/cmdline", procid);    fcmdline = open(szCmdline, O_RDONLY, 0644);    if (fcmdline > 0) {        read(fcmdline, szProcName, 256);        close(fcmdline);    }
    if (szProcName[0]) {
        const DexFile *dex_file = artmethod->GetDexFile();         const char *methodname =            PrettyMethod(artmethod).c_str();        const uint8_t *begin_ = dex_file->Begin();         size_t size_ = dex_file->Size(); 
        memset(dexfilepath, 0, 2000);        int size_int_ = (int) size_;
        memset(dexfilepath, 0, 2000);        sprintf(dexfilepath, "%s", "/sdcard/fart");        mkdir(dexfilepath, 0777);
        memset(dexfilepath, 0, 2000);        sprintf(dexfilepath, "/sdcard/fart/%s",            szProcName);        mkdir(dexfilepath, 0777);
        memset(dexfilepath, 0, 2000);        sprintf(dexfilepath,            "/sdcard/fart/%s/%d_dexfile.dex",            szProcName, size_int_);        int dexfilefp = open(dexfilepath, O_RDONLY, 0666);        if (dexfilefp > 0) {            close(dexfilefp);            dexfilefp = 0;
        } else {            dexfilefp =                open(dexfilepath, O_CREAT | O_RDWR,                 0666);            if (dexfilefp > 0) {                write(dexfilefp, (void *) begin_,                      size_);                 fsync(dexfilefp);                close(dexfilefp);            }

        }        //下半部分开始        const DexFile::CodeItem * code_item =            artmethod->GetCodeItem(); // (1)        if (LIKELY(code_item != nullptr)) {            int code_item_len = 0;            uint8_t *item = (uint8_t *) code_item;            if (code_item->tries_size_ > 0) { // (2)                const uint8_t *handler_data = (const uint8_t *) (DexFile::GetTryItems(*code_item,code_item->tries_size_));                uint8_t *tail = codeitem_end(&handler_data);                code_item_len = (int)(tail - item);            } else {                code_item_len =                    16 +                    code_item->                    insns_size_in_code_units_ * 2;            }            memset(dexfilepath, 0, 2000);            int size_int = (int) dex_file->Size();    // Length of data            uint32_t method_idx =                artmethod->get_method_idx();            sprintf(dexfilepath,                "/sdcard/fart/%s/%d_%ld.bin",                szProcName, size_int, gettidv1());            int fp2 =                open(dexfilepath,                 O_CREAT | O_APPEND | O_RDWR,                 0666);            if (fp2 > 0) {                lseek(fp2, 0, SEEK_END);                memset(dexfilepath, 0, 2000);                int offset = (int) (item - begin_);                sprintf(dexfilepath,                    "{name:%s,method_idx:%d,offset:%d,code_item_len:%d,ins:",                    methodname, method_idx,                    offset, code_item_len);                int contentlength = 0;                while (dexfilepath[contentlength]                       != 0)                    contentlength++;                write(fp2, (void *) dexfilepath,                      contentlength);                long outlen = 0;                char *base64result =                    base64_encode((char *) item,                          (long)                          code_item_len,                          &outlen);                write(fp2, base64result, outlen);                write(fp2, "};", 2);                fsync(fp2);                close(fp2);                if (base64result != nullptr) {                    free(base64result);                    base64result = nullptr;                }            }
        }

    }
    if (dexfilepath != nullptr) {        free(dexfilepath);        dexfilepath = nullptr;    }
}

dumpArtMethod函数开始先通过/proc/<pid>/cmdline虚拟文件读取进程pid对应的进程名，根据得到的进程名在sdcard下创建目录，所以在脱壳之前要给APP写入外部存储的权限。之后通过ArtMethod的GetDexFile函数得到DexFile指针，即ArtMethod所在的dex的指针，再从DexFile的Begin函数和Size函数得到dex文件在内存中起始的地址和dex文件的大小，接着用write函数把内存中的dex写到文件名以_dexfile.dex的文件中。

但该函数还没完，dumpArtMethod函数的下半部分，对函数的CodeItem进行dump。可能有些人就有疑问了，函数的上半部分不是把dex给dump了吗，为什么还需要取函数的CodeItem进行dump呢？对于某些壳，dumpArtMethod的上半部分已经能对dex进行整体dump,但是对于部分抽取壳，dex即使被dump下来，函数体还是以nop填充，即空函数体，FART还把函数的CodeItem给dump下来是让用户手动来修复这些dump下来的空函数。

我们来看dumpArtMethod函数的下半部分，这里将会涉及dex文件的结构，如果不了解请结合文档来看。注释(1)处，从ArtMethod中得到一个CodeItem。注释(2)处，根据CodeItem的tries_size_，即try_item的数量来计算CodeItem的大小：

(1)如果triessize不为0，说明这个CodeItem有try_item，那么去把CodeItem的结尾地址给算出来

const uint8_t *handler_data = (const uint8_t *) (DexFile::GetTryItems(*code_item,code_item->tries_size_));                uint8_t *tail = codeitem_end(&handler_data);                code_item_len = (int)(tail - item);

codeitem_end函数怎么算出CodeItem的结束地址呢？

GetTryItems第二参数传入triessize，即跳过所有的try_item，得到encoded_catch_handler_list的地址，然后传入codeitem_end函数

uint8_t *codeitem_end(const uint8_t ** pData) {    uint32_t num_of_list = DecodeUnsignedLeb128(pData);    for (; num_of_list > 0; num_of_list--) {        int32_t num_of_handlers =            DecodeSignedLeb128(pData);        int num = num_of_handlers;        if (num_of_handlers <= 0) {            num = -num_of_handlers;        }        for (; num > 0; num--) {            DecodeUnsignedLeb128(pData);            DecodeUnsignedLeb128(pData);        }        if (num_of_handlers <= 0) {            DecodeUnsignedLeb128(pData);        }    }    return (uint8_t *) (*pData);}

codeitem_end函数的开头读取encoded_catch_handler_list结构中包含多少个encoded_catch_handler结构，如果不为0，遍历所有encoded_catch_handler结构，读取encoded_catch_handler结构中有多少encoded_type_addr_pair结构，有的话全部跳过，即跳过了整个encoded_catch_handler_list结构。最后函数返回的pData即为CodeItem的结尾地址。

得到了CodeItem结尾地址，用CodeItem结尾的地址减去CodeItem的起始地址得到CodeItem的真实大小。

(2)如果triessize为0，那么就没有try_item，直接就能把CodeItem的大小计算出来：

code_item_len = 16 + code_item->insns_size_in_code_units_ * 2;

CodeItem的大小计算出来之后，接下来可以看到，有几个变量以格式化的方式打印到dexfilepath

sprintf(dexfilepath,    "{name:%s,method_idx:%d,offset:%d,code_item_len:%d,ins:",    methodname,     method_idx,    offset,     code_item_len);

methodname 函数的名称

methodidx 来源FART新增的函数：`uint32_t get_method_idx(){ return dex_method_index; },函数返回dex_method_index_，dex_method_index_是函数在method_ids`中的索引

offset 是该函数的CodeItem相对于dex文件开始的偏移

code_item_len Codeitem的长度

数据组装好之后，写入到以.bin为后缀的文件中：

write(fp2, (void *) dexfilepath,        contentlength);long outlen = 0;char *base64result =    base64_encode((char *) item,            (long)            code_item_len,            &outlen);write(fp2, base64result, outlen);write(fp2, "};", 2);

对于上面的dexfilepath，它们是明文字符，直接写入即可。而对于CodeItem中的bytecode这种非明文字符，直接写入不太好看，所以FART选择对它们进行base64编码后再写入。

分析到这里好像已经结束了，从主动调用，到dex整体dump，再到函数CodeItem的dump，都已经分析了。但是FART中确实还有一部分逻辑是没有分析的。如果你使用过FART来脱过壳，会发现它dump下来的dex中还有以_execute.dex结尾的dex文件。这种dex是怎么生成的呢？

这一部分的代码也是在artruntimeart_method.cc文件中

extern "C" void dumpDexFileByExecute(ArtMethod * artmethod)     SHARED_LOCKS_REQUIRED(Locks::mutator_lock_) {        char *dexfilepath = (char *) malloc(sizeof(char) * 2000);        if (dexfilepath == nullptr) {            LOG(INFO) <<                "ArtMethod::dumpDexFileByExecute,methodname:"                << PrettyMethod(artmethod).                c_str() << "malloc 2000 byte failed";            return;        }        int fcmdline = -1;        char szCmdline[64] = { 0 };        char szProcName[256] = { 0 };        int procid = getpid();        sprintf(szCmdline, "/proc/%d/cmdline", procid);        fcmdline = open(szCmdline, O_RDONLY, 0644);        if (fcmdline > 0) {            read(fcmdline, szProcName, 256);            close(fcmdline);        }
        if (szProcName[0]) {
            const DexFile *dex_file = artmethod->GetDexFile();            const uint8_t *begin_ = dex_file->Begin();    // Start of data.            size_t size_ = dex_file->Size();    // Length of data.
            memset(dexfilepath, 0, 2000);            int size_int_ = (int) size_;
            memset(dexfilepath, 0, 2000);            sprintf(dexfilepath, "%s", "/sdcard/fart");            mkdir(dexfilepath, 0777);
            memset(dexfilepath, 0, 2000);            sprintf(dexfilepath, "/sdcard/fart/%s",                szProcName);            mkdir(dexfilepath, 0777);
            memset(dexfilepath, 0, 2000);            sprintf(dexfilepath,                "/sdcard/fart/%s/%d_dexfile_execute.dex",                szProcName, size_int_);            int dexfilefp = open(dexfilepath, O_RDONLY, 0666);            if (dexfilefp > 0) {                close(dexfilefp);                dexfilefp = 0;
            } else {                dexfilefp =                    open(dexfilepath, O_CREAT | O_RDWR,                     0666);                if (dexfilefp > 0) {                    write(dexfilefp, (void *) begin_,                          size_);                    fsync(dexfilefp);                    close(dexfilefp);                }

            }        }
        if (dexfilepath != nullptr) {            free(dexfilepath);            dexfilepath = nullptr;        }
    }

可以看到，dumpDexFileByExecute函数有点像dumpArtMethod函数的上半部分，即对dex文件的整体dump。那么，dumpDexFileByExecute在哪里被调用呢？

通过搜索，在artruntimeinterpreterinterpreter.cc文件的开始，看到了FART在art命名空间下定义了一个dumpDexFileByExecute函数

namespace art {extern "C" void dumpDexFileByExecute(ArtMethod* artmethod);namespace interpreter {        ......    }}

同时在文件其中找到了对dumpDexFileByExecute函数的调用：

static inline JValue Execute(Thread* self, const DexFile::CodeItem* code_item,                             ShadowFrame& shadow_frame, JValue result_register) {   if(strstr(PrettyMethod(shadow_frame.GetMethod()).c_str(),"<clinit>")!=nullptr)  {      dumpDexFileByExecute(shadow_frame.GetMethod());  }  ......}

在Execute函数中，通过判断函数名称中是否存在<clinit>即是否为静态代码块来决定要不要调用dumpDexFileByExecute，如果存在则调用dumpDexFileByExecute函数，并传入一个ArtMethod指针。

dumpDexFileByExecute中对dex进行了整体dump，可以把它看作是dumpArtMethod方式的互补，有时dumpArtMethod中得不到想得到的dex,用dumpDexFileByExecute或许能得到惊喜。

3. 结语

非常感谢FART作者能够开源FART，这使得人们对抗ART环境下App壳得到了良好的思路。FART脱壳机理论上来讲能脱大多数壳，但是仍有例外，需要自行摸索。

4. 参考

https://bbs.pediy.com/thread-252630.htm

https://source.android.google.cn/devices/tech/dalvik/dex-format

（请点击“阅读原文”查看链接）

- End -

精彩推荐

本文始发于微信公众号（安全客）：Android FART脱壳机流程分析

左青龙
微信扫一扫

右白虎
微信扫一扫

Android FART脱壳机流程分析

hTPM:可信平台模块的混合实现

利用Oracle VirtualBox实现虚拟机逃逸

微软再爆“死亡之ping”漏洞

专项行动的意外收获——2020年9月墨子（Mozi）僵尸网络分析报告

APP逆向系列(四)

使用生成式 AI 开展安全工程

一种利用合法工具渗透的新型方法

原创 | GDA CTF应用方向：牛刀小试ONE

安卓逆向之常见结构的smali语句分析

基于微纳卫星的电磁频谱监测系统技术

第二届华为杯 WP

美国HawkEye 360电磁频谱监测卫星

看雪2023 KCTF年度赛 | 第12题·深入内核-设计思路及解析

第三代CAN总线 CAN-XL

发表评论

在线咨询

微信