How to extract all assets in a cocos2d Android app using Frida

Note: all codes of this project is in my repository at Github
 

Introduction


Cocos2D is a popular game engine, people can make mobile games using it. You can find more information in its official website, https://www.cocos.com/en. And it's open source, you can find its source code easily at Github. Frankly, I should say open source has its own dark side, people can gain many hints and inspiration form source code to reverse engineering a built binary. 

Frida is a powerful dynamic instrument toolkit, all code is here.

Now,  I will try to extract all assets in a cococ2d Android app using Frida. I select a card game developed by a Chinese company. It's fknsg(放开那三国). You can download it in here. I select this game because it's using Cocos2D engine, and it has arm7 and arm8-64 version of built binary, so I can test in many architecture. Lots of Cocos2D Android games only have arm7 version binary.  And also this game's art is very beautiful.

How to do it?

Inspect the APK

First, I downloaded the APK file and unzipped it. I find libgame.so in the lib folder, it's built binary of Cocos2d engine, and this APK has two version for it. One is armeabi-v7a for Arm32 bit. and the other is arm64-v8a for Arm 64bit. There are so many file in the assets folder, and it seems many file have been encrypted, at least, many .png files and .lua files. And I realized the first 4 bytes of many encrypted files are same, 4 bytes with 0xfe.  The follow screenshot shows one encrypted file's content.


When I am doing some work related reverse engine something, just like this, I feel, 90% of the work is find useful function and instruction. 10% is coding. Usually, the binary I need to analyze is so big, and I don't need to understand every  byte. So I need to find the useful byte quickly. This is just like look for a needle in haystack in many times. First at all, I should found a function to decrypt these files. 
I use hopper to dissemble the libgame.so, and I opened libgame.so with this great dissembler. libgame.so is standard ELF file, luckily, it's not been obfuscated. Most importantly, hopper tell it exports   so many functions, many exported function have a name with prefix "cocos2d::". And I found a interesting function 'bt_decrypt' , I suspected it's the function to decrypt data by its name. The following is its pseudo code 
But I don't want to use this function directly, I can not found it in Cocos2D source code. So it's a special function only used in the game. I should find a Cocos2D function to decrypt data for generic.Finally, I found them. 
  • cocos2d::CCFileUtilsAndroid::getFileData
  • cocos2d::ZipFile::getFileData
I inspected them with Frida-trace. And I found cocos2d::CCFileUtilsAndroid::getFileData can decrypt data, but cocos2d::ZipFile::getFileData can not. And you should have a good internet connection when you debug this game, it seems the game need to visit its server to load itself. When I started this app without a network, it stays black screen, and most importantly, it did not load libgame.so.


My own CModule

Frida provides CModule. This Javascript API allow users to write functions in C, and to call C function in Javascript. C functions can call all functions in the current process. And user can write javascript functions, and convert them to NativeCallbacks so C function can call javascript functions.  I tried CMoudle API, but I feel it's not very convenient for me. I decided to write my own CMoudle.

First I wrote a shared library with NDK. I use NDK r15c, and you can set NDKPATH variable in jni/Makefile to point to your own NDK install path, and don't try to use latest version of NDK, so many issues.  Maybe the game developers use a older version of NDK. We need to set LOCAL_ALLOW_UNDEFINED_SYMBOLS variable to true in jni/Android.mk file. This allows us to pass compilation when unresolved symbols. We will resolve these symbols when we load this shared library with Frida. 

Second I wrote a python script to convert a shared library to a typescript module. It's name utils/so2tsmodule.py.   This script parses the input shared library using LIEF to get many informations need to load it with Frida. Then write these informations to a typescript module file. I use jinja to generate typescript file. I convert the bytes need to load into an array in the final typescript module file. So when you shared library is not very big,  no problem.  And if you shared library is too big, will make the generated typescript module too big either. This will increase the compile time of the final typescript. For this case, you can push the shared library to your Android device, and try to load the binary from the file. 

Finally I wrote a typescript module `tsmodules/soutils.ts` to help me to load the shared library.   The loadSo function in this module does the main work.

So far, my own CModule only supports arm7 and arm8-64 architecture.

And I will talk more detail on every step.

JNI program to extract assets 

We need to get a list of asset files. and call cocos2d::CCFileUtilsAndroid::getFileData with every file, and write the gotten file content to storage on you Android device. Cocos2D do not use AAsset_read function to read asset file. Instead it access the APK file directly. APK file is nothing but a zip file. So we can extract assets file from it just use zip library. Surely, every android process has permission to access its own APK file. Cocos2D development think this way may be more efficient than use AAsset_* functions Android provided. 
The main function to extract in jni/cocos2dExtractAssets.cpp file, the following is the code snippet:

This function has two arguments. The first argument, baseaddress, is the base address of libgame.so in current progress, and the second argument is a string to indicate where we will put the extracted files. 

Line 33-41, I try to get the pointer to a cocos2d::ZipFile object, and this object is a singleton, and Cocos2d does not export it, so we need to get its actual address using base+offset, and the offset value is different in arm7 and arm8-64 version.

And global ZipFile object has a pointer to a std::map contained all file info of the APK file.  I need to get this pointer.  Line 46-56 do this. Why do so weird operation. I get hints from libgame.so in assembly. The following is the assembly code of function cocos2d::ZipFile::fileExists (arm7 version)


Register r0 at 0x1fe9f8 is the pointer to the cocos2d::ZipFile object, and r0 should be the pointer to the std::map object at 0x1fea02. 

I only care about the encrypted files, so I defined a std::vector variable to store a list of the encrypted files' names.Line 57-85, I iterates the std::map variable to get filenames, and can call cocos2d::ZipFile:getFileData function to get file contents. This function does not decrypt data. I check returned data, if the first 4 bytes is '\xfe\xfe\xfe\xfe', this file is an encrypted file, and add file name to the vector.  We need to free returned buffer, as line 78 does. Because this game have so many asset files, I define micro  TEST_VERSION, if this macro is set to 1,  whole function only do the work for the first  encrypted file. Line 97, _frida_hexdump is a function defined in typescript, and this function only do some hexdump.

Line 87-118, I iterated  the vector of all names of the encrypted files,  and call function cocos2d::CCFileUtilsAndroid::getFileData to get file content, this function decrypt data, and I write all plaintext to output folder.

Frida script to call my JNI function

I wrote a typescript named index.ts to call my JNI function. The following is the main function
 
Line 18, load shared library to the current process memory space, using loadSo method in the soutils module I wrote. The first argument  "info" content all info on the shared library,  the second argument  pass some functions written in typescript for C++ code to call, the third argument is an array of the libraries, loadSo will try to find unresolved symbols in these libraries.

Line 27, returned loadm.syms.test is the pointer of our loaded C++ function. We can call this function in Frida script,  and we need to pass it , the base address of the loaded libgame.so, and the output path. This game have not permission to write date to extern storage, do we need to set output under its own data directory. 


Further more 

So far, we got a way to call functions existing in the loaded modules in C++. We only need to analysis function prototype. Now, I don't know the data decrypt algorithm. I only need to know how to call the related function correctly. We can do many things by this way, at least, on Android platform. The limitation is your imagination. Ok, that's it, enjoy.    


Comments