## Side-channel attack in pyc program reversing

### A simple example

Recently, I am reversing a highly obfuscated pyc program. There are a lot of junk codes in the program, restoring the logic of all the functions is very time consuming and laborious for me. But I got a interesting idea when I reviewing the function call. Here is a simple example to describe my idea.

import random
import base64

def checkstr(_str,encrypt_str):
random.seed(233)
encodestr = ""
for i in _str:
tmp = chr(ord(i) ^ random.randint(0,255))
encodestr += tmp
try:
if(encrypt_str == base64.b64encode(str(encodestr.encode('hex'))).encode('hex')):
return 1
except:
return 0

def checklen(_str):
if len(_str) <=0 :
return 0
else:
return 1

def test():
encrypt_str = "5a47466a4e575931596a673d" #test
_str = raw_input()
if checklen(_str):
if checkstr(_str,encrypt_str):
print 'ok'
return 1
else:
print 'error'
return 0
else:
print 'len error'

test()


Next, I will use some code obfuscation measures to compile this code into a pyc file that is difficult to decompile.

First of all, we need to know the general function of this program in advance, for example, the above example is an authentication program.

So we can know that the program finally has an instruction to determine whether the strings are equal. (Or other situations, such as directly comparing in the middleware, database etc. But we can easily find them in logs)

if func1(your_input) == func2(encrypted_password)


Now, we compile a magically modified python and output the left and right expressions in the comparison process to stdout.

The code corresponding to the comparison string function is in Objects/stringobject.c

We change the code like this

string_richcompare(PyStringObject *a, PyStringObject *b, int op)
{
...

if (op == Py_EQ) {
/* Supporting Py_NE here as well does not save
much time, since Py_NE is rarely used.  */
printf("left string : %s\n",a->ob_sval);
printf("right string: %s\n",b->ob_sval);
if (Py_SIZE(a) == Py_SIZE(b)
&& (a->ob_sval[0] == b->ob_sval[0]
&& memcmp(a->ob_sval, b->ob_sval, Py_SIZE(a)) == 0)) {
result = Py_True;
} else {
result = Py_False;
}
goto out;
}
...
}



Compiling it , and we will get a python that can output the comparison strings.

We tried to use this python to run the example obfuscated file.

Now we can see there is a static string 5a47466a4e575931596a673d.

if func1(your_input) == func2(encrypted_password)


But we don't know func1 and func2 .

We can use marshal to have simply look at all the functions called.

>>> code.co_names
('sys', 'zlib', 'base64', 'marshal', '_getframe', 'f_code', 'yield finally', 'co_code', 'continue as', 'len', '^ + dict', 'from --', 'elif &&', 'as as assert', 'range', '/ with', 'chr', 'ord', 'loads', 'decompress', 'b64decode', '&& isdecoded with', 'True')


And using strace command to see all the libraries that are called after the user input.

strace python aaa.pyc 2>&1 | grep "python2.7.*pyc" | grep -v "No such"


The final step is hex_codec.py , and we inject some codes to output the useful information.

We can see hex_encode is called twice.

There is a base64 string , and we can change the function of base64 too.

Now, we can easily crack out the password.

Because the program is simple, so we can guess the logic trough some different input.

We know the final encrypt string is 5a47466a4e575931596a673d

after hex_decode => ZGFjNWY1Yjg=

after b64decode => dac5f5b8

after hex_decode => \xda\xc5\xf5\xb8

We don't need to know the code logic before , just brute force it by bytes .

### Others

I think that this kind of side channel attack can attack the opcode program formed by many kinds of interpreted languages after code obfuscation in the case of black box, it is suitable for programs that are inconvenient to debug or disable debugging .

• 用支付宝打我
• 用微信打我