I recently participated in the MRMCD CTF. My favourite challenge was called “Friendly Machine”.
It consisted of a Python script which reads code from a Base64-encoded JSON-encoded array. The array itself looks something like this:
[
{
"ZeiteesohpiefeeyuHah" : "start",
"Jeicheidahmeichetaik" : "ZeiteesohpiefeeyuHah"
},
{
"sebeeluoCaedohlaehoh" : "ZERO",
"IeCilahWaishaibiemoo" : 0,
"Jeicheidahmeichetaik" : "ayahshecieleeYeingis"
},
{
"IeCilahWaishaibiemoo" : 0,
"sebeeluoCaedohlaehoh" : "RES",
"Jeicheidahmeichetaik" : "ayahshecieleeYeingis"
},
{
"ZeiteesohpiefeeyuHah" : "lencheck_start",
"Jeicheidahmeichetaik" : "ZeiteesohpiefeeyuHah"
},
[...]
Hmmm, that kinda looks like variable assignments, labels, etc.? And sure enough, the main friendly machine code has a dictionary for variables and based on the entry in the current position in the code (yes, there are jumps, so it’s not necessarily linear) assignments or operations happen. Our goal is to end up in a position where we return 0, since then the flag is correct.
First I set out to see if I can add some debug output to the execution, but that turned out to be rather confusing than helpful. Static analysis it is, then. I wrote a script to output the code in a more readable form:
i = 0
for i in range(len(code)):
if code[i]["Jeicheidahmeichetaik"] == "bohxudohMeiteipiVaeZ":
print(code[i]["yuGhoxeebaivaiteifai"] + "=pwbyte")
elif code[i]["Jeicheidahmeichetaik"] == "chahghoaThoariaCowoh":
print("ret " + str(code[i]["shumeesaiXoohigheari"]))
elif code[i]["Jeicheidahmeichetaik"] == "ayahshecieleeYeingis":
print(code[i]["sebeeluoCaedohlaehoh"] + "=" + str(code[i]["IeCilahWaishaibiemoo"]))
elif code[i]["Jeicheidahmeichetaik"] == "DaweeyeiZaiceemeitah":
print(code[i]["iesheiQuiphaipohquei"] + "=" + code[i]["Koobaicahxaexeicohno"] + "+" + code[i]["OhNgaesiequievaijaca"])
elif code[i]["Jeicheidahmeichetaik"] == "geethahshiuxiyeitooH":
print(code[i]["SheixienaigeeSaeHahC"] + "=" + code[i]["looheThedohsouquoogo"] + "-" + code[i]["UaYaDaeciekeemeehein"])
elif code[i]["Jeicheidahmeichetaik"] == "uDohngaephaethahngah":
print(code[i]["ietaiviexuaniequeZie"] +"="+ code[i]["saichuqueiShieRaeYie"] + "^" + code[i]["RahThiefudeimahhohch"])
elif code[i]["Jeicheidahmeichetaik"] == "AhkiexaZeishieKohqui":
print(code[i]["eepuozeeviexoopieMoi"] + "=" + code[i]["aageenuxeLaeBaidoaru"] + "|" + code[i]["PeGoawoowiuthoobaaTh"])
elif code[i]["Jeicheidahmeichetaik"] == "riatheihoxooziitahGo":
print(code[i]["eishaBeiwiYahSiexaem"] + "=" + code[i]["IsichaikuaNeiHahRaiH"] + "&" + code[i]["thuyaecenaethiPochie"])
elif code[i]["Jeicheidahmeichetaik"] == "ieZieyiechooTeilaexe":
for equahSohNeohoonohphu in code:
if equahSohNeohoonohphu.has_key("ZeiteesohpiefeeyuHah"):
if equahSohNeohoonohphu["ZeiteesohpiefeeyuHah"] == code[i]["aNaeNeeyooCeezaiGeeb"]:
print("jmp " + str(code.index(equahSohNeohoonohphu)+1) + ' if ' + code[i]["ozeeleephuiGaechaiSh"] + '==0')
break
else:
print("nop")
i += 1
This leads to the following “code” (first few lines):
1 nop
2 ZERO=0
3 RES=0
4 nop
5 ONE=1
6 COUNT=0
7 nop
8 x=pwbyte
9 x=x+ONE
10 jmp 13 if x==0
11 COUNT=COUNT+ONE
12 jmp 7 if ZERO==0
13 nop
14 x=28
15 x=COUNT-x
16 jmp 21 if x==0
17 jmp 18 if ZERO==0
18 nop
19 o=-1
20 ret o
21 nop
22 ohhayeexongoakaeVuph=0
23 jmp 24 if ZERO==0
24 nop
Note that “pwbyte” represents “read a byte from the input and return -1 if we
read beyond the string length”. So we read byte by byte and increase COUNT
by
ONE
. Line 14 to 16 show us that our flag needs to be 28 characters long,
since otherwise we would return -1.
Let’s continue:
25 A=pwbyte
26 B=77
27 C=A-B
28 RES=RES|C
29 A=pwbyte
30 B=82
31 C=A-B
32 RES=RES|C
33 A=pwbyte
34 B=77
35 C=A-B
36 RES=RES|C
Oh, 77, 82, 77, or M
, R
, M
again. This looks good! And from the equations
we can see that our input needs to be exactly these values in order to keep
RES
(which will be returned at the very end) nicely at 0.
The code continues similarly, but gets a bit more complex:
[...]
49 X=pwbyte
50 t=7
51 Y=X-t
52 t=90
53 C=Y^t
54 RES=RES|C
55 X=pwbyte
56 t=15
57 Y=X-t
58 t=80
59 C=Y^t
[...]
79 X=pwbyte
80 t=999
81 Y=t-X
82 t=900
83 C=Y^t
84 RES=RES|C
85 X=pwbyte
86 t=1
87 Y=t+X
88 t=102
89 C=Y-t
90 RES=RES|C
[...]
One could solve all these things algebraically, but luckily for me my
“decompiler” outputs syntactically valid Python, so I was lazy and brute-forced
each character by looping over possible pwbyte
values and checking when RES
ended up being 0.
During the CTF I did this manually with a bit of copy-and-paste and running python, but for the sake of “AUTOMATE ALL THE THINGS!!111ELF”, here’s a script that does the same:
#!/usr/bin/env python3
import sys
START_CODE = "for pwbyte in range(128):\n\tRES = 0\n"
code = open('code', 'r').readlines()
current_block = START_CODE
# start at line 25, after the length check
for i in range(24, len(code)):
current_block += "\t" + code[i]
if 'RES=' in code[i]: # RES gets assigned, we want this to be 0
current_block += "\tif RES == 0:\n"
current_block += "\t\tprint(chr(pwbyte), end='')\n"
sys.stderr.write("Current code block:\n" + current_block)
exec(current_block) # don't run on untrusted input ;-)
current_block = START_CODE
print()
Running it gives us the flag:
$ ./bruteforce.py 2>/dev/null
MRMCD{a_processor_in_python}