Reversing a game implemented in a rom-to-eng translator in HackTM 2020
TL;DR
Solving a video game wrapped inside an obfuscated translation engine.Challenge Description:
“The binary is still in development. The final version would have let you play boggle”
All the files for this chall, including my solve, can be found here.
Introduction
After the first 24 hours of HackTM-20 playing with pwndevils, I was finished working on all the “Bear” reversing challenges with @fish. I decided it was time to confront my fears and finally reverse a Rust binary.
Overall Understanding
Within the first few seconds of getting this binary we do a simple file on the two files provided:
Luckily it’s an ELF. Next we do a simple strace
on the binary to understand
how this binary is interacting with the provided “plaintext” file.
We get overwhelmed with output since it seems that the binary is reading in the ascii file line by line (and it has a lot of lines):
Playing with the binary we found out we have a translation program that converts
English to Romanian by doing a table lookup in the ascii file. It’s easy to see
this is true since they put a delimiter <BR>
between each English and Romanian
translation – also, the entire file was loaded into memory earlier. Time to
jump into this binaries assembly code!
Discovering important functions
After about 0.01 seconds of looking at this binary in any decompiler, IDA Pro in my case, we see that this binary was compiled from the language Rust. That is unfortunate, seeing that rust includes all it’s library’s function code in the binary – making our binary bloated as hell.
Yeah we have around 2599 functions in this binary, so we want to figure out
where to start static analysis by seeing where this binary does the important
things in dynamic analysis. This is also possible by following calls to main
.
Causing a keyboard interrupt in the middle of translation allows us to see some
basic information about how the binary is getting it’s translations. We can see
from the backtrace that the main package/class in this binary is the dex
class. We can see the important functions dex::translate()
, dex::main()
, and
dex::get_input()
. Let’s take a look into dex::main
function.
Taking a look at offset 0xEDD5
:
We see a data dereference loaded into rax
and then compared against 9. If it
is 9 then we enter the elusive extra
function. Without 100% knowing where that
data references, we just try a simple test:
Yup, we found the secret mode of the binary. Recall the hint said we are dealing
with a program that was not finished being developed, so we can assume that this
area was not meant to be accessed yet. Our new target is the hex::extra
function. Taking a look at it’s internals we see offset 0xC6C4
:
This causes us to leave the function. Assuming that we will never be able to
satisfy this with normal user input, we can just hex patch this section out by
modifying the jz
to a jnz
. Now we can get to the pro mode:
How do we get the flag?
Once we enter the PRO Mode, we can vaguely see that this is the right path to a
flag since we see "HackDex!\n\n===> GG!"
at offset 0xE453
– of course since
this is rust, strings are not null terminated. From here on, it may be helpful to
see the decompilation, so I will assume you have IDA Pro. If you don’t, Ghidra
will have similar results for these sections. We continue!
Executing the binary in the pro-mode, we see a rejection for input:
At offset 0xC813
we see the use of dex::get_input()
:
Along with the get_input()
, we see that a get_valid_words
is called with
each instance of the get_inputs
(for almost all of them). If this is the
validation, we can assume we need to get valid words to get the flag.
We see the output of the valid words used in the contains_key
map function for
a series of condition checks:
Using a similar method to patching the pro-edition check we can make this check pass each time. After patching we try the test again:
As you can see, the check is run 6 times. We can confirm this by looking at the cross references to the function. So we need to input 6 “correct” words to get the flag. Now we need to understand what makes a word “valid.”
Valid Words
At this point, we need to explore the dex::extra
function and find things that
look interesting in mangling our input. We find two such functions at 0xE312
and 0xE46A
:
Yup, a sha256
sum, and a chacha
cipher. Let’s work from the latter first.
With some google searching and some back tracking in the code, we discover that
something we input is used as the key to the cipher. You can guess what the
information is in the cipher text (the flag).
But notice we only make it to the chacha
if we make it passed a check:
I’ve annotated some of the variables in this, but it still looks terrible. Like
actual trash. Before fully understand what this if
statement is doing, let’s
go back to the former mentioned sha256
digest.
Looking at the call to the sha256 Digest
we can see an input buffer filled
with “dynamic” looking strings before digesting:
From looking at this line, I realised that out input was SHA256ed then saved to
the output buffer, then used in the former if
statement:
This if
statement references the output of the SHA256
sum. If you’re picking
up what I’m putting down, you are guessing the input to the SHA256 is our
inputs, and you would be right! From the gruesome IDA code above, we can see the
output of the sha
compared to some hex string. Following the above psuedo
operations by IDA, we use the below strings to get the expected sum:
Following the operations, we get the expected sum as:
0xF550BAA8068D9C17669E140626A9D7BF13EF0A66662AEB5910FC406BE196A287
.
To recap, we have six strings that are inputted from us, then are sha256 hashed.
If the output of the hash is:
0xF550BAA8068D9C17669E140626A9D7BF13EF0A66662AEB5910FC406BE196A287
then we have inputted the correct strings and we decrypt the stored flag using
the chacha
cipher.
Playing some Boggle
After figuring all of that out, I was stumped for little since the input space
of 6 strings was a little too large to bruteforce for an expected hash. Out of
nowhere @fish, the reversing god himself, bestowed upon me that the call of
every get_input
looked like it was setting up an array in memory. That’s when
he put together that these tables must be the boggle tables reference in the
hint.
So I took a look at the initialization of the arrays, and he was damn right!
This code may look a little confusing, but you can see that v9
is being used
as the base of this array, and setup as a series of characters… as is v8
and
v6
. We have start breaking apart the memsets to discover a table being
created:
That’s a boggle table! SOOOOO, we must be playing boggle as inputs to our hashing algorithm.
If you follow every call to get_input
you will notice that something similar
is happening before each call. Below are the tables from each call:
We finally approach the endgame! We need to find out what strings are possible
for boggle table, then try every combination to the SHA256
sum. After trying
this initially, and failing, I speculated that we needed to use one more piece
of information to finish this boi… the hackdex dictionary file!
Getting that Flag
Here is the full recap of the rundown to solving for the flag:
- We discovered the pro version
- We needed 6 strings that sha to
0xF550BAA8068D9C17669E140626A9D7BF13EF0A66662AEB5910FC406BE196A287
- We discovered the boggle tables used to generate the strings
- The strings can only be strings located in the
hacktm.hdex
file.
After doing this, we get the final message:
As we all know, CTFs are not only about winning. They are also about learning,
team work, overcoming yourself, passion and making friends. Have fun!
HackTM{wh4t_4_cur10us_pow3r_w0rds_h4v3}
The main portion of the solve is below:
The full solve script can be found here
Thanks where thanks is due
Big thanks to @fish for the help in this solve!