Automate Solving Anti-Debug Binary With Angr

cothan published on
8 min, 1482 words

Tags: angr

AngrControlFlow - Naive binary

You can grab the challenges at here.

A small note while reversing large binaries, when you look at any graph, look from the bottom up, it gives you some senses where the flow goes. From the end first, and then up to where it starts.

When a crypto guy looks at the graph, 2 things are going on in his head:

  • Non-constant time comparison, vulnerable to timing attack.
  • Simple comparison. Easy to reverse.

Let me remind you once again, what can be solved with Z3 can be solved with Angr, because Angr uses Z3 as the internal symbolic solver, however, we often don’t deal with Z3 in Angr, we use Claripy as an abstract symbolic layer.

The more I use Angr, I realize Claripy has more capabilities tailored to reverse engineering in BitVector field than mathematical aspects like Z3. By saying this, I mean the performance gain is nothing, but the ease of implementation is quite comfortable. (I use Z3 longer enough so both are good for me)

About the binaries:

  • No anti-debug (if it has, I would patch it, in the later of this post, I will introduce to hook anti-debug instead of patching)
  • No SIMD instruction

So here is my plan to solve the challenge:

  • Decompiled the code by IDA, solve it with Z3 (the last resort)
  • Apply automate binary analysis Angr
  • Intel Pintool to count instructions, if we pass a comparison check, the number of instruction we execute is increased, so we can solve this with Pintool. However, because the comparison uses non-consecutive array, so we have to customize Pintool.

Let’s go with Angr first. If we can’t make it, we will be back to Z3.

As usual, we load the binary, let it run at main(), define a good point and bad point, and pray Angr gets to the winning branch.

import angr

p = angr.Project('angrycontrolflow')

In [5]: p = angr.Project('angrycontrolflow')
WARNING | 2019-03-03 02:39:04,834 | cle.loader | The main binary is a position-independent executable. It is being loaded with a base address of 0x400000.

Oops, this time the binaries is position-independent code (PIC). It’s not like what we used to do. However, the binary is load with a base address 0x400000. We note that, and move on.

This time, we write like what we used to do. Find a good point, bad point, and hint the simulation manager (simgr) to get to a good point and avoid a bad point

main = 0x000007e4

state = p.factory.entry_state(addr=main)

simgr = p.factory.simulation_manager(state)

good =0xa66
bad = 0xa88

simgr.explore(find=good, avoid=bad)

Oh man, we get errors.

This happens because God doesn’t like you, or you don’t treat people well so the drama goes back to you, or it could be because you use a Mac.

The real reason is, we pin the function address to a specific address, that address number is way too low, so we may think what we did is unusual. It’s true when Angr is started, it starts at the main function, which is a non-exist location in the binary code. As a result, when simgr runs, it starts from nowhere and ends at nowhere, that is why it doesn’t work.

So I gave you a hint before when we load the binary, the base address is 0x400000, now let’s modify the code, add the base address to these addresses we proposed.

base = 0x400000
main = base + 0x000007e4

state = p.factory.entry_state(addr=main)

simgr = p.factory.simulation_manager(state)

good = base + 0xa66
bad = base + 0xa88

simgr.explore(find=good, avoid=bad)

Wow, we found the result.

Let dump the result out.

result = simgr.found[0]

# Always print this

for i in range(3):
print (result.posix.dumps(i))

Yay we have the flag.

The whole script here:

import angr

p = angr.Project('angrycontrolflow')

base = 0x400000
main = base + 0x000007e4

state = p.factory.entry_state(addr=main)

simgr = p.factory.simulation_manager(state)

good = base + 0xa66
bad = base + 0xa88

simgr.explore(find=good, avoid=bad)

result = simgr.found[0]

# Always print this

for i in range(3):
print (result.posix.dumps(i))

ReMe - Antidebug

Not long ago I wrote a challenge Introduction to SMT in Efiens Round 2 CTF. You can see the writeup here. The download link also in midas’s github.

The code is developed by me, therefore if you take a look at the comparison graph, it’s not vulnerable to timing attack.

Let’s go with Angr, midas’s solution is in Z3, from what I said, what can be complete with Z3 can be complete in Angr.

Check the binary:

  • There are anti-debug functions, we will hook to the address of anti-debug function and replace it to do nothing, and functions with fork, we modify the eax register to avoid it jump to a black hole
  • The code is PIC (so we need to add base)
  • Patch the obfuscated code which takes forever to complete in the beginning (using radare2). We patch at address 0x00000bb6 jmp 0xc01 to jmp 0xc0a

As usual, let’s start loading the binary and read the base given to us.

import angr

p = angr.Project('REme-a28b7273a8e6671e6f0a361addbba87e')

## WARNING | 2019-03-03 03:43:33,089 | cle.loader | The main binary is a position-independent executable. It is being loaded with a base address of 0x400000.

As always, we start at main, start simulation manager, find good points and bad points. Remember, this binary has base 0x400000, so we should add the base to address in order to make Angr work.

base = 0x400000
main = base + 0x00000b2f

state = p.factory.entry_state(addr=main)

simgr = p.factory.simulation_manager(state)

good = base + 0x38ec
bad = base + 0x38ff

In the previous blog post about Angr, I write a hook function as this style

@p.hook(0x123)
def bla(state):
pass

But we could write it shorter, like this, both will work in the same way

def bla(state):
pass

p.hook(addr=0x123, bla)

The input to hook function is state, you may ask why, because when Angr run to the address we specify to hook, it will pass the current state it has to the hook function and ask “What do you want to do with this state? You asked for a hook”. That’s why we need to pass state to hook. Otherwise, we get errors and Angr will exit.

Next step, instead of patching a bunch of places, I decided to hook the ptrace to do nothing and fork function to set eax=1.

def nothing(sate):
pass

def set_eax(state):
state.regs.eax = 1

Alright, back to our script, now we see that ptrace has XREF as address 0x341f, so let’s hook it with nothing() . And hook fork with set_eax()

ptrace = base+ 0x341f

p.hook(ptrace, nothing)

gdb_debug_list = [0x00000b4e,0x00001340,0x00001805,0x00001dd3,0x000026ff,0x0000302e,]
gdb_debug_list = map(lambda x: base+x, gdb_debug_list)

for addr in gdb_debug_list:
p.hook(addr, set_eax)

Alright, let’s the simulation manager find its way to the good points.

simgr.explore(find=good,avoid=bad)

Nice, there are 64 solutions to good points. Let’s grab it.

Here is the full script for you to grab and play

import angr

def nothing(sate):
pass

def set_eax(state):
state.regs.eax = 1
pass

p = angr.Project('REme-a28b7273a8e6671e6f0a361addbba87e')

base = 0x400000
main = base + 0x00000b2f

state = p.factory.entry_state(addr=main)

simgr = p.factory.simulation_manager(state)

good = base + 0x38ec
bad = base + 0x38ff

ptrace = base+ 0x341f

p.hook(ptrace, nothing)

gdb_debug_list = [0x00000b4e,0x00001340,0x00001805,0x00001dd3,0x000026ff,0x0000302e,]
gdb_debug_list = map(lambda x: base+x, gdb_debug_list)

for addr in gdb_debug_list:
p.hook(addr, set_eax)

simgr.explore(find=good,avoid=bad)

result = simgr.found[0]

# Always print this

for i in range(3):
print (result.posix.dumps(i))

Well, you reach the end of this post and wonder why we have to deal with Angr instead of Z3 in order to solve ReMe? In terms of mentality, use Z3 in ReMe challenge is better, since I wrote it in a way to prevent Angr, I add a lot of anti-debug.

In the challenge AngrControlFlow, no anti-debug, so it would be faster to solve with Angr.

I solved AngrControlFlow with Angr within 10 mins, meanwhile, I solved my ReMe take 25 minutes.

Base on how fast I parse decompiled code from IDA, in someway, if I do ReMe with Z3 it would be faster. However, the solution I wrote would be only use for ReMe challenge, but with Angr, you can reuse your solution for another challenge.

So far, we learn how to use Angr, deal with anti-debug function, add constraints to symbolic solver, with PIC binary we need to add base.