disassembler linear sweep Shingled recusrive traversal anti-disassembly

relative disassembler performance

capstone - disassembler. converse of key zydis xed distrom iced bddisasm yaxpeax

McSema - older trail of bits lifter. Uses llvm as IR remill lift to llvm bitcode anvill processing remill rellic makes C like code

BAP ANGR - VEX which is valgrind’s ir

gtirb ddisasm grammatech “GTIRB explicitly does NOT represent instructions or instruction semantics but does provide symbolic operand information and access to the bytes. “

Speculative disssembly decode every offset. Refine blocks. Spedi, open source spcualtive disassembler Nucleus paper Compiler-Agnostic Function Detection in Binaries

superset disassembler kenneth civuentes thesis

probablistic disassembly using proabablistic datalog? bap mc + datalog?

Formally Verified Lifting of C-Compiled x86-64 Binaries Scalable validation of binary lifters

  • Linear
  • Recursive



Delay slots are an annoyance. Some architectures allow instructions to exist in the shadow of jump instructions that logically execute beofre the jump instruction. This makes sense from a micro architectural perspective, but it is bizarre to disassemble paper osr - offset shift range. I;ve seen this called value set analysis? thesis


Analyzing Memory Accesses in x86 Executables Reps and Balakrishnnan

byteweight - bap neural network thing identifies function starts

Ghidra repackaging: lifting bits sleigh pypcode

An Empirical Study on ARM Disassembly Tools

ben’s ll2l


Interesting. they are actuallyb trying to get source that compiles to exactly the assembly. They might have the original tool chain “matching” decompilation mips and powerpc decompiler ocarina of time deocmpilation,0,0 paper maro bincat retypd Ahoy SAILR! There is No Need to DREAM of C: A Compiler-Aware Structuring Algorithm for Binary Decompilation Graph schema matching? Smart methodology. Take codebase, decompile, compare number of gotos in original vs decompiled functions. Find hotspots. Binary search for passes responsible

DREAM Pheonix

FoxDec - Formally verified x86-64 decompilation

relic - no more gotos

a comb for decompiled C

Cifuentes thesis kind of a ludicrous amount of info “Decompiler compiler” references BB91 BB92 bbl91 bbl93 bow91 bow93 A Compendium of Formal Techniques for Software Maintenance the redo project final report

Decompilation: The Enumeration of Types and Grammars From programs to object code and back again using logic programming: Compilation and decompilation The art of computer un-programming: Reverse engineering in Prolog Generating Decompilers

Polymorphic Type Inference for Machine Code

Static Single Assignment for Decompilation - Emmerik

Decompilers and beyond Ilfak Guilfanov, Hex-Rays SA, 2008

Type-Based Decompilation? (or Program Reconstruction via Type Reconstruction)- Alan Mycroft

Loop recovery

Getting structured control flow from CFG Stackify

havlak and tarjan testing redicibility with union find

A New Algorithm for Identifying Loops in Decompilation

dfs. label nodes. Intervals of labels are subsets of nodes. timestamp of first visit timestamp of last ivist. backedge forward edge, cross edge

Also… egraphs

webassembly is at the core of both of these Allllll things come around I wonder if webassembly would be a good universal disassembler ir someone should do that


  • IDA
  • Ghidra
  • Binary Ninja
  • Cutter
  • angr management

decompiler explorer Hmm. too bad it’s not a web service

Binary Ninja

SCC- shellcde compiler. Why is this a top level thing?

Binary View Making a plugin - has remote denug to vs code

current_function bv.functions

f.callers f.callees

Types ` bv.parse_type_string(“uint64_t”) ` slow but convenient

Type.bool Type.char

#!/usr/bin/env python3
import binaryninja
with binaryninja.open_view("/bin/ls") as bv:
    print(f"Opening {bv.file.filename} which has {len(list(bv.functions))} functions")

IR Tower

Lifted IL




See ghidra notes



echo "
int foo(int x) {return x*x + 1;}
" > /tmp/foo.c
gcc /tmp/foo.c -c -o /tmp/foo.o


import angr #, monkeyhex
proj = angr.Project('/bin/true')
state = proj.factory.entry_state()

code = '''
int fact(int x){
  int acc = 1;
  while(x > 0){
    acc *= x;
  return acc;
import tempfile
import subprocess
import angr #, monkeyhex
import os
with tempfile.NamedTemporaryFile(suffix=".c") as fp:
  with tempfile.TemporaryDirectory() as mydir:
    outfile = mydir + "/fact"
    print(["gcc",  "-g",  "-c","-O1", "-o",  outfile,], check=True))
    print(["objdump", "-d", outfile], check=True))

    proj = angr.Project(outfile)
    block = proj.factory.block(proj.entry)
    print(block.instructions) # numebr of instryuctions
    state = proj.factory.entry_state()
    #state = proj.factory.entry_state()

echo "
int fact(int x){
   int acc = 1;
   for (int i = 1; i < x; i++){
      acc *= i;
   return acc;
" > /tmp/fact.c
gcc /tmp/fact.c -c -o /tmp/fact 
echo "
int max(int x){
  return x > 0 ? x : -x;
" > /tmp/max.c
gcc /tmp/max.c -c -o /tmp/max 
import angr
p = angr.Project('/tmp/fact')
cfg = p.analyses.CFGFast()
print("This is the graph:", cfg.graph)
entry_func = cfg.kb.functions[p.entry]
blocks = list(entry_func.blocks)
irsb = blocks[0].vex
import angr
p = angr.Project('/tmp/max')
cfg = p.analyses.CFGFast()
print("This is the graph:", cfg.graph)
entry_func = cfg.kb.functions[p.entry]
blocks = list(entry_func.blocks)
irsb = blocks[0].vex
from pyvex.stmt import *
from pyvex.expr import *

jmp_table = ["""

#for addr in addrs:
#  jmp_table.append(f"case 0x{addr:x}: goto {label(addr)};")
# assert(false); // unexpected indirect jump
def proc_binop(expr):
    # TODO: perhaps I need casts here to signed unsigned?
    if expr.op == "Iop_Sub64":
      return f"({proc_expr(expr.args[0])} - {proc_expr(expr.args[1])})"
    elif expr.op == "Iop_Sub32":
      return f"({proc_expr(expr.args[0])} - {proc_expr(expr.args[1])})"
    elif expr.op == "Iop_Add64":
      return f"({proc_expr(expr.args[0])} + {proc_expr(expr.args[1])})"
    elif expr.op == "Iop_Add32":
      return f"({proc_expr(expr.args[0])} + {proc_expr(expr.args[1])})"
    elif expr.op == "Iop_Shr64":
      return f"({proc_expr(expr.args[0])} >> {proc_expr(expr.args[1])})"
    elif expr.op == "Iop_And64":
      return f"({proc_expr(expr.args[0])} & {proc_expr(expr.args[1])})"
    elif expr.op == "Iop_Xor64":
      return f"({proc_expr(expr.args[0])} & {proc_expr(expr.args[1])})"
    elif expr.op == "Iop_CmpLT32S":
      return f"({proc_expr(expr.args[0])} < {proc_expr(expr.args[1])})"
    elif expr.op == "Iop_Mul32":
      return f"({proc_expr(expr.args[0])} * {proc_expr(expr.args[1])})"
      assert False, f"Unimplemented binop {expr.op}"
def proc_unop(expr):
  if expr.op == "Iop_64to32":
    return f"((int32_t) {proc_expr(expr.args[0])})"
  elif expr.op == "Iop_32Uto64":
    return f"((int64_t) {proc_expr(expr.args[0])})"
  elif expr.op == "Iop_1Uto64":
    return f"((int64_t) {proc_expr(expr.args[0])})"
  elif expr.op == "Iop_64to1":
    return f"((bool) {proc_expr(expr.args[0])})"
    assert False, f"Unimplemented unop {expr.op}"

def proc_expr(expr : IRExpr):
    if isinstance(expr, Binder):
        assert False
    elif isinstance(expr, VECRET):
        assert False
    elif isinstance(expr, GSPTR):
        assert False
    elif isinstance(expr, RdTmp):
        return str(expr)
    elif isinstance(expr, Get):
        assert expr.ty == "Ity_I64"
        return f"state->gr[{expr.offset}]"
    elif isinstance(expr, Qop):
        assert False
    elif isinstance(expr, Triop):
        assert False
    elif isinstance(expr, Binop):
        return proc_binop(expr)
    elif isinstance(expr, Unop):
        return proc_unop(expr)
    elif isinstance(expr, Load):
        #assert expr.ty == "Ity_I32"  # TODO. We need to deal with these types and endianess. Yuck
        return f"state->mem[{expr.addr}]"
    elif isinstance(expr, Const):
        return str(expr)
    elif isinstance(expr, ITE):
        return f"{proc_expr(expr.cond)} ? {proc_expr(expr.iftrue)}: {proc_expr(expr.iffalse)}"
    elif isinstance(expr, CCall):
        assert False
      assert False

def label(addr):
  return f"L_0x{addr:x}" 
def proc_type(typ):
  if ty == "Ity_I32":
    return "int32_t"
  elif ty == "ItyI64":
    return "int64_t"
    assert False

def proc_stmt(stmt : IRStmt):
    if isinstance(stmt, NoOp):
         return "// NoOp"
    elif isinstance(stmt, IMark):
        # assert PC = expected pc here?
        # f"assert(state->gr[{}] == stmt.addr);"
        #return f"break; case 0x{stmt.addr:x}: //{label(stmt.addr)}:" # also print original assembly in comment
        return f"//{label(stmt.addr)}:"
    elif isinstance(stmt, AbiHint):
        return f"return; // {stmt}" # TODO: add return parameter? Or state is good. Is abihint always at returns?
    elif isinstance(stmt, Put):

        return f"state->gr[{stmt.offset}] = {proc_expr(};"
        assert False
    elif isinstance(stmt, PutI):
        assert False
    elif isinstance(stmt, WrTmp):
        return f"t{stmt.tmp} = {proc_expr(};"
    elif isinstance(stmt, Store):
        return f"state->mem[{stmt.addr}] = {proc_expr(};"
    elif isinstance(stmt, CAS):
        assert False
    elif isinstance(stmt, LLSC):
        assert False
    elif isinstance(stmt, MBE):
        assert False
    elif isinstance(stmt, Dirty):
        assert False
    elif isinstance(stmt, Exit):
        assert stmt.jk == "Ijk_Boring"
        return f"if({proc_expr(stmt.guard)}){{ state->gr[{stmt.offsIP}] = 0x{stmt.dst.value:x}; break; }}"
        # TODO deal with other jumpkind
        # What represents an indirect jump? goto jump_table;
        # elif stmt.jk == "Ijk_Ret" f"return;"
        # elif stmt.jk == Ijk_Call   hmmm. f"{funname}()" maybe?
        # elif stmt.jk == "Ijk_Exit"
    elif isinstance(stmt, LoadG):
        assert False
      print("unrecognized IRStmt")
      assert False

output = []
tmps = set()
for block in blocks:
  output.append(f"case 0x{block.addr:x}:")
  for stmt in block.vex.statements:
    if isinstance(stmt,WrTmp):
  # output.append("goto jump_table;")
output.append("default: assert(0); // Unexpected PC value. Something has gone awry.")


header = """
#include <stdint.h>
#include <assert.h>
#include <stdbool.h>
#define PUT(reg) reg
#define PC 184

typedef struct state_t 
int64_t *mem;
int64_t *gr;
} state_t;

with open("/tmp/decomp.c", "w") as file:
    file.write(f"void {}_decomp(state_t *state){{\n") # use
    file.write("int " + ",".join([f"t{tmp}" for tmp in tmps]) + ";\n") # declare temps
    file.write(f"state->gr[PC] = 0x{entry_func.addr:x};\n") # initilizae PC to entry point
    file.write("while(1){\n") # interpreter loop invariant on PC? Could make gas?
    file.writelines([x + "\n" for x in output])
clang-format -i  --style=google /tmp/decomp.c
cat /tmp/decomp.c
gcc /tmp/decomp.c -O2 -c -o /tmp/decomp.o -Wall -Wextra -Wcast-align -Wcast-qual -Wmissing-declarations
esbmc /tmp/decomp.c --function max_decomp #--goto-functions-only

Control flow encoding. Do I do separate jump table, go to jump table every time? while(true){switch[pc]{ case: case: case: case default: } } This is fairly conservative. Big block encoding trusts fall through behavior. Byte address and then cast mem pointers.

import angr
p = angr.Project('/tmp/fact')
cfg = p.analyses.CFGFast()
print("This is the graph:", cfg.graph)
entry_func = cfg.kb.functions[p.entry]
blocks = list(entry_func.blocks)
irsb = blocks[0].vex
import pyvex

stmts = []
counter = 0
def fresh():
  global counter
  counter += 1
  return f"%v{counter}"

def print_expr(dst, expr):
  if isinstance(expr, pyvex.expr.Const):
    stmts.append( { "op": "const", "type": "int", dest: dst, "value": expr.con })
  elif isinstance(expr, pyvex.expr.Binop):
    args = [fresh() for _ in range(2)]
    for a, e in zip(args, expr.child_expressions):
    stmts.append({ "op": expr.op, "type": "int", "dest": dst, "args": args })
  elif isinstance(expr, pyvex.expr.RdTmp):
    #stmts.append({ "op": "id", "type": "int", "dest": dst, "args":  })
  elif isinstance(expr, pyvex.expr.Get):
    return f"%{expr.offset}"
    assert False

for stmt in irsb.statements:
    if isinstance(stmt, pyvex.IRStmt.Store):
      expr = print_expr(
      assert stmt.end == "Iend_LE"
      expr["op"] = "store"
      expr["dest"] = expr.addr
    elif isinstance(stmt, pyvex.IRStmt.Put):
    elif isinstance(stmt, pyvex.IRStmt.WrTmp):
    elif isinstance(stmt, pyvex.IRStmt.IMark):
      assert "unrecognized stmt" == None


See notes on patching


see note on patching

Binary reversing hilbert curves for binary vsiualization benford’s law

binwalk ofrak fra cwechecker

entropy visualization Decomperson: How Humans Decompile and What We Can Learn From It



See note on debuggers

Code Search isn’t semgrep basically a treesitter grepper?

Joern codeql semgrep

Dynamic Binary Instrumentation

dynamorio frida injects a quickjs huh pin


CWE - common weakenss enumeration

integer overflow

null pointer dereference

sophos - comprehensive exploit prevention 2018

Current State of Exploit Dev 2020

Web App

Burp suite


microsoft exploits and exploit kits


Control Flow integrity is a broad term for many of these CONFIRM: Evaluating Compatibility and Relevance of Control-flow Integrity Protections for Modern Software

DEP - data execution prevention executable space protection This says DEP is Windows terminology? NX bit

shadow stack

control flow guard - windows reverse flow guard extreme flow guard kernel data protection

stack canary -fstack-protector. Guard variable put on stack SSP stack smashing protection. Stackguard, Propolice. Buffer overflow protection

ASLR ASLP A Address Space Layout Randomization. Libraries are linked in at a different location. This make code reuse in an exploit more difficult.

Fat pointers

endbr intel control flow enforcement technology (CET). Valid locations for indirect jumps.

ASLR - Addresses are randomized cat /proc/mem/self ? To look at what actually loaded Also ldd shows were libraries get loaded in memory Stack canaries - set once per binary run, so with forking you can brute force them or maybe leak them?

checksec tells you about which things are enabled. which also has a rundown of the different things and how you could check them manually. Can output into xml, json, csv

gcc options -no-pie -pie -fpie -no-stack-protection -fstack-protector-all -z execstack makes stack executable

RELRO - relocation read only. GOT table becomes read only. Prevents relocation attacks

binary diversification - compiler differently every time. code reuse becomes way harder diversification make many versions of binary to make code reuse attacks harder. disunison


Buffer Overflows a game version of buffer overflows. cool.

buffer overflow When a buffer overflow occurs you are writing to memory that possibly had a different purpose. Maybe other stack variables, maybe return address pointers, maybe over heap metadata.

Sanitization of user input Off by one errors String termination


AWP Arbitrary write primitive CWE-123: Write-what-where Condition ARP arbitrary read primitive

Return Oriented Programming (ROP)

rp ropper [pwntool]

ropgadget ropium supports semantics queries. hmm in cpp. V impressive.

return to libc libc is very common and you can weave together libc calls. “Solar Designer” solar designer 1997

ropc-llvm ropc

smashing the stack for fun and profit - stacks are no longer executable geometry of innocent flesh on the bone. ROP

rop emporium

rop ftw

pop_gadget ; value ; nextgadget loads from stack into register

pure buffer overflow from command line:

#include <stdio.h>
int main(int argc, char *argv[])
    char buf[256];
    memcpy(buf, argv[1],strlen(argv[1]));



rop prevention by binary rewriting ropguard

stack pivoting moving over to a different stack.

ret2libc ret2dlresolve ret2csu ret2plt


jop rocket. blackhat talk

Data Oriented Programming (DOP)

Block Oriented Programming

Heap house of rust

If you can overwrite a struct that contains a pointer, you can use this to obtain reads or writes when that pointer is read or written. If the struct contains a code pointer, you get control flow execution.

Heap layout problem Heap layout manipulation


Double free Use after free advanced doug lea malloc hacking valgrind massif perf-mem, valgrind massif, and heaptrack

advanced doug lea malloc - phreak post

glibc. ldd /bib/ls - symbolic link probably glibc 2.27 actually pie You can run it? /lib/x86_64-linux-gnu/

malloc chunks of memory new/delete make_unique

heap history viewer

pwndbg vis let’s you see allocated chunks. How do it do it? vmmap also shows memory regions classified top chunk. I think the top chunk is resized to hand out new memory. Metadata - size field. prev_in_use flag. Allocator hands out in discrete sizes, not arbitrarily flexible

Playing for K(H)eaps: Understanding and Improving Linux Kernel Exploit Reliability defragmentation


libheap examine glibc heap in gdb. seems like there is a python model of heap in here.


chunks - has prev size, size, info bits. Free chunks have pointers in content top chunk - large piece of memory new chunks are carved out of last remainder chunk tcache fastbin - last in first out. a stacklike structure

large bins - varying size. chunks put in order

printf(“%p”) is your friend

free lists when we call free memory fastbin. hold free checks of a specific size set context-sections code b main vis to visualize heap fastbins command -x20 - 0x80 arenas main arena - glibc data section pwndbg arena

fd forward pointer

House of Force

fastbin dup

arbitrary write double free fastbin will return twice. frame command to context code

immediate double free is prtoected


find_fake_fast one_gadget - constraints


n vis

unsorted bin doubly linked circular free unsorted chunks have forward and backward pointers. new free added to head. malloced from tail

adjeacent chunks get considlated prev in use flags unlinking. chunks being removed from doubly linked list

2000 solar designer voodoo malloc tricks

Type Confusion


double free ps4 Joy of explooitation the kernel

Browser Attacking JavaScript Engines: A case study of JavaScriptCore and CVE-2016-4622

How I started chasing speculative type confusion bugs in the kernel and ended up with ‘real’ ones

Return to sender Detecting kernel exploits with eBPF

Automated Exploit Generation (AEG)

sean heelan thesis

usenix security heaphopper angr symbolic analysis for heap exploits? archeap maze toward heap feng shui backward search from heaphopper teerex discover of memory corrupton vulen [symcc]


Elf stuff

See linker elfmaster elf reverse

example interesting elf files overlaying headers. smallest that doesn’t voilate spec

binary golf workshop size coding dead bytes libgolf Elf Binary Mangling Pt. 4: Limit Break

Virus Preloading the linker for fun and profit ~ elfmaster vxunderground VXHeaven - mirror tmp.out second to hell interview roy g biv hh86 herm1t ezines

Language infection project Valhalla 0-4 zine Metamorphism, Formal grammars and Undecidable Code Mutatio “Polymorphism and Grammars” - qozah

see herm1t’s metamorphic Linux virus Linux.Lacrimae in EOF #2 and his article “Recompiling the Metamorphism”

From the design of a generic metamorphic engine to a black-box classification of antivirus detection techniques - tau obfuscation UNIX ELF Parasites and virus - Silvio Cesare

More virus ezines ~2000 mostly initeresting little article also has zines in papers

antivirus AV

metamorphic / polymorphic - mutate themselves to avoid simple detection Relationship to quines?

(De)Obfuscation like obfuscation passes for llvm? llvm archived java bytecode android python

anti hooking - maybe more generally these are anti-RE techniques. debugger detection. VM detection. control flow breaking opaque constants

virtual machine VM obfuscation jit anti alias analysis self modification - C source to source list of transformations

Mixed Boolean Arithmetic (MBA) Efficient Deobfuscation of Linear Mixed Boolean-Arithmetic Expressions ∗ HITB2022SIN #LAB Advanced Code Obfuscation With MBA Expressions - Arnau Gàmez Montolio QSynth - A Program Synthesis based Approach for Binary Code Deobfuscation

mixed Boolean arithmetic mbasolver simb gamba goomba msynth Boosting SMT Solver Performance on Mixed-Bitwise-Arithmetic Expressions

Viruses try to obfuscate themselves t avd detection


SUID GTFOBins is a curated list of Unix binaries that can be used to bypass local security restrictions in misconfigured systems.


commando vm

yolo space hacker. steam game of ctf stuff exploit_me very vulnerable example applications pwntools



What is Binary Analysis

Dynamic - It feels like you’re running the binary in some sense. Maybe on an emulated environment Static - It feels like you’re not running the binary

Fuzzing is definitely dynamic. Dataflow analysis on a CFG is static There are greys areas. Symbolic execution starts to feel like a grey area. I would consider it to be largely dynamic, but you are executing in a rather odd way.

Trying to understand a binary Why?

  • Finding vulnerabilities for defense or offense
    • buffer overflows
    • double frees
    • use after frees
    • memory leaks - just bad performance
    • info leaks - bad security
  • Verification - Did your compiler produce a thing that does what your code said?
  • Reversing/Cracking closed source software.
  • Patching and Code injection. Finding Bugs for use in speed runs. Game Genie.
  • Auditing
  • Aids for manual RE work. RE is useful because things may be undocumented intentionally or otherwise. You want to reuse a chip, or turn on secret functionality, or reverse a protocol.
  • Discovery of patent violation or GPL violations
  • Comparing programs. Discovering what has been patched.

I don’t want my information stolen or held ransom. I don’t want people peeping in on my conversations. I don’t want my computer wrecked. These are all malicious actors We also don’t want our planes and rockets crashing. This does not require maliciousness on anyone’s part persay.

  • Symbol recovery
  • Disassembly
  • CFG recovery
  • Taint tracking
  • symbolic execution A list of tools CSE597: Binary-level program analysis Spring 19 Gang Tan

Program Analysis

What’s the difference? Binaries are less structured than what you’ll typically find talked about in program analysis literature.

Binaries are tough because we have tossed away or the coupling has become very loose between high level intent and constructs and what is there.

How are binaries made

C preprocessor -> Maintains file number information isn’t that interesting

C compiler -> assembly. You can ask for this assembly with -S. You can also Or more cut up C -> IR IR -> MIR (what does gcc do? RTL right? ) MIR -> Asm

Misc kernel exploitation brwoser exploitation blog posts

  • Arbitrary code guard
  • Code interity guard
  • hypervisor protected code integrity (HVCI) “the acg of kernel mode”
  • vistualization based security VBS. credential guard
  • local privilege escalation (LPE)


  • elfmaster ryan oneill

cfi directives - call frame information automatic bug insertion using joern phaser slither

Hiding instructions in instructions

Thomas stars

SGX enclaves

obfuscation snapchat ollvm vmprotect opaque preciates - one branch always taken

chris domas tom 7

firmadyne emulating and analyzing firmware


burp suite idor - autorize


shellcode encoding and decoding - sometimes you need to avoid things like \0 termination. Shellcode generators. What do they do? shellcode database

google dorking Like using google with special commands? Why “dork”? shodan


-A -T4. OS detection nmap nse - nmap scriping engine. There is a folder of scripts

p0f - passive sniffing. fingerprinting

malware reversing class live overflow youtube exploit education rop emporium linux exploitation course yara - patterns to recognize malware. Byte level patterns? Sigma snort

SIEM IDS - intrusin detection systems

shellcode encoder/decoder/generator synesthesia

FLIRT exploit examples

Gray Hat Hacking The Shellcoder’s handbook Attacking network Protocols Implementing Effective Code Review

Hacking: sergey weird machine paper

blackhat defcon bluehat ccc bsides ctf project zero kpaersky blog spectre/meltdown

return oriented programming sounds like my backwards pass. Huh.

Digital forensics

radare2, a binary analysis thingo. rax is useful for conversion of hex

binary ninja






Maybe we should get a docker of all sorts of tools. Kali Linux?

klee, afl, other fuzzers? valgrind



ROP - solve substitution cyphers john the ripper. Brute force password cracker


Best CTFs. I probably don’t want the most prestigious ones? They’ll be too high level? I want the simple stuff - check out the heap exploitation github thing


metasploit, pacu - aws, cobalt strike

and the pwn category of ctf

ROP JOP SROP BOP - block oriented

return 2 libc - a subset of rop?

ryan chapman syscall

privilege escalation - getuid effective id.. Inherit user and group from parent process. switching to user resets the setuid bit. sticky bits id command

shellcode - binary that launchs the shell system call execv(“/bin/sh”, NULL, NULL) - args and env params

intel vs at&t syntax Load up addresses constantrs in binary with .string gcc -static -nostdlib objcopy –section .text=outfile exiting cleanly is smart. Helps know what is screwing up ldd

Trying out shellcode mmap. mprotect? read() deref function pointer

gdb x for eXamine $rsi x/5i $rip gives assembly? x/gx break *0xx040404 n next s step ni si

strace is useful first debugging


system calls set rax to syscall number. call syscall instruction man yada strace

  • fork
  • execve
  • read
  • write
  • wait
  • brk - program brk. change size of data segment. sbrk by increments. sbrk(0) returns current address of break

stack. rbp, rsp. stack grows down decreasing. Rsp + 0x8 is on stack, rbp - 8 is on stack most systems are little endian calling conventions. rdi rsi rdx rcx r8 r9, return in rax rbx rbp r12 r13 r14 r15 are callee saved. guaranteed not smashed opcode listing - assembly repl

binary files

file - tells info about file
elf - interpreter, 
 - sections - text, plt/got resolve and siprach library calls, data preinitilize data, rodata, global read only,, bss for uniitialized data. sections are not required to run a binary
 - symbols - 
- segments - where to load

readelf, objdump, nm - reads symbols, patchelf, objcopy, strip, kaitai struct

process loading what to load. look for #! or elf magic. /proc/sys/fs/binsmt_misc can match a string there. hand off to elf defined interpeter is dynamically linked.

Then it’s onto ld probably. LD_PRELOAD,, LD_LIBRARY_PATH,, DT_RUNTIME in binary file,, system wide /etc/, /lib and /usr/lib relocations updated /proc/self/maps libc is almost always linked. printf, scanf, socket, atoi, amlloc, free



ASLR - Addresses are randomized cat /proc/mem/self ? To look at what actually loaded Also ldd shows were libraries get loaded in memory Stack canaries - set once per binary run, so with forking you can brute force them or maybe leak them?

checksec tells you about which things are enabled.

gcc options -no-pie -no-stack-protection


attaching to gdb and/or a process is really useful. cyclic bytes can let you localize what ends up where in a buffer overflow for example cyclic_find

Examples from