Verified Assembly 2: Memory, RISC-V, Cuts for Invariants, and Ghost Code
I’ve been hitting a good stride moving forward on this, which is good because I’ve been pretty mentally constipated on doing anything that I can even pretend is useful.
The idea is to make an as unopinionated as possible assembly verification system. By adding annotations inline to an assembly files, which remains GAS assemblable https://www.philipzucker.com/assembly_verify/ we can get something like a Dafny or Frama-C experience.
Rather than going all Hoare or weakest precondition or whatever, verification is done via symbolic execution with a twist. I hope that this directness makes it more palatable and understandable to an audience that is only mildly interested in formal. Each piece of annotation is there because I literally quite directly need it. I’ve tried to stay unopinionated (HA!) and have not invented any new syntax, reusing SMTLIB to describe properties. I’ve avoided making syntactic niceties like inferring labels, keeping it as clunky and spartan as assembly itself.
I hope this format can be a target for both manual assembly and for compiler writers to output more info. The info should go in the binary metadata. https://www.philipzucker.com/dwarf-patching/
This week,
- I turned on being able to talk about memory.
- I completely rearrange the symbolic executor so that is emits verification conditions rather than solves them on the spot.
kd_cut
gives us invariants.kd_prelude
let’s you inject smtlibdefine-fun
,define-const
,declare-const
anddefine-fun-rec
annotations which may help you write specs.- Did light rearrangement to support other archs like riscv.
Memory
Not too much to say here. The only thing I needed to do is inject a special variable called ram
into the substitute
function that turns user facing names into their internal pcode equivalents at that program state. The pcode execution model uniformly represents machine state as byte arrays, including the registers.
%%file /tmp/stack.s
.include "/tmp/knuckle.s"
.global _start
kd_entry _start " true"
movq $42, (%rsp)
kd_exit _start_end "(= (select ram RSP) (_ bv42 8))"
ret
Overwriting /tmp/stack.s
import kdrag.contrib.pcode as pcode
from kdrag.contrib.pcode.asmspec import assemble_and_check
assemble_and_check("/tmp/stack.s").successes
stmts []
Executing 0x400000/8: MOV qword ptr [RSP],0x2a at (4194304, 0) PCODE IMARK ram[400000:8]
Executing 0x400000/8: MOV qword ptr [RSP],0x2a at (4194304, 1) PCODE unique[6e80:8] = 0x2a
Executing 0x400000/8: MOV qword ptr [RSP],0x2a at (4194304, 2) PCODE *[ram]RSP = unique[6e80:8]
stmts [Exit(label='_start_end', addr=4194312, expr=ram[RSP] == 42)]
finish Exit(label='_start_end', addr=4194312, expr=ram[RSP] == 42) 0x400008
["[✅] VC(Entry(label='_start', addr=4194304, expr=True), ['0x400000'], Exit(label='_start_end', addr=4194312, expr=ram[RSP] == 42), {})"]
https://github.com/philzook58/knuckledragger/blob/0a039c4539e6dba2b03c79e1e995d14d9e081977/kdrag/contrib/pcode/__init__.py#L544 This is the substitution routine. Only substituting for constants that actually appear in the expressions was a very large optimization in terms of construction time of properties
def substitute(self, memstate: MemState, expr: smt.ExprRef) -> smt.ExprRef:
"""
Substitute the values in memstate into the expression `expr`.
`expr` uses the short register names and `ram`
"""
# Much faster typically to pull only those constants that are used in the expression
consts = {t.decl().name(): t for t in kd.utils.consts(expr)}
substs = [
(t, self.get_reg(memstate, regname))
for regname, t in consts.items()
if regname in self.ctx.registers
]
if "ram" in consts:
substs.append((smt.Array("ram", BV64, BV8), memstate.mem.ram))
if "ram64" in consts:
addr = smt.BitVec("addr", 64)
substs.append(
(
smt.Array("ram64", BV64, BV64),
smt.Lambda([addr], memstate.getvalue_ram(addr, 8)),
)
)
return smt.substitute(expr, *substs)
Pcode uniformly represents everything as byte addressed memory. It is very useful to be able to pull out 64 and 32 bits at a time. This was a serious ergonomics problem in cbat https://github.com/draperlaboratory/cbat_tools/blob/master/wp/resources/sample_binaries/loop_invariant/break/loop_invariant.smt where we often manually concated bitvectors in the smtlib. I fixed it I think by adding special constants ram64
which becomes a lambda which pulls out 8 bytes at once and concats them. I’ll also add ram16
and ram32
.
%%file /tmp/mem.s
.include "/tmp/knuckle.s"
.global _start
kd_entry _start "true"
movq $12345678, (%rsp)
kd_exit _start_end "(= (select ram64 RSP) (_ bv12345678 64))"
ret
Writing /tmp/mem.s
from kdrag.contrib.pcode.asmspec import assemble_and_check
assemble_and_check("/tmp/mem.s").successes
stmts []
Executing 0x400000/8: MOV qword ptr [RSP],0xbc614e at (4194304, 0) PCODE IMARK ram[400000:8]
Executing 0x400000/8: MOV qword ptr [RSP],0xbc614e at (4194304, 1) PCODE unique[6e80:8] = 0xbc614e
Executing 0x400000/8: MOV qword ptr [RSP],0xbc614e at (4194304, 2) PCODE *[ram]RSP = unique[6e80:8]
stmts [Exit(label='_start_end', addr=4194312, expr=ram64[RSP] == 12345678)]
finish Exit(label='_start_end', addr=4194312, expr=ram64[RSP] == 12345678) 0x400008
["[✅] VC(Entry(label='_start', addr=4194304, expr=True), ['0x400000'], Exit(label='_start_end', addr=4194312, expr=ram64[RSP] == 12345678), {})"]
Rearranging the SymExec
I had sort of let myself build my symexec step by step, following my nose. I realized that I wanted a couple things to change. First off, I wanted to kind of “array of structure” the thing into lists of spec statements indexed by address. This is because I want them ordered, knowing the order with which to execute assumes, assigns, asserts, etc. This is more natural than having their order undefined.
The other stuff I did was start to record all the information about what happened during execution for easier debugging later.
Another important change was to switch from eagerly discharging verification conditions to emitting a verification condition structure. This enables interactive verification using the general knuckledragger
The core routine now looks like this https://github.com/philzook58/knuckledragger/blob/0a039c4539e6dba2b03c79e1e995d14d9e081977/kdrag/contrib/pcode/asmspec.py#L301. It alternates between executing spec statements and actual assembly instructions. The spec statements are more or less attached to the assembly statement they precede (akin to how labels are attached). Initializing the trace starts is a bit awkward because of that, since i only want to run spec statements that are after the entry or cut that starts the trace. I dunno. I couldn’t figure out how to make it cleaner than this.
def run_all_paths(
ctx: pcode.BinaryContext, spec: AsmSpec, mem=None, verbose=True
) -> list[VerificationCondition]:
if mem is None:
mem = pcode.MemState.Const("mem")
todo = []
vcs = []
# Initialize executions out of entry points and cuts
for addr, specstmts in spec.addrmap.items():
for n, stmt in enumerate(specstmts):
if isinstance(stmt, Cut) or isinstance(stmt, Entry):
precond = ctx.substitute(mem, stmt.expr)
assert isinstance(precond, smt.BoolRef)
tracestate = TraceState(
start=stmt,
trace=[],
state=pcode.SimState(mem, (addr, 0), [precond]),
)
tracestate, new_vcs = execute_spec_stmts(
specstmts[n + 1 :], tracestate, ctx
)
vcs.extend(new_vcs)
if tracestate is not None:
todo.extend(execute_insn(tracestate, ctx, verbose=verbose))
# Execute pre specstmts and instructions
while todo:
tracestate = todo.pop()
addr = tracestate.state.pc[0]
specstmts = spec.addrmap.get(addr, [])
tracestate, new_vcs = execute_spec_stmts(specstmts, tracestate, ctx)
vcs.extend(new_vcs)
if tracestate is not None:
todo.extend(execute_insn(tracestate, ctx, verbose=verbose))
return vcs
Ghost State
A thing that verifiers often need is ghost state https://hal.science/hal-01396864v1 , stuff that isn’t real code that needs to execute, but is necessary for verification. It may need to track extra information that is not being retained in the real state (such as function arguments in rdi/rsi before they get clobbered)
I did this by adding a kd_assign
statement and a ghost_env
into my symbolic executor. The ghost code is obviously is not real code (having actual runtime meaning) because it’s not assembly, it’s in smtlib. I do not ever translate SMTLIB to assembly, although the idea is intriguing.
kd_prelude
is a feature to be able to add declarations. Constants need to be declared before z3 can parse them.
%%file /tmp/assign.s
.include "/tmp/knuckle.s"
.globl myfunc
kd_prelude "(declare-const mytemp (_ BitVec 64))"
.text
kd_entry myfunc "true"
movq $42, %rax
kd_assign mylabel mytemp "(bvadd RAX (_ bv1 64))"
kd_exit func_end "(= mytemp (_ bv43 64))"
ret
Writing /tmp/assign.s
from kdrag.contrib.pcode.asmspec import assemble_and_check
assemble_and_check("/tmp/assign.s").successes
stmts []
Executing 0x400000/7: MOV RAX,0x2a at (4194304, 0) PCODE IMARK ram[400000:7]
Executing 0x400000/7: MOV RAX,0x2a at (4194304, 1) PCODE RAX = 0x2a
stmts [Assign(label='mylabel', addr=4194311, name='mytemp', expr=RAX + 1), Exit(label='func_end', addr=4194311, expr=mytemp == 43)]
assign addr 0x400007
{}
0x400007
RAX + 1
{'mytemp': 42 + 1}
finish Exit(label='func_end', addr=4194311, expr=mytemp == 43) 0x400007
["[✅] VC(Entry(label='myfunc', addr=4194304, expr=True), ['0x400000'], Exit(label='func_end', addr=4194311, expr=mytemp == 43), {'mytemp': 42 + 1})"]
Cuts
Maybe the most interesting but simple idea that I don’t recall seeing before is using “cuts” as a way of thinking about invariants. It is obvious, so it’s probably out there.
Cuts are annotations that cut the control flow graph. They both start a symbolic execution with an assumption and stop them with an assertion with the same expression. If your CFG is cut to be acyclic, there will only be finite number of trace fragments. The cut annotations in this way can be proven to be weakly self consistent. The weakness is that I do not currently prove anything about termination, which is necessary for true functional correctness.
%%file /tmp/cutloop.s
.include "/tmp/knuckle.s"
.global _start
kd_entry _start "true"
movq $42, %rdi
kd_cut mycut "(= RDI (_ bv42 64))"
jmp mycut
Writing /tmp/cutloop.s
assemble_and_check("/tmp/cutloop.s").successes
stmts []
Executing 0x400000/7: MOV RDI,0x2a at (4194304, 0) PCODE IMARK ram[400000:7]
Executing 0x400000/7: MOV RDI,0x2a at (4194304, 1) PCODE RDI = 0x2a
stmts []
Executing 0x400007/2: JMP 0x400007 at (4194311, 0) PCODE IMARK ram[400007:2]
Executing 0x400007/2: JMP 0x400007 at (4194311, 1) PCODE goto ram[400007:8]
stmts [Cut(label='mycut', addr=4194311, expr=RDI == 42)]
finish Cut(label='mycut', addr=4194311, expr=RDI == 42) 0x400007
stmts [Cut(label='mycut', addr=4194311, expr=RDI == 42)]
finish Cut(label='mycut', addr=4194311, expr=RDI == 42) 0x400007
["[✅] VC(Cut(label='mycut', addr=4194311, expr=RDI == 42), ['0x400007'], Cut(label='mycut', addr=4194311, expr=RDI == 42), {})",
"[✅] VC(Entry(label='_start', addr=4194304, expr=True), ['0x400000'], Cut(label='mycut', addr=4194311, expr=RDI == 42), {})"]
Risc-V
A huge advantage of using pcode is that other architectures are easy. It’s all lifted to a uniform representation. Currently I have to give it the Pcode langid and the appropriate assembler for the architecture, but maybe I could infer this in the future.
%%file /tmp/mov42.s
.include "/tmp/knuckle.s"
.text
.globl myfunc
kd_entry myfunc "true"
li a0, 42
kd_exit myfunc_end "(= a0 (_ bv42 64))"
ret
Writing /tmp/mov42.s
assemble_and_check("/tmp/mov42.s", langid="RISCV:LE:64:default", as_bin="riscv64-linux-gnu-as").successes
stmts []
Executing 0x400000/4: li a0,0x2a at (4194304, 0) PCODE IMARK ram[400000:4]
Executing 0x400000/4: li a0,0x2a at (4194304, 1) PCODE unique[780:8] = 0x2a
Executing 0x400000/4: li a0,0x2a at (4194304, 2) PCODE a0 = unique[780:8]
stmts [Exit(label='myfunc_end', addr=4194308, expr=a0 == 42)]
finish Exit(label='myfunc_end', addr=4194308, expr=a0 == 42) 0x400004
["[✅] VC(Entry(label='myfunc', addr=4194304, expr=True), ['0x400000'], Exit(label='myfunc_end', addr=4194308, expr=a0 == 42), {})"]
Pypcode ships with a bunch. avr8 for arduino stuff, ebpf, 6502 for NES, etc could make for fun posts. I think there are even more out there like a wasm https://github.com/nneonneo/ghidra-wasm-plugin . Not so sure how I’d load these into pypcode though.
! python3 -m pypcode --list
6502:LE:16:default - 6502 Microcontroller Family
65C02:LE:16:default - 65C02 Microcontroller Family
68000:BE:32:Coldfire - Motorola 32-bit Coldfire
68000:BE:32:MC68020 - Motorola 32-bit 68020
68000:BE:32:MC68030 - Motorola 32-bit 68030
68000:BE:32:default - Motorola 32-bit 68040
6805:BE:16:default - 6805 Microcontroller Family
6809:BE:16:default - 6809 Microprocessor
80251:BE:24:default - 80251 Microcontroller Family
80390:BE:24:default - 80390 in flat mode
8048:LE:16:default - 8048 Microcontroller Family
8051:BE:16:default - 8051 Microcontroller Family
8051:BE:24:mx51 - NXP/Phillips MX51
8085:LE:16:default - Intel 8085
AARCH64:BE:32:ilp32 - Generic ARM64 v8.5-A LE instructions, BE data, ilp32
AARCH64:BE:64:v8A - Generic ARM64 v8.5-A LE instructions, BE data, missing some 8.5 vector
AARCH64:LE:32:ilp32 - Generic ARM64 v8.5-A LE instructions, LE data, ilp32
AARCH64:LE:64:AppleSilicon - AppleSilicon ARM v8.5-A LE instructions, LE data, AMX extensions
AARCH64:LE:64:v8A - Generic ARM64 v8.5-A LE instructions, LE data, missing some 8.5 vector
ARM:BE:32:Cortex - ARM Cortex / Thumb big endian
ARM:BE:32:v4 - Generic ARM v4 big endian
ARM:BE:32:v4t - Generic ARM/Thumb v4 big endian (T-variant)
ARM:BE:32:v5 - Generic ARM v5 big endian
ARM:BE:32:v5t - Generic ARM/Thumb v5 big endian (T-variant)
ARM:BE:32:v6 - Generic ARM/Thumb v6 big endian
ARM:BE:32:v7 - Generic ARM/Thumb v7 big endian
ARM:BE:32:v8 - Generic ARM/Thumb v8 big endian
ARM:BE:32:v8-m - ARM Cortex v8-m big endian
ARM:BE:32:v8T - Generic ARM/Thumb v8 big endian (Thumb is default)
ARM:LE:32:Cortex - ARM Cortex / Thumb little endian
ARM:LE:32:v4 - Generic ARM v4 little endian
ARM:LE:32:v4t - Generic ARM/Thumb v4 little endian (T-variant)
ARM:LE:32:v5 - Generic ARM v5 little endian
ARM:LE:32:v5t - Generic ARM/Thumb v5 little endian (T-variant)
ARM:LE:32:v6 - Generic ARM/Thumb v6 little endian
ARM:LE:32:v7 - Generic ARM/Thumb v7 little endian
ARM:LE:32:v8 - Generic ARM/Thumb v8 little endian
ARM:LE:32:v8-m - ARM Cortex v8-m little endian
ARM:LE:32:v8T - Generic ARM/Thumb v8 little endian (Thumb is default)
ARM:LEBE:32:v7LEInstruction - Generic ARM/Thumb v7 little endian instructions and big endian data
ARM:LEBE:32:v8LEInstruction - Generic ARM/Thumb v8 little endian instructions and big endian data
BPF:LE:32:default - BPF processor 32-bit little-endian
CP1600:BE:16:default - General Instruments CP1600
CR16C:LE:16:default - National Semiconductor's CompactRISC CR16C little endian
DATA:BE:64:default - Raw Data File (Big Endian)
DATA:LE:64:default - Raw Data File (Little Endian)
Dalvik:LE:32:DEX_Android10 - Dalvik DEX Android10
Dalvik:LE:32:DEX_Android11 - Dalvik DEX Android11
Dalvik:LE:32:DEX_Android12 - Dalvik DEX Android12
Dalvik:LE:32:DEX_Android13 - Dalvik DEX Android13
Dalvik:LE:32:DEX_Base - Dalvik Base
Dalvik:LE:32:DEX_KitKat - Dalvik DEX KitKat
Dalvik:LE:32:DEX_Lollipop - Dalvik DEX Lollipop
Dalvik:LE:32:DEX_Nougat - Dalvik DEX Nougat
Dalvik:LE:32:DEX_Oreo - Dalvik DEX Oreo
Dalvik:LE:32:DEX_Pie - Dalvik DEX Pie
Dalvik:LE:32:Marshmallow - Dalvik DEX Marshmallow
Dalvik:LE:32:ODEX_KitKat - Dalvik ODEX KitKat
Dalvik:LE:32:default - Dalvik Base
H6309:BE:16:default - Hitachi 6309 Microprocessor, extension of 6809, 6309 addressing modes, missing many instructions
HC-12:BE:16:default - HC12 Microcontroller Family
HC05:BE:16:M68HC05TB - HC05 (6805) Microcontroller Family - M68HC05TB
HC05:BE:16:default - HC05 (6805) Microcontroller Family
HC08:BE:16:MC68HC908QY4 - HC08 Microcontroller Family - MC68HC908QY4
HC08:BE:16:default - HC08 Microcontroller Family
HCS-12:BE:24:default - HCS12 Microcontroller Family
HCS-12X:BE:24:default - HCS12X Microcontroller Family
HCS08:BE:16:MC9S08GB60 - HCS08 Microcontroller Family - MC9S08GB60
HCS08:BE:16:default - HCS08 Microcontroller Family
HCS12:BE:24:default - HCS12X Microcontroller Family
JVM:BE:32:default - Generic JVM
Loongarch:LE:32:ilp32d - Loongson 3 32-bit with 64-bit FP
Loongarch:LE:32:ilp32f - Loongson 3 32-bit with 32-bit FP
Loongarch:LE:64:lp64d - Loongson 3 64-bit with 64-bit FP
Loongarch:LE:64:lp64f - Loongson 3 64-bit with 32-bit FP
M16C/60:LE:16:default - Renesas M16C/60 16-Bit MicroComputer
M16C/80:LE:16:default - Renesas M16C/80 16-Bit MicroComputer
M8C:BE:16:default - Cypress M8C Microcontroller Family
MCS96:LE:16:default - Intel MCS-96 Microcontroller Family
MIPS:BE:32:R6 - MIPS32 Release-6 32-bit addresses, big endian, with microMIPS
MIPS:BE:32:default - MIPS32 32-bit addresses, big endian, with mips16e
MIPS:BE:32:micro - MIPS32 32-bit addresses, big endian, with microMIPS
MIPS:BE:64:64-32R6addr - MIPS64 Release-6 big endian with 32 bit addressing and microMIPS
MIPS:BE:64:64-32addr - MIPS64 32-bit addresses, big endian, with mips16e
MIPS:BE:64:R6 - MIPS64 Release-6 64-bit addresses, big endian, with microMIPS
MIPS:BE:64:default - MIPS64 64-bit addresses, big endian, with mips16e
MIPS:BE:64:micro - MIPS64 64-bit addresses, big endian, with microMIPS
MIPS:BE:64:micro64-32addr - MIPS64 32-bit addresses, big endian, with microMIPS
MIPS:LE:32:R6 - MIPS32 Release-6 32-bit addresses, little endian, with microMIPS
MIPS:LE:32:default - MIPS32 32-bit addresses, little endian, with mips16e
MIPS:LE:32:micro - MIPS32 32-bit addresses, little endian, with microMIPS
MIPS:LE:64:64-32R6addr - MIPS64 Release-6 with 32-bit addresses, little endian, with microMIPS
MIPS:LE:64:64-32addr - MIPS64 32-bit addresses, little endian, with mips16e
MIPS:LE:64:R6 - MIPS64 Release-6 64-bit addresses, little endian, with microMIPS
MIPS:LE:64:default - MIPS64 64-bit addreses, little endian, with mips16e
MIPS:LE:64:micro - MIPS64 64-bit addresses, little endian, with microMIPS
MIPS:LE:64:micro64-32addr - MIPS64 32-bit addresses, little endian, with microMIPS
PIC-12:LE:16:PIC-12C5xx - PIC-12C5xx
PIC-16:LE:16:PIC-16 - PIC-16(C,CR)XXX
PIC-16:LE:16:PIC-16C5x - PIC-16C5x
PIC-16:LE:16:PIC-16F - PIC-16F(L)XXX
PIC-17:LE:16:PIC-17C7xx - PIC-17C7xx
PIC-18:LE:24:PIC-18 - PIC-18
PIC-24E:LE:24:default - PIC-24E
PIC-24F:LE:24:default - PIC-24F
PIC-24H:LE:24:default - PIC-24H
PowerPC:BE:32:4xx - PowerPC 4xx 32-bit big endian embedded core
PowerPC:BE:32:MPC8270 - Freescale MPC8280 32-bit big endian family (PowerQUICC-III)
PowerPC:BE:32:QUICC - PowerQUICC-III 32-bit big endian family
PowerPC:BE:32:default - PowerPC 32-bit big endian w/Altivec, G2
PowerPC:BE:32:e200 - Power ISA e200 32-bit big-endian embedded core w/VLE
PowerPC:BE:32:e500 - PowerQUICC-III e500 32-bit big-endian family
PowerPC:BE:32:e500mc - PowerQUICC-III e500mc 32-bit big-endian family
PowerPC:BE:64:64-32addr - PowerPC 64-bit big endian w/Altivec and 32 bit addressing, G2
PowerPC:BE:64:A2-32addr - Power ISA 3.0 Big Endian w/EVX and 32-bit Addressing
PowerPC:BE:64:A2ALT - Power ISA 3.0 Big Endian w/Altivec
PowerPC:BE:64:A2ALT-32addr - Power ISA 3.0 Big Endian w/Altivec and 32-bit Addressing
PowerPC:BE:64:VLE-32addr - Power ISA 3.0 Big Endian w/VLE, EVX and 32-bit Addressing
PowerPC:BE:64:VLEALT-32addr - Power ISA 3.0 Big Endian w/VLE, Altivec and 32-bit Addressing
PowerPC:BE:64:default - PowerPC 64-bit big endian w/Altivec, G2
PowerPC:LE:32:4xx - PowerPC 4xx 32-bit little endian embedded core
PowerPC:LE:32:QUICC - PowerQUICC-III 32-bit little endian family
PowerPC:LE:32:default - PowerPC 32-bit little endian w/Altivec, G2
PowerPC:LE:32:e500 - PowerQUICC-III e500 32-bit little-endian family
PowerPC:LE:32:e500mc - PowerQUICC-III e500mc 32-bit little-endian family
PowerPC:LE:64:64-32addr - PowerPC 64-bit little endian w/Altivec and 32 bit addressing, G2
PowerPC:LE:64:A2-32addr - Power ISA 3.0 Little Endian w/EVX and 32-bit Addressing
PowerPC:LE:64:A2ALT - Power ISA 3.0 Little Endian w/Altivec
PowerPC:LE:64:A2ALT-32addr - Power ISA 3.0 Little Endian w/Altivec and 32-bit Addressing
PowerPC:LE:64:default - PowerPC 64-bit little endian w/Altivec, G2
RISCV:LE:32:RV32G - RISC-V 32 little general purpose
RISCV:LE:32:RV32GC - RISC-V 32 little general purpose compressed
RISCV:LE:32:RV32I - RISC-V 32 little base
RISCV:LE:32:RV32IC - RISC-V 32 little base compressed
RISCV:LE:32:RV32IMC - RISC-V 32 little base compressed
RISCV:LE:32:default - RISC-V 32 little default
RISCV:LE:64:RV64G - RISC-V 64 little general purpose
RISCV:LE:64:RV64GC - RISC-V 64 little general purpose compressed
RISCV:LE:64:RV64I - RISC-V 64 little base
RISCV:LE:64:RV64IC - RISC-V 64 little base compressed
RISCV:LE:64:default - RISC-V 32 little default
SuperH4:BE:32:default - SuperH-4(a) (SH4) big endian
SuperH4:LE:32:default - SuperH-4(a) (SH4) little endian
SuperH:BE:32:SH-1 - SuperH SH-1 processor 32-bit big-endian
SuperH:BE:32:SH-2 - SuperH SH-2 processor 32-bit big-endian
SuperH:BE:32:SH-2A - SuperH SH-2A processor 32-bit big-endian
TI_MSP430:LE:16:default - TI MSP430 16-Bit MicroController
TI_MSP430X:LE:32:default - TI MSP430X 20-Bit MicroController
Toy:BE:32:builder - Toy (test-builder) processor 32-bit big-endian
Toy:BE:32:builder.align2 - Toy (test-builder) processor 32-bit big-endian word-aligned
Toy:BE:32:default - Toy (test) processor 32-bit big-endian
Toy:BE:32:posStack - Toy (test) processor 32-bit big-endian
Toy:BE:32:wordSize2 - Toy (test) processor 32-bit big-endian (wordsize=2)
Toy:BE:64:default - Toy (test) processor 64-bit big-endian
Toy:BE:64:harvard - Toy (test) processor 64-bit big-endian Harvard
Toy:LE:32:builder - Toy (test-builder) processor 32-bit little-endian
Toy:LE:32:builder.align2 - Toy (test-builder) processor 32-bit little-endian word-aligned
Toy:LE:32:default - Toy (test) processor 32-bit little-endian
Toy:LE:32:wordSize2 - Toy (test) processor 32-bit little-endian (wordsize=2)
Toy:LE:64:default - Toy (test) processor 64-bit little-endian
V850:LE:32:default - Renesas V850 family
Xtensa:BE:32:default - Tensilica Xtensa 32-bit big-endian
Xtensa:LE:32:default - Tensilica Xtensa 32-bit little-endian
avr32:BE:32:default - Generic AVR32-A big-endian
avr8:LE:16:atmega256 - AVR8 for an Atmega 256
avr8:LE:16:default - AVR8 with 16-bit word addressable code space
avr8:LE:16:extended - AVR8 with 22-bit word addressable with EIND code space
avr8:LE:24:xmega - AVR8 for an Xmega
dsPIC30F:LE:24:default - dsPIC30F
dsPIC33C:LE:24:default - dsPIC33C
dsPIC33E:LE:24:default - dsPIC33E
dsPIC33F:LE:24:default - dsPIC33F
eBPF:LE:64:default - eBPF processor 64-bit little-endian
pa-risc:BE:32:default - Generic PA-RISC 32-bit big endian
sparc:BE:32:default - Sparc V9 32-bit
sparc:BE:64:default - Sparc V9 64-bit
tricore:LE:32:default - Siemens Tricore Embedded Processor
tricore:LE:32:tc172x - Siemens Tricore Embedded Processor TC1724/TC1728
tricore:LE:32:tc176x - Siemens Tricore Embedded Processor TC1762/TC1766
tricore:LE:32:tc29x - Siemens Tricore Embedded Processor TC29x
x86:LE:16:Protected Mode - Intel/AMD 16-bit x86 Protected Mode
x86:LE:16:Real Mode - Intel/AMD 16-bit x86 Real Mode
x86:LE:32:System Management Mode - Intel/AMD 32-bit x86 System Management Mode
x86:LE:32:default - Intel/AMD 32-bit x86
x86:LE:64:compat32 - Intel/AMD 64-bit x86 in 32-bit compatibility mode (long mode off)
x86:LE:64:default - Intel/AMD 64-bit x86
z180:LE:16:default - Zilog Z180
z182:LE:16:default - Zilog Z182
z80:LE:16:default - Zilog Z80
z8401x:LE:16:default - Zilog Z8401x (IPC) microcontroller
Bits and Bobbles
TODO:
- better countermodel presentation
- 32 bit arch
- kd_load for constants that need to be loaded into ram
- Make CLI.
- objcopy into biary data
- kd_always
- Checking memory reads and writes aren’t really state functions. Hmm.
- equivalence checking
- bigger more interesting examples. simd?
- Still unclear if hitting Entry should be something like hitting a sealed precondition (whether entry is also a trace killer). I’d need to track which entry can reach which exits to make this work.
It would be interesting to make a rust version that uses cbmc stuff as solver backend. A lot of work for questionable gains.
- Ghost state is now available via
kd_assign
. This let’s you store information that may be clobbered in the actual state.
getvalue RAX and simplify. before subsitution. Oh I already dio that… Hmm.
Maybe if I set init_RAX Take the assert out of the smtlib statements. It’s weird to have it in an assume Versions that don’t need conditions?
exit and entry are kind of subsumed. No but they aren’t if I want
If I ever jump to something marked as entry, don’t do it. Just use entry condition, chaos, and receive exit condition. Or just run, but also receive exit condition?
Faster checks for true / false. DOn’t bother doing a whole thing… right?
Mark out the pieces of memory that need to be loaded into memory for this to work.
kd_load foo: .ascii “hello world” kd_load
should memstate have a pointer to a context? Maybe.
Also initialize all symbols that point to data to at least… eh.
mem —> Lambda([x], memyada)
- fixup to work on riscv and 32 bit.
- nicer error output. Negative examples
- Retain initial ghost variables. kd_ghost? kd_let ?
- Countermodels interpret to be more readable. Give path through program, state at beginning and end.
- Use abstractions so the intermidate symbolic states aren’t quite so unreadable. Or initialize mem to RAX0. Too slow though. Hmm.
- Hoarize. I’ve had so many bugs this principled approach might be pretty useful.
- prearranged loaded data. Mem does not currently reflect the actual contents of the loading. kd_load ?
- pre and post conditions on calls? if jump into address already.
- objcopy assertions into section
- Pro / Con of WP?
- kd_always for constantly checked invariants
- A whitelist of jump points. Only allow labelled positions?
- Regular execution or GDB sessions. All the cbat features.
- failsafe. If I grep kd_ in the line, but don’t recognize it, I should complain. Multiline smtlib would be nice too.
- The stdlib of helpers.
- speed
- Tracking seen addresses. We may want to track if we never hit a label that has an annotation on it, as this is suspicious.
- it could make sense to slot in AFL or angr or Symqemu or whatever instead of my system, checking the same properties. I have designed my semantics to be moderately sound
- ram8; ram32 = Lambda([addr], selectconcat32(ram8, addr)) ram32 ram64
- comparative checking between two assembly programs
done:
- memory. Gotta inject a
mem
variable. - Gotta test that stuff I put in the library
- cuts. Invariants are cuts of cfg into a dag. Distinguishing backedges from forward edges
- We should probably do asserts, assumes, assigns in program text order if multiple at single address.
- store history just to make debugging easier.
cbmc / ebmc options to mimic? seelct entry points to test. select asssertions to test bounded mode’
flag for eager VC or timeout param.
Grab just variables in highlevel assertion to present countermodel.
Hmm it would actually be easy to add a kd_label and let it persist through the parsing loop… Or just parse labels. There has to have been a label in the contiguous previous lines
call tool asmc
, the “language asmspec
%%file /tmp/knuckle.s
#precondition
.macro kd_entry label smt_expr
\label :
.endm
.macro kd_assert label smt_expr
\label :
.endm
.macro kd_assume label smt_expr
\label :
.endm
#postcondition
.macro kd_exit label smt_expr
\label :
.endm
#invariant
.macro kd_cut label smt_expr
\label :
.endm
.macro kd_always smt_expr
.endm
Overwriting /tmp/knuckle.s
%%file /tmp/stack.s
.include "/tmp/knuckle.s"
.global _start
kd_entry _start " true"
movq $42, (%rsp)
kd_exit _start_end "(= (select ram RSP) (_ bv42 8))"
ret
Overwriting /tmp/stack.s
from kdrag.all import *
import kdrag.contrib.pcode as pcode
from kdrag.contrib.pcode.asmspec import assemble_and_check
assemble_and_check("/tmp/stack.s")
---------------------------------------------------------------------------
Z3Exception Traceback (most recent call last)
Cell In[1], line 4
2 import kdrag.contrib.pcode as pcode
3 from kdrag.contrib.pcode.asmspec import assemble_and_check
----> 4 assemble_and_check("/tmp/stack.s")
File ~/Documents/python/knuckledragger/kdrag/contrib/pcode/asmspec.py:252, in assemble_and_check(filename, langid, as_bin)
250 subprocess.run([as_bin, filename, "-o", "/tmp/kdrag_temp.o"], check=True)
251 ctx = pcode.BinaryContext("/tmp/kdrag_temp.o", langid=langid)
--> 252 spec = AsmSpec.of_file(filename, ctx)
253 print(spec)
254 return run_all_paths(ctx, spec)
File ~/Documents/python/knuckledragger/kdrag/contrib/pcode/asmspec.py:100, in AsmSpec.of_file(cls, filename, ctx)
98 addr = sym.rebased_addr
99 smt_string = "\n".join(preludes + ["(assert ", expr, ")"])
--> 100 expr = smt.parse_smt2_string(smt_string, decls=decls)[0]
101 if cmd == "kd_entry":
102 spec.entries[addr].append((label, expr))
File ~/philzook58.github.io/.venv/lib/python3.12/site-packages/z3/z3.py:9498, in parse_smt2_string(s, sorts, decls, ctx)
9496 ssz, snames, ssorts = _dict2sarray(sorts, ctx)
9497 dsz, dnames, ddecls = _dict2darray(decls, ctx)
-> 9498 return AstVector(Z3_parse_smtlib2_string(ctx.ref(), s, ssz, snames, ssorts, dsz, dnames, ddecls), ctx)
File ~/philzook58.github.io/.venv/lib/python3.12/site-packages/z3/z3core.py:3635, in Z3_parse_smtlib2_string(a0, a1, a2, a3, a4, a5, a6, a7, _elems)
3633 def Z3_parse_smtlib2_string(a0, a1, a2, a3, a4, a5, a6, a7, _elems=Elementaries(_lib.Z3_parse_smtlib2_string)):
3634 r = _elems.f(a0, _str_to_bytes(a1), a2, a3, a4, a5, a6, a7)
-> 3635 _elems.Check(a0)
3636 return r
File ~/philzook58.github.io/.venv/lib/python3.12/site-packages/z3/z3core.py:1579, in Elementaries.Check(self, ctx)
1577 err = self.get_error_code(ctx)
1578 if err != self.OK:
-> 1579 raise self.Exception(self.get_error_message(ctx, err))
Z3Exception: b'(error "line 2 column 12: unknown constant assert (Bool) ")\n'
cut
cut, we maybe want post and pre program point pre_insn1 exec_insn1 post_insn1 pre_insn2 …
examples: VST book https://softwarefoundations.cis.upenn.edu/vc-current/toc.html dafny book frama-c examples / book https://link.springer.com/book/10.1007/978-3-031-55608-1
sum-array hmm. I’m worried that we’ll need auxiliary predicates. I could define them in assembly?
kd_prelude “(define-fun foo ( ) )” kd_declare “(declare-const )” # kdprelude “(declare-const mytemp (BitVec 64))” kd_assign “mytemp” expr kd_declare
“(define-const flub (RAX))” Hmm. This might auto expand. I think so. So you can have a running summary in thisa way But kd_assign gives you
Maybe i should be worried more about the embedded python experience. Getting lemmas in there. lemmas = dict[label, list[kd.Proof]] Offer needed extra bits.
Or have the thing output a pile of verification conditions in Results.
This is getting verbose. I’m just delaying figuring out what I want to do. But it is also flexibility.
class Trace() events : list[Event] def VCs(self):
class TraceForest():
class Jump(): TraceForest
class Event(): pass class Execute(Event): … class Assert(Event): … class Assume(Event): class Entry(): class
T
%%file /tmp/cut.s
.include "/tmp/knuckle.s"
.global _start
kd_entry _start "true"
movq $42, %rdi
kd_cut mycut "(= RDI (_ bv42 64))"
movq %rdi, %rax
kd_exit _start_end "(= RAX (_ bv42 64))"
ret
Overwriting /tmp/cut.s
from kdrag.all import *
import kdrag.contrib.pcode as pcode
from kdrag.contrib.pcode.asmspec import assemble_and_check
assemble_and_check("/tmp/cut.s")
{
"entries": {
"4194304": [
[
"_start",
"true"
]
]
},
"asserts": {},
"assumes": {},
"exits": {
"4194314": [
[
"_start_end",
"(= RAX #x000000000000002a)"
]
]
},
"cuts": {
"4194311": [
[
"mycut",
"(= RDI #x000000000000002a)"
]
]
}
}
entry _start at 4194304 with precond True
cut mycut at 4194311 with precond RDI == 42
Executing 0x400007/3: MOV RAX,RDI at (4194311, 0) PCODE IMARK ram[400007:3]
Executing 0x400007/3: MOV RAX,RDI at (4194311, 1) PCODE RAX = RDI
[✅] exit _start_end: RAX == 42
Executing 0x400000/7: MOV RDI,0x2a at (4194304, 0) PCODE IMARK ram[400000:7]
Executing 0x400000/7: MOV RDI,0x2a at (4194304, 1) PCODE RDI = 0x2a
[✅] cut mycut: RDI == 42
Results(successes=['[✅] exit _start_end: RAX == 42', '[✅] cut mycut: RDI == 42'], failures=[], traces=[[(4194311, 0), '[✅] exit _start_end: RAX == 42'], [(4194304, 0), '[✅] cut mycut: RDI == 42']])
%%file /tmp/cutloop.s
.include "/tmp/knuckle.s"
.global _start
kd_entry _start "true"
movq $42, %rdi
kd_cut mycut "(= RDI (_ bv42 64))"
jmp mycut
Writing /tmp/cutloop.s
from kdrag.all import *
import kdrag.contrib.pcode as pcode
from kdrag.contrib.pcode.asmspec import assemble_and_check
assemble_and_check("/tmp/cutloop.s")
{
"entries": {
"4194304": [
[
"_start",
"true"
]
]
},
"asserts": {},
"assumes": {},
"exits": {},
"cuts": {
"4194311": [
[
"mycut",
"(= RDI #x000000000000002a)"
]
]
}
}
entry _start at 4194304 with precond True
cut mycut at 4194311 with invariant RDI == 42
Executing 0x400007/2: JMP 0x400007 at (4194311, 0) PCODE IMARK ram[400007:2]
Executing 0x400007/2: JMP 0x400007 at (4194311, 1) PCODE goto ram[400007:8]
[✅] cut mycut: RDI == 42
Executing 0x400000/7: MOV RDI,0x2a at (4194304, 0) PCODE IMARK ram[400000:7]
Executing 0x400000/7: MOV RDI,0x2a at (4194304, 1) PCODE RDI = 0x2a
[✅] cut mycut: RDI == 42
Results(successes=['[✅] cut mycut: RDI == 42', '[✅] cut mycut: RDI == 42'], failures=[], traces=[[(4194311, 0), '[✅] cut mycut: RDI == 42'], [(4194304, 0), '[✅] cut mycut: RDI == 42']])
%%file /tmp/cutloop.s
.include "/tmp/knuckle.s"
.global _start
kd_entry _start "true"
movq $42, %rdi
lea 1(%rdi), %rax
kd_cut mycut "(bvule RDI (_ bv42 64))"
add $1, %rax
jmp mycut
riscv
%%file /tmp/mop42.s
.include "/tmp/knuckle.s"
.text
.globl myfunc
kd_entry myfunc "(assert true)"
li a0, 42
kd_exit myfunc_end "(assert (= a0 (_ bv42 64)))"
ret
Overwriting /tmp/mop42.s
! riscv64-linux-gnu-gcc -c -o /tmp/mop42.o /tmp/mop42.s
%%file /tmp/test.c
#include <stdio.h>
#include <stdint.h>
uint64_t myfunc();
int main() {
printf("Result is %lu\n", myfunc());
return 0;
}
Overwriting /tmp/test.c
! riscv64-linux-gnu-gcc /tmp/test.c /tmp/mop42.o -o /tmp/test && qemu-riscv64 -L /usr/riscv64-linux-gnu /tmp/test
Result is 42
%load_ext jupyterflame
#%%prun -s tottime
import subprocess
import kdrag.contrib.pcode as pcode
from kdrag.contrib.pcode.asmspec import AsmSpec, run_all_paths
def assemble_and_check_riscv64(filename: str):
subprocess.run(["riscv64-linux-gnu-as", filename, "-o", "/tmp/kdrag_temp.o"], check=True)
ctx = pcode.BinaryContext("/tmp/kdrag_temp.o", langid="RISCV:LE:64:default")
spec = AsmSpec.of_file(filename, ctx)
print(spec)
return run_all_paths(ctx, spec)
assemble_and_check_riscv64("/tmp/mop42.s")
{
"entries": {
"4194304": [
[
"myfunc",
"true"
]
]
},
"asserts": {},
"assumes": {},
"exits": {
"4194308": [
[
"myfunc_end",
"(= a0 #x000000000000002a)"
]
]
},
"cuts": {}
}
entry myfunc at 4194304 with precond True
Executing 0x400000/4: li a0,0x2a at (4194304, 0) PCODE IMARK ram[400000:4]
Executing 0x400000/4: li a0,0x2a at (4194304, 1) PCODE unique[780:8] = 0x2a
Executing 0x400000/4: li a0,0x2a at (4194304, 2) PCODE a0 = unique[780:8]
[✅] exit myfunc_end: a0 == 42
Results(successes=['[✅] exit myfunc_end: a0 == 42'], failures=[], traces=[[(4194304, 0), '[✅] exit myfunc_end: a0 == 42']])
32 bit
little endian and big endian
Bits and Bobbles
https://github.com/WestfW/structured_gas
reorg
checks: defaultdict[int, list[HasCheck]] = dataclasses.field(
default_factory=lambda: defaultdict(list)
)
# %%prun -s tottime
from kdrag.contrib.pcode.asmspec import AsmSpec, run_all_paths, assemble_and_check_str
ex = """
.include "/tmp/knuckle.s"
.globl _start
kd_entry _start "true"
movq %rdi, %rax
cmp %rdi, %rsi
cmovb %rsi, %rax
kd_exit _start_end "(= RAX (ite (bvult RDI RSI) RDI RSI))"
#kd_exit _start_end "(or (= RAX RDI) (= RAX RSI))"
ret
"""
assemble_and_check_str(ex)
{
"addrmap": {
"4194304": [
{
"label": "_start",
"addr": 4194304,
"expr": "true"
}
],
"4194314": [
{
"label": "_start_end",
"addr": 4194314,
"expr": "(= RAX (ite (bvult RDI RSI) RDI RSI))"
}
]
},
"entries": {
"4194304": [
[
"_start",
"true"
]
]
},
"asserts": {},
"assumes": {},
"exits": {
"4194314": [
[
"_start_end",
"(= RAX (ite (bvult RDI RSI) RDI RSI))"
]
]
},
"cuts": {}
}
Executing 0x400000/3: MOV RAX,RDI at (4194304, 0) PCODE IMARK ram[400000:3]
Executing 0x400000/3: MOV RAX,RDI at (4194304, 1) PCODE RAX = RDI
Executing 0x400003/3: CMP RSI,RDI at (4194307, 0) PCODE IMARK ram[400003:3]
Executing 0x400003/3: CMP RSI,RDI at (4194307, 1) PCODE unique[3af00:8] = RSI
Executing 0x400003/3: CMP RSI,RDI at (4194307, 2) PCODE CF = unique[3af00:8] < RDI
Executing 0x400003/3: CMP RSI,RDI at (4194307, 3) PCODE OF = sborrow(unique[3af00:8], RDI)
Executing 0x400003/3: CMP RSI,RDI at (4194307, 4) PCODE unique[3b000:8] = unique[3af00:8] - RDI
Executing 0x400003/3: CMP RSI,RDI at (4194307, 5) PCODE SF = unique[3b000:8] s< 0x0
Executing 0x400003/3: CMP RSI,RDI at (4194307, 6) PCODE ZF = unique[3b000:8] == 0x0
Executing 0x400003/3: CMP RSI,RDI at (4194307, 7) PCODE unique[28080:8] = unique[3b000:8] & 0xff
Executing 0x400003/3: CMP RSI,RDI at (4194307, 8) PCODE unique[28100:1] = popcount(unique[28080:8])
Executing 0x400003/3: CMP RSI,RDI at (4194307, 9) PCODE unique[28180:1] = unique[28100:1] & 0x1
Executing 0x400003/3: CMP RSI,RDI at (4194307, 10) PCODE PF = unique[28180:1] == 0x0
Executing 0x400006/4: CMOVC RAX,RSI at (4194310, 0) PCODE IMARK ram[400006:4]
Executing 0x400006/4: CMOVC RAX,RSI at (4194310, 1) PCODE unique[39b00:8] = RSI
Executing 0x400006/4: CMOVC RAX,RSI at (4194310, 2) PCODE unique[39b80:1] = !CF
Executing 0x400006/4: CMOVC RAX,RSI at (4194310, 3) PCODE if (unique[39b80:1]) goto ram[40000a:8]
Executing 0x400006/4: CMOVC RAX,RSI at (4194310, 4) PCODE RAX = unique[39b00:8]
Results(successes=["[✅] VC(Entry(label='_start', addr=4194304, expr=True), ['0x400000', '0x400003', '0x400006'], Exit(label='_start_end', addr=4194314, expr=RAX == If(ULT(RDI, RSI), RDI, RSI)))", "[✅] VC(Entry(label='_start', addr=4194304, expr=True), ['0x400000', '0x400003', '0x400006'], Exit(label='_start_end', addr=4194314, expr=RAX == If(ULT(RDI, RSI), RDI, RSI)))"], failures=[], traces=[])
import pypcode
ctx = pypcode.Context("RISCV:LE:64:default")
#ctx.registers
# Trim mem_init to just the good stuff somehow?
# Maybe in archinfo
{name: vnode for name, vnode in ctx.registers.items() if (vnode.size == 8 or vnode.size == 4) and len(name) <= 3}
{'pc': <pypcode.pypcode_native.Varnode at 0x76b63422a550>,
'ra': <pypcode.pypcode_native.Varnode at 0x76b60ba68360>,
'sp': <pypcode.pypcode_native.Varnode at 0x76b60ba68810>,
'gp': <pypcode.pypcode_native.Varnode at 0x76b60ba685a0>,
'tp': <pypcode.pypcode_native.Varnode at 0x76b60ba68480>,
't0': <pypcode.pypcode_native.Varnode at 0x76b60ba68c30>,
't1': <pypcode.pypcode_native.Varnode at 0x76b60ba68e40>,
't2': <pypcode.pypcode_native.Varnode at 0x76b60ba68c00>,
's0': <pypcode.pypcode_native.Varnode at 0x76b60ba68cc0>,
's1': <pypcode.pypcode_native.Varnode at 0x76b60ba68d50>,
'a0': <pypcode.pypcode_native.Varnode at 0x76b60ba68d20>,
'a1': <pypcode.pypcode_native.Varnode at 0x76b60ba68d80>,
'a2': <pypcode.pypcode_native.Varnode at 0x76b60ba68ed0>,
'a3': <pypcode.pypcode_native.Varnode at 0x76b60ba69200>,
'a4': <pypcode.pypcode_native.Varnode at 0x76b60ba68c90>,
'a5': <pypcode.pypcode_native.Varnode at 0x76b60ba68f90>,
'a6': <pypcode.pypcode_native.Varnode at 0x76b60ba68f00>,
'a7': <pypcode.pypcode_native.Varnode at 0x76b60ba68fc0>,
's2': <pypcode.pypcode_native.Varnode at 0x76b60ba69020>,
's3': <pypcode.pypcode_native.Varnode at 0x76b60ba690b0>,
's4': <pypcode.pypcode_native.Varnode at 0x76b60ba690e0>,
's5': <pypcode.pypcode_native.Varnode at 0x76b60ba69140>,
's6': <pypcode.pypcode_native.Varnode at 0x76b60ba69290>,
's7': <pypcode.pypcode_native.Varnode at 0x76b60ba691d0>,
's8': <pypcode.pypcode_native.Varnode at 0x76b60ba69260>,
's9': <pypcode.pypcode_native.Varnode at 0x76b60ba692c0>,
's10': <pypcode.pypcode_native.Varnode at 0x76b60ba69320>,
's11': <pypcode.pypcode_native.Varnode at 0x76b60ba694d0>,
't3': <pypcode.pypcode_native.Varnode at 0x76b60ba69470>,
't4': <pypcode.pypcode_native.Varnode at 0x76b60ba69440>,
't5': <pypcode.pypcode_native.Varnode at 0x76b60ba694a0>,
't6': <pypcode.pypcode_native.Varnode at 0x76b60ba693b0>,
'ft0': <pypcode.pypcode_native.Varnode at 0x76b60ba698c0>,
'ft1': <pypcode.pypcode_native.Varnode at 0x76b60ba69950>,
'ft2': <pypcode.pypcode_native.Varnode at 0x76b60ba69980>,
'ft3': <pypcode.pypcode_native.Varnode at 0x76b60ba69ce0>,
'ft4': <pypcode.pypcode_native.Varnode at 0x76b60ba69aa0>,
'ft5': <pypcode.pypcode_native.Varnode at 0x76b60ba6b720>,
'ft6': <pypcode.pypcode_native.Varnode at 0x76b60ba6b8d0>,
'ft7': <pypcode.pypcode_native.Varnode at 0x76b60ba6b900>,
'fs0': <pypcode.pypcode_native.Varnode at 0x76b60ba69a40>,
'fs1': <pypcode.pypcode_native.Varnode at 0x76b60ba6bd20>,
'fa0': <pypcode.pypcode_native.Varnode at 0x76b60ba6bea0>,
'fa1': <pypcode.pypcode_native.Varnode at 0x76b60ba6b420>,
'fa2': <pypcode.pypcode_native.Varnode at 0x76b60ba6b690>,
'fa3': <pypcode.pypcode_native.Varnode at 0x76b60ba6b390>,
'fa4': <pypcode.pypcode_native.Varnode at 0x76b60ba6b4b0>,
'fa5': <pypcode.pypcode_native.Varnode at 0x76b60ba6b450>,
'fa6': <pypcode.pypcode_native.Varnode at 0x76b60ba6b990>,
'fa7': <pypcode.pypcode_native.Varnode at 0x76b60ba6aaf0>,
'fs2': <pypcode.pypcode_native.Varnode at 0x76b60ba6bdb0>,
'fs3': <pypcode.pypcode_native.Varnode at 0x76b60ba6a580>,
'fs4': <pypcode.pypcode_native.Varnode at 0x76b60ba6a670>,
'fs5': <pypcode.pypcode_native.Varnode at 0x76b60ba6a100>,
'fs6': <pypcode.pypcode_native.Varnode at 0x76b60ba6aeb0>,
'fs7': <pypcode.pypcode_native.Varnode at 0x76b60ba6bb10>,
'fs8': <pypcode.pypcode_native.Varnode at 0x76b60ba6ad30>,
'fs9': <pypcode.pypcode_native.Varnode at 0x76b60ba69e90>,
'ft8': <pypcode.pypcode_native.Varnode at 0x76b60ba6b7e0>,
'ft9': <pypcode.pypcode_native.Varnode at 0x76b60ba6bed0>,
'frm': <pypcode.pypcode_native.Varnode at 0x76b60ba68030>,
'uie': <pypcode.pypcode_native.Varnode at 0x76b60ba6a8e0>,
'uip': <pypcode.pypcode_native.Varnode at 0x76b60b63d1d0>,
'sie': <pypcode.pypcode_native.Varnode at 0x76b60ba757d0>,
'sip': <pypcode.pypcode_native.Varnode at 0x76b60ba76f70>,
'mie': <pypcode.pypcode_native.Varnode at 0x76b60ba7e610>,
'mip': <pypcode.pypcode_native.Varnode at 0x76b60ba7d3e0>,
'hie': <pypcode.pypcode_native.Varnode at 0x76b60b6d0270>,
'hip': <pypcode.pypcode_native.Varnode at 0x76b60b6d19e0>,
'dpc': <pypcode.pypcode_native.Varnode at 0x76b60ba91d40>,
'vl': <pypcode.pypcode_native.Varnode at 0x76b60ba4fd50>}
from kdrag.all import *
import kdrag.contrib.pcode as pcode
ctx = pcode.BinaryContext()
memstate = ctx.init_mem()
ctx.get_reg(memstate, "RAX")
RAX!1506
! sudo apt install gcc-riscv64-linux-gnu g++-riscv64-linux-gnu binutils-riscv64-linux-gnu
import cle
loader = cle.loader.Loader("/tmp/myfunc.o")
dir(loader.all_elf_objects[0].arch)
['__annotations__',
'__class__',
'__delattr__',
'__dict__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__getstate__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__le__',
'__lt__',
'__module__',
'__ne__',
'__new__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__setattr__',
'__setstate__',
'__sizeof__',
'__str__',
'__subclasshook__',
'__weakref__',
'_configure_capstone',
'_configure_keystone',
'_cs',
'_cs_x86_syntax',
'_get_register_dict',
'_ks',
'_ks_x86_syntax',
'address_types',
'argument_register_positions',
'argument_registers',
'artificial_registers',
'artificial_registers_offsets',
'asm',
'bits',
'bp_offset',
'branch_delay_slot',
'byte_width',
'bytes',
'cache_irsb',
'call_pushes_ret',
'call_sp_fix',
'capstone',
'capstone_support',
'capstone_x86_syntax',
'concretize_unique_registers',
'copy',
'cpu_flag_register_offsets_and_bitmasks_map',
'cs_arch',
'cs_mode',
'default_endness',
'default_register_values',
'default_symbolic_registers',
'disasm',
'dwarf_registers',
'dynamic_tag_translation',
'elf_tls',
'entry_register_values',
'fp_ret_offset',
'function_address_types',
'function_epilogs',
'function_prologs',
'get_base_register',
'get_default_reg_value',
'get_register_by_name',
'get_register_offset',
'got_section_name',
'ida_processor',
'initial_sp',
'instruction_alignment',
'instruction_endness',
'ip_offset',
'is_artificial_register',
'is_thumb',
'keystone',
'keystone_support',
'keystone_x86_syntax',
'ks_arch',
'ks_mode',
'ld_linux_name',
'lib_paths',
'library_search_path',
'linux_name',
'lr_offset',
'm_addr',
'max_inst_bytes',
'memory_endness',
'name',
'nop_instruction',
'persistent_regs',
'qemu_name',
'reg_blacklist',
'reg_blacklist_offsets',
'register_endness',
'register_list',
'register_names',
'register_size_names',
'registers',
'reloc_b_a',
'reloc_copy',
'reloc_s',
'reloc_s_a',
'reloc_tls_doffset',
'reloc_tls_mod_id',
'reloc_tls_offset',
'ret_instruction',
'ret_offset',
'sizeof',
'sp_offset',
'stack_change',
'stack_size',
'struct_fmt',
'subregister_map',
'symbol_type_translation',
'syscall_num_offset',
'translate_dynamic_tag',
'translate_register_name',
'translate_symbol_type',
'triplet',
'uc_arch',
'uc_const',
'uc_mode',
'uc_prefix',
'uc_regs',
'unicorn',
'unicorn_support',
'vex_arch',
'vex_archinfo',
'vex_cc_regs',
'vex_conditional_helpers',
'vex_endness',
'vex_support',
'vex_to_unicorn_map',
'x_addr']