Title: Paper notes - Block Oriented Programming: Automating Data-Only Attacks
Date: 2018-12-01 10:15

- Complete title: Block Oriented Programming: Automating Data-Only Attacks
- PDF: [69dcf39e7a25cd84a5f53479a48a80a86d2ce3c4]({static}/files/papers/69dcf39e7a25cd84a5f53479a48a80a86d2ce3c4_Block_Oriented_Programming:_Automating_Data-Only_Attacks.pdf)

The paper describes a system (called BOPC) to automatically construct payloads
to bypass [CFI](https://en.wikipedia.org/wiki/Control-flow_integrity), provided
only with an arbitrary read/write, an entrypoint and the payload itself,
written in a pseudo-C language called SPL (SPloit Language).

Mapping between SPL statements and basic blocks is done with constrains, via
symbolic execution of blocks: the constraints of the basic block must be a
superset of the operations required to implement the SPL statement. The paper's
implementation only maps a single basic block to each SPL statement, it doesn't
combine them.

Once *functional* blocks are found, the next step is to search for paths to
connect them blocks together, via chains of *dispatcher* blocks.  Since this is
a hard-problem, heuristics are used, mainly the proximity of the *functional*
blocks (the more instructions between them, the more chance of clobbering the
payload state). This results in a graph, with *functional* blocks as nodes, and
chains of *dispatchers* as edges.

BOPC then checks if it's possible to to stitch the *functional* blocks without
clobbering the SPL state, via
[concolic](https://en.wikipedia.org/wiki/Concolic_testing) execution,
concretizing the path, *functional* block by *functional* block. The execution
trace is then encoded as a series of writes.

There is currently no code released along with the paper.

I really liked the following excerpt of the paper, providing a digest and clear
yet exact description of a [weird machine](https://en.wikipedia.org/wiki/Weird_machine):

> The environment for SPL differs from that of conventional languages. Instead
of running code directly on a CPU, our compiler encodes the payload as a
mapping of instructions to functional blocks. That is, the underlying runtime
environment is the target binary and its program state, where payloads are
executed as side effects of the underlying binary.
