Introduction to Formal Compiler Security

This blog-post tries to give a formal introduction to compiler security. If you are already familiar with the term robust preservation, this blog-post is likely not for you. The post is accompanied by a Coq mechanization, i.e., all proofs presented in this blog-post are machine-checked. Find the link for it at the bottom of the post.

Preliminaries

Why formal methods?

Starting out in computer science, I’ve been very much into practical stuff. Writing “cool” algorithms, optimizing the hell out of the classic raytracer that everyone has to write at some point, and, well, designing a compiler. My first ever compiler was for a subset of the C programming language and it was inherently incorrect: It failed to faithfully translate the recursive version of Fibonacci: The compiler just wrote to some static memory location and forgot about the contents of the variables by overwriting the old location. In other words, my compiler did not make use of the stack at all, neither did it simulate it somehow, ending up with compiled programs that don’t behave the way I expected them to behave.

This expectation for program behavior should be written down somewhere, e.g., the C programming language dictates in its standard ( Citation: WG14/N1256 2007 September WG14/N1256. 2007 September . Programming languages — C ) the way a C program should run. In there, one can see a bunch of Backus-Naur-Forms defining the concrete syntax. Moreover, prose describes the semantics (what should happen under certain consequences, e.g., calling a function pointer is the same as dereferencing and then calling it). These rules give a more or less precise understanding how the programming language works and what programmers expect. Even though there have been studies that suggest differently ( Citation: Xu et.al. 2023 Jianhao Xu and Kangjie Lu and Zhengjie Du and Zhu Ding and Linke Li and Qiushi Wu and Mathias Payer and Bing Mao. 2023 . Silent Bugs Matter: A Study of Compiler-Introduced Security Bugs ) , the point I’m trying to make here is that every C programmer has some kind of mental model of how their program behaves. It turns out that specifying the semantics of programming languages is itself a tricky problem. As an example, for C, in certain contexts, the evaluation order is left implementation-defined, so there can be differences in the result of the expression f() + g() for the following program when compiled with different C compilers:

int x = 0;
int f() { x = x - 1; return x; }
int g() { x = x + 1; return x; }

If f() is evaluated first, the value of f() + g() is -1. If g() is evaluated first, the value of f() + g() is 1.

A formal, mathematical, specification gives a precise understanding on how something works. While it do can be ambiguous by design (by introducing non-determinism as a “feature” to the language), it cannot be ambiguous “by language”. It is also the most convincing argument one can do: Starting with stuff that is irrefutably obvious, inducing other facts from these by means of rules given by the logic we do reasoning in, there is no chance to say “your argument doesn’t make sense” without questioning the logic to begin with. It is the very core reason why we do math/logic.

Utilizing formal methods to prove that programs adhere to a specification is also much more powerful than unit- or property-based testing. Consider the following program:

void add(int x, int y) {
  if(x == 17290001 && y == 29171373) {
    return x - y;
  }
  return x + y;
}

For almost all inputs x and y, it is correct. A :shit:ton of tests are still very unlikely to find the bug where the function does not fulfill the intended specification “add number x and y”.

What are properties?

So, we want to verify programs. One way of looking at programs is to do so abstractly: Programs may perform some events that may have an effect to the “outside world”, like reading from or writing to memory. These events are also known as atomic propositions, commonly referred to as AP\operatorname{AP}. This post will only use the terminology “event”. The formal description of an event can be done with a Backus-Naur-Form without any metavariable/non-terminal symbol:

(Pre-Event) ap:=StartEnd v(Event) a:=εap\begin{aligned} \mi{(\text{Pre-Event})}\ a_p &:= \text{Start} \mid \text{End}\ v \\ \mi{(\text{Event})}\ a &:= \varepsilon \mid a_p \end{aligned}

This is an example of a formal definition for concrete events: the empty event (ε\varepsilon), the event signaling that a program starts (Start\text{Start}), and the event signaling program termination (End v\text{End}\ v). The empty event is usually included to refer to unimportant/unobservable/internal actions (you may see this being referred to as λ\lambda instead of ε\varepsilon in the literature, some even use τ\tau). Note that End\text{End} carries a parameter vv, which I have not defined precisely, but which is supposed to be “just some value”, like true, false, or 1729.

A program may emit multiple events (written a\overline{a}) and, depending on what properties interest us, the concrete events wanted from program execution may need to change. In the verification domain, programs yield (potentially) infinitely sized lists of events, which are known as traces.

Even in this very simple setting where the only “interesting” events are Start\text{Start} and End v\text{End}\ v, it is possible to define interesting properties for programs, which will be done soon. First up, it is necessary to make sense of the term property from the formal perspective: A trace property (π\pi) is simply a set of traces (e.g., Trace -> Prop in Coq). For example, the property “termination” may be formulated as “any trace a\overline{a} which at some point contains an End v\text{End}\ v event”. More formally:

πterm={av,aEnd v}\pi_{\text{term}} = \{ \overline{a} \mid \exists v, \overline{a} \vDash \diamond \text{End}\ v \}

Hereby, the \vDash means “satisfies” and the \diamond is a unary operator reading “eventually”. πterm\pi_{\text{term}} is an example for a liveness property, which intuitively concerns that “something good happens”. Even if the good thing did not happen yet, it could still happen at some point!

Having seen a liveness property, let us see an example for another class of properties, the safety properties. For this, let’s try to formulate “non-termination”: “any trace a\overline{a} which never contains an End v\text{End}\ v event”. Formally:

πno-term={av,a¬End v}\pi_{\text{no-term}} = \{ \overline{a} \mid \forall v, \overline{a} \vDash \square \neg\text{End}\ v \}

The \square reads “always” and the ¬\neg is simply “not”.

Safety and liveness are the two big properties, but there also exist properties which are neither safety nor liveness (because they are both!). But anyhow, the crucial bit to keep in mind for the rest of the blog post is that properties are sets of traces.

(Robust) Satisfaction

Suppose γ\gamma is a compiler that translates from language S\src{S} to language T\trg{T}.

Def. Property Satisfaction

A program ww satisfies a property π\pi (wπw\vdash\pi) if its execution yields a trace a\overline{a} such that aπ\overline{a}\in\pi.

It is easy for a compiler to make sure target programs never go wrong: Just always emit a program that corresponds to the constant 42\trg{42}. But, this is very much not satisfactory. The intuition of programmers is that compiled T\trg{T}-level programs behave similar to the original S\src{S} program! This “behaves similar” is precisely the intuition for preservation, given some compiler γ\gamma:

Def. Preservation of Property Satisfaction

For any property π\pi and S\src{S}-level program w\src{w}, if wπ\src{w}\vdash\pi, then γ(w)π\gamma(\src{w})\vdash\pi.

Notice how wπ\src{w}\vdash\pi has a hidden assumption: program execution of w\src{w} can actually yield a trace. The issue is that this is not how a programmer writes a program. To illustrate why, consider this C function:

void foo() {
  printf("hello");
}

A very basic C function that makes crucial use of printf, a function that is not defined in the above excerpt! In fact, one would have to compile this snippet and then link the result with an implementation of the C standard library, which provides a definition for printf. While this is only one of many possible scenarios, in reality, people use libraries daily, no questions asked. To accurately model preservation of properties with this in mind, the blog-post refers to programs written by the programmer as components and to any library providing definitions for, e.g., undefined functions, as contexts. Property satisfaction is only defined on whole programs, so it does not work for components, because they are partial programs where not every symbol has an accompanying definition. A natural extension to make is to generalize this to arbitrary contexts. To this end, revisit property satisfaction and take arbitrary contexts into account:

Def. Robust Property Satisfaction

A component pp robustly satisfies a property π\pi (pRπp\vdash_R\pi) if there is a whole program ww that is the result of linking an arbitrary context CC with the component pp such that the execution of ww yields a trace a\overline{a} that fulfills aπ\overline{a}\in\pi.

Consider this C function and the compiled ARM64 output:

extern int arr[5];

int foo(int idx) {
  return arr[idx];
}
foo(int):
  adrp x1, arr
  add x1, x1, arr
  ldr w0, [x1, w0, sxtw 2]
  ret

Looking at the code, the compiler translates this as expected: perform an array access to some pre-defined constant arr. However, it is easy to see that there is a potential memory safety issue: A context that somewhere invokes foo(1729) will lead to an out-of-bounds memory access, which is a spatial memory safety violation. This means fooRsmssafe\texttt{foo}\vdash_R\text{smssafe} does not hold, since foo\texttt{foo} does not satisfy the property smssafe\text{smssafe} (no out-of-bounds accesses) for all possible contexts.

A potential fix is to make it robust by inserting a bounds-check:

extern int arr[5];

int foo(int idx) {
  if(0 <= idx && idx < 5) {
    return arr[idx];
  }
  abort();
}

Now, fooRsmssafe\texttt{foo}\vdash_R\text{smssafe} holds.

Secure Compilation

The lifting of preservation of a source behavior in the target simply uses robust satisfaction :

Def. Robust Trace Property Preservation

For any property π\pi and S\src{S}-level program p\src{p}, if pRπ\src{p}\vdash_R\pi, then γ(p)Rπ\gamma(\src{p})\vdash_R\pi.

Notably, in the assumption pRπ\src{p}\vdash_R\pi , the context is at S\src{S}-level, while for the goal γ(p)Rπ\gamma(\src{p})\vdash_R\pi the context is at T\trg{T}-level. Now, if you think of T\trg{T} as something like ARM64, program contexts easily appear much more powerful than for S\src{S}, which could be something like Rust. In Rust, one is not able to just guess some memory address (without unsafe), while this works without issue in ARM64, assuming one guesses correctly.

Lets see how one could try to prove robust preservation . On the left there are the assumptions that we are allowed to work with, on the right there is the goal we need to prove. The diagram is supposed to be read from top to bottom.

Have Comment Want
π\pi
 : 
Properties\text{Properties}
p\src{p}
 : 
S-level component\src{S}\text{-level component}
H1H_1
 : 
pRπ\src{p}\vdash_R\pi
pST\comp{\src{S}}{\trg{T}}{\src{p}}
H1H_1'
 : 
aTrace,CS-level program,if link C pa,then aπ\begin{array}{l}\forall\trace\in\text{Trace},\\\forall\src{C}\in\src{S}\text{-level program},\\\text{if }\src{link\ C\ p}\to\trace,\\\text{then }\trace\in\pi\end{array}
(unfold R)(\text{unfold $\vdash_R$})
aTrace,CT-level program,if link C pSTa,then aπ\begin{array}{l}\forall\trace\in\text{Trace},\\\forall\trg{C}\in\trg{T}\text{-level program},\\\text{if }\trg{link\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\to\trace,\\\text{then }\trace\in\pi\end{array}
a\trace
 : 
Trace\text{Trace}
assume from goal\text{assume from goal}
C\trg{C}
 : 
T-level program\trg{T}\text{-level program}
HT-execH_{\trg{T}\text{-exec}}
 : 
link C pSTa\trg{link\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\to\trace
aπ\trace\in\pi

Looking at the proof context, i.e., the assumptions that we have, we have to show that this trace a\overline{a} is part of whatever property π\pi, where the only thing we know about a\overline{a} is that it is the result of the execution link C pSTa\trg{link\ C\ }\llbracket\src{p}\rrbracket^{\src{S}\to\trg{T}}\to\overline{a}. Since π\pi is arbitrary, there is little hope to show that a\overline{a} is an element of it by decomposing this T\trg{T}-level execution. But, note that there is also assumption H1H_1' . We certainly can specialize it with a\overline{a}, but then we need to provide an S\src{S}-level context as well as an execution link C pa\src{link\ C\ p}\to\overline{a}, both of which we do not have. This is the crucial part in these kinds of robust preservation proofs: somehow coming up with an S\src{S}-level context C\src{C} whose runtime behavior when linked with p\src{p} is similar to that of link C pST\trg{link\ C\ }\llbracket\src{p}\rrbracket^{\src{S}\to\trg{T}}.

In the rest of this post, we’ll see how to do this on the basis of backtranslating the trace a\overline{a}.

S\src{S} Definitions

S\src{S} Syntax

(Value) v:=truefalsen          , where n is a number(Expression) e:=val vvar xe1e2if e1 then e2 then e3abort(Type) τ:=NBττ\begin{aligned} \mi{(\text{Value})}\ \src{v} &:= \src{true} \mid \src{false} \mid \src{n} \ \ \ \ \ \ \ \ \ \ \text{, where } \src{n} \text{ is a number}\\ \mi{(\text{Expression})}\ \src{e} &:= \src{val\ v} \mid \src{var\ x} \mid \src{e_1 \oplus e_2} \mid \src{if\ e_1\ then\ e_2\ then\ e_3} \mid \src{abort} \\ \mi{(\text{Type})}\ \src{\tau} &:= \src{\mathbb{N}} \mid \src{\mathbb{B}} \mid \src{\tau\to\tau} \end{aligned}
(** Possible binary operations. *)
Variant binopsymb : Type :=
| Badd : binopsymb
| Bsub : binopsymb
| Bmul : binopsymb
| Bdiv : binopsymb
| Bless : binopsymb
.
(** Value *)
Inductive value : Type :=
| Vnat : nat -> value
| Vbool : bool -> value
.
(** Variables are just strings. *)
Definition vart := string.

(** Expression *)
Inductive expr : Type :=
| Xval (v : value) : expr
| Xvar (x : vart) : expr
| Xbinop (symb : binopsymb) (lhs rhs : expr) : expr
| Xif (c e0 e1 : expr) : expr
| Xabort : expr
.
(** Type *)
Inductive ty : Type :=
| Tnat : ty
| Tbool : ty
| Tfun : ty -> ty -> ty
.

Hereby, x\src{x} is some kind of string and \src{\oplus} is one of the following binary operations: <\src{<}, +\src{+}, ×\src{\times}, /\src{/}, and \src{-}. We do not encode precedence or anything into this formal description, because it is more meant to be a generator for abstract syntax trees instead of a context free grammar. That being said, the precedences of the binary operators are “as usual”.

Another rather syntactic comment I have to make is the fact that some metavariables appear “free” in the above definition, e.g., e1\src{e_1} in the binary operation rule. I sketch the “basename” of the metavariable e\src{e} and add subscripts as a means to signal that both branches in an expression like e1e2\src{e_1\oplus e_2} can be different. If I’d have written it as ee\src{e\oplus e}, there could be confusion that you have to insert the same syntactic block into each branch, i.e., you can only have expressions like val 1+val 1\src{val\ 1+val\ 1}, but not val 1+val 2\src{val\ 1+val\ 2}. Admittedly, the rule for expressions should quantify over all used metavariables, i.e., the full, formal definition of the Backus-Naur-Form rule would look like Exprse1,e2,e3::=\src{Exprs}\ni\src{e_1,e_2,e_3}::=\dots, where "\dots" as above. Exprs\src{Exprs} hereby describes a set that contains all words adhering to the syntax on the righthandside of the ::= operator. Anyhow, I think this notation is still somewhat understandable while lessening the burden of parsing what I’ve written.

While we have expressions now, we still lack contexts and whole programs. The following definitions make this precise:

(Components) p:=(x;τ1;τ2;e)         (Contexts) C:=(y;epre;epost)(Whole Programs) W:=prog C p\begin{aligned} \mi{(\text{Components})}\ \src{p} &:= (\src{x};\src{\tau_1};\src{\tau_2};\src{e})\ \ \ \ \ \ \ \ \ \mi{(\text{Contexts})}\ \src{C} := (\src{y};\src{e_{pre}};\src{e_{post}})\\ &\mi{(\text{Whole Programs})}\ \src{W} := \src{prog\ C\ p} \end{aligned}
(** Symbols look like `fn foo x : τ := e` *)
Definition symbol : Type := vart * ty * ty * expr.

(** Viable contexts that can be linked against a symbol. *)
Inductive LinkageContext : Type :=
(* basically models `let y = call foo e__pre in e__post` *)
| LContext (y : vart) (e__pre e__post : expr) : LinkageContext
.

(** Top-level programs *)
Inductive prog : Type := Cprog (C : LinkageContext) (foo : symbol) : prog.

Again, this is meant to be abstract syntax. The intuitive understanding for compontents is simply a function, in imaginative syntax, fn(x:τ):τ:=e\src{fn(x : \tau) : \tau := e}. Similarily, contexts are simply a let-expression that “calls” into this function, again in imaginative syntax, let y = call p epre in epost\src{let\ y\ =\ call\ p\ e_{pre}\ in\ e_{post}}. We’ll see the dynamic semantics of this soon, but first, let’s take a look at the static semantics, i.e., what it means for a program to be considered valid/safe.

S\src{S} Static Semantics

 
vtrue:B\vdash_v\src{true}:\src{\mathbb{B}}
(S-t-true)\text{\footnotesize($\src{S}$-t-true)}
 
vfalse:B\vdash_v\src{false}:\src{\mathbb{B}}
(S-t-false)\text{\footnotesize($\src{S}$-t-false)}
n is a number\src{n}\text{ is a number}
vn:N\vdash_v\src{n}:\src{\mathbb{N}}
(S-t-num)\text{\footnotesize($\src{S}$-t-num)}
vv:τ\vdash_v\src{v}:\src{\tau}
Γeval v:τ\src{\Gamma}\vdash_e\src{val\ v}:\src{\tau}
(S-t-val)\text{\footnotesize($\src{S}$-t-val)}
(x,τ)Γ(\src{x},\src{\tau})\in\src{\Gamma}
Γevar x:τ\src{\Gamma}\vdash_e\src{var\ x}:\src{\tau}
(S-t-var)\text{\footnotesize($\src{S}$-t-var)}
 
Γeabort:τ\src{\Gamma}\vdash_e\src{abort}:\src{\tau}
(S-t-abort)\text{\footnotesize($\src{S}$-t-abort)}
Γee1:NΓee2:N is not <\begin{array}{ rcl }\src{\Gamma}\vdash_e\src{e_1}:\src{\mathbb{N}} && \src{\Gamma}\vdash_e\src{e_2}:\src{\mathbb{N}} \\ \src{\oplus}&\text{ is not }&\src{<}\end{array}
Γee1e2:N\src{\Gamma}\vdash_e\src{e_1\oplus e_2}:\src{\mathbb{N}}
(S-t-binop)\text{\footnotesize($\src{S}$-t-binop)}
Γee1:NΓee2:N\begin{array}{ lr }\src{\Gamma}\vdash_e\src{e_1}:\src{\mathbb{N}} & \src{\Gamma}\vdash_e\src{e_2}:\src{\mathbb{N}}\end{array}
Γee1<e2:B\src{\Gamma}\vdash_e\src{e_1 < e_2}:\src{\mathbb{B}}
(S-t-less)\text{\footnotesize($\src{S}$-t-less)}
Γee1:BΓee2:τΓee3:τ\begin{array}{ ccc }\src{\Gamma}\vdash_e \src{e_1}:\src{\mathbb{B}} & \src{\Gamma}\vdash_e \src{e_2}:\src{\tau} & \src{\Gamma}\vdash_e \src{e_3}:\src{\tau}\end{array}
Γeif e1 then e2 else e3:τ\src{\Gamma}\vdash_e\src{if\ e_1\ then\ e_2\ else\ e_3}:\src{\tau}
(S-t-if)\text{\footnotesize($\src{S}$-t-if)}
(** Type system spec *)
Reserved Notation "'[' Γ '|-' e ':' τ  ']'" (at level 81, Γ at next level, e at next level, τ at next level).
Inductive check : Gamma -> expr -> ty -> Prop :=
| T_var : forall: Gamma) (x : vart) (τ : ty),
  nodupinv Γ ->
  Min Γ x τ ->
|- Xvar x : τ]
| T_abort : forall: Gamma) (τ : ty),
|- Xabort : τ]
| T_vnat : forall: Gamma) (n : nat),
|- Xval(Vnat n) : Tℕ]
| T_vbool : forall: Gamma) (b : bool),
|- Xval(Vbool b) : Tbool]
| T_binop : forall: Gamma) (b : binopsymb) (e1 e2 : expr),
  b <> Bless ->
|- e1 : Tℕ] ->
|- e2 : Tℕ] ->
|- Xbinop b e1 e2 : Tℕ]
| T_binop_less : forall: Gamma) (e1 e2 : expr),
|- e1 : Tℕ] ->
|- e2 : Tℕ] ->
|- Xbinop Bless e1 e2 : Tbool]
| T_if : forall: Gamma) (e0 e1 e2 : expr) (τ : ty),
|- e0 : Tbool] ->
|- e1 : τ] ->
|- e2 : τ] ->
|- Xif e0 e1 e2 : τ]
where "'[' Γ '|-' e ':' τ ']'" := (check Γ e τ)
.

The typing rules for expressions are rather straightforward. Typechecking binary operations is split into S-t-binop\src{S}\text{-t-binop} and S-t-less\src{S}\text{-t-less} , since comparisons should return a boolean value and anything else a number.

Typechecking contexts, components, and whole programs closely follows the standard “let”-rule when typechecking more complicated programming languages.

(x;τ1),[]ee:τ2(\src{x};\src{\tau_1}),\src{[\cdot]}\vdash_e \src{e}:\src{\tau_2}
p(x;τ1;τ2;e):τ1τ2\vdash_p(\src{x};\src{\tau_1};\src{\tau_2};\src{e}):\src{\tau_1\to\tau_2}
(S-t-component)\text{\footnotesize($\src{S}$-t-component)}
[]eepre:N(y;N),[]eepost:τ\begin{array}{ cc }\src{[\cdot]}\vdash_e\src{e_{pre}}:\src{\mathbb{N}} & (\src{y};\src{\mathbb{N}}),\src{[\cdot]}\vdash_e \src{e_{post}}:\src{\tau}\end{array}
c(y;epre;epost):τ\vdash_c(\src{y};\src{e_{pre}};\src{e_{post}}):\src{\tau}
(S-t-context)\text{\footnotesize($\src{S}$-t-context)}
cC:τpp:NN\begin{array}{ cc }\vdash_c\src{C}:\src{\tau} & \vdash_p\src{p}:\src{\mathbb{N}\to\mathbb{N}}\end{array}
prog C p:τ\vdash\src{prog\ C\ p}:\src{\tau}
(S-t-whole)\text{\footnotesize($\src{S}$-t-whole)}
(** Checking of symbols. A symbol may use the identifier x, because a symbol
  in this language is basically fn(x) ↦ e. *)
Inductive check_symb : symbol -> Prop :=
| check_symb_c : forall (e : expr) (x : vart),
  check (x ↦ Tℕ ◘ [⋅]) e Tℕ ->
  check_symb (x, Tℕ, Tℕ, e)
.
(** Similarily, contexts are like `let y = call foo e__pre in e__post` *)
Inductive check_ctx : LinkageContext -> Prop :=
| check_ctx_c : forall (y : vart) (e__pre e__post : expr),
  [[⋅] |- e__pre : Tℕ] ->
  [y ↦ Tℕ ◘ [⋅] |- e__post : Tℕ] ->
  check_ctx (LContext y e__pre e__post)
.
(** Checking a top-level program just means checking both context and component *)
Definition prog_check (p : prog) : Prop :=
let '(Cprog C foo) := p in
check_symb foo /\ check_ctx C
.

S-t-whole\src{S}\text{-t-whole} restricts components to be of type NN\src{\mathbb{N}\to\mathbb{N}}.

S\src{S} Dynamic Semantics

Since the language is so simple, the dynamic semantics are also simple, like the typing rules. The semantics is done small-step style with evaluation contexts, thereby sidestepping the need for congruence rules. Evaluation does not happen on expressions themselves, but rather on “runtime expressions”. Moreover, execution may give visible events (something like syscalls), which warrants a trace-model:

(Runtime Expressions) r:= e:zap:(Events) a:=ε:zap:Call vRet vStartEnd v(Traces) a:=[]aa\begin{aligned} \mi{(\text{Runtime Expressions})}\ \src{r} &:= \src{\triangleright\ e} \mid \src{\rtexpr{}{:zap:}} \\ \mi{(\text{Events})}\ \src{a} &:= \src{\varepsilon} \mid \text{:zap:} \mid \src{Call\ v} \mid \src{Ret\ v} \mid \src{Start} \mid \src{End\ v} \\ \mi{(\text{Traces})}\ \src{\trace} &:= \src{[\cdot]} \mid \src{\event\cdot\trace} \end{aligned}
(** A runtime program is just an expression or a crashed state. *)
Inductive rtexpr : Type :=
| RTerm (e : expr)
| RCrash
.
#[global]
Notation "'▷' e" := ((RTerm e) : rtexpr) (at level 81, e at next level).
#[global]
Notation "'▷↯'" := (RCrash).

With this, primitive reductions are defined as follows:

 
abort prim \src{\rtexpr{}{abort}\psteparrow{⚡}\rtexpr{}{\text{⚡}}}
(S-e-crash)\text{\footnotesize($\src{S}$-e-crash)}
 
if val true then e1 else e2 prim e1\src{\rtexpr{}{if\ val\ true\ then\ e_1\ else\ e_2\psteparrow{} e_1}}
(S-e-if-true)\text{\footnotesize($\src{S}$-e-if-true)}
 
if val false then e1 else e2 prim e1\src{\rtexpr{}{if\ val\ false\ then\ e_1\ else\ e_2\psteparrow{} e_1}}
(S-e-if-false)\text{\footnotesize($\src{S}$-e-if-false)}
eval_binop(;n1;n2)=v\operatorname{eval\_binop}(\src{\oplus};\src{n_1};\src{n_2})=\src{v}
val n1val n2 prim val v\src{\rtexpr{}{val\ n_1\oplus val\ n_2\psteparrow{} val\ v}}
(S-e-binop)\text{\footnotesize($\src{S}$-e-binop)}
(** Evaluation of binary expressions. Note that 0 means `true` in S, so `5 < 42` evals to `0`. *)
Definition eval_binop (b : binopsymb) (n0 n1 : nat) : option value :=
Some((match b with
     | Bless => (if Nat.ltb n0 n1 then Vbool true else Vbool false)
     | Badd => Vnat(n0 + n1)
     | Bdiv => Vnat(Nat.div n0 n1)
     | Bsub => Vnat(n0 - n1)
     | Bmul => Vnat(n0 * n1)
     end))
.
(** Primitive evaluation steps. *)
Inductive pstep : (PrimStep rtexpr) :=
| e_binop : forall (n1 n2 : nat) v (b : binopsymb),
  Some(v) = eval_binop b n1 n2 ->
  ▷ Xbinop b n1 n2 --[]--> ▷ v
| e_if_true : forall (e1 e2 : expr),
<( if true then (e1) else (e2) )> --[]--> ▷ e1
| e_if_false : forall (e1 e2 : expr),
<( if false then (e1) else (e2) )> --[]--> ▷ e2
| e_abort :
<( abort )> --[ Scrash ]--> ▷↯
.

All steps without an event above the arrow implicitly throw the empty event (ε\src{\varepsilon}). While these definitions allow us to evaluate a runtime expression like val 1+val 2\src{\rtexpr{}{val\ 1+val\ 2}}, it fails to apply for stuff like (val 1+val 2)+val 3\src{\rtexpr{}{(val\ 1+val\ 2)+val\ 3}}. In order to be able to reduce these more complicated expressions, we decompose them into evaluation contexts and primitive expressions. The idea of evaluation contexts is that of a continuation, i.e., “the remaining continuation”. When evaluating something like (val 1+val 2)+val 3\src{\rtexpr{}{(val\ 1+val\ 2)+val\ 3}}, there is an outermost part that stays the same when evaluating, in this case +val 3\src{\square+val\ 3}. So, let’s define evaluation contexts:

(Evaluation Contexts) E:=EenEif E then e1 else e2\begin{aligned} \mi{(\text{Evaluation Contexts})}\ \src{E} &:= \src{\square} \mid \src{E \oplus e} \mid \src{n \oplus E} \mid \src{if\ E\ then\ e_1\ else\ e_2} \\ \end{aligned}
(** We proceed to define the dynamic semantics via evaluation contexts/environments. *)
Inductive evalctx : Type :=
| Khole : evalctx
| KbinopL (b : binopsymb) (K : evalctx) (e : expr) : evalctx
| KbinopR (b : binopsymb) (n : nat) (K : evalctx) : evalctx
| Kif (K : evalctx) (e0 e1 : expr) : evalctx
.

Filling the hole (\src{\square}) of an evaluation context with some expression e\src{e} is written E[e]\src{E[e]}. With that, define context-based reduction as follows:

e aprim e\src{\rtexpr{}{e}\psteparrow{\event}\rtexpr{}{e'}}
E[e] aectx E[e]\src{\rtexpr{}{E[e]}\ectxsteparrow{\event}\rtexpr{}{E[e']}}
(S-e-ectx)\text{\footnotesize($\src{S}$-e-ectx)}
e prim \src{\rtexpr{}{e}\psteparrow{\text{⚡}}\rtexpr{}{\text{⚡}}}
E[e] ectx \src{\rtexpr{}{E[e]}\ectxsteparrow{\text{⚡}}\rtexpr{}{\text{⚡}}}
(S-e-ectx-crash)\text{\footnotesize($\src{S}$-e-ectx-crash)}
(* Contextual steps may either have a subexpression that can do a primstep or
 have a subexpression that crashes. *)
Inductive estep : EctxStep rtexpr :=
| E_ectx : forall (e e' e0 e0' : expr) (o : option event) (K : evalctx),
  e0 = insert K e ->
  e0' = insert K e' ->
  ▷ e --[, o ]--> ▷ e' ->
  ▷ e0 ==[, o ]==> ▷ e0'
| E_stuck : forall (e e0 : expr) (K : evalctx),
  e0 = insert K e ->
  ▷ e --[ Scrash ]--> ▷↯ ->
  ▷ e0 ==[ Scrash ]==> ▷↯
.

To evaluate multiple steps, simply take the reflexive transitive closure  aectx \src{\stepsarrow{\trace}}. To evaluate multiple steps in nn, simply count down nn for each transitive step and only if it is 00 allow reflexive steps. When taking multiple steps, the trace a\src{\trace} only collects events that are non-empty - and, the way we have defined the relations, it’s either a singleton list containing the crash event (:zap:), or it is the empty list. The other events will become relevant for the top-level execution, which we define now.

prog (y;epre;epost) (x;τ1;τ2;efoo):Nepre nectx val vpreefoo[subst x for val vpre] nectx val vfooepost[subst y for val vfoo] nectx val vpost\begin{array}{ c }\vdash\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo})}:\src{\nat} \\ \src{\rtexpr{}{e_{pre}} \nstepsarrow{}{n}\rtexpr{}{val\ v_{pre}} } \\ \src{ \triangleright e_{foo}}[\text{subst }\src{x}\text{ for }\src{val\ v_{pre}}]\src{ \nstepsarrow{}{n'} \rtexpr{}{val\ v_{foo}} } \\ \src{ \triangleright e_{post} }[\text{subst }\src{y}\text{ for }\src{val\ v_{foo}}]\src{\nstepsarrow{}{n''}\rtexpr{}{val\ v_{post}}}\end{array}
prog (y;epre;epost) (x;τ1;τ2;efoo) StartCall vpreRet vfooEnd vpost val vpost\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo}) \progarrow{Start\cdot Call\ v_{pre}\cdot Ret\ v_{foo}\cdot End\ v_{post}}\rtexpr{}{val\ v_{post}} }
(S-eprog-success)\text{\footnotesize($\src{S}$-eprog-success)}
epre nectx \begin{array}{ c }\src{\rtexpr{}{e_{pre}} \nstepsarrow{⚡}{n}\rtexpr{}{⚡} }\end{array}
prog (y;epre;epost) (x;τ1;τ2;efoo) Start \src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo}) \progarrow{Start\cdot ⚡}\rtexpr{}{⚡} }
(S-eprog-fail0)\text{\footnotesize($\src{S}$-eprog-fail0)}
prog (y;epre;epost) (x;τ1;τ2;efoo):Nepre nectx val vpreefoo[subst x for val vpre] nectx \begin{array}{ c }\vdash\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo})}:\src{\nat} \\ \src{\rtexpr{}{e_{pre}} \nstepsarrow{}{n}\rtexpr{}{val\ v_{pre}} } \\ \src{ \triangleright e_{foo}}[\text{subst }\src{x}\text{ for }\src{val\ v_{pre}}]\src{ \nstepsarrow{⚡}{n'} \rtexpr{}{⚡} }\end{array}
prog (y;epre;epost) (x;τ1;τ2;efoo) StartCall vpre \src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo}) \progarrow{Start\cdot Call\ v_{pre}\cdot ⚡}\rtexpr{}{⚡} }
(S-eprog-fail1)\text{\footnotesize($\src{S}$-eprog-fail1)}
prog (y;epre;epost) (x;τ1;τ2;efoo):Nepre nectx val vpreefoo[subst x for val vpre] nectx val vfooepost[subst y for val vfoo] nectx \begin{array}{ c }\vdash\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo})}:\src{\nat} \\ \src{\rtexpr{}{e_{pre}} \nstepsarrow{}{n}\rtexpr{}{val\ v_{pre}} } & \src{ \triangleright e_{foo}}[\text{subst }\src{x}\text{ for }\src{val\ v_{pre}}]\src{ \nstepsarrow{}{n'} \rtexpr{}{val\ v_{foo}} } \\ \src{ \triangleright e_{post} }[\text{subst }\src{y}\text{ for }\src{val\ v_{foo}}]\src{\nstepsarrow{⚡}{n''}\rtexpr{}{⚡}}\end{array}
prog (y;epre;epost) (x;τ1;τ2;efoo) StartCall vpreRet vfoo \src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo}) \progarrow{Start\cdot Call\ v_{pre}\cdot Ret\ v_{foo}\cdot ⚡}\rtexpr{}{⚡} }
(S-eprog-fail2)\text{\footnotesize($\src{S}$-eprog-fail2)}
¬(prog (y;epre;epost) (x;τ1;τ2;efoo):N)epre nectx val vpre\begin{array}{ c }\neg\left( \vdash\src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo})}:\src{\nat} \right) \\ \src{\rtexpr{}{e_{pre}} \nstepsarrow{}{n}\rtexpr{}{val\ v_{pre}} }\end{array}
prog (y;epre;epost) (x;τ1;τ2;efoo) StartCall vpre \src{prog\ (y;e_{pre};e_{post})\ (x;\tau_1;\tau_2;e_{foo}) \progarrow{Start\cdot Call\ v_{pre}\cdot⚡}\rtexpr{}{⚡} }
(S-eprog-fail3)\text{\footnotesize($\src{S}$-eprog-fail3)}
Inductive progstep : prog -> tracepref -> rtexpr -> Prop :=
| prog_step_success : forall (x__foo y : vart) (e__pre e__post e__foo : expr) (foo : symbol) (v v__pre v__foo : value)
                      (n__pre n__foo n__post : nat),
  check_symb foo ->
  (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
     To actually emit traces, we just hardcode them at the top-level... *)
  x__foo = vart_of_symbol foo ->
  e__foo = expr_of_symbol foo ->
  (* setup the argument for the function call *)
  ▷ e__pre =( n__pre )=[ nil ]==> ▷ Xval v__pre ->
  (* do the function call *)
  ▷ (subst x__foo e__foo (Xval v__pre)) =( n__foo )=[ nil ]==> ▷ Xval v__foo ->
  (* something may happen after the function call *)
subst y e__post (Xval v__foo) =( n__post )=[ nil ]==> ▷ Xval v ->
  progstep (Cprog (LContext y e__pre e__post) foo)
           (Sstart :: Scall Qctxtocomp (sv v__pre) :: Sret Qcomptoctx (sv v__foo) :: Send (sv v) :: nil)
           (RTerm (Xval v))
| prog_step_failure0 : forall (y : vart) (e__pre e__post : expr) (foo : symbol) (n__pre : nat),
  (* wether component is well-typed or not does not matter if the context yields a crash *)
  (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
     To actually emit traces, we just hardcode them at the top-level... *)
  (* setup the argument for the function call ; fails *)
  ▷ e__pre =( n__pre )=[ Scrash :: nil ]==> (▷↯) ->
  progstep (Cprog (LContext y e__pre e__post) foo)
           (Sstart :: Scrash :: nil)
           (▷↯)
| prog_step_failure1 : forall (x__foo y : vart) (e__pre e__post e__foo : expr) (foo : symbol) (v__pre : value)
                       (n__pre n__foo : nat),
  check_symb foo ->
  (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
     To actually emit traces, we just hardcode them at the top-level... *)
  x__foo = vart_of_symbol foo ->
  e__foo = expr_of_symbol foo ->
  (* setup the argument for the function call *)
  ▷ e__pre =( n__pre )=[ nil ]==> ▷ Xval v__pre ->
  (* do the function call ; fails *)
  ▷ (subst x__foo e__foo (Xval v__pre)) =( n__foo )=[ Scrash :: nil ]==> (▷↯) ->
  progstep (Cprog (LContext y e__pre e__post) foo)
           (Sstart :: Scall Qctxtocomp (sv v__pre) :: Scrash :: nil)
           (▷↯)
| prog_step_failure2 : forall (x__foo y : vart) (e__pre e__post e__foo : expr) (foo : symbol) (v__pre v__foo : value)
                       (n__pre n__foo n__post : nat),
  check_symb foo ->
  (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
     To actually emit traces, we just hardcode them at the top-level... *)
  x__foo = vart_of_symbol foo ->
  e__foo = expr_of_symbol foo ->
  (* setup the argument for the function call *)
  ▷ e__pre =( n__pre )=[ nil ]==> ▷ Xval v__pre ->
  (* do the function call *)
  ▷ (subst x__foo e__foo (Xval v__pre)) =( n__foo )=[ nil ]==> ▷ Xval v__foo ->
  (* something may happen after the function call; fails *)
subst y e__post (Xval v__foo) =( n__post )=[ Scrash :: nil ]==> (▷↯) ->
  progstep (Cprog (LContext y e__pre e__post) foo)
           (Sstart :: Scall Qctxtocomp (sv v__pre) :: Sret Qcomptoctx (sv v__foo) :: Scrash :: nil)
           (▷↯)
| prog_step_failure3 : forall (x__foo y : vart) (e__pre e__post e__foo : expr) (foo : symbol) (v__pre : value)
                       (n__pre : nat),
  (* this is a dirty hack to sidestep more formalization work. we want ill-typed components to "crash", but it's a hassle
     to extend small step and bigstep accordingly. So, instead, we do it at the top level execution relation of programs *)
  ~(check_symb foo) ->
  (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
     To actually emit traces, we just hardcode them at the top-level... *)
  x__foo = vart_of_symbol foo ->
  e__foo = expr_of_symbol foo ->
  (* setup the argument for the function call *)
  ▷ e__pre =( n__pre )=[ nil ]==> ▷ Xval v__pre ->
  (* since the component is ill-typed, just crash immediately *)
  progstep (Cprog (LContext y e__pre e__post) foo)
           (Sstart :: Scall Qctxtocomp (sv v__pre) :: Scrash :: nil)
           (▷↯)
.

Since the language only has one call, it can be hardcoded in the semantics by simulating the reduction of a nested let-expression, as done in S-eprog-success\src{S}\text{-eprog-success} .

The rules S-eprog-fail0\src{S}\text{-eprog-fail0} , S-eprog-fail1\src{S}\text{-eprog-fail1} , and S-eprog-fail2\src{S}\text{-eprog-fail2} encode all possibilities where either the context or the component crashes: the component could fail or the context can fail prior to calling in epre\src{e_{pre}} or it can fail in epost\src{e_{post}} after the component returned. Lastly, S-eprog-fail3\src{S}\text{-eprog-fail3} is a rather nasty/lazy rule that exists for technical reasons. The rule itself is “wrong” in the sense that not all ill-typed programs are able to reduce the first part of the context epre\src{e_{pre}}. So, pedantically, there should be yet another rule for ill-typed programs where epre\src{e_{pre}} may fail. Anyhow, this situation could/should also be a lemma that is proven somewhere, i.e. the component should fail if the whole program is ill-typed, and this is something that is preferrable, but not done here, out of lazyness. The main secure compilation proof is not really affected by this. :sweat_smile:

Robust satisfaction , for S\src{S}-level programs is now defined as:

Def. $\src{S}$-level Robust Property Satisfaction

p robustly satisfies π iff C,if prog C p a r, then aπ \src{p}\text{ robustly satisfies }\pi \text{ iff } \forall \src{C},\text{if }\src{prog\ C\ p \progarrow{\trace} r}\text{, then }\trace\in\pi

T\trg{T} Definitions

T\trg{T} Syntax

To make compilation at least somewhat interesting, let’s make T\trg{T} a stack-machine. It is worth noting that, while we do have control here over the (formal) definition of the target language of the soon-to-be-defined compiler, in practice, it’s only S\src{S} that you have control over, if at all. More often than not for practically relevant languages, even S\src{S} is specified by, e.g., some ISO standard ( Citation: WG14/N1256 2007 September WG14/N1256. 2007 September . Programming languages — C ) . Anyways, the commands of T\trg{T} are as follows:

(Values) v:=truefalsen          , where n is a number(Types) τ:=NB(Commands) c:=push vbop aborthas τquote [c]  unquoteifget xnegb(Components) p:=(x;c)      (Contexts) C:=(y;c1;c2)(Whole Programs) W:=prog C p\begin{aligned} \mi{(\text{Values})}\ \trg{v} &:= \trg{true} \mid \trg{false} \mid \trg{n} \ \ \ \ \ \ \ \ \ \ \text{, where } \trg{n} \text{ is a number} \\ \mi{(\text{Types})}\ \trg{\tau} &:= \trg{\nat} \mid \trg{\bool} \\ \mi{(\text{Commands})}\ \trg{c} &:= \trg{push\ v} \mid \trg{bop\ \oplus} \mid \trg{abort} \mid \trg{has\ \tau} \mid \trg{quote\ [\overline{c}]} \\ &\ \ \mid \trg{unquote} \mid \trg{if} \mid \trg{get\ x} \mid \trg{negb} \\ \mi{(\text{Components})}\ \trg{p} &:= (\trg{x};\trg{\overline{c}})\ \ \ \ \ \ \mi{(\text{Contexts})}\ \trg{C} := (\trg{y};\trg{\overline{c_1}};\trg{\overline{c_2}}) \\ \mi{(\text{Whole Programs})}\ \trg{W} &:= \trg{prog\ C\ p} \end{aligned}
(** Values are numbers or booleans. *)
Inductive value : Type :=
| Vnat : nat -> value
| Vbool : bool -> value
.
(** _Dynamic_ Types of T *)
Inductive ty : Type :=
| Tnat : ty
| Tbool : ty
.
(** List of commands available in T. The abstract syntax is just a list *)
Inductive cmd : Type :=
| Cpush : value -> cmd
| Cbop : binopsymb -> cmd
| Cabort : cmd
| Chas : ty -> cmd
| Cquote : list cmd -> cmd
| Cunquote : cmd
| Cif : cmd
| Cget : vart -> cmd
| Cnegb : cmd
.
(** Components *)
Definition symbol : Type := vart * list cmd.

(** Viable contexts that can be linked against a symbol. *)
Inductive LinkageContext : Type :=
(* basically models `let y = call foo e__pre in e__post` *)
| LContext (y : vart) (cmds__pre cmds__post : list cmd) : LinkageContext
.
(** Top-level programs *)
Inductive prog : Type := Cprog (C : LinkageContext) (foo : symbol) : prog.

The notation c\trg{\overline{c}} means “list of c\trg{c}”.

T\trg{T} Dynamic Semantics

As before, we first define runtime programs. Contrary to S\src{S}, runtime programs in T\trg{T} depend on a stack and a heap:

(Stack Element) s:=vc(Stack) Ψ:=[]s,Ψ(Heap) Δ:=xv(Runtime Programs) r:=Ψ;Δc:zap:\begin{aligned} \mi{(\text{Stack\ Element})}\ \trg{s} &:= \trg{v} \mid \trg{\overline{c}} & \mi{(\text{Stack})}\ \trg{\Psi} &:= \trg{[\cdot]} \mid \trg{s,\Psi} \\ \mi{(\text{Heap})}\ \trg{\Delta} &:= \trg{\emptyset} \mid \trg{x\mapsto v} & \mi{(\text{Runtime Programs})}\ \trg{r} &:= \trg{\rtexpr{\Psi;\Delta}{\overline{c}}} \mid \trg{\rtexpr{}{:zap:}}& \end{aligned}
(** Stack elements are either values or code. *)
Inductive stack_el : Type :=
| Sval : value -> stack_el
| Scode : list cmd -> stack_el
.
Definition stack : Type := list stack_el.
Definition store : Type := option (vart * value).

(** A runtime program is a stack and heap with a list of commands or a crashed state. *)
Inductive rtprog : Type :=
| RPTerm (Ψ : stack) (Δ : store) (cmds : list cmd)
| RPCrash
.

The “heap” is a singleton, it can store at most one variable. Stacks may either contain a list of commands as one element or a value.

Since the trace-models of S\src{S} and T\trg{T} are exactly the same, they are not listed in the above definition. Just color everything related to the trace model in the previous definition , red.

In terms of evaluation, the semantics does not need contexts, since it’s just popping off commands off of the list of commands in a given runtime program. The context is simply the remaining list of commands, but this is being taken care of by the reflexive-transitive closure as well as the step-indexed version. Most commands are straightforward:

 
true,Ψ;Δnegb,c ectx false,Ψ;Δc\trg{\rtexpr{true,\Psi;\Delta}{negb,\overline{c}}\ectxsteparrow{}\rtexpr{false,\Psi;\Delta}{\overline{c}}}
(T-e-negb-true)\text{\footnotesize($\trg{T}$-e-negb-true)}
 
false,Ψ;Δnegb,c ectx true,Ψ;Δc\trg{\rtexpr{false,\Psi;\Delta}{negb,\overline{c}}\ectxsteparrow{}\rtexpr{true,\Psi;\Delta}{\overline{c}}}
(T-e-negb-false)\text{\footnotesize($\trg{T}$-e-negb-false)}
 
Ψ;xvget x,c ectx v,Ψ;xvc\trg{\rtexpr{\Psi;x\mapsto v}{get\ x,\overline{c}}\ectxsteparrow{}\rtexpr{v,\Psi;x\mapsto v}{\overline{c}}}
(T-e-get)\text{\footnotesize($\trg{T}$-e-get)}
 
true,cbdy,Ψ;Δif,c ectx Ψ;Δcbdy,c\trg{\rtexpr{true,\overline{c_{bdy}},\Psi;\Delta}{if,\overline{c}}\ectxsteparrow{}\rtexpr{\Psi;\Delta}{\overline{c_{bdy}},\overline{c}}}
(T-e-if-true)\text{\footnotesize($\trg{T}$-e-if-true)}
 
false,cbdy,Ψ;Δif,c ectx Ψ;Δc\trg{\rtexpr{false,\overline{c_{bdy}},\Psi;\Delta}{if,\overline{c}}\ectxsteparrow{}\rtexpr{\Psi;\Delta}{\overline{c}}}
(T-e-if-false)\text{\footnotesize($\trg{T}$-e-if-false)}
 
c1,Ψ;Δunquote,c2 ectx Ψ;Δc1,c2\trg{\rtexpr{\overline{c_{1}},\Psi;\Delta}{unquote,\overline{c_2}}\ectxsteparrow{}\rtexpr{\Psi;\Delta}{\overline{c_1},\overline{c_2}}}
(T-e-unquote)\text{\footnotesize($\trg{T}$-e-unquote)}
 
Ψ;Δquote [c1],c2 ectx c1,Ψ;Δc2\trg{\rtexpr{\Psi;\Delta}{quote\ [\overline{c_1}],\overline{c_2}}\ectxsteparrow{}\rtexpr{\overline{c_1},\Psi;\Delta}{\overline{c_2}}}
(T-e-quote)\text{\footnotesize($\trg{T}$-e-quote)}
 
Ψ;Δpush v,c ectx v,Ψ;Δc\trg{\rtexpr{\Psi;\Delta}{push\ v,\overline{c}}\ectxsteparrow{}\rtexpr{v,\Psi;\Delta}{\overline{c}}}
(T-e-push)\text{\footnotesize($\trg{T}$-e-push)}
eval_binop(;n1;n2)=v\text{eval\_binop}(\trg{\oplus};\trg{n_1};\trg{n_2})=\trg{v}
n1,n2,Ψ;Δbop ,c ectx v,Ψ;Δc\trg{\rtexpr{n_1,n_2,\Psi;\Delta}{bop\ \oplus,\overline{c}}\ectxsteparrow{}\rtexpr{v,\Psi;\Delta}{\overline{c}}}
(T-e-bop)\text{\footnotesize($\trg{T}$-e-bop)}
 
Ψ;Δabort,c ectx \trg{\rtexpr{\Psi;\Delta}{abort,\overline{c}}\ectxsteparrow{}\rtexpr{}{⚡}}
(T-e-abort)\text{\footnotesize($\trg{T}$-e-abort)}
 
n,Ψ;Δhas N,c ectx true,Ψ;Δc\trg{\rtexpr{n,\Psi;\Delta}{has\ \nat,\overline{c}}\ectxsteparrow{}\rtexpr{true,\Psi;\Delta}{\overline{c}}}
(T-e-hasN-true)\text{\footnotesize($\trg{T}$-e-has$\nat$-true)}
sn\trg{s}\not=\trg{n}
s,Ψ;Δhas N,c ectx false,Ψ;Δc\trg{\rtexpr{s,\Psi;\Delta}{has\ \nat,\overline{c}}\ectxsteparrow{}\rtexpr{false,\Psi;\Delta}{\overline{c}}}
(T-e-hasN-false)\text{\footnotesize($\trg{T}$-e-has$\nat$-false)}
strue and sfalse\begin{array}{ rcl }\trg{s}\not=\trg{true} & \text{ and } & \trg{s}\not=\trg{false}\end{array}
s,Ψ;Δhas B,c ectx false,Ψ;Δc\trg{\rtexpr{s,\Psi;\Delta}{has\ \bool,\overline{c}}\ectxsteparrow{}\rtexpr{false,\Psi;\Delta}{\overline{c}}}
(T-e-hasB-false)\text{\footnotesize($\trg{T}$-e-has$\bool$-false)}
s=true or s=false\begin{array}{ rcl }\trg{s}=\trg{true} & \text{ or } & \trg{s}=\trg{false}\end{array}
s,Ψ;Δhas B,c ectx true,Ψ;Δc\trg{\rtexpr{s,\Psi;\Delta}{has\ \bool,\overline{c}}\ectxsteparrow{}\rtexpr{true,\Psi;\Delta}{\overline{c}}}
(T-e-hasB-true)\text{\footnotesize($\trg{T}$-e-has$\bool$-true)}
(* Contextual steps just eat off the top of the list, i.e., the context is the rest of the list. *)
Inductive estep : EctxStep rtprog :=
| e_negb : forall: stack) (Δ : store) (cmds : list cmd) (b : bool),
  s(Sval b :: Ψ ; Δ ▷ Cnegb :: cmds) ==[]==> s(Sval (negb b) :: Ψ ; Δ ▷ cmds)
| e_get : forall: stack) (Δ : store) (cmds : list cmd) (x : vart) (v : value),
  Some (x, v) = Δ ->
  s(Ψ ; Δ ▷ Cget x :: cmds) ==[]==> s(Sval v :: Ψ ; Δ ▷ cmds)
| e_if_true : forall: stack) (Δ : store) (cmds thens : list cmd),
  s(Sval true :: Scode thens :: Ψ ; Δ ▷ Cif :: cmds) ==[]==> s(Ψ ; Δ ▷ thens ++ cmds)
| e_if_false : forall: stack) (Δ : store) (cmds thens : list cmd),
  s(Sval false :: Scode thens :: Ψ ; Δ ▷ Cif :: cmds) ==[]==> s(Ψ ; Δ ▷ cmds)
| e_unquote : forall: stack) (Δ : store) (cmds what : list cmd),
  s(Scode what :: Ψ ; Δ ▷ Cunquote :: cmds) ==[]==> s(Ψ ; Δ ▷ what ++ cmds)
| e_quote : forall: stack) (Δ : store) (cmds what : list cmd),
  s(Ψ ; Δ ▷ Cquote what :: cmds) ==[]==> s(Scode what :: Ψ ; Δ ▷ cmds)
| e_hasnat_true : forall: stack) (Δ : store) (cmds : list cmd) (s : stack_el) (n : nat),
  s = Sval n ->
  s(s :: Ψ ; Δ ▷ Chas Tℕ :: cmds) ==[]==> s(Sval true :: Ψ ; Δ ▷ cmds)
| e_hasnat_false : forall: stack) (Δ : store) (cmds : list cmd) (s : stack_el),
  (forall (n : nat), s <> Sval n) ->
  s(s :: Ψ ; Δ ▷ Chas Tℕ :: cmds) ==[]==> s(Sval false :: Ψ ; Δ ▷ cmds)
| e_hasbool_true : forall: stack) (Δ : store) (cmds : list cmd) (s : stack_el) (b : bool),
  s = Sval b ->
  s(s :: Ψ ; Δ ▷ Chas T𝔹 :: cmds) ==[]==> s(Sval true :: Ψ ; Δ ▷ cmds)
| e_hasbool_false : forall: stack) (Δ : store) (cmds : list cmd) (s : stack_el),
  (forall (b : bool), s <> Sval b) ->
  s(s :: Ψ ; Δ ▷ Chas T𝔹 :: cmds) ==[]==> s(Sval false :: Ψ ; Δ ▷ cmds)
| e_abort : forall: stack) (Δ : store) (cmds : list cmd),
  s(Ψ ; Δ ▷ Cabort :: cmds) ==[ Scrash ]==> ▷↯
| e_bop : forall: stack) (Δ : store) (cmds : list cmd) (n1 n2 : nat) (b : binopsymb) (v : value),
  Some v = eval_binop b n1 n2 ->
  s(Sval n2 :: Sval n1 :: Ψ ; Δ ▷ Cbop b :: cmds) ==[]==> s(Sval v :: Ψ ; Δ ▷ cmds)
| e_push : forall: stack) (Δ : store) (cmds : list cmd) (v : value),
  s(Ψ ; Δ ▷ Cpush v :: cmds) ==[]==> s(Sval v :: Ψ ; Δ ▷ cmds)
. 

Quotation can push lists of commands onto the stack and is a necessity for non-trivial control-flow via the if\trg{if} command. if\trg{if} always expects at least two elements on the stack, where the first is a bool and the second gives the consequence that should be run next if that bool is true. Typechecking can only happen dynamically by inspecting the shape of the top-element on the stack, which ultimately gets replaced by the typecheck with a boolean that indicates wether the stack element has said type or not. Lists of commands, that are on the stack, do not have any type and yield for both has N\trg{has\ \nat} and has B\trg{has\ \bool} the value false\trg{false}.

Lastly, we need to define execution of whole programs, which is entirely similar to that of S\src{S}, albeit without typechecks. Instead of substituting the variable for the value when simulating the single function call, the heap is populated with the value.

[];cpre nectx vpre,Ψ;[]Ψ;xvprecfoo nectx vfoo,Ψ;xvpre[]vfoo,Ψ;cpost nectx vpost,[];[]\begin{array}{ c }\trg{\rtexpr{[\cdot];\emptyset}{\overline{c_{pre}}} \nstepsarrow{}{n}\rtexpr{v_{pre},\Psi;\emptyset}{[\cdot]} } \\ \trg{ \rtexpr{\Psi;x\mapsto v_{pre}}{\overline{c_{foo}}} \nstepsarrow{}{n'} \rtexpr{v_{foo},\Psi;x\mapsto v_{pre}}{[\cdot]} } \\ \trg{ \rtexpr{v_{foo},\Psi;\emptyset}{\overline{c_{post}}} \nstepsarrow{}{n''}\rtexpr{v_{post},[\cdot];\emptyset}{[\cdot]}}\end{array}
prog (y;cpre;cpost) (x;cfoo) StartCall vpreRet vfooEnd vpost vpost,[];[]\trg{prog\ (y;\overline{c_{pre}};\overline{c_{post}})\ (x;\overline{c_{foo}}) \progarrow{Start\cdot Call\ v_{pre}\cdot Ret\ v_{foo}\cdot End\ v_{post}}\rtexpr{v_{post},[\cdot];\emptyset}{[\cdot]} }
(T-eprog-success)\text{\footnotesize($\trg{T}$-eprog-success)}
[];cpre nectx \begin{array}{ c }\trg{\rtexpr{[\cdot];\emptyset}{\overline{c_{pre}}} \nstepsarrow{⚡}{n}\rtexpr{}{⚡} }\end{array}
prog (y;cpre;cpost) (x;cfoo) Start \trg{prog\ (y;\overline{c_{pre}};\overline{c_{post}})\ (x;\overline{c_{foo}}) \progarrow{Start\cdot ⚡}\rtexpr{}{⚡} }
(T-eprog-fail0)\text{\footnotesize($\trg{T}$-eprog-fail0)}
[];cpre nectx vpre,Ψ;[]Ψ;xvprecfoo nectx \begin{array}{ c }\trg{\rtexpr{[\cdot];\emptyset}{\overline{c_{pre}}} \nstepsarrow{}{n}\rtexpr{v_{pre},\Psi;\emptyset}{[\cdot]} } \\ \trg{ \rtexpr{\Psi;x\mapsto v_{pre}}{\overline{c_{foo}}} \nstepsarrow{⚡}{n'} \rtexpr{}{⚡} }\end{array}
prog (y;cpre;cpost) (x;cfoo) StartCall vpre \trg{prog\ (y;\overline{c_{pre}};\overline{c_{post}})\ (x;\overline{c_{foo}}) \progarrow{Start\cdot Call\ v_{pre}\cdot ⚡}\rtexpr{}{⚡} }
(T-eprog-fail1)\text{\footnotesize($\trg{T}$-eprog-fail1)}
[];cpre nectx vpre,Ψ;[]Ψ;xvprecfoo nectx vfoo,Ψ;xvpre[]vfoo,Ψ;cpost nectx \begin{array}{ c }\trg{\rtexpr{[\cdot];\emptyset}{\overline{c_{pre}}} \nstepsarrow{}{n}\rtexpr{v_{pre},\Psi;\emptyset}{[\cdot]} } \\ \trg{ \rtexpr{\Psi;x\mapsto v_{pre}}{\overline{c_{foo}}} \nstepsarrow{}{n'} \rtexpr{v_{foo},\Psi;x\mapsto v_{pre}}{[\cdot]} } \\ \trg{ \rtexpr{v_{foo},\Psi;\emptyset}{\overline{c_{post}}} \nstepsarrow{⚡}{n''}\rtexpr{}{⚡}}\end{array}
prog (y;cpre;cpost) (x;cfoo) StartCall vpreRet vfoo \trg{prog\ (y;\overline{c_{pre}};\overline{c_{post}})\ (x;\overline{c_{foo}}) \progarrow{Start\cdot Call\ v_{pre}\cdot Ret\ v_{foo}\cdot ⚡}\rtexpr{}{⚡} }
(T-eprog-fail2)\text{\footnotesize($\trg{T}$-eprog-fail2)}
(** Defines top-level program execution. *)
Inductive progstep : prog -> tracepref -> rtprog -> Prop :=
| prog_step_success : forall (x__foo y : vart) (cmds__pre cmds__post cmds__foo : list cmd) (foo : symbol) (v v__pre v__foo : value)
: stack) (n__pre n__foo n__post : nat),
  (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
     To actually emit traces, we just hardcode them at the top-level... *)
  x__foo = vart_of_symbol foo ->
  cmds__foo = body_of_symbol foo ->
  (* setup the argument for the function call *)
  s(nil ; None ▷ cmds__pre) =( n__pre )=[ nil ]==> (s(Sval v__pre :: Ψ ; Nonenil)) ->
  (* do the function call *)
  s(Ψ ; Some (x__foo, v__pre) ▷ cmds__foo) =( n__foo )=[ nil ]==> s(Sval v__foo :: Ψ ; Some (x__foo, v__pre) ▷ nil) ->
  (* something may happen after the function call *)
  s(Sval v__foo :: Ψ ; None ▷ cmds__post) =( n__post )=[ nil ]==> s(Sval v :: nil ; Nonenil) ->
  progstep (Cprog (LContext y cmds__pre cmds__post) foo)
           (Sstart :: Scall Qctxtocomp (sv v__pre) :: Sret Qcomptoctx (sv v__foo) :: Send (sv v) :: nil)
           (RPTerm (Sval v :: nil) None nil)
| prog_step_failure0 : forall (y : vart) (cmds__pre cmds__post : list cmd) (foo : symbol) (n__pre : nat),
  (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
     To actually emit traces, we just hardcode them at the top-level... *)
  (* setup the argument for the function call ; fails *)
  s(nil ; None ▷ cmds__pre) =( n__pre )=[ Scrash :: nil ]==> (▷↯) ->
  progstep (Cprog (LContext y cmds__pre cmds__post) foo)
           (Sstart :: Scrash :: nil)
           (RPCrash)
| prog_step_failure1 : forall (x__foo y : vart) (cmds__pre cmds__post cmds__foo : list cmd) (foo : symbol) (v__pre : value)
                          (Ψ__pre : stack) (n__pre n__foo : nat),
  (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
     To actually emit traces, we just hardcode them at the top-level... *)
  x__foo = vart_of_symbol foo ->
  cmds__foo = body_of_symbol foo ->
  (* setup the argument for the function call *)
  s(nil ; None ▷ cmds__pre) =( n__pre )=[ nil ]==> (s(Sval v__pre :: Ψ__pre ; Nonenil)) ->
  (* do the function call ; fails *)
  s(Ψ__pre ; Some(x__foo, v__pre) ▷ cmds__foo) =( n__foo )=[ Scrash :: nil ]==> (▷↯) ->
  progstep (Cprog (LContext y cmds__pre cmds__post) foo)
           (Sstart :: Scall Qctxtocomp (sv v__pre) :: Scrash :: nil)
           (RPCrash)
| prog_step_failure2 : forall (x__foo y : vart) (cmds__pre cmds__post cmds__foo : list cmd) (foo : symbol) (v__pre v__foo : value)
                          (Ψ__pre : stack) (n__pre n__foo n__post : nat),
  (* with traces, we cheat a little. All steps of the language are internal (i.e., they do not produce an action)
     To actually emit traces, we just hardcode them at the top-level... *)
  x__foo = vart_of_symbol foo ->
  cmds__foo = body_of_symbol foo ->
  (* setup the argument for the function call *)
  s(nil ; None ▷ cmds__pre) =( n__pre )=[ nil ]==> (s(Sval v__pre :: Ψ__pre ; Nonenil)) ->
  (* do the function call *)
  s(Ψ__pre ; Some(x__foo, v__pre) ▷ cmds__foo) =( n__foo )=[ nil ]==> s(Sval v__foo :: Ψ__pre ; Some(x__foo, v__pre) ▷ nil) ->
  (* something may happen after the function call; fails *)
  s(Sval v__foo :: Ψ__pre ; None ▷ cmds__post) =( n__post )=[ Scrash :: nil ]==> (▷↯) ->
  progstep (Cprog (LContext y cmds__pre cmds__post) foo)
           (Sstart :: Scall Qctxtocomp (sv v__pre) :: Sret Qcomptoctx (sv v__foo) :: Scrash :: nil)
           (RPCrash)
.

Compiler from S\src{S} to T\trg{T}

The compilation of values is a trivial recoloring, although one could map true\src{true} to false\trg{false} and false\src{false} to true\trg{true}, or some number, to complicate matters (not done here, because why??).

(v Compiler) truevST= truefalsevST= falsenvST= n\begin{aligned} \mi{(\src{v}\text{ Compiler})}\ & \comp{\src{S}}{\trg{T}}{\src{true}}_v =\ \trg{true} \\ & \comp{\src{S}}{\trg{T}}{\src{false}}_v =\ \trg{false} \\ & \comp{\src{S}}{\trg{T}}{\src{n}}_v =\ \trg{n} \end{aligned}
(** Compiling values is just a "recoloring", nothing interesting happening here *)
Definition compile_v (v : S.value) : T.value :=
match v with
| S.Vnat n => T.Vnat n
| S.Vbool b => T.Vbool b
end
.

Now, for expressions, most parts are simple, since they do not involve any structurual recursion with regards to the expression syntax. By that metric, the binary operations as well as the conditionals are the interesting cases.

(e Compiler)val veST= push vvST,[]var xeST= x,[]aborteST= abort,[]e1e2eST= e2eST,e1eST,bop ,[]if e1 then e2 else e3eST= quote e2eST,e1eST,if, quote e3eST,negb,if,[]\begin{array}{rrcl} &\mi{(\src{e}\text{ Compiler})}&&\\[1em] & \comp{\src{S}}{\trg{T}}{\src{val\ v}}_e &=&\ \trg{push}\ \comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{,[\cdot]} \\ & \comp{\src{S}}{\trg{T}}{\src{var\ x}}_e &=&\ \trg{x,[\cdot]} \\ & \comp{\src{S}}{\trg{T}}{\src{abort}}_e &=&\ \trg{abort,[\cdot]} \\ & \comp{\src{S}}{\trg{T}}{\src{e_1\oplus e_2}}_e &=&\ \comp{\src{S}}{\trg{T}}{\src{e_2}}_e,\comp{\src{S}}{\trg{T}}{\src{e_1}}_e,\trg{bop\ \oplus},\trg{[\cdot]} \\ & \comp{\src{S}}{\trg{T}}{\src{if\ e_1\ then\ e_2\ else\ e_3}}_e &=&\ \trg{quote\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,if,}\\ & && \ \trg{quote\ }\comp{\src{S}}{\trg{T}}{\src{e_3}}_e\trg{,negb,}\trg{if,[\cdot]} \\ \end{array}
(** Compiling expressions the more or less standard way when compiling to stack-based languages *)
Fixpoint compile_e (e : S.expr) : list T.cmd :=
match e with
| S.Xval v => T.Cpush (compile_v v) :: nil
| S.Xvar x => T.Cget x :: nil
| S.Xbinop symb e1 e2 => compile_e e1 ++ compile_e e2 ++ T.Cbop symb :: nil
| S.Xif e1 e2 e3 => T.Cquote (compile_e e2) :: compile_e e1 ++ T.Cif
                  :: T.Cquote (compile_e e3) :: compile_e e1 ++ T.Cnegb :: T.Cif :: nil
| S.Xabort => T.Cabort :: nil
end
.

The compilation of binary expressions converts from infix-style to somewhat reverse polish. Importantly (and contrary to reverse polish), the second subexpression needs to be pushed on the stack prior to the first subexpression. Consider the compilation of val 1+val 2\src{val\ 1+val\ 2}, which will result in the list of commands push 2,push 1,bop +\trg{push\ 2,push\ 1,bop\ +}. Starting with the empty stack and empty heap, by transitivity of  aectx \trg{\stepsarrow{\trace}} and applying T\trg{T}-e-push , twice and concluding with T\trg{T}-e-bop , we have:

[];push 2,push 1,bop +,[] ectx 2,[];push 1,bop +,[] ectx 1,2,[];bop +,[] ectx 3,[];[]\begin{array}{lc} \trg{\rtexpr{[\cdot];\emptyset}{push\ 2,push\ 1,bop\ +,[\cdot]}}&\trg{\stepsarrow{}}\\ \trg{\rtexpr{2,[\cdot];\emptyset}{push\ 1,bop\ +,[\cdot]}}&\trg{\stepsarrow{}}\\ \trg{\rtexpr{1,2,[\cdot];\emptyset}{bop\ +,[\cdot]}}&\trg{\stepsarrow{}}\\ \trg{\rtexpr{3,[\cdot];\emptyset}{[\cdot]}}& \end{array}

Similarily for conditionals. In contrast to S\src{S}, where conditionals always have both consequence and alternative cases, the conditional in T\trg{T} only has a consequence. Because of this, a single S\src{S} conditional compiles into two T\trg{T} conditionals, where the second one uses the negated version of the compilation of the conditional.

Finally, compilation of components relies on the S\src{S} components to be well-typed, i.e., there is some function check\operatorname{check} which satisfies the equivalence check(p)=τ1τ2\operatorname{check}(\src{p})=\src{\tau_1\to\tau_2} iff pp:τ1τ2\vdash_p\src{p}:\src{\tau_1\to\tau_2}.

(Wrapper)wrap(x)=quote [abort,[]],get xST,has N,negb,if,[](p Compiler)(x;τ1;τ2;ebdy)vST= {wrap(x),e_bdyST,[]if check((x;τ1;τ2;e_bdy))=τ1τ2abort,[]otherwise\begin{aligned} \mi{(\text{Wrapper})}&\\ &\operatorname{wrap}(\src{x})=\trg{quote\ [abort,[\cdot]],get\ }\comp{\src{S}}{\trg{T}}{\src{x}}\trg{, has\ \nat, negb, if, [\cdot]}\\[1em] \mi{(\src{p}\text{ Compiler})}&\\[-2ex] &\comp{\src{S}}{\trg{T}}{(\src{x};\src{\tau_1};\src{\tau_2};\src{e_{bdy}})}_v = \ \begin{cases} \operatorname{wrap}(\src{x})\trg{,}\comp{\src{S}}{\trg{T}}{\src{e\_{bdy}}}\trg{,[\cdot]} & \text{if }\operatorname{check}((\src{x};\src{\tau_1};\src{\tau_2};\src{e\_{bdy}}))=\src{\tau_1\to\tau_2} \\ \trg{abort,[\cdot]} & \text{otherwise} \end{cases} \\ \end{aligned}
(** Assume there exists an implementation of a typechecker for S. *)
(** This is a nice exercise to try and extend S/static.v with a `Fixpoint`
  defining the `check` axiom as well as a lemma proving `check_equiv_Scheck`. *)
Axiom check : S.symbol -> bool.
Axiom check_equiv_Scheck : forall (foo : S.symbol), check foo = true <-> S.check_symb foo.

(** To compile symbols, the compiler has to ensure that the target context cannot pass ill-typed arguments. *)
(** Furthermore, it requires that the component is well-typed. *)
Definition compile_symb' (foo : S.symbol) : T.symbol :=
let x__foo := S.vart_of_symbol foo in
let e__foo := S.expr_of_symbol foo in
let wrapper := T.Cquote (T.Cabort :: nil) :: T.Cget x__foo
  :: T.Chas T.Tnat :: T.Cnegb :: T.Cif :: nil in
(x__foo, wrapper ++ compile_e e__foo)
.
Definition compile_symb (foo : S.symbol) : T.symbol :=
if check foo then
  compile_symb' foo
else
  (S.vart_of_symbol foo, T.Cabort :: nil)
.

Crucially, the compiler inserts wrapper code to ensure that the component is robust with respect to well-typing. Consider an insecure compiler (that is nonetheless a correct compiler) which does not insert the wrapper code. Suppose it compiles the S\src{S}-level component (x;N;N;val 1var x)(\src{x};\src{\nat};\src{\nat};\src{val\ 1 - var\ x}). The compiled component of this is (x;get x,push 1,bop ,[])(\trg{x};\trg{get\ x,push\ 1,bop\ -,[\cdot]}). Now, note that there is no possible S\src{S}-level context C\src{C} that when linked to this component renders the resulting whole program ill-typed (see S\src{S}-t-whole and S\src{S}-t-context ). However, consider the T\trg{T}-level context (y;true;y)(\trg{y};\trg{true};\trg{y}) and observe the component-level execution (see T\trg{T}-eprog-success ) of the whole program that results from linking that context with the above compiled component:

Ψ;xtrueget x,push 1,bop ,[] ectx true,Ψ;xtruepush 1,bop ,[] ectx 1,true,Ψ;xtruebop ,[]stuck!\begin{array}{lc} \trg{\rtexpr{\Psi;x\mapsto true}{get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{true,\Psi;x\mapsto true}{push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{1,true,\Psi;x\mapsto true}{bop\ -,[\cdot]} } & \text{stuck!} \end{array}

The rule T\trg{T}-e-bop is the only reasonable candidate to reduce this further, but it does not apply, since binary operations with mixed arguments (one a number, the other a boolean) are left undefined. Put differently, the values mismatch, i.e., there is a (dynamic) typing error! So, with this, it is easy to see that the compiler without the wrapper code is not enough to preserve type safety: While type safety does hold robustly in S\src{S}, it does not in T\trg{T}. So, the compiler has to do something to ensure the robust preservation of type safety.

Let’s see what happens with the execution of the component when the wrapper code is prepended to the component, the bad context stays the same:

Ψ;xtruequote [abort,[]],get x,has N,negb,if,get x,push 1,bop ,[] ectx [abort,[]],Ψ;xtrueget x,has N,negb,if,get x,push 1,bop ,[] ectx true,[abort,[]],Ψ;xtruehas N,negb,if,get x,push 1,bop ,[] ectx false,[abort,[]],Ψ;xtruenegb,if,get x,push 1,bop ,[] ectx true,[abort,[]],Ψ;xtrueif,get x,push 1,bop ,[] ectx Ψ;xtrueabort,get x,push 1,bop ,[] ectx \begin{array}{lc} \trg{\rtexpr{\Psi;x\mapsto true}{quote\ [abort,[\cdot]],get\ x,has\ \nat,negb,if,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{[abort,[\cdot]],\Psi;x\mapsto true}{get\ x,has\ \nat,negb,if,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{true,[abort,[\cdot]],\Psi;x\mapsto true}{has\ \nat,negb,if,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{false,[abort,[\cdot]],\Psi;x\mapsto true}{negb,if,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{true,[abort,[\cdot]],\Psi;x\mapsto true}{if,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{}} \\ \trg{\rtexpr{\Psi;x\mapsto true}{abort,get\ x,push\ 1,bop\ -,[\cdot]} } & \trg{\stepsarrow{⚡}} \\ \trg{\rtexpr{}{⚡} } & \\ \end{array}

With the wrapper code, the compiler prevents the execution of any component-level code past this wrapper if the precondition that the argument has type “number” is not met.

The rest of this post now attempts to sketch a proof that this compiler can indeed be considered “secure”, because it robustly preserves our property of interest: well-typing.

Simulations

An incorrect, but secure, compiler is a questionable goal. S\src{S} components could always compile to abort\trg{abort} and - heyoho - security! But it is immediate that such a compiler is undesirable. That’s why we also take a small detour to compiler correctness, which is actually a spectrum ( Citation: Patterson and Ahmed 2019 jul Daniel Patterson and Amal Ahmed. 2019 jul . The next 700 Compiler Correctness Theorems (Functional Pearl) ) and not a binary property (like, it’s not just “yeah it is correct”, it’s also things like “it’s kinda correct”).

Intuitively, what is wanted from compiler correctness, is that “anything the source does, the compiled version also does”. This correspondence cannot easily happen in lockstep: either source or target language have different features and the execution may differ for a bit. For example, consider a relation between runtime expressions and runtime programs (\approx) which, in whatever sense, shall mean “these two are the quote-on-quote same”. Then, the following figure depicts a very simple situation in languages S\src{S} and T\trg{T} where the S\src{S} program takes less steps than the accompanying T\trg{T} program. Each downward arrow is a  εectx \ectxsteparrow{\varepsilon}-step in either S\src{S} or T\trg{T}.

Because of these kinds of discrepancies, there are several different flavors of simulation diagrams in literature. For our purposes, we are only interested in a straightforward forward simulation (:sweat_smile:) as well as a backward simulation. The forward simulation intuitively captures “the source does steps, so does the compiled version” and the backward simulation is dual, i.e., “the compiled version does steps, so does the source”. We’ll see their lemma statements and a proofsketch for both of these in the following subsections.

Forward Simulation

Suppose we have an execution e nectx v\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v}} that reaches the final runtime expression v\src{\rtexpr{}{v}} in nn steps. Morally, the result we want is that there is some n\trg{n'} such that [];  eeST nectx [];  veST\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{\nstepsarrow{}{n'}}\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{v}}_e. But this does not work out even in one of the most simple cases where e=val 1\src{e}=\src{val\ 1}:

 val 1 0ectx [];  val 1eST= val 1[];push 1 1ectx 1,[];[] 0ectx 1,[];[]\begin{array}{lc|lc} \src{\triangleright\ val\ 1} & \src{\nstepsarrow{}{0}} & \trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{val\ 1}}_e & = \\ \src{\triangleright\ val\ 1} & & \trg{\rtexpr{[\cdot];\emptyset}{push\ 1}} & \trg{\nstepsarrow{}{1}} \\ &&\trg{\rtexpr{1,[\cdot];\emptyset}{[\cdot]}} & \trg{\nstepsarrow{}{0}} \\ &&\trg{\rtexpr{1,[\cdot];\emptyset}{[\cdot]}} & \\ \end{array}

While the S\src{S}-level execution is reflexive on 1\src{\rtexpr{}{1}}, the corresponding, compiled version does not have a reflexive T\trg{T}-level execution. There, the final state is actually 1,[][]\trg{\rtexpr{1,[\cdot]}{[\cdot]}}. This is easy to fix: what we actually want is that there exists an n\trg{n} such that [];  eeST nectx vvST,[];[]\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{\nstepsarrow{}{n}}\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}.

Let’s try to prove:

Lem. Forward Simulation (1st try)

If e nectx v,then n,[];  eeST nectx  vvST,[];[]\begin{array}{l}\text{If }\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v}},\\\text{then }\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}\end{array}

Proof.

Induction on e\src{e}. This section only lists a few cases.

Case e=val v0\src{e}=\src{val\ v_0}:

Have Comment Want
n\src{n}
 : 
Number\text{Number}
v,v0\src{v,v_0}
 : 
S value\src{S}\text{ value}
HexecH_{\src{exec}}
 : 
val v0 nectx val v\src{\rtexpr{}{val\ v_0}\nstepsarrow{}{n}\rtexpr{}{val\ v}}
n,[];  val v0eST nectx  vvST,[];[]\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{val\ v_0}}_e\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}
HexecH_{\src{exec}}'
 : 
v0=v\src{v_0}=\src{v}
’invert’  Hexec\text{'invert' $\ H_{\src{exec}}$}
HexecH_{\src{exec}}''
 : 
n=0\src{n}=\src{0}
(unfold  eST)(\text{unfold $\ \comp{\src{S}}{\trg{T}}{\src{\bullet}}_e$})
n,[];  push vvST nectx  vvST,[];[]\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ push\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}
let  n=1\text{let $\ \trg{n'}=\trg{1}$}
[];  push vvST ectx  vvST,[];[]\trg{[\cdot];\emptyset\ \triangleright\ push\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\ectxsteparrow{}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}
(by  T-e-push)(\text{by $\ \trg{T}\text{-e-push}$})

Case e=e1e2\src{e}=\src{e_1\oplus e_2}:

Have Comment Want
IH1\text{IH}_1
 : 
nIH vIH,if e1 nIHectx val vIH, then nIH,[];e1eST nIHectx vIHvST,[];[]\forall \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_1}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}
IH2\text{IH}_2
 : 
nIH vIH,if e2 nIHectx val vIH, then nIH,[];e2eST nIHectx vIHvST,[];[]\forall \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_2}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}
n\src{n}
 : 
Number\text{Number}
e1,e2\src{e_1,e_2}
 : 
S exprs\src{S}\text{ exprs}
v\src{v}
 : 
S value\src{S}\text{ value}
HexecH_{\src{exec}}
 : 
e1e2 nectx val v\src{\rtexpr{}{e_1\oplus e_2}\nstepsarrow{}{n}\rtexpr{}{val\ v}}
n,[];  e1e2eST nectx  vvST,[];[]\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1\oplus e_2}}_e\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}
m1,m2\src{m_1,m_2}
 : 
S number/value\src{S}\text{ number/value}
’invert’  Hexec\text{'invert' $\ H_{\src{exec}}$}
n1,n2\src{n_1,n_2}
 : 
Number\text{Number}
Hexec1H_{\src{exec_1}}
 : 
e1 n1ectx val m1\src{\rtexpr{}{e_1}\nstepsarrow{}{n_1}\rtexpr{}{val\ m_1}}
Hexec2H_{\src{exec_2}}
 : 
e2 n2ectx val m2\src{\rtexpr{}{e_2}\nstepsarrow{}{n_2}\rtexpr{}{val\ m_2}}
HexecH_{\src{exec}}'
 : 
n=1+n1+n2\src{n}=1+\src{n_1}+\src{n_2}
HexecH_{\src{exec}}''
 : 
v=eval_binop(;m1;m2)\src{v}=\text{eval\_binop}(\src{\oplus};\src{m_1};\src{m_2})
(unfold eST)(\text{unfold$\ \comp{\src{S}}{\trg{T}}{\src{\bullet}}_e$})
n,[];  e2eST,e1eST,bop ,[] nectx  vvST,[];[]\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}
 

Now, as in the base-case, we would like to partially execute the runtime program e2eST,e1eST,bop ,[]\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,[\cdot]}. But, given we don’t know much about e2\src{e_2}, the only hope is to make use of IH2\text{IH}_2 . For this, we specialize IH2\text{IH}_2 with n2\src{n_2} , m2\src{m_2} , and Hexec2H_{\src{exec_2}} to obtain IH2\text{IH}_2':

 
n2\trg{n_2'}
 : 
Number\text{Number}
IH2\text{IH}_2'
 : 
[];e2eST n2ectx m2vST,[];[]\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{\nstepsarrow{}{n_{2}'}}\comp{\src{S}}{\trg{T}}{\src{m_{2}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}
n,[];  e2eST,e1eST,bop ,[] nectx  vvST,[];[]\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}
 

Let’s try and let n=n2+42\trg{n'}=\trg{n_2'}+\trg{42}. By “transitivity” of  ectx n0+m0\trg{\stepsarrow{}{n_0+m_0}} (for any n0,m0\trg{n_0,m_0}), the goal can be split into two parts (given a suitable stack Ψ\trg{\Psi}):

  1. [];  e2eST,e1eST,bop ,[] n2ectx  Ψ;  e1eST,bop ,[]\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}\trg{\nstepsarrow{}{n_2'}\ }\trg{\Psi;\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}

  2. Ψ;  e1eST,bop ,[] 42ectx  vvST,[];[]\trg{\Psi;\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}\trg{\nstepsarrow{}{42}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}

The interesting failure happens in the first goal . Notice how IH2\text{IH}_2' only tells us that a runtime program that has a list of commands e2eST,[]\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,[\cdot]} goes to []\trg{[\cdot]}. Unfortunately, in the case of the goal , however, we want to show that a runtime program with e2eST\comp{\src{S}}{\trg{T}}{\src{e_2}}_e and some remaining list of commands, namely e1eST,bop ,[]\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}, goes to another runtime program with that same remaining list of commands. The proof is stuck.

The proof attempt goes by induction and suffers from the fact that the statement of the lemma is too specific! (lists of commands need to have exactly this shape…) A way to fix this is to just generalize the lemma.

Lem. Forward Simulation (2nd try)

If e nectx v,then n,[];  eeST,c nectx  vvST,[];c\begin{array}{l}\text{If }\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v}},\\\text{then }\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{,\overline{c}}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c}}}\end{array}

Proof.

Induction on e\src{e}. This section continues where the attempted proof earlier failed.

Case e=e1e2\src{e}=\src{e_1\oplus e_2}:

Have Comment Want
IH1\text{IH}_1
 : 
cIH nIH vIH,if e1 nIHectx val vIH, then nIH,[];e1eST,cIH nIHectx vIHvST,[];cIH,[]\forall \trg{\overline{c_{IH}}}\ \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_1}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,\overline{c_{IH}}}\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c_{IH}},[\cdot]}}
IH2\text{IH}_2
 : 
cIH nIH vIH,if e2 nIHectx val vIH, then nIH,[];e2eST,cIH nIHectx vIHvST,[];cIH,[]\forall \trg{\overline{c_{IH}}}\ \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_2}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,\overline{c_{IH}}}\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c_{IH}},[\cdot]}}
 

Both IH1\text{IH}_1 and IH2\text{IH}_2 are more general this time.

 
n\src{n}
 : 
Number\text{Number}
e1,e2\src{e_1,e_2}
 : 
S exprs\src{S}\text{ exprs}
v\src{v}
 : 
S value\src{S}\text{ value}
HexecH_{\src{exec}}
 : 
e1e2 nectx val v\src{\rtexpr{}{e_1\oplus e_2}\nstepsarrow{}{n}\rtexpr{}{val\ v}}
n,[];  e1e2eST,c,[] nectx  vvST,[];c,[]\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1\oplus e_2}}_e\trg{,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}
m1,m2\src{m_1,m_2}
 : 
S number/value\src{S}\text{ number/value}
’invert’  Hexec\text{'invert' $\ H_{\src{exec}}$}
n1,n2\src{n_1,n_2}
 : 
Number\text{Number}
Hexec1H_{\src{exec_1}}
 : 
e1 n1ectx val m1\src{\rtexpr{}{e_1}\nstepsarrow{}{n_1}\rtexpr{}{val\ m_1}}
Hexec2H_{\src{exec_2}}
 : 
e2 n2ectx val m2\src{\rtexpr{}{e_2}\nstepsarrow{}{n_2}\rtexpr{}{val\ m_2}}
HexecH_{\src{exec}}'
 : 
n=1+n1+n2\src{n}=1+\src{n_1}+\src{n_2}
HexecH_{\src{exec}}''
 : 
v=eval_binop(;m1;m2)\src{v}=\text{eval\_binop}(\src{\oplus};\src{m_1};\src{m_2})
(unfold eST)(\text{unfold$\ \comp{\src{S}}{\trg{T}}{\src{\bullet}}_e$})
n,[];  e2eST,e1eST,bop ,c,[] nectx  vvST,[];c,[]\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}
 

As earlier, we specialize IH2\text{IH}_2 , but this time with e1eST,bop ,c,[]\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}, n2\src{n_2} , m2\src{m_2} , and Hexec2H_{\src{exec_2}} to obtain IH2\text{IH}_2'. Specialize IH1\text{IH}_1' in similar fashion but with list of commands bop ,[]\trg{bop\ \oplus,[\cdot]} and obtain:

 
n1,n2\trg{n_1',n_2'}
 : 
Number\text{Number}
IH2\text{IH}_2'
 : 
[];e2eST,e1eST,bop ,c,[] n2ectx m2vST,[];,e1eST,bop ,c,[]\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_{2}'}}\comp{\src{S}}{\trg{T}}{\src{m_{2}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{,}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}
IH1\text{IH}_1'
 : 
[];e1eST,bop ,c,[] n1ectx m1vST,[];bop ,c,[]\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_{1}'}}\comp{\src{S}}{\trg{T}}{\src{m_{1}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{bop\ \oplus,\overline{c},[\cdot]}}
n,[];  e2eST,e1eST,bop ,c,[] nectx  vvST,[];c,[]\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}
 

The proof continues by setting n=n2+(n1+1)\trg{n'}=\trg{n_2'+(n_1'+1)} obtaining the goal [];  e2eST,e1eST,bop ,[] n2+(n1+1)ectx  vvST,[];[]\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,[\cdot]}\trg{\nstepsarrow{}{n_2'+(n_1'+1)}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}. By “transitivity” of  n0+m0ectx \trg{\nstepsarrow{}{n_0+m_0}} (for any n0,m0\trg{n_0,m_0}), the goal can be split (as done in the other attempt, this time setting the stack to m2vST,[]\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,[\cdot]}):

  1. [];  e2eST,e1eST,bop ,c,[] n2ectx  m2vST,[];  e1eST,bop ,c,[]\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_2'}\ }\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}

  2. m2vST,[];  e1eST,bop ,c,[] n1+1ectx  vvST,[];c,[]\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_1'+1}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}

The first goal has exactly the shape of IH2\text{IH}_2' , so this follows by assumption. In order to prove the other goal , split again by “transitivity” (using stack m2vST,m1ST,[]\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,}\comp{\src{S}}{\trg{T}}{\src{m_1}}\trg{,[\cdot]}):

  1. m2vST,[];  e1eST,bop ,c,[] n1ectx  m2vST,m1ST,[];bop ,c,[]\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_1'}\ }\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,}\comp{\src{S}}{\trg{T}}{\src{m_1}}\trg{\rtexpr{,[\cdot];\emptyset}{bop\ \oplus,\overline{c},[\cdot]}}

  2. m2vST,m1ST,[];  bop ,c,[] ectx  vvST,[];c,[]\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,}\comp{\src{S}}{\trg{T}}{\src{m_1}}\trg{,[\cdot];\emptyset\ \triangleright\ bop\ \oplus,\overline{c},[\cdot]}\trg{\ectxsteparrow{}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}

Second goal goes by T-e-bop\trg{T}\text{-e-bop} and HexecH_{\src{exec}}'' . For the other goal , we would really like to apply IH1\text{IH}_1' . But, notice how the induction hypothesis requires an empty stack []\trg{[\cdot]}, while the current state of the stack for this goal is m22ST,[]\comp{\src{S}}{\trg{T}}{\src{m_2}}_2\trg{,[\cdot]}. So, the proof is again stuck, because the induction hypothesis is not general enough.

Let’s try again, this time also generalizing the whole statement on stacks:

Lem. Forward Simulation (3rd try)

If e nectx v,then n,Ψ;  eeST,c nectx  vvST,Ψ;c\begin{array}{l}\text{If }\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v}},\\\text{then }\exists\trg{n'},\trg{\Psi;\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{,\overline{c}}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,\Psi;\emptyset}{\overline{c}}}\end{array}

Proof.

Induction on e\src{e}.

Case e=e1e2\src{e}=\src{e_1\oplus e_2}:

Have Comment Want
IH1\text{IH}_1
 : 
cIH ΨIH nIH vIH,if e1 nIHectx val vIH, then nIH,ΨIH;e1eST,cIH nIHectx vIHvST,ΨIH;cIH,[]\forall \trg{\overline{c_{IH}}}\ \trg{\Psi_{IH}}\ \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_1}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{\Psi_{IH};\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,\overline{c_{IH}}}\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,\Psi_{IH};\emptyset}{\overline{c_{IH}},[\cdot]}}
IH2\text{IH}_2
 : 
cIH ΨIH nIH vIH,if e2 nIHectx val vIH, then nIH,ΨIH;e2eST,cIH nIHectx vIHvST,ΨIH;cIH,[]\forall \trg{\overline{c_{IH}}}\ \trg{\Psi_{IH}}\ \src{n_{IH}}\ \src{v_{IH}}, \text{if }\src{\rtexpr{}{e_2}\nstepsarrow{}{n_{IH}}\rtexpr{}{val\ v_{IH}}}\text{, then }\exists \trg{n_{IH}},\trg{\rtexpr{\Psi_{IH};\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,\overline{c_{IH}}}\trg{\nstepsarrow{}{n_{IH}}}\comp{\src{S}}{\trg{T}}{\src{v_{IH}}}_v\trg{\rtexpr{,\Psi_{IH};\emptyset}{\overline{c_{IH}},[\cdot]}}
 

IH1\text{IH}_1 and IH2\text{IH}_2 are, again, more general.

 
n\src{n}
 : 
Number\text{Number}
e1,e2\src{e_1,e_2}
 : 
S exprs\src{S}\text{ exprs}
v\src{v}
 : 
S value\src{S}\text{ value}
HexecH_{\src{exec}}
 : 
e1e2 nectx val v\src{\rtexpr{}{e_1\oplus e_2}\nstepsarrow{}{n}\rtexpr{}{val\ v}}
n,Ψ;  e1e2eST,c,[] nectx  vvST,Ψ;c,[]\exists\trg{n'},\trg{\Psi;\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_1\oplus e_2}}_e\trg{,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,\Psi;\emptyset}{\overline{c},[\cdot]}}
m1,m2\src{m_1,m_2}
 : 
S number/value\src{S}\text{ number/value}
’invert’  Hexec\text{'invert' $\ H_{\src{exec}}$}
n1,n2\src{n_1,n_2}
 : 
Number\text{Number}
Hexec1H_{\src{exec_1}}
 : 
e1 n1ectx val m1\src{\rtexpr{}{e_1}\nstepsarrow{}{n_1}\rtexpr{}{val\ m_1}}
Hexec2H_{\src{exec_2}}
 : 
e2 n2ectx val m2\src{\rtexpr{}{e_2}\nstepsarrow{}{n_2}\rtexpr{}{val\ m_2}}
HexecH_{\src{exec}}'
 : 
n=1+n1+n2\src{n}=1+\src{n_1}+\src{n_2}
HexecH_{\src{exec}}''
 : 
v=eval_binop(;m1;m2)\src{v}=\text{eval\_binop}(\src{\oplus};\src{m_1};\src{m_2})
(unfold eST)(\text{unfold$\ \comp{\src{S}}{\trg{T}}{\src{\bullet}}_e$})
n,[];  e2eST,e1eST,bop ,c,[] nectx  vvST,[];c,[]\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}
 

The specialization of IH2\text{IH}_2 is as earlier, but with the stack []\trg{[\cdot]}.

As for IH1\text{IH}_1' , the specialization uses the stack m2vST,[]\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{,[\cdot]}:

 
n1,n2\trg{n_1',n_2'}
 : 
Number\text{Number}
IH2\text{IH}_2'
 : 
[];e2eST,e1eST,bop ,c,[] n2ectx m2vST,[];,e1eST,bop ,c,[]\trg{\rtexpr{[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_{2}'}}\comp{\src{S}}{\trg{T}}{\src{m_{2}}}_v\trg{\rtexpr{,[\cdot];\emptyset}{,}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}
IH1\text{IH}_1'
 : 
m2vST,[];e1eST,bop ,c,[] n1ectx m1vST,m2vST,[];bop ,c,[]\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{\rtexpr{,[\cdot];\emptyset}{}}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{, bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n_{1}'}}\comp{\src{S}}{\trg{T}}{\src{m_{1}}}_v\trg{,}\comp{\src{S}}{\trg{T}}{\src{m_2}}_v\trg{\rtexpr{,[\cdot];\emptyset}{bop\ \oplus,\overline{c},[\cdot]}}
n,[];  e2eST,e1eST,bop ,c,[] nectx  vvST,[];c,[]\exists\trg{n'},\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e_2}}_e\trg{,}\comp{\src{S}}{\trg{T}}{\src{e_1}}_e\trg{,bop\ \oplus,\overline{c},[\cdot]}\trg{\nstepsarrow{}{n'}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{\overline{c},[\cdot]}}
 

The rest now follows as expected. It’s exactly the same as in the previous proof attempt, but this time, the assumptions apply, making the proof go through.

This statement is finally general enough so that we can prove it with an induction. A few issues are left from a “logical” perspective: First, the lemma statement is hardcoded to use an empty heap \trg{\emptyset}, so it is not applicable for cases where we want a non-empty heap. The solution to this is to reformulate the lemma, generalizing on heaps. The proof stays exactly the same up to instantiating the heap in the induction hypotheses. Second, the lemma statement is also hardcoded to executions that do not emit any events whatsoever, i.e., non-crashing executions. Likewise, the final runtime state should be generalized to be either  v\src{\triangleright\ v} (as before) or :zap:\src{\triangleright} \text{:zap:} (crash). To do so, it is necessary to define a compiler on runtime states r\src{r} giving runtime programs r\trg{r}.

These technical complications are not interesting enough to warrant inclusion in this blog-post, however, it should be at least mentioned that the given forward simulation statement is not the most general/usable one. For example, it can (and probably should) be further generalized by accounting for arbitrary substitutions into e\src{e}. The interested reader is invited to take a look at the mechanization, where these details are being taken care of.

Backward Simulation

While the intuition for backward simulation is “the compiled version does steps, so does the source”, we need to constrain programs to safe programs. This is a reasonable constraint, however, since practical compiler implementations include, e.g., a typechecker that ensures that only safe programs are compiled. As long as the compiler for values is injective and the execution of T\trg{T} deterministic (normalforms are unique), backward simulation is implied by forward simulation:

Lem. Backward Simulation

If Γe:τand [];  eeST,[] nectx  vvST,[];[],then n,e nectx v\begin{array}{l}\text{If }\src{\Gamma\vdash e:\tau}\\\text{and }\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{,[\cdot]}\trg{\nstepsarrow{}{n}\ }\comp{\src{S}}{\trg{T}}{\src{v}}_v\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}},\\\text{then }\exists\src{n'},\src{\rtexpr{}{e}\nstepsarrow{}{n'}\rtexpr{}{v}}\end{array}

Proof.

Have Comment Want
HSchecksH_{\src{S}-\text{checks}}
 : 
Γe:τ\src{\Gamma}\vdash\src{e}:\src{\tau}
HTexecH_{\trg{T}-exec}
 : 
[];  eeST nectx vST,[];[]\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{\nstepsarrow{}{n}}\comp{\src{S}}{\trg{T}}{\src{v}}\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}
n,e nectx v\exists\src{n'},\src{\rtexpr{}{e}\nstepsarrow{}{n'}\rtexpr{}{v}}
(apply ’typesafety’ on HSchecks)(\text{apply 'typesafety' on $H_{\src{S}-\text{checks}}$})
only showing the case with value as result\text{only showing the case with value as result}
v\src{v'}
 : 
S-value\text{$\src{S}$-value}
HSexecH_{\src{S}-exec}
 : 
e nectx v\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v'}}
n\trg{n'}
 : 
Number\text{Number}
(apply forward sim. on HSexec)(\text{apply forward sim. on $H_{\src{S}-exec}$})
HTexecH_{\trg{T}-exec}'
 : 
[];  eeST nectx vST,[];[]\trg{[\cdot];\emptyset\ \triangleright\ }\comp{\src{S}}{\trg{T}}{\src{e}}_e\trg{\nstepsarrow{}{n'}}\comp{\src{S}}{\trg{T}}{\src{v'}}\trg{\rtexpr{,[\cdot];\emptyset}{[\cdot]}}
(by determinism on HTexec and HTexec)(\text{by determinism on $H_{\trg{T}-exec}$ and $H_{\trg{T}-exec}'$})
Heq1H_{eq1}
 : 
n=n\trg{n}=\trg{n'}
Heq2H_{eq2}
 : 
vvST=vvST\comp{\src{S}}{\trg{T}}{\src{v}}_v=\comp{\src{S}}{\trg{T}}{\src{v'}}_v
apply ’value compiler is injective’ on Heq2\text{apply 'value compiler is injective' on $H_{eq2}$}
Heq3H_{eq3}
 : 
v=v\src{v}=\src{v'}
let  n=n\text{let $\ \src{n'}=\src{n}$}
e nectx v\src{\rtexpr{}{e}\nstepsarrow{}{n}\rtexpr{}{v}}
goal is solved by HSexec\text{goal is solved by $H_{\src{S}-exec}$}

The other case left out (where the well-typed program yields a crash) will eventually contradict with the fact that it is assumed that it does not crash in the target. i.e., apply a general form of forward simulation that also accounts for crashing programs, obtaining that the program crashes in the target, contradicting (due to deterministic semantics of T\trg{T}) the assumed fact that it does produce a value in the target.

Backtranslation of Traces

The goal is to complete the earlier proofsketch of RTP . For this, it is necessary to build a source-level context C\src{C} that generates the same trace as the given target-level execution. Given the simple setup, there are only four possible cases:

  1. Nothing crashes.
  2. After returning from the component, the context fails.
  3. The component fails.
  4. The context fails before calling the component.

With the information on the trace, it is relatively easy to define the backtranslation, generating an S\src{S}-level context from the trace:

(StartCall ? n0Ret ! n1End v)=(y;v(n0);v(v))(StartCall ? n0Ret ! n1:zap:)=(y;v(n0);abort())(StartCall ? n0:zap:)=(y;v(n0);42)(Start:zap:)=(y;abort();42)(StartCall ? b:zap:)=(y;abort();42)\begin{aligned} \uparrow \left( \trg{Start} \cdot \trg{Call\ ?\ n_0} \cdot \trg{Ret\ !\ n_1} \cdot \trg{End\ v} \right) & = & \src{(y;}\uparrow_v(\trg{n_0})\src{;}\uparrow_v(\trg{v})\src{)} \\ \uparrow \left( \trg{Start} \cdot \trg{Call\ ?\ n_0} \cdot \trg{Ret\ !\ n_1} \cdot \trg{:zap:} \right) & = & \src{(y;}\uparrow_v(\trg{n_0})\src{; abort())} \\ \uparrow \left( \trg{Start} \cdot \trg{Call\ ?\ n_0} \cdot \trg{:zap:} \right) & = & \src{(y;}\uparrow_v(\trg{n_0})\src{; 42)} \\ \uparrow \left( \trg{Start} \cdot \trg{:zap:} \right) & = & \src{(y; abort(); 42)} \\ \uparrow \left( \trg{Start} \cdot \trg{Call\ ?\ b} \cdot \trg{:zap:} \right) & = & \src{(y;abort(); 42)} \\ \end{aligned}
(** The backtranslation generates an S level context from a trace. *)
Definition bt (As : tracepref) : option S.LinkageContext :=
match As with
| Sstart ::
  Scrash :: nil => Some(S.LContext "y" (S.Xabort) S.Xabort)
| Sstart ::
  Scall Qctxtocomp (inl v__pre) ::
  Scrash :: nil => Some(S.LContext "y" (S.Xval(S.Vnat v__pre)) (S.Xval(S.Vnat 5)))
| Sstart ::
  Scall Qctxtocomp (inl v__pre) ::
  Sret Qcomptoctx _ ::
  Scrash :: nil => Some(S.LContext "y"
                        ((S.Xval (S.Vnat v__pre)))
                        S.Xabort)
| Sstart ::
  Scall Qctxtocomp (inl v__pre) ::
  Sret Qcomptoctx _ ::
  Send v__end :: nil => Some(S.LContext "y" (S.Xval(S.Vnat v__pre)) (S.Xval(S.vs v__end)))
| Sstart ::
  Scall Qctxtocomp (inr v__pre) ::
  _ => Some(S.LContext "y" (S.Xabort) (S.Xval(S.Vnat 5)))
| _ => None
end%string
.

v\uparrow_v is just the inverse of the value compiler . Now, suppose we have the component (x;var x+1)\src{(x;var\ x+1)}. Because of the way the static typing of S\src{S} is setup, it is impossible to call it with a boolean value. As established earlier, the compiler inserts wrapper code to ensure this static property dynamically, since T\trg{T} does not have any static typing, but we still want to forbid certain T\trg{T}-level contexts to call our component. The compiled version of the component is (x;quote [abort(),[]],get x,has N,negb,if,get x,push 1,bop +,[])\trg{(x;quote\ [abort(),[\cdot]],get\ x,has\ \mathbb{N},negb,if,get\ x,push\ 1,bop\ +,[\cdot])}. Consider linking and running it with the context (y;true;get y)\trg{(y;true;get\ y)}. The resulting trace is StartCall ? true:zap:\trg{Start\cdot Call\ ?\ true\cdot :zap:}. Now, we “want” a source-level context that gives “the same” trace in order to prove robust preservation . But, as (re-)established just now, language S\src{S} is statically typed and ensures type-safety that way, so we cannot construct a context that exhibits the same behavior without violating well-typing, i.e., the program resulting from linking the component and the context generated by the backtranslation is not well-typed. The backtranslation tries to detect these ill-typed contexts and generates an S\src{S}-level context that crashes immediately, i.e., (y;abort();42)\src{(y;abort();42)}. Unfortunately, linking and running this gives a different trace, since the component is now not called at all: Start:zap:\src{Start\cdot :zap:}.

So, we’ve got a situation where we have a target execution giving trace StartCall ? true:zap:\trg{Start\cdot Call\ ?\ true\cdot :zap:} and a source execution giving trace Start:zap:\src{Start\cdot :zap:}. We want these traces to be considered equal, because in both of them, the component-level code does not run at all, either the wrapper or the context resulting from the backtranslation will halt program execution.

Moreover, this situation gives rise to another technicality. Consider the component (x;abort())\src{(x;abort())} and compile, link, and run it with context (y;push 1;get y)\trg{(y;push\ 1;get\ y)}. Clearly, the resulting trace is StartCall ? 1:zap:\trg{Start\cdot Call\ ?\ 1\cdot :zap:}. But, this time, the backtranslation generates context (y;val 1;var y)\src{(y;val\ 1;var\ y)}, the resulting S\src{S}-level program is obviously well-typed. In the previously described situtation, we also obtain a trace where the component crashes, but here, code of the component is actually being run!

Anyhow, we proceed to define what it means for traces to be considered equal:

v0=v(v0)v1=v(v1)v2=v(v2)\begin{array}{ ccc }\src{v_0}=\uparrow_v(\trg{v_0}) & \src{v_1}=\uparrow_v(\trg{v_1}) & \src{v_2}=\uparrow_v(\trg{v_2})\end{array}
StartCall ? v0Ret ! v1End v2StartCall ? v0Ret ! v1End v2\trg{Start} \cdot \trg{Call\ ?\ v_0} \cdot \trg{Ret\ !\ v_1} \cdot \trg{End\ v_2} \approx \src{Start} \cdot \src{Call\ ?\ v_0} \cdot \src{Ret\ !\ v_1} \cdot \src{End\ v_2}
(trace-eq-ok)\text{\footnotesize(trace-eq-ok)}
v0=v(v0)v1=v(v1)\begin{array}{ cc }\src{v_0}=\uparrow_v(\trg{v_0}) & \src{v_1}=\uparrow_v(\trg{v_1})\end{array}
StartCall ? v0Ret ! v1StartCall ? v0Ret ! v1\trg{Start} \cdot \trg{Call\ ?\ v_0} \cdot \trg{Ret\ !\ v_1} \cdot \trg{⚡} \approx \src{Start} \cdot \src{Call\ ?\ v_0} \cdot \src{Ret\ !\ v_1} \cdot \src{⚡}
(trace-eq-fail0)\text{\footnotesize(trace-eq-fail0)}
v0=v(v0)\begin{array}{ c }\src{v_0}=\uparrow_v(\trg{v_0})\end{array}
StartCall ? v0StartCall ? v0\trg{Start} \cdot \trg{Call\ ?\ v_0} \cdot \trg{⚡} \approx \src{Start} \cdot \src{Call\ ?\ v_0} \cdot \src{⚡}
(trace-eq-fail1)\text{\footnotesize(trace-eq-fail1)}
 
StartStart\trg{Start} \cdot \trg{⚡} \approx \src{Start} \cdot \src{⚡}
(trace-eq-fail2)\text{\footnotesize(trace-eq-fail2)}
 
StartCall ? vStart\trg{Start} \cdot \trg{Call\ ?\ v} \cdot \trg{⚡} \approx \src{Start} \cdot \src{⚡}
(trace-eq-fail3)\text{\footnotesize(trace-eq-fail3)}
(** We need a different kind of trace equality than strict equality.
  The backtranslated context may fail early if the component fails, e.g.,
  a context tries to pass a `bool` and the compiler wrapper fails.
  This, in turn, would mean that the backtranslation cannot generated a well-typed S context,
  so the only option is to abort. However, the abort then happens prior to
  calling into the component, so there is mismatch in the generated traces
  that we "need" to account for by using a different kind of "equality".
*)
Inductive trace_eq : tracepref -> tracepref -> Prop :=
| empty_eq : trace_eq nil nil 
| nocrash_eq : forall (v0 v1 v2 : nat + bool),
  trace_eq (Sstart :: Scall Qctxtocomp v0 :: Sret Qcomptoctx v1 :: Send v2 :: nil)
           (Sstart :: Scall Qctxtocomp v0 :: Sret Qcomptoctx v1 :: Send v2 :: nil) 
| crashctx1_eq :
  trace_eq (Sstart :: Scrash :: nil)
           (Sstart :: Scrash :: nil) 
| crashctx2_eq : forall (v : nat + bool),
  trace_eq (Sstart :: Scall Qctxtocomp v :: Scrash :: nil)
           (Sstart :: Scrash :: nil) 
| crashcomp_eq : forall (v : nat + bool),
  trace_eq (Sstart :: Scall Qctxtocomp v :: Scrash :: nil)
           (Sstart :: Scall Qctxtocomp v :: Scrash :: nil) 
| crashctx3_eq : forall (v0 v1 : nat + bool),
  trace_eq (Sstart :: Scall Qctxtocomp v0 :: Sret Qcomptoctx v1 :: Scrash :: nil)
           (Sstart :: Scall Qctxtocomp v0 :: Sret Qcomptoctx v1 :: Scrash :: nil) 
.
#[local]
Infix "≂" := (trace_eq) (at level 81).
Local Hint Constructors trace_eq : core.

For technical reasons concerned with this blog-post, we’ll need an axiom about this equality:

Def. Trace Equality Preserves

If aa, then aπaπ \text{If }\trg{\trace}\approx\src{\trace},\text{ then }\trg{\trace}\in\pi\leftrightarrow\src{\trace}\in\pi

Extending Robust Trace Preservation with Trace Relation

Recall the definition of RTP and the first few proof steps we did. As of now, the definition does not account for custom trace equalities. That’s why the definition needs an extension to account for this, as done in existing work ( Citation: Abate et.al. 2021 nov Carmine Abate and Roberto Blanco and Ştefan Ciobâcă and Adrien Durier and Deepak Garg and Cătălin Hriţcu and Marco Patrignani and Éric Tanter and Jérémy Thibault. 2021 nov . An Extended Account of Trace-Relating Compiler Correctness and Secure Compilation ) .

For any property  π \ \pi\ and  S\ \src{S}-level program  p \ \src{p}\ , if  C a r,prog C p a  r    aπ\ \forall\src{C}\ \src{\trace}\ \src{r}, \src{prog\ C\ p}\progarrow{\src{\trace}}{}\ \src{r} \implies \src{\trace}\in\pi, then  C a r, prog C γ(p) a r    aa and aπ\ \forall\trg{C}\ \trg{\trace}\ \trg{r},\ \trg{prog\ C\ }\gamma(\src{p})\progarrow{\trg{\trace}}\trg{r}\implies \src{\trace}\approx\trg{\trace}\text{ and }\trg{\trace}\in\pi.

(** We define a "weakened" version of RSC that has a different notion of trace equality. *)
Definition rscwt (cc : compiler__ST) :=
forall (p : S.symbol) (ctx__T : T.LinkageContext) (As : tracepref),
  forall r, T.progstep (T.plug' ctx__T (cc p)) As r ->
  exists r' (ctx__S : S.LinkageContext) As',
    trace_eq As As' /\
    S.progstep (S.plug' ctx__S p) As' r'
.
Notation "'[' '|-' cc ':' 'rscwt' ']'" := (rscwt cc) (at level 81, cc at next level).

The given definition unfolds robust preservation , because it needs to relate the traces hidden “inside it”, albeit this statement is to be taken with a grain of salt, since the qantifiers need to change. In the original definition of robust preservation, traces where only quantified inside the robust satisfaction parts, while now at least the scope of the source-level trace leaks into the target-level robust satisfaction in order to relate the source with the target trace. Ultimately, this is the definition that we will prove soon.

Back to Backtranslation

While we have seen the definition for the backtranslation , we do not - yet - know wether it is correct or not. Intuitively, a target-level execution with whatever context and some compiled component, we want to identify a source-level context (this is done with the backtranslation \uparrow) which successfully links with the component and yields a trace upon execution that relates to the given one with our custom trace equality \approx.

Lem. Backtranslation Correctness

If prog C pST a r, thenC r a,prog C p a r and aa\begin{array}{l}\text{If }\trg{prog\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\progarrow{\trg{\trace}}\trg{r},\text{ then}\exists \src{C}\ \src{r}\ \src{\trace}, \src{prog\ C\ p}\progarrow{\src{\trace}}\src{r} \text{ and } \trg{\trace}\approx\src{\trace}\end{array}

Proof Sketch.

First, perform a case-analysis on the compiler output, which either yields the compiled component code or a singleton abort\trg{abort} commandlist. We focus in the following on the first. Here, there are four cases, i.e., (1) all ok, (2) context fails after component is done, (3) component fails, (4) or context fails. The failing cases are all somewhat similar, so we chose to sketch only case (1) and case (3).

1st Case. Note that we have assumed an execution that yields a trace of shape StartCall ? v0Ret ! v1End v2\trg{Start\cdot Call\ ?\ v_0\cdot Ret\ !\ v_1\cdot End\ v_2}. Instantiate existential with result of backtranslation and trace StartCall ? v(v0)Ret ! v(v1) End v(v2)\src{Start\cdot Call\ ?\ }\uparrow_v(\trg{v_0})\src{\cdot Ret\ !\ }\uparrow_v(\trg{v_1})\src{\cdot\ End\ }\uparrow_v(\trg{v_2}). Furthermore, split the goal into the execution and trace equality part, since it’s a conjunction.

Have Comment Want
......
 : 
......
equality part\text{equality part}
(by definition)(\text{by definition})
StartCall ? v0Ret ! v1End v2StartCall ? v(v0)Ret ! v(v1) End v(v2)\trg{Start\cdot Call\ ?\ v_0\cdot Ret\ !\ v_1\cdot End\ v_2}\approx\src{Start\cdot Call\ ?\ }\uparrow_v(\trg{v_0})\src{\cdot Ret\ !\ }\uparrow_v(\trg{v_1})\src{\cdot\ End\ }\uparrow_v(\trg{v_2})
Have Comment Want
execution part\text{execution part}
p\src{p}
 : 
S-level Component\text{$\src{S}$-level Component}
HchecksH_{\src{checks}}
 : 
x:Np:τ\src{x:\nat\vdash p:\tau}
compiler does not fail, so it must be well-typed\text{compiler does not fail, so it must be well-typed}
v0,v1,v2\trg{v_0},\trg{v_1},\trg{v_2}
 : 
T-level Values\text{$\trg{T}$-level Values}
C\trg{C}
 : 
T-level Context\text{$\trg{T}$-level Context}
r\trg{r}
 : 
T-level Runtime Program\text{$\trg{T}$-level Runtime Program}
HexecH_{\trg{exec}}
 : 
prog C pST StartCall ? v0Ret ! v1End v2 r\trg{prog\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\progarrow{\trg{Start}\cdot\trg{Call\ ?\ v_0}\cdot\trg{Ret\ !\ v_1}\cdot\trg{End\ v_2}}\trg{r}
prog (y;v(v0);v(v2)) p StartCall ? v(v0)Ret ! v(v1) End v(v2) rrST\src{prog\ (y;}\uparrow_v(\trg{v_0})\src{;}\uparrow_v(\trg{v_2})\src{)\ p}\progarrow{\src{Start\cdot Call\ ?\ }\uparrow_v(\trg{v_0})\src{\cdot Ret\ !\ }\uparrow_v(\trg{v_1})\src{\cdot\ End\ }\uparrow_v(\trg{v_2})}\comp{\src{S}}{\trg{T}}{\trg{r}}_r
invert  Hexec, i.e., compute the execution up to the call of the component\text{invert $\ H_{\trg{exec}}$, i.e., compute the execution up to the call of the component}
nn
 : 
Number\text{Number}
HwrapH_{wrap}
 : 
Ψ;xv0wrap(x),pST,[] nectx v1,Ψ;xv0[]\trg{\rtexpr{\Psi;x\mapsto v_0}{}}\operatorname{wrap}(\trg{x})\trg{,}\comp{\src{S}}{\trg{T}}{\src{p}},\trg{[\cdot]}\nstepsarrow{}{n}\trg{\rtexpr{v_1,\Psi;x\mapsto v_0}{[\cdot]}}
further computation requires case analysis on wether wrap(x)  fails or not\text{further computation requires case analysis on wether $\operatorname{wrap}(\trg{x})\ $ fails or not}
 
wrapper fails\text{wrapper fails}
computation contradicts shape of the trace, which does not fail\text{computation contradicts shape of the trace, which does not fail}
 
HpH_{\trg{p}}
 : 
Ψ;xv0pST,[] n5ectx v1,Ψ;xv0[]\trg{\rtexpr{\Psi;x\mapsto v_0}{}}\comp{\src{S}}{\trg{T}}{\src{p}},\trg{[\cdot]}\nstepsarrow{}{n-5}\trg{\rtexpr{v_1,\Psi;x\mapsto v_0}{[\cdot]}}
wrapper not failing\text{wrapper not failing}
back-sim. on  Hp  and  Hchecks\text{back-sim. on $\ H_{\trg{p}}\ $ and $\ H_{\src{checks}}$}
nn'
 : 
Number\text{Number}
HexecH_{\src{exec}}
 : 
p[subst x for v(v0)] nectx (v1)\src{\rtexpr{}{p}}[\text{subst}\ \src{x}\ \text{for}\ \uparrow_v(\trg{v_0})]\nstepsarrow{}{n'}\src{\rtexpr{}{}}\uparrow(\trg{v_1})
prog (y;v(v0);v(v2)) p StartCall ? v(v0)Ret ! v(v1) End v(v2) rrST\src{prog\ (y;}\uparrow_v(\trg{v_0})\src{;}\uparrow_v(\trg{v_2})\src{)\ p}\progarrow{\src{Start\cdot Call\ ?\ }\uparrow_v(\trg{v_0})\src{\cdot Ret\ !\ }\uparrow_v(\trg{v_1})\src{\cdot\ End\ }\uparrow_v(\trg{v_2})}\comp{\src{S}}{\trg{T}}{\trg{r}}_r
compute goal up to component call\text{compute goal up to component call}
n0,p[subst x for v(v0)] n0ectx (v1)\exists n'_0,\src{\rtexpr{}{p}}[\text{subst}\ \src{x}\ \text{for}\ \uparrow_v(\trg{v_0})]\nstepsarrow{}{n'_0}\src{\rtexpr{}{}}\uparrow(\trg{v_1})
by  Hexec\text{by $\ H_{\src{exec}}$}
the context executions are trivial\text{the context executions are trivial}

3rd Case. Run the program right up to the wrapper. Then case-analysis on the wether the wrapper fails or not.

tedious exercise. see the details in the mechanization.

Final Proof

We conclude the post with the final proof.

Lem. RTP

For any property π and S-level program p, if C a r,prog C p a  r    aπ, then C a r,prog C γ(p) a r    aa and aπ.\begin{array}{l}\text{For any property } \pi \text{ and } \src{S}\text{-level program } \src{p}, \text{ if } \forall\src{C}\ \src{\trace}\ \src{r}, \src{prog\ C\ p}\progarrow{\src{\trace}}{}\ \src{r} \implies \src{\trace}\in\pi,\\\text{ then } \forall\trg{C}\ \trg{\trace}\ \trg{r}, \trg{prog\ C\ }\gamma(\src{p})\progarrow{\trg{\trace}}\trg{r}\implies \trg{\trace}\approx\src{\trace}\text{ and }\trg{\trace}\in\pi.\end{array}

Have Comment Want
π\pi
 : 
Property\text{Property}
p\src{p}
 : 
S-level component\src{S}\text{-level component}
H1H_1
 : 
aTrace,rRuntime Expressions,CS-level program,if prog C p a r,then aπ\begin{array}{l}\forall\src{\trace}\in\text{Trace},\\\forall\src{r}\in\text{Runtime Expressions},\\\forall\src{C}\in\src{S}\text{-level program},\\\text{if }\src{prog\ C\ p}\progarrow{\src{\trace}}\src{r},\\\text{then }\src{\trace}\in\pi\end{array}
CT-level program,aTrace,rRuntime Program,if prog C pST a r,then aa and aπ\begin{array}{l}\forall\trg{C}\in\trg{T}\text{-level program},\\\forall\trg{\trace}\in\text{Trace},\\\forall\trg{r}\in\text{Runtime Program},\\\text{if }\trg{prog\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\progarrow{\trg{\trace}}\trg{r},\\\text{then }\trg{\trace}\approx\src{\trace}\text{ and }\trg{\trace}\in\pi\end{array}
a\trg{\trace}
 : 
Trace\text{Trace}
assume from goal\text{assume from goal}
C\trg{C}
 : 
T-level context\trg{T}\text{-level context}
HT-execH_{\trg{T}\text{-exec}}
 : 
prog C pST a r\trg{prog\ C\ }\comp{\src{S}}{\trg{T}}{\src{p}}\progarrow{\trg{\trace}}\trg{r}
aaaπ\trg{\trace}\approx\src{\trace}\wedge\trg{\trace}\in\pi
use backtranslation correctness on HT-exec\text{use backtranslation correctness on $H_{\trg{T}\text{-exec}}$}
C\src{C}
 : 
S-level context\text{$\src{S}$-level context}
r\src{r}
 : 
Runtime program\text{Runtime program}
a\src{\trace}
 : 
Trace\text{Trace}
HexecH_{\src{exec}}
 : 
prog C p a r\src{prog\ C\ p}\progarrow{\src{\trace}}\src{r}
HeqH_{eq}
 : 
aa\trg{\trace}\approx\src{\trace}
follows from HeqH1 and Hexec\text{follows from $H_{eq}$, $H_1$ and $H_{\src{exec}}$, }
(relying on the trace equality axiom)(\text{relying on the trace equality axiom})

As a last note, RTP variants usually also have a property-free characterization which can be more elegant to do proofs with. Essentially, you almost always just have to construct the source context and then argue that there exists some kind of source-level execution that behaves similar to the given target-level execution. One does not need the trace equality axiom in the property-free version. Also note that it may be possible to lower this axiom to a lemma whenever one is interested in a concrete property and wants to prove the compiler secure for that concrete property alone. I refrained from presenting it here for “pedagogical reasons” (the post is already long enough).

You can find the Coq development containing all the details here. Thanks for reading and feel free to ping me on socials (preferrably ) for questions! :sweat_smile: