Note
This is the full implementation of the CSE machine in source (call it “final CSE machine” to differentiate from the theoretical CSE machine)
Symbol vs name: symbol is an object that refers to the value, name is the literal string
How is evaluate different
function evaluate(expr) {
let C = list(expr); // this is the control
let S = null; // this is the stash
let E = the_global_environment; // this is the environment
while (!is_null(C)) {
const command = head(C); // take the topmost control command
C = tail(C); // remove it from the control
// then, process the command
if (is_literal(command)) {
S = pair(literal_value(command), S);
} else if () {} // handle all other commands
else { error ("unknown command"); }
}
return head(S);
}- How we process the command is based on the CSE Machine.
- The control list stores expressions, as well as instructions, such as
pop(which are also tagged lists)
Evaluating names
// evaluate ...
else if (is_name(command)) {
S = pair(lookup_symbol_value(symbol_of_name(command), E),
S);
}- The name is looked up in the current environment.
- If not found, we keep looking through the enclosing environments until reaching the outermost environment
- Note: If the name is not found, we get the error “unbound name [name]”
Handling Sequences
// evaluate ...
else if (is_sequence(command)) {
const body = sequence_statements(command);
C = is_null(body)
? pair(make_literal(undefined), C)
: is_null(tail(body))
? pair(head(body), C)
: pair(head(body),
pair(make_pop_instr(),
pair(make_sequence(tail(body)), C)));
}- If there are no items in the sequence, add “undefined” to the control (this is how empty blocks cause undefined to be on the Stash)
- If there is one item in the sequence, just add it to the control (this means the sequence is not empty, therefore does not need an “undefined”)
- If there is more than one item in the sequence
- Add the rest of the sequence back to the control (as a sequence)
- Add a
popinstruction to the control - Add the first item in the body to the control
- Note: this means that unlike playground, where the whole sequence is decomposed at once, final CSE machine handles the sequence one item at a time
Handling Blocks
// evaluate ...
else if (is_block(command)) {
const scan_out_declarations(block_body(command));
const unassigneds = list_of_unassigned(locals);
// ^ same as before
C = pair(block_body(command),
pair(make_env_instr(E), C));
// ^ this is adding an env instruction,
// which points to the current environtment to the control,
// then adds the body of the block (which is a sequence)
// to the control
E = extend_environment(locals, unassigneds, E);
// ^ this extendns the environment with a new frame (if needed)
// all the values in the new frame will be unassigned
}Handling Declarations
// evaluate ...
else if (is_declaration(command)) {
C = pair(make_assignment(
decl_symbol(command),
decl_value_expr(command)),
C);
}- A declaration is treated as an assignment
- This is because all the declarations would have been scanned out earlier when evaluating the current block, so we just need to assign a value to the symbol
Handling Assignment
x = y + 7; // < example
// evaluate ...
else if (is_assignment(command)) {
C = pair(assignment_value_expression(command),
pair(make_assign_instr(assignment_symbol(command)),
C));
}- This adds the
asgninstruction to the control, soasgn x - Then it adds the assignment value expression to the control, so
y + 7 - Note: final CSE machine does not check if a variable is a
constorletbefore assigning. The only time it checks forletis when evaluating a for loopfor(let i <- here - Note: If the name is not found, we get the error “unbound name — assignment [name]”
Functions
- There is a difference between simple functions and compound functions, in whether the
markinstruction is created, andundefinedreturn value.- A function is simple in the final CSE machine if
- it is a lambda with no block, or
- there is only one expression in the function body, and it is
return
- If the function is a compound function:
markinstruction will be pushed onto control, to let us know where to skip to after the function returns- a return
undefinedstatement will be added to the end of the function body sequence, so that we know to pushundefinedonto the stash if the function does not return anything
- A function is simple in the final CSE machine if
- The function application/call process is the same as in CSE Machine
callinstruction: it removes the required number of arguments from the stash, and stores them in a listargs, which is then passed toextend_environment(function_parameters(fun), args, function_environment(fun));Handlingreturninstruction:
// evaluate ...
else if (is_return_instr(command)) {
if (is_marker(head(C))) {
C = tail(C); // just remove the marker
} else {
C = pair(command, tail(C));
// ^ remove the top item from the control,
// then add the return instruction back
}Other things to note
Proper tail calls
This is to make tail calls truly iterative (prevent env instructions from building up)
Just need to know: if there is already an env instruction as the latest item on the control, do not push a new env instruction
Value-producing statements
Declarations (const and let), as well as blocks ({ 1 + 1; }) are the ONLY non value-producing statements. Everything else (loops, assignments etc.) are value-producing statements.
This means: if we encounter a non value-producing statement, we need to add a pop instruction after it
Loop details
If the body of a loop is not run at all, it produces undefined. e.g. (while(false) {1;})) produces undefined.
If the body of a loop runs, then the loop will produce the value that the body sequence produces