Preamble
VIP: <to be assigned>
Title: Implement lexical block-level scoping in Viper
Author: @yzhang90 @daejunpark @pdaian @grosu
Type: Standard Track
Status: Draft
Created: 2017-12-1
Simple Summary
Viper should correctly implement block-level scoping semantics while fully disallowing shadowing to avoid misleading and error-prone code.
Abstract
The semantics of variable scoping is very important for a programming language because the scoping determines how a (variable) name binds to the actual object. The current Viper compiler mixes different scoping approaches and makes it very hard to reason about programs. We propose uniform use of lexical block-level scoping.
Motivation
The principles behind Viper are security, simplicity and auditability. However, the current scoping in Viper is not clear and can make the code misleading and error-prone.
In Viper, a local variable is accessible throughout the whole function after it is declared.
For example, in the following program, b is equal to 1 if choice is true and 0 otherwise. Even though a is declared and assigned inside if, it is still accessible after the if statement finishes:
def foo(choice: bool):
if (choice):
a = 1
b = a
However, index variables are treated differently from other local variables.
The scope of index variables is the for loop and the index variables are not accessible outside the loop. For example, Viper rejects the following program:
def foo():
for i in range(10):
pass
i += 1
What's more, an annotated assignment a:num, which normally also initializes a to 0, is only executed once in the loop and thus b equals 10 in the end:
def foo():
for i in range(10):
a:num
a += 1
b = a
With the current scoping rules, a program which declares a variable inside the if...then...else branch or the for loop is almost equivalent to declaring the variable before the if statement or the for loop. For instance,
def foo(choice: bool) -> num:
if choice:
a = 1
return a
else:
b = 2
return a + b
is equivalent to
def foo(choice: bool) -> num:
a: num
b: num
if choice:
a = 1
return a
else:
b = 2
return a + b
The difference is that for the second program, one can negate the choice variable and swap the then and else branch, but you could not do the same thing for the first program (because a is not defined before expression a + b!).
Situations like the above make it unnecessarily difficult to reason about programs and program equivalence.
In general, it is much more convenient to formally analyze/verify programs with block-scoped variables, because that allows for compositional reasoning without modifying the original program (e.g., without moving the variable declarations to the beginning of the method). Compositional verification is very important in practice, because it allows us to decompose large proof obligations into smaller proof tasks and even do those in parallel.
Specification
Throughout the specification, before/after means lexically appears before/after.
Since global variables have their own namespace self.<variable name>, we only specify the lexical block-level scoping rules for local variables inside a function declaration.
Variable declaration and initialization
A variable can be declared only with a statement of the form <varname>: <type> = <initial_value>, which also initializes it.
Moreover, a variable cannot be read or written unless it has been previously declared.
This is in sharp contrast with the current Viper compiler, which allows statements like a:num without explicit initialization to declare a variable and implicitly initialize it to zero, as well as statements like a = 1 without explicit declaration of a to initialize, implicitly declaring a as a variable of a type inferred by the compiler.
Function parameters are special: they are not initialized when declared but when the function is called.
Block scope
A block scope is created by a function declaration, a conditional if statement, or by a for loop statement.
-
function declaration
def foo(a:num): stmts introduces a block scope. The block scope is started with all the formal parameters (i.e, a in our case) declared and followed by stmts. The block scope ends when the function declaration finishes.
-
if statement:
There are two cases.
if(exp) then stmts introduces 1 block scope and all the statements in the then branch are executed inside this block scope.
if(exp) then stmts1 else stmts2 introduce 2 block scopes. stmts1 is executed in one block scope and stmts2 is executed in the other. We say that the two block scopes are parallel.
-
for loop: for i in exp: stmt introduces 2 nested blocks, with the outer block declaring the index variable i and the inner block holding the loop body stmt. Introducing 2 nested blocks is necessary because i should be visible inside the loop body and i should be alive throughout the loop. However, any variable declared inside the loop body is only alive in the current loop iteraton.
Rules
- Variables must be lexically declared (and initialized) before being used.
For example, the following program should be rejected
and so should be the following
because a is not declared.
The assignment's semantics is to only update an already declared variable, and not to
declare the variable as well in case it was not declared before.
- Variables must only be declared once inside a block.
For example, the following program should be rejected:
def foo():
a:num = 1
a:num = 2
But the following program is valid:
def foo():
a:num = 1
a = 2
- Variables declared in one block must be freed when the block ends.
For example, the following program is valid:
def foo(choice: bool):
if (choice):
a:num = 0
a += 1
else:
a:num = 1
a += 2
a should be 1 or 3 at the end of corresponding block depending on the value of choice, but none of the two declared a variables is visible outside of the conditional statement.
- Blocks can be nested (e.g., you can declare an
if inside a for loop).
Variables declared in the outer-blocks are visible to the inner-blocks, but not the other way around.
For example,
def foo(choice: bool):
a:num = 10
if (choice):
a += 1
a is 11 if choice is true.
However, the following program is not allowed because a is declared in the if block and invisible to the outer-block:
def foo(choice: bool):
if (choice):
a:num = 1
a += 1
- No shadowing: Shadowing is notoriously error-prone.
Following good practices in other languages, we propose that blocks cannot be used for variable shadowing.
For example, the following program should be rejected because the a inside the if statement shadows the a outside.
def foo(choice:bool):
a:num =0
if (choice):
a:bool = true
Shadowing is so evil and correctness is so paramount in Viper, that we encourage the language designers to consider an event stronger discipline than in other languages. We propose to even disallow programs like the following one:
def foo(choice:bool):
if (choice):
a:bool = true
a:num = 0
In java, similar programs are allowed, because, technically, the second declaration of a is totally orthogonal to the former, due to the block scoping of the former.
But Viper implementations, to save memory, may prefer to do variable hoisting to lift all declarations to the top of the block.
The program above does not allow hoisting the second declaration of a.
A simple all declared variables in any block are different policy will give the language both clarity and performance, at a price for developers that in our view is not much to pay. This change will also ease reasoning while auditing programs, reducing confusion between identically named variables in different scopes.
Backwards Compatibility
This change is backward compatible with some Viper programs. This includes all programs that, for instance, declare variables at the beginning of their functions, or do not use blocks (or declarations in blocks) in their functions. Some Viper programs will need to be minorly modified for compatibility with these new rules.
Copyright
Copyright and related rights waived via CC0
Preamble
Simple Summary
Viper should correctly implement block-level scoping semantics while fully disallowing shadowing to avoid misleading and error-prone code.
Abstract
The semantics of variable scoping is very important for a programming language because the scoping determines how a (variable) name binds to the actual object. The current Viper compiler mixes different scoping approaches and makes it very hard to reason about programs. We propose uniform use of lexical block-level scoping.
Motivation
The principles behind Viper are security, simplicity and auditability. However, the current scoping in Viper is not clear and can make the code misleading and error-prone.
In Viper, a local variable is accessible throughout the whole function after it is declared.
For example, in the following program,
bis equal to1ifchoiceistrueand0otherwise. Even thoughais declared and assigned insideif, it is still accessible after theifstatement finishes:However, index variables are treated differently from other local variables.
The scope of index variables is the
forloop and the index variables are not accessible outside the loop. For example, Viper rejects the following program:What's more, an annotated assignment
a:num, which normally also initializesato0, is only executed once in the loop and thusbequals10in the end:With the current scoping rules, a program which declares a variable inside the
if...then...elsebranch or theforloop is almost equivalent to declaring the variable before theifstatement or theforloop. For instance,is equivalent to
The difference is that for the second program, one can negate the
choicevariable and swap thethenandelsebranch, but you could not do the same thing for the first program (becauseais not defined before expressiona + b!).Situations like the above make it unnecessarily difficult to reason about programs and program equivalence.
In general, it is much more convenient to formally analyze/verify programs with block-scoped variables, because that allows for compositional reasoning without modifying the original program (e.g., without moving the variable declarations to the beginning of the method). Compositional verification is very important in practice, because it allows us to decompose large proof obligations into smaller proof tasks and even do those in parallel.
Specification
Since global variables have their own namespace
self.<variable name>, we only specify the lexical block-level scoping rules for local variables inside a function declaration.Variable declaration and initialization
A variable can be declared only with a statement of the form
<varname>: <type> = <initial_value>, which also initializes it.Moreover, a variable cannot be read or written unless it has been previously declared.
This is in sharp contrast with the current Viper compiler, which allows statements like
a:numwithout explicit initialization to declare a variable and implicitly initialize it to zero, as well as statements likea = 1without explicit declaration ofato initialize, implicitly declaringaas a variable of a type inferred by the compiler.Function parameters are special: they are not initialized when declared but when the function is called.
Block scope
A block scope is created by a function declaration, a conditional
ifstatement, or by aforloop statement.function declaration
def foo(a:num): stmtsintroduces a block scope. The block scope is started with all the formal parameters (i.e,ain our case) declared and followed bystmts. The block scope ends when the function declaration finishes.if statement:
There are two cases.
if(exp) then stmtsintroduces 1 block scope and all the statements in thethenbranch are executed inside this block scope.if(exp) then stmts1 else stmts2introduce 2 block scopes.stmts1is executed in one block scope andstmts2is executed in the other. We say that the two block scopes are parallel.for loop:
for i in exp: stmtintroduces 2 nested blocks, with the outer block declaring the index variableiand the inner block holding the loop bodystmt. Introducing 2 nested blocks is necessary becauseishould be visible inside the loop body andishould be alive throughout the loop. However, any variable declared inside the loop body is only alive in the current loop iteraton.Rules
For example, the following program should be rejected
and so should be the following
because
ais not declared.The assignment's semantics is to only update an already declared variable, and not to
declare the variable as well in case it was not declared before.
For example, the following program should be rejected:
But the following program is valid:
For example, the following program is valid:
ashould be 1 or 3 at the end of corresponding block depending on the value ofchoice, but none of the two declaredavariables is visible outside of the conditional statement.ifinside aforloop).Variables declared in the outer-blocks are visible to the inner-blocks, but not the other way around.
For example,
ais 11 ifchoiceis true.However, the following program is not allowed because
ais declared in theifblock and invisible to the outer-block:Following good practices in other languages, we propose that blocks cannot be used for variable shadowing.
For example, the following program should be rejected because the
ainside the if statement shadows theaoutside.Shadowing is so evil and correctness is so paramount in Viper, that we encourage the language designers to consider an event stronger discipline than in other languages. We propose to even disallow programs like the following one:
In java, similar programs are allowed, because, technically, the second declaration of
ais totally orthogonal to the former, due to the block scoping of the former.But Viper implementations, to save memory, may prefer to do variable hoisting to lift all declarations to the top of the block.
The program above does not allow hoisting the second declaration of
a.A simple all declared variables in any block are different policy will give the language both clarity and performance, at a price for developers that in our view is not much to pay. This change will also ease reasoning while auditing programs, reducing confusion between identically named variables in different scopes.
Backwards Compatibility
This change is backward compatible with some Viper programs. This includes all programs that, for instance, declare variables at the beginning of their functions, or do not use blocks (or declarations in blocks) in their functions. Some Viper programs will need to be minorly modified for compatibility with these new rules.
Copyright
Copyright and related rights waived via CC0