Project B

This page describes the second project for the verification course, which is concerned with developing verification methodologies. Read the full description carefully. If you need clarification on a task, contact the lecturer.

Deadline: November 30, 23:59

The project can be solved in many different ways. It is essential that you reasonably justify your design decisions.

A written report is not required. However, you must explain your approach and justify your design decisions for each task. We recommend putting your explanations in comments at the beginning of the task's Viper file.

Submission: push your solution to the GitLab repository of your homework group in a directory named project-b. The latest push before the deadline will be considered as your submission.

Files: Viper code skeleton

Tools: All tasks are to be completed using Viper. Your submission must work with the Viper and Z3 versions distributed with the current version of Viper IDE. Your solutions should work with at least one Viper backend (Silicon or Carbon; indicate which one works if it makes a difference).

Background

Throughout the course, we focused on program correctness, that is, how to specify what a program should do and how to verify that it does what has been specified. Apart from writing correct programs, we are also interested in writing efficient programs.

The goal of this project is thus to develop a verification methodology for reasoning about time complexity and to automate that methodology with Viper.

Time complexity

Upper bounds on the time complexity of algorithms are typically described using Big O notation, an abstract that allows us to reason about complexity classes. For example, does a function's runtime grow linearly, quadratically, exponentially, and so on, concerning the function's input size? By using Big O notation, we also abstract from the exact runtime costs of individual instructions (in terms of clock cycles or seconds).

For this project, we will use a simplified notion of time complexity. Instead of using Big O notation, we will assume an abstract time unit and use a runtime model in which every method call and every loop iteration consumes one abstract time unit; all other statements are considered to consume no time at all.

Time units

We considered Viper predicates as a mechanism for abstracting heap-based data structures. In particular, predicate arguments enable abstracting over the content of a data structure. To gain access to and hide the internals of a data structure, we then need to unfold and fold the predicate manually.

In this project, we will use predicates to represent an abstract, non-duplicable resource that does not refer to anything on the heap. Instead, we will use a predicate to indicate one abstract unit of time, called a time credit for short.

predicate time_credit()

Notice that the predicate has neither arguments nor a body. Our runtime model also supplies an abstract method letting time pass by consuming a time credit:

method consume_time_credit()
    requires time_credit()

Whenever a statement consumes time, we invoke the above method to consume a time credit. Intuitively, the number of time credits available to our program thus represents the time that can pass during its execution. Verification will fail if we do not have enough time credits left.

For example, the method below does not verify. In the precondition, we claim that we can execute it with just two time credits, but we consume three time credits in its body:

method example()
    requires time_credit() && time_credit()
{
    consume_time_credit()
    consume_time_credit()
    consume_time_credit() // error: not enough time credits
}

Alternatiely, we can represent the above precondition using fractional permissions, that is, using the following precondition:

    requires acc(time_credit(), 2/1)

Note that unlike heap locations, we can hold more than one full premission to an abstract predicate. This notation also allows us to have arbitrarily many time credits, for example depending on method arguments:

method example2(n: Int)
  requires acc(time_credit(), n/1)

Runtime model

Throughout this project we consider a runtime model that adheres to the following two rules:

  1. Every (non-ghost) method implementation must start with a call to consume_time_credit(). Hence, every (executable) method consumes at least one time unit.

  2. Every loop body must start with a call to consume_time_credit(). Hence, every loop iteration consumes at least one time unit.

We will not create time credits out of thin air. Instead, the amount of time credits available is determined by the precondition. If we manage to verify a program with these time credits, then the initial amount of time credits is an upper bound on the execution time of a program in our simple runtime model.

More precisely, if time credits are correctly consumed by any operation we use, then we show that the amount of time credits given in the precondition is sufficient to execute a given method. We thus prove an upper bound on the program's execution time with respect to our runtime model.

Note that we could have allocated more time credits in the precondition than required.

Amortized Analysis

Some data structure operations have a costly worst-case execution time but are still very efficient when considering a longer sequence of operations. For example, adding an element to a (dynamically allocated) vector or array is typically constant-time. However, its worst-case execution time may be linear: if we attempt to add an element to a full array, we first need to allocate a larger array and copy over all existing values. Such a worst-case scenario rarely happens since we can add many more elements before reaching the new array's capacity.

Hence, when considering sufficiently many add operations, the average time per method call becomes almost constant. The general technique for analyzing such average time consumption over a sequence of operations is called amortized analysis.

There are multiple approaches to amortized analysis, but the one we will consider is the accounting method (also known as the banker's method). The general idea is that a data structure can store time credits. Operations on the data structure are given a sufficient cost that may also cover extra credit saved in the data structure. Most calls to the method increase the credit balance of the data structure, but the ones that would normally take a longer time consume the stored credit instead.

Task 1: Basic runtime analysis

To get familiar with reasoning about runtimes, we will analyze three methods that we have verified before. In each subtask listed below, you are given an existing method implemented in Viper with a correct functional specification. Find an upper bound of each method’s runtime and specify it using time credits. Try to find the most precise upper bound possible!

  1. Analyze the iterative implementation of the Fibonacci numbers in skeleton/task1/fib_iter.vpr.
  2. Analyze the recursive implementation of the Fibonacci numbers in skeleton/task1/fib_rec.vpr.
  3. Analyze the fast exponentiation algorithm in skeleton/task1/fast_pow.vpr.

Hint: For the third subtask, you may want to prove a logarithmic bound. Instead of defining and using a logarithm function directly, it may be easier to use ghost variables to keep track of the logarithm of some variable. You may also need to introduce lemmas for proving properties of logarithms.

Task 2: Binary search trees

We will consider binary and binary search trees (BSTs) in this task.

A binary tree is a heap-based data structure consisting of nodes containing a value, an optional left subtree, and an optional right subtree.

A binary search tree is a binary tree where all node values in the left subtree are strictly smaller than the current node’s value, and all node values in the right subtree are strictly greater than the current node’s value. Note that this definition also implies that there are no duplicates in the tree.

Modify the file skeleton/task2/trees.vpr when solving the following tasks:

  1. Define a binary tree predicate with nodes containing integer values.
  2. Define a BST predicate that ensures sortedness of the node values.
  3. Implement, specify, and verify a method bst_insert, which takes a value and a BST and inserts the value into the BST, such that the resulting tree remains a BST after the operation.
  4. Use the time credit model to prove that the runtime of bst_insert is at most logarithmic in our simple runtime model. To this end, you may want to prove that bst_insert terminates within h+1 time units, where h is the tree's height.

Hints:

  • The BST property can be specified without using quantifiers. Consider using abstraction to specify the minimum and maximum values in the given (sub)tree.
  • Remember that your implementation of bst_insert must be sufficiently efficient such that you can prove the runtime bound in part 4.

Task 3: Dynamic arrays

In this task, we will consider dynamic arrays, which are arrays that grow as values are inserted. A typical dynamic array has a capacity indicating the number of elements it can store before it needs to be resized. It also has a length indicating the number of elements currently stored. The length is always less than or equal to the capacity.

Modify the file skeleton/task3/dyn_array.vpr when solving the following tasks:

  1. Define a predicate for a dynamic array of integers using quantified permissions. Use the provided abstract function array_offset to obtain the heap locations of the array. Use abstraction to define the contents of the array as a sequence.
  2. Implement, specify, and verify a naive append method that never resizes the array; this method can thus only be called when there is space for another element.
  3. Implement, specify, and verify a method that resizes the array by doubling its capacity. To model reallocation, you can inhale the permissions for the new array, but you should copy the old contents using a loop.
  4. Implement, specify, and verify a full append method that calls either of the previous two methods depending on the state of the given dynamic array.
  5. Amortized time analysis: Using the time credits stored in the dynamic array predicate, prove that the full append method has an amortized constant runtime.

Task 4: Circular buffers

In this task, we will consider circular buffers and linked lists. Circular buffers are a data structure that uses a fixed size array to implement a FIFO (first in, first out) queue. An offset into the array is maintained as the head, where elements are read from, and another offset defines the tail, where elements are written to. A write operation consists of writing data to the tail offset and incrementing it by one. A read operation consists of reading data from the head offset and incrementing it by one. The buffer is called circular because the offset increments use a modulo operation to wrap around the array's length.

Modify the file skeleton/task4/circular.vpr when solving the following tasks. The skeleton contains an abstract predicate for arrays and a small set of functions and methods for using arrays.

  1. Define a circular buffer predicate. Use abstraction to define the contents stored in the circular buffer.
  2. Implement, specify, and verify methods to read one element from a circular buffer and to write one element into a circular buffer.
  3. Define a predicate for a linked list of integers.
  4. Using linked lists, implement, specify, and verify a method to reverse the contents of a circular buffer.