Stanford Invariant Generator

 Latest

 07/06/2013 Version that compiles under PPL 1.0 posted. 05/03/2006 Compiles under PPL 0.9 (latest version). Thanks to Prof. Bagnara for his help. 01/31/2005 Source code released under GPL. 09/21/2004 Version 1.1 released. Numerous bug fixes both in my code and from the PPL side. New release uses ppl-0.6.1 08/19/2004 Binaries Released for version 1.0 just in time for SAS 2004.

 Introduction

StInG computes linear invariants for linear transition systems, automatically. Automatic Loop-Invariant Generation is an old problem that has been studied since the 70s. Consider the following program:

 integer x,y; (x,y) := (0,0) while (...) do if (...) then x:= x+ 4 else ( x,y ) := (x+ 2, y+1) end-if end-while

The branch conditions denoted by (...) are treated as non-deterministic choices. The loop-invariant for the main while loop is x >= 2 and y >= 0 and x -2y >= 2 . The first two are easy to justify. The third is slightly non-trivial to see. StInG takes in a description of such programs in the form of transition systems [MP95], and computes loop invariants. Transition Systems also correspond to control-flow graphs. In this sense, our analysis is intra-procedural since we do not handle function calls directly. There are numerous applications for this sort of invariant discovery in the traditional domains of program analysis and more recently in the non-traditional domain of reasoning about biological system models, and analyzing process models.

StInG is primarily a research tool, and is constantly evolving. It implements three techniques for the invariant generation problem. Two of them belong to the more traditional Propagation-based techniques for invariant generation. These are contrasted against our own approach to the problem that we call the Constraint-based technique. The tool has been built only as a platform to test ideas. A new release will contain at least two more algorithms for invariant generation and compilers/translators for many more benchmark examples.

StInG computes invariants by constraint solving . In effect it reads in the transition system description and compiles them into constraints. Any solution to these constraints is an invariant. StInG has two main components, (1) Compiles linear transition system into Parametric-Linear Constraints, and (2) a heuristic constraint-solver that uses a lazy Constraint-Logic Programming like technique to solve non-linear constraints of a very special structure.

Compilation:

The compilation into constraints is achieved by using a very standard mathematical technique called dualization. A linear invariant for the program looks like this template below:

 c1 x + c2 y + d >= 0

the idea is to specify constraints on the unknown c1,c2 and ds so that any solution to the constraints when plugged-in to the template will yield an invariant. The details are available in our SAS [SSM04] and our CAV [CSS03] papers.

Solver:

The constraint solver core in StInG has three components to it. A linear equality solver, a linear inequality solver which uses the polyhedral library Parma Polyhedral Library [PPL], and an incomplete but sound disequality solver. The solver then constantly rewrites, uses the store to cast the constraints in a canonical form, and splits cases, until it is able to solve. It also implements subsumption rules (refer to our paper[SSM04] for details). These rules let us decide when to abandon a particular branch in the solution search.

Implements Cousot-Halbwachs

Most invariant generation algorithms before StInG are based on Abstract Interpretation([ CC77]. Specifically the Cousot-Halbwachs algorithm ([CH79]) works by propagating polyhedra. I will try to present a rough idea of how it works. The algorithm maintains an estimate of all the states that a program can reach. A state for a program with n real variables is a point in the n-dimensional space. An estimate is a polyhedron in this space. The initial estimate is that no state is reachable. We then form a current estimate of what is reachable. If this is not a polyhedron, it is forced to be by computing the convex-polyhedral hull. So every estimate is forced to be a polyhedron this way. To the ith estimate, we add those states that can be reached in one more step by an operation called post-condition. So we try to close the polyhedral estimates under post. But this does not terminate, and so we are forced to take a guess. For example consider the following sequence of estimates on x,

 0 <= x <= 10 0 <= x <= 20 0 <= x <= 40 .....

We could either chase it ad-infinitum or smartly widen it up to 0 <= x . Widening takes a sequence of estimates and guesses where things are heading. It does this by dropping sides from the polyhedron that were not also sides of the previous estimate. Widening works well in practice, because,
• It keeps the polyhedra simple, computing on complicated 10-dimensional polyhedra with a 1000 sides is much much harder than that on a 100 sided polyhedron.
• It makes sure that the algorithm terminates with some invariants that can be used.
But at times, widening becomes too weak to be of use. There are numerous documented examples. Plus, no one is reasonably sure where, when and how to apply widening. Strangely enough, it is only now in 2004, 25 years after the algorithm first appeared that people are studying this issue in detail. A much more sophisticated widening operator recently appeared[BHRZ03]. It has been implemented in the Parma Polyhedral Library, and we shall refer to it as the BHRZ03 widening after its developers. Similarly the original widening will be called the CH79 widening. We shall use these as comparisons for our method.

 Results

Results were obtained on benchmark examples which came mainly in two categories based on where they came from:

• Those designed and typed out by the author of this document,
• Examples (with citations) from the related work, especially from the FAST developers.
and also based on how complex they are, we split them into low-dimensional examples, and higher dimensional examples. The table below shows only the timings and comparisons. More detailed results are reported here [ SSM04]. The timings may vary a lot when run on the latest release because of the many bugs that were unearthed since the paper appeared in print. Finally, we have dropped all support for strict inequalities thus achieving good overall speedups (Thanks to Prof. Bagnara for pointing this out to us).

Low-Dimensional Exampes

Results on Low Dimensional Examples
Name SSM04 CH79 BHRZ03
Name Dimension Time (secs) Invariant Time +- Time +-
See-Saw 2 0.03 see-saw.inv 0 + 0 +
Robot-HH96 3 0.02 robot.inv 0.01 = 0.01 =
Train-HPR97 3 0.86 Train-HPR97 0.02 +++= 0.02 +++=
Berkeley 4 0.06 Berkeley 0.01 <> 0.01 <>
Berkeley-nat 4 0.04 Berkeley-nat 0.01 + 0.01 +
Heapsort 5 0.1 Heapsort 0.02 <> 0.02 <>
Train-RM03 6 1.16 Train-RM03 0.06 + 0.07 =
EFM 6 0.36 EFM 0.0 - 0.0 -
EFM1 6 0.32 EFM1 0.0 - 0.01 -
LIFO 7 0.88 LIFO 0.29 <> 0.32 <>
LIFO-NAT 7 10.13 LIFO-NAT 0.27 + 0.32 +
Cars-Midpt 7 0.1 Cars-Midpt 32.8 + > 3600 +
barber 8 1.68 barber 0.18 + 20.41 +
Swim-pool 9 0.42 Swim-pool 0.08 - 0.61 -
Swim-pool-1 9 0.88 Swim-pool-1 0.07 = 0.59 =

Higher Dimensional Examples (Scheduler:)

Results on Scheduler Examples
Num. Processes SSM04 CH79 BHRZ03
# Dim Time (secs) Invariant Time +- Time +-
2 7 0.54 scheduler-2p.inv 0.15 <> - 0.19 <> -
3 10 8.21 scheduler-3p.inv 2232 + > 3600 <>
4 13 284 scheduler-4p.inv >3600 + > 3600 +
5 16 > 3600 scheduler-5p.inv >3600 ? > 3600 ?

Higer Dimensional Examples (Convoy of Cars:)

Results on Cars Examples
Num. Processes SSM04 CH79 BHRZ03
# Dim Time (secs) Invariant Time +- Time +-
2 10 3.54 cars-2p.inv 4.23 + 443 <>
3 14 20.54 cars-3p.inv >3600 + > 3600 <>
4 18 1006 cars-4p.inv >3600 + > 3600 +

 Instructions

Installing and Running

The main binary is \$STING_HOME/bin/lsting. It was compiled with gcc on a i386_linux24 system. Before running, please either ensure that PPL/GMP are installed on your system, or set LD_LIBRARY_PATH to include \$STING_HOME/lib/GMP and \$STING_HOME/lib/PPL.

Please read the README file under the main directory for the disclaimer. Any use of this code in your research should be cited. If you use the examples, you will also need to cite the FAST project papers [FAST]. To run sting, type:

 % lsting < input.in > output.out
Please note that lsting accepts input from stdin and outputs to stdout. Therefore, redirects are required to read in and out of files. The usage information is shown below: Command line options:

Option (default) Description
one (off) Use a single solver instance for the entire LTS. This is efficient if your system has a single location.
many (on) Use many solver instances, one per location and use 0/1 instantiation for the inter-location transitions automatically.
noch79 (off) Do not do ch79.
nobhrz03 (off) Do not do bhrz03.
invcheck (off) Check invariants obtained for consecution. If this ever fails then please notify us immediately!!! Running invcheck can significantly slow down the process for larger examples.

Input file description:

This description is a temporary measure. We hope to adopt one of the standard descriptions from which we can draw benchmark examples.

The input file starts with a header describing the variables and how many steps of propagation to be used (optional). For example, the header

 variable [b1 s d] propsteps(3)

declares three variables b1,s and d, along with a directive asking the generator to perform 3 steps of initial propagation for the CH79 and BHRZ03 methods.

This is followed by a description of the system itself. The system description consists of a description of location , transitions and invariants . These can appear in any order but we recommend describing all the locations first, the transitions next followed by the invariants.

The location description is of the form

 Location loc_id optional_initial_conditions

Here is an example with initial conditions

 Location late1 b1 -s +10 =0 d=0 b1 >=0 s>=0

Here is one without, the initial condition is taken to be false , i.e, control cannot start from such a location,

 Location late1

Locations without initial conditions need not be declared. Their presence will be inferred when a transition originating from/pointing to that location is encountered.

Transition Descriptors have the following form

 Transition trs_id : preloc_id , optional_postloc_id transition_relation preserve[ set_of_preserved_variables ]

Each transition id must be unique. We reference transitions internally based on their strings. The prelocation is compulsory, so is the comma following it. If the postlocation is the same as the prelocation, then it can be ommitted. The preserve command is optional. Here is an example

 Transition back_on_time_ob: onbrake,ontime, b1 - s <=0 'd=0 preserve[s,b1]

Note that 'd refers to the next state value of d, instead of the traditional d'.

The final component of a description is an invariant . An invariant is a forced condition on a location. In our semantics, a transition is taken from a state that satisfies the invariant of the pre-location. For the time being, we do not enforce the symmetric condition of invariant condition of the post-location to hold. Invariants provide a convenient way of adding something common in the guards of transitions. We can also iterate the invariant generation process by adding previously computed invariants back into the guards of the transitions.

 invariant loc_id : invariant_conditions

Here is an example

 invariant late1: b1 - s <= 0 b1 >=0 s>= 0

And finally do not forget to add end to the end of your description. Comment lines begin with #.

 References

 [SSM04] Sriram Sankaranarayanan, Henny Sipma and Zohar Manna, "Constraint-based Linear-Relations Analysis", in Static Analysis Symposium, 2004. [CSS03] Michael Colon, Sriram Sankaranarayanan, and Henny Sipma, "Linear Invariant Generation using Non-linear Constraint Solving", in Computer-aided Verification (CAV), 2003. [CC77] Patrick Cousot and Radhia Cousot, "Abstract Interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints". In the Fourth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL 1977). [CH78] Patrick Cousot and Nicolas Halbwachs. "Automatic discovery of linear restraints among variables of a program". In the Fifth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Program ming Languages, (POPL 1978). [MP95] Zohar Manna and Amir Pnueli, "The Temporal Verification of Reactive Systems: Safety", 1996. [BHRZ03] Roberto Bagnara, Patricia M. Hill. Elisa Ricci, and Enea Zaffanella, "Precise Widening Operators for Convex Polyhedra", in Static Analysis Symposium, 2003. [FAST] The FAST project page at http://www.lsv.ens-cachan.fr/fast. [PPL] The Parma Polyhedra Library project page at http://www.cs.unipr.it/ppl.

 Credits

The key algorithms behind the library were conceived jointly by Sriram Sankaranarayanan, Michael Colon, Henny Sipma and Zohar Manna.

The code was designed and implemented by Sriram Sankaranarayanan.

The project uses the PPL and the GMP libraries at its core for which we are extremely grateful. Discussions with the PPL developers after this release enabled us to use their library better, for which we are doubly thankful!

This project was partially funded by the NSF-EHS program (grant CCR-02-20134), by the NSF-ITR program (grant CCR-01-21403) and by the DARPA SEC program (ARPA/AF contract F33615-99-C-3014).

This page has been designed and is being maintained by Sriram Sankaranarayanan.