Vai al contenuto| Home page|

   Ti trovi in: HOME »Programmi, progetti e risultati »I progetti »PRIN - Programmi di ricerca di Rilevante Interesse Nazionale»Programma di ricerca»Unità di ricerca
INIZIO_TESTO_DA_INDICIZZARE

UNITA' DI RICERCA

italiano - english

Research program

AIDA - Abstract Interpretation Design and Applications
University Co-ordinator
Università degli Studi di PARMA - MATEMATICA - PARMA(PR)
Research Unit Leader
Roberto BAGNARA
Description
Most of the work of this Research Unit will be done in the context of Workpackage 1 (Static Analysis) of the project and will look at the specification, development and implementation of abstract domains and operators, including widenings and narrowings operators, that can be used to enforce or accelerate the convergence of the abstract computation. More specifically, the main objective of this research is the theoretical study and practical implementation of a complete and robust library of reusable numerical abstract domains: intervals, bounded differences and quotients, octagons, convex polyhedra, grids and Z-polyhedra.The domain of convex polyhedra has several operations that are characterized by an exponential worst-case complexity for both time and space resources. On the other hand, the interval domain, while being much simpler from a computational point of view, often yields approximations that are too coarse. An intermediate solution is given by the domains of bounded differences and octagons. Industrial users of static analysis tools often need the accuracy provided by the most precise abstract domains; however, they also need some form of control over time and space consumption, since approximate results within a given time frame are preferred to more precise results at an unknown date. We will thus pursue the possibility of dynamically combining polyhedra with these simpler constraint domains in order to solve the precision and scalability trade-off. The static analysis or verification procedure, by default, should consider the more descriptive (and costly) domains, such as convex polyhedra, yet specify maximal resource bounds (computation time, memory occupation, magnitude of the numerical coefficient involved) for each operation. When the computation of an operation exceeds the imposed space or time limits, a simpler, less accurate algorithm is used, which still provides correct results for an abstract interpretation, i.e., such that the resulting polyhedron is a superset of the exact result. The less accurate algorithm will perform a change of representation, for instance by approximating the polyhedra by a system of bounded differences and computing the required operation on that simpler domain. If the cost of computing with bounded differences still happens to be unaffordable, then the system should perform a further change of representation, e.g., scaling down to the domain of intervals, where most of the interesting operations have linear complexity. After the execution of the requested operation on the less precise domain, the result may eventually be converted back into, say, a convex polyhedron. The final objective is the realization of a robust library of operators on polyhedra able to deal with large sets of many hundreds (or, possibly, thousands) of constraints in high dimensional spaces.It is important to notice that such a scenario is feasible only if all the involved abstract domains have been designed and implemented so that they consistently react, without any loss of system resources, to events such as a timeout or an exception thrown by the memory allocation routines. The fact that we have achieved this in the ``Parma Polyhedra Library'' [BRZH02], which will be one of the key and more complex components in our library of numerical abstract domains, is thus essential.Other features that are important for all members of the library are: support for every operator needed by an abstract interpreter; support for interoperability, that is, the ability to build any kind of description starting from any other kind of description; support for space decomposition operators that can help speed the execution [HMPV03]; availability of a database of difficult non-regression tests for each operator.
IntervalsA good abstract domain for intervals is highly desirable not only as the fall-back domain, the most efficient (though not very precise) domain to be used when everything else has failed, but as a key ingredient for the realization of more complex domains such as bounded differences and quotients [Dav87, Bag97]. Unfortunately, no existing interval library is really adequate to constitute the basis of such an abstract domain. In fact, the available libraries either lack support for non-closed intervals (so that they are unable to represent constraints of the form `ax < b'), or they do not provide the right support for approximation in the sense of partial correctness (e.g., division by an interval containing zero gives rise to a runtime error instead of yielding an interval containing the result under the assumption that the concrete division approximated was not a division by zero), or they disregard rounding errors and are therefore unsafe. For instance, the filib++ library [LTW+01], which is very sophisticated and highly regarded in its own application field, suffers from the first two problems mentioned above.We will therefore study and implement a complete abstract domain based on intervals. Such intervals will be parametric on a number of features and will support open as well as closed boundaries. It will be possible to choose boundaries within one of several families of numbers, including native integers, unlimited precision integers, rationals, and native floating point types. When boundaries are floating point numbers, several methods of controlling the rounding directions will be provided with different characteristics concerning portability and efficiency. Independently from the type of the boundaries, both intervals of real numbers and intervals of integer numbers will be supported.
Bounded Differences and QuotientsA domain of bounded differences allows to express the relative values of two quantities by means of finite sets of constraints of the form `x-y in S', where `S' is a subset of the reals [Dav87]. The first proposal for the use in static analysis of this and other descriptions originated in the field of artificial intelligence is due to [BGL92, BGL93] and was further elaborated in [Bag97, Chapter 5]. With a suitable domain for (extended) intervals the implementation of the basic operations for bounded differences is relatively easy. However, as is the case for octagons (see below), for several important operations, including a proper widening, no efficient realization is known to date. Further theoretical work is thus required in order to obtain a fully fledged abstract domain based on bounded differences. For the widening operators, we plan to use the ideas presented in [BHRZ03, BHZ03b].Once one has fixed a base for logarithms, a system of bounded quotients captures relations of the form `log |x| - log |y| in S' [AK85, Bag97]. Bounded quotients are quite similar to bounded differences and several aspects of the implementations can be reused. However, much work remains to be done, both at the theoretical and at the practical levels, in order to turn this idea into an usable abstract domain. Notice that, differently from all the other domains we want to study in this project, the domain of bounded quotients involves non-linear constraints. The need, in some application of static analysis, to go beyond the expressive power of linear constraints has been recently highlighted in [BCC+03], where the definition and use of an abstract domain based on ellipsoids proved to be crucial for the success of the analysis. We conjecture that the domain of bounded quotients we propose will be quite powerful for analyses such as compile-time overflow detection.
Octagons and BeyondWhile the octagon abstract domain is characterized by a quadratic memory cost (in the number of variables), the cubic time cost that is advertised for them is currently only available for a fraction of the operations of interest. The limitations of the current definition (and, of course, of the available implementations) of the octagon domain include the following: the efficient addition of a new linear constraint is only available for constraints that are of the special form above; the computation of affine images and preimages has only been defined for a very limited class of affine transformation; an efficient definition of the time-elapse operation [HPR97], which is useful for the analysis of linear hybrid systems, is unavailable; no proper widening has been defined for octagons: only extrapolation operators that do not provide any convergence guarantee are available. In addition, no currently available implementation supports strict constraints, so that they do not support octagons that are not necessarily closed (NNC). In this project we aim at overcoming all these limitations of the octagon domain. In particular, we will research efficient algorithms to realize the operations mentioned above, using the techniques described in [BHRZ03, BHZ03b] to endow the domain with very precise, proper widening operators. In a first phase we will experiment with the octagon implementation based on difference-bound matrices proposed in [Min01]. We will then bring forward the proposal in [Bag97] and study the realization of octagons as constraint networks based on the class of intervals outlined above. This will immediately result in a class of not necessarily closed octagons and, more generally will allow the generalization to abstract domains that are more expressive than octagons yet characterized by an acceptable computational complexity.
Convex PolyhedraRegarding the domain of convex polyhedra, we plan to investigate specialized algorithms working on restricted classes of polyhedra that are commonly used in several application fields. An example is the class of non-negative polyhedra (for approximating those subsets of the vector space whose elements have all coordinates greater than or equal to zero). The ``Parma Polyhedra Library'' will also be extended with an efficient version of the simplex algorithm: this will, in turn, enable the addition of several new features such as the efficient computation of the envelope of two polyhedra as an approximation of their convex polyhedral hull.We will also target the adequate handling of NNC polyhedra, i.e., polyhedra that can be described by systems of constraints where strict inequalities are allowed to occur. Strict inequalities are important, for instance, in order to directly represent non-intersecting temporal regions, as it is often the case when modeling applications where concurrency, synchronization protocols, asynchronous interactions and temporal constraints come into play. Current implementations based on the double description method represent NNC polyhedra as closed polyhedra in a higher dimension vector space. While this encoding allows for a substantial fraction of code to be reused (almost) unchanged, some of the resulting operators suffer from unnecessary performance and/or precision losses. A precision loss may be incurred, for instance, when enforcing the convergence of the fixpoint computation by widening the closed representations of NNC polyhedra. This could be avoided by providing a widening operator specifically designed for NNC polyhedra. As another example, a performance penalty can be met after the computation of a description in minimal form for an NNC polyhedron. The currently available minimization procedures are not incremental so that, when minimizing the constraint (resp., generator) description, they invalidate the dual generator (resp., constraint) description. A possible solutions to this problem is the wider adoption of lazy computation techniques, which would allow to preserve both descriptions until, by looking at the following computation, it would be possible to make a more reasoned choice. Another possibility is the specification of a fully-incremental minimization procedure, allowing for both the addition and the removal of constraints/generators.
Grids and Z-PolyhedraMany applications of static analysis such as buffer overflow detection, verification of termination or automatic parallelization are concerned with discrete numerical values. Relevant candidates for these applications are those domains whose elements are lattices of vectors having integer coordinates [Anc91, BK97, CL98, DRS01, Gra91, LW97, Pug92]. However, for many applications these domains are inadequate as the values may not be integral or the lattice components may be sets of vectors rather than single vectors [Gra97, Mas93]. Thus we plan to develop a truly generic grids domain whose elements are lattices having components in other numerical abstract domains [Gra97, Mas93]. These grids will be provided with all the basic operations needed in a static analyzer.One important and useful analysis domain that uses grids is that of Z-polyhedra whose elements can be defined as the intersection of a polyhedron and a grid [Anc91, NR00, QRR96]. The existing operations on this domain are often inefficient or not fully specified. Research is therefore needed in formalizing and implementing the grid domain and then, building on this work, to develop a grid-polyhedral domain.In collaboration with the other units and with other research groups (see http:/www.cs.unipr.it/ppl/Credits/), the Parma research unit will closely follow any further work on static analysis applications relevant to these and similar abstract domains.
Application to Complexity AnalysisAnother line of research will be pursued in the context of Workpackage 5 (Constraints) of the project. We plan to study the problem of fully automatic complexity analysis, for which the research unit is in a unique position. In fact, in order to attack this problem, we would combine the work on numerical abstract domains outlined above with our work on the automatic solution and approximation of recurrence relations (see http://www.cs.unipr.it/purrs/ and [BZZ03, BZ04]). One way to express such a combination of data-flow and control-flow information is by a loosely coupled integration of the corresponding constraint systems, so as to keep the complexity of the resulting interactions under control. In fact, recurrence relations can be seen as constraints on sequences of real numbers, therefore relating the size of a problem (e.g., a measure of some of its input parameters) with the upper or lower bounds on the amount of resources needed for solving it (e.g., an upper bound on the number of computation steps). Besides deepening the study of automatic methods to solve or approximate recurrence relations, we will study the problem of their extraction from the analyzed software systems, an operation that usually hides other forms of approximation that can greatly influence the overall accuracy of such an analysis. In order to provide a first proof-of-concept, we will initially focus on a suitable fragment of Objective CAML and extend the techniques to larger fragments in later phases. When the object-oriented aspects of the language will be considered, precise complexity analysis will require class and escape analyses, which we plan to realize with the integration of set-constraints [DPPR00].