Vai al contenuto| Home page|

   Ti trovi in: HOME »Programmi, progetti e risultati »I progetti »PRIN - Programmi di ricerca di Rilevante Interesse Nazionale»Programma di ricerca»Unità di ricerca
INIZIO_TESTO_DA_INDICIZZARE

UNITA' DI RICERCA

italiano - english

Research program

Cryptographic databases
University Co-ordinator
Università degli Studi di BERGAMO - INGEGNERIA GESTIONALE E DELL'INFORMAZIONE - ()
Research Unit Leader
Stefano Paraboschi
Description
The presentation of the scientific base demonstrated that the topic we want to explore is relevant and that the previous work of our research unit in this area is a good starting point for the research goals of the project. We now describe the problems we consider and the approach we plan to follow. All the problems we consider have an important role in the scenario and need to be solved in order to fully exploit the potential of the "untrusted database" paradigm.

The UNIBG unit cooperates equally with UNIMI and UNISA in WP0 and WP8, supports the work of UNISA in WP3 and of UNIMI in WP4, is responsible of WP6 (with support by UNIMI) and is responsible of WP7 (with support by UNISA). Each of the "core" workpackages (WP1 to WP7) aims at a different aspect that arises in the realization of a secure database service hosted by an untrusted third party.

WP0: State of the art

This workpackage has the goal to evaluate the state of the art of the field. Even if the UNIBG unit has previous direct experience on the context of the project, the field is rapidly evolving and a precise assessment of the main research directions and of the recent results is extremely useful as a preparation for the work of the following workpackages. The contribution of the UNIBG unit to workpackage WP0 will be an intermediate report, WP0-BG-D1-R. A responsibility of the UNIBG unit will be to combine this report with the contributions produced by the UNIMI and UNISA units, to produce the project report WP0-BG-D2-R, which represents the main deliverable of this workpackage.

WP3: Prototype of cryptographic services

This workpackage has the goal to build a prototype of the cryptographic services designed by the UNISA unit in WP1 and WP2. The role of UNIBG in this WP is to provide the implementation requirements that will permit the integration of this prototype within the software system built in WP7.

WP4: Access control models for encrypted databases

The work in WP4 starts from the consideration that for most database applications there is a need to manage different users, with specific confidentiality and integrity requirements that distinguish the access privileges of each user. This is a line of research that has recently received a significant interest.

Relational DBMSs use an access control model that is based on the description of authorizations with triples (subject,object,action) that are stored in a table of the data dictionary. Such a mechanism requires that a system component with the privileges of the database owner implements a reference monitor that is responsible to evaluate each request access coming from the users. In the scenario we are studying, such an approach would be extremely limiting, since it would force each database user to contact the database owner in order to retrieve exactly the information that the user has the right to access. Instead, the full realization of the potential of this architecture requires the definition of mechanisms that allow users to directly access the database, providing protection from incorrect read/write accesses using adequate cryptographic solutions.

The UNIBG unit on WP4 will support the initiative of UNIMI on the design of mechanisms that will allow the management of a plurality of users with different access privileges. Confidentiality requirements will probably require to introduce families of keys to encrypt different portions of the data, each key protecting data characterized by the same protection requirements. Integrity will probably be based on the design of signature services that will permit a user to verify if the piece of information retrieved by the database is correct or not.

One of the main obstacles in the realization of a system that supports confidentiality and integrity requirements in this context is the design of an adequate key management solution. Considerable attention will have to be paid to recent work on these topics. Another issue that will be considered in WP4 is the introduction of trust management services in this context, which would offer the opportunity for the construction of many advanced distributed services.

WP6: Integration of data encryption with current relational systems

The goal of WP6 is the identification of techniques able to support the realization of the "database-as-a-service" scenario. We will start from the knowledge of the current status of database technology, aiming to identify the key issues that need to be solved to permit a full realization of the potential of this architecture. There are indeed several issues that need to be solved in order to deploy in a real environment this architecture, We organize the work of this workpackage into 4 tasks; task 2 is decomposed into two subtasks.

Task 1: Indexing structures

Even if computer systems continue to increase their performance at costs that continue to decrease, an adequate performance level still remains a major concern in the construction of a DBMS-centered solution. An architecture that relies on the "database-as-a-service" paradigm is not competitive if it offers its secure services at the cost of a significant performance reduction. Indeed, many security solutions are not effectively deployed because of their inability to offer a level of efficiency adequate to what is expected by the users.

Cryptography necessarily imposes a overhead on the computations of the system; it is important to design a system where the performance impact is not significantly more than what derives from the presence of the cryptograhic services.

Access to data contained in relational databases occurs with a set-oriented approach. Starting from a declarative specification of the user request (typically, a SQL query), the optimizer translates the query into an internal algebraic representation, which makes use of the operators supported by the relational engine. Performance of the query usually depends on the presence of adequate indexes, which are fundamental to support an efficient access to the information contained within the database.

The strategy we plan to follow in order to satisfy this requirement consists in designing novel index structures. These index structures must take as a starting point the solutions that are currently supported by relational engines (trees in their many variants, hash structures, bitmaps, spatial access methods), adapting them to the specific features of this environment.

Task 2: Integration with the Query Optimizer

Another aspect to consider is the identification of a strategy that would permit the integration of the above index structures with the query optimizers of existing DBMSs. There are two main issues that have to be solved in order to realize this integration:

- Task 2.1, Novel Cost Model: a cost model should be constructed for the novel operators responsible of the remote access;

- Task 2.2, Feasibility of Query Optimization: an approach should be identified that limits the increase in the solution space that would derive from the introduction of a complex remote access service with additional operators and novel index structures.

Task 2.1: Novel Cost Model

The construction of a novel model for the representation of the cost of the remote access operators is necessary to support the integration of storage of encrypted data within a remote database. A major difference between the cost model that is necessary to adopt in this context and the ones that characterize current DBMSs is the fact that cost models of DBMSs focus on the identification of the number and organization (random or sequential) of disk I/Os, which is traditionally the main parameter controlling the relational engine performance. Instead, in most of the contexts where this paradigm is going to be used, we can assume that the major bottleneck of the system is the network bandwidth available to transfer data between the server and the client. Then, the major parameter to control is the amount of data that needs to be transferred to the client as an answer to an access request; another parameter with an important impact on performance is the latency of the access to the remote database, i.e., the delay expected for the receipt of the first result.

The definition of such a cost model will require to design an abstract model that represents the behavior of the different components of the architecture. The heuristics that are currently used to solve the optimization problem of relational queries should then be adapted to this context. Finally, experiments should be realized, to verify in concrete terms the quality of the solution returned by the novel cost model.

Task 2.2: Feasibility of the Query Optimizer

The goal of this task is to identify an optimization strategy able to maintain the query optimization problem feasible. It is useful to consider the similarities and differences of this context compared with a distributed database scenario. Similarly to a distributed database, in this scenario there are relational engines in a network that have to cooperate in the computation of queries. Previous research on distributed query optimization has demonstrated that the problem is extremely difficult, since the solution space is considerably greater than in the centralized setting and the application of standard query optimization strategies in this context is too expensive. The lesson to be learned from this research is that it is important to aim to an effective decoupling of the problem, since the consideration of all the aspects of the system certainly makes the problem unfeasible. Fortunately, it appears in this scenario we can consider as significant only the optimization of the query on the client (the query cost of the server can be considered an optimization problem on the part of the service provider); also, the minimization of the network costs should minimize the amount of the information that the untrusted server is asked to return, minimizing its costs, too. These considerations make us think it is possible to define a framework able to offer adequate performance in the optimization phase.

Task 3: Modular Architecture

Another important aspect that the research in WP6 should consider is the design of an architecture that would allow the easy integration of this approach within existing DBMSs. Current DBMSs are often huge pieces of software, which implement many sophisticated services with an adequate level of performance. The use of a remote untrusted server is reasonable only if it continues to offer most of the services of a current DBMS, otherwise it would not be able to support the requirements of most business applications.

The re-implementation from scratch of all the DBMS services within a system supporting untrusted remote servers is out of the question, as it would require an enormous effort. A strategy that we plan to investigate is one that aims to modularize the components implementing the access to remote untrusted servers, in order to permit an almost "transparent" integration of this novel service within an existing DBMS.

Ideally, the software architecture of the DBMS should permit the identification of a few modules that would require to be adapted for the introduction of this novel service. We plan to do this analysis, using, as a reference, DBMSs that are characterized by an open design. Since the service of an untrusted remote server can be considered as a simple storage of tables, we will specifically investigate the integration of this service as the introduction of novel physical data access operators. In this way, this service would be introduced as a physical design option, aiming to increase the robustness in storage. Current encrypted DBMS, where encryption is only applied within the functions that access the content of disk blocks, demonstrate that a physical access component using encryption can be used within existing DBMSs with a limited effort. In this context the service interface operates at a considerably higher level and the challenges it poses are considerably greater; nonetheless, it appears to us that, at the cost of a loss of efficiency that needs to be quantified, there is the possibility to introduce an abstract layer that separates the access service to the untrusted database and makes it possible to integrate it within an existing DBMS. This line of research obviously requires a significant experimental support, which will be one of the goals of the activities in WP7.

Task 4: Physical Design Methodology

The model produced in the previous task, which presents the use of remote untrusted servers as a physical design option, provides the opportunity for the design of an updated physical design methodology, A successful focalization of this approach, within the physical design phase of the database design process, would present the great advantage of a reduction of the impact to current database design methodologies, permitting a relatively easy introduction of this paradigm within existing applications.

WP7: Prototype implementation

The goal of workpackage WP7 is the implementation of a prototype that will be used as a testbed for the evaluation of the feasibility and applicability of the solutions produced in the other workpackages. We will operate in two phases.

The first phase will focus on the construction of an architecture that will build from scratch a simple relational engine, in order to demonstrate the behavior of the system. This prototype will extend the relational engine that has been described in [DDFPSJ-03].

The second phase will instead aim to demonstrate how the approach can be integrated within an existing DBMS. This work will extend an existing open source database engine. Our research unit has some experience in the extension of open source DBMS engines. Specifically, an activity has recently analyzed the support of triggers in PostgreSQL and has produced a variant that increased the level of support that PostgreSQL offers to trigger priorities and to their set-oriented processing. This prototype will integrate the cryptographic services implemented by UNISA in WP3.

The idea to work in parallel on a research prototype and on the extension of existing DBMS engines gets the inspiration from the experience in the active database scenario, where, at the beginning of the '90s, members of the UNIBG unit built several research prototypes at Politecnico di Milano from scratch to demonstrate the advanced features that triggers could realize; at the same time, the same group of researchers analyzed the introduction of triggers within existing commercial DBMSs.

WP8: Dissemination

The organization of a Workshop at the end of the project represents a convenient way to promote the results of the project.

Apart from the Workshop, the members of all the project units have significant research experience and will be able to promote the results of the project with publications in major conferences and journals. This dissemination activity, crucial to the success of the project, will occur during all the phases of the project and we expect it will produce a significant number of publications. The continuous feedback that the scientific community will give after a submission to a journal or a conference will be of great help in identifying the road to follow and will help in the verification of the potential of the project. Frequent meetings and coordination activities among the research units will allow the coordinator of the project to assess the progress toward the project goals.