Vai al contenuto| Home page|

   Ti trovi in: HOME »Programmi, progetti e risultati »I progetti »PRIN - Programmi di ricerca di Rilevante Interesse Nazionale»Programma di ricerca»Unità di ricerca
INIZIO_TESTO_DA_INDICIZZARE

UNITA' DI RICERCA

italiano - english

Research program

Algorithms for the Next Generation Internet and Web: Methodologies, Design, and Experiments
University Co-ordinator
Universita' degli Studi di ROMA - INFORMATICA E SISTEMISTICA - ROMA(RM)
Research Unit Leader
Giorgio AUSIELLO
Description
In the following, for each Task, we present the proposed research activities involving this Research Unit.Task 1.1 - Models and Algorithmic ParadigmsIntroductionOur goal in this task is to design and implement a software library able to provide basic functionalities needed to handle massive graphs stored in secondary memory. We want to implement known techniques to store, retrieve and measure graphs, and, when necessary, to design new techniques. Depending on the problem, we will focus on algorithms to deal with graphs that exhibit the known properties of the Webgraph, i.e. the power-law in-degree, a huge scc and a small diameter (if compared with the number of nodes). Later we plan to use the library to measure samples of the Webgraph.Phase 1.In this phase we plan to develop a software library to handle basic operations over massive graphs in external memory. We need to define the external data structures, implement the basic techniques as well as routines able to import/export from/to standard graph file formats (i.e. Dimacs) and web repositories (i.e. WebBase). On a parallel line we would like to collect samples of the Web using different crawling strategies.Expected Results.Open source software library to handle massive graphs. Collection of samples from the Web.Phase 2.We plan to use the tools developed in Phase one to analyze samples of massive data from the Web. In particular we want to investigate if different crawling strategies (starting from a common node set) can lead to statistically different samples. We also plan to adapt the library to deal with temporal data related to the link.Expected Results.An empirical study of properties of samples from the Web. Guidelines for crawling strategies.Task 1.2 - Visualization MethodologiesIntroduction.We plan to create new Java-based tools to support the realization of animated visualizations over the Web, which can be used in studying network-related aspects. In particular, we aim at developing light and easy-to-use visualization tools for producing highly customized and interactive presentations. Furthermore, we will exploit the potentialities offered by the Java language so as to make our tools highly independent of specific operating systems. This activity will be carried on in strict cooperation with another Research Unit within the project.Phase 1.In this Phase, we plan to devise new approaches to automate (or at least to simplify) the visualization of network-related data over the Web. In particular, we plan to develop light and easy-to-use Java-based tools to support the creation of animated visualizations by integrating editor capabilities with the traditional program-driven animation approach.Expected Results.Light and easy-to-use Java-based tools for creating animations over the Web.Phase 2.We will evaluate the effectiveness of the visualization systems developed during Phase 1, cooperating with other Research Units in the evaluation process.Expected Results.Evaluation of the visualization tools developed during Phase 1.Task 2.1 - Algorithms for Internet RoutingIntroduction.We plan to design and engineer new algorithms for Internet routing, based on shortest path techniques, to be employed in networking software tools. Since the network dynamics are extremely fast (due to the traffic variability as well as to the variations in the network connectivity and in the availability of switching and transmission resources), the recomputation of the routing tables must be as fast. Following up on a very recent technique [DI03], which is substantially different from previous approaches, we plan to use novel combinatorial properties of graphs based on the well-known optimal-substructure property of shortest paths as the kernel for solving dynamic single-source shortest paths.Furthermore, we plan to devise methods for maintaining dynamically the t-spanner of a graph, i.e., a subgraph where the original distances are approximated within a factor of t. This is especially important for reducing the number of physical links in a communication network, which may be very expensive, without increasing too much the distances between routers.A different research line is based on tools from Game Theory. We aim at a game-theoretic analysis of the Internet switching and routing problems. Later, we intend to design mechanisms for network cost sharing that are strongly truthful, i.e. the dominant strategy of every user is to report its truth utility even in presence of coalition, efficient, and allow to recover a good share of the network cost.Part of the activities in this will be carried on in strict cooperation with another Research Unit.Phase 1.We will investigate new algorithmic techniques for maintaining dynamic single-source shortest paths. We emphasize that this is a very ambitious goal. Differently from other dynamic graph problems, such as for instance the dynamic maintenance of minimum spanning trees [HLT98], dynamic shortest path problems seem much harder to solve: indeed, despite the existence of some previous studies [FMN99,RR96], finding a solution that is better than recomputing from scratch a shortest path tree is still an elusive goal since over four decades. To achieve better bounds, we plan to extend the latest results for dynamic all pairs shortest paths presented in [DI03].We also plan to study methods for maintaining dynamically the t-spanner of a graph.Furthermore, in the framework provided by Game Theory, we aim at studying the Internet switching problem in the packetized model with utility functions depending on the individual delay and throughput, without any assumption on the network traffic.Expected Results.New algorithms for dynamic single-source shortest paths and dynamic t-spanners. A model for Internet switching management. The theoretical and experimental study of Nash equilibria and price of anarchy [KP99] of simple FIFO and Random Early Detection buffer and congestion management policies. Phase 2.We plan to evaluate experimentally the algorithmic techniques studied in Phase 1, developing new heuristics for improving the performances of the proposed algorithms in real environments, and testing them on open source platforms. We will also tune and engineer known dynamic routing algorithms by extensively testing their performance on suitable data sets generated according to specific models for next generation Internet and we will design new ones based on the gained insight.We aim at designing mechanisms to share in a fair manner the cost of a network infrastructure [JV01]. In particular we will study the Rent-or-buy network design problem [PT03,LS04] in which we aim at deploying a network in which link connections are either bought at some fixed cost or rented at a 'per unit of flow' cost.Expected Results.Engineering of the algorithms devised in Phase 1 for Internet routing. Cost sharing mechanisms for single source rent-or-buynetwork design that use randomization and share the expected cost of the network infrastructure. Extension of these mechanisms to the Multi-commodity setting.Task 3.1 - Algorithms for Web SearchingIntroduction.In Web searching one of the crucial problems is the ranking of the documents related to a user query. Modern search engines use ranking algorithms based on link analysis. We are interested in the stability of rank analysis algorithm. We say that a ranking algorithm is stable if a small perturbation of the input produces a small variation in the output.Phase 1.We plan to study necessary and sufficient conditions for stability and similarity of link analysis ranking algorithms in terms of the spectral properties of the underlying Webgraph. In particular we plan to study under which conditions the HITS algorithm is stable and it produces a rank that is substantially similar to the In-degree algorithm.Expected Results.Theoretical analysis of the stability and similarity of link analysis ranking algorithms.Phase 2.We plan to extend our work to the study of the stability of link analysis ranking algorithms on specific models for the Webgraph such as the product graph model. We also intend to compare the theoretical analysis with the empirical study of a corpus of Web documents.Expected Results.Analysis of link analysis algorithms for specific models for the Web.Empirical validation of the theoretical results.Task 3.2 - Discovery and Visualization of the WebIntroduction.The goal of our research with respect to this task is to develop a more precise picture of the structure and dynamics of the Web. We aim at studying in detail the characteristic regions of the Webgraph, i.e. the Core, IN and OUT node sets, and, by analyzing samples provided us from Web repositories, we plan to characterize the dynamics of the Webgraph. It is not necessary to point out that the dynamics of the Webgraph plays a crucial role in modern Web search engines, that are based on link analysis algorithms.Phase 1.We plan to study in deeper details the topological properties of the Web, specifically in the context of the BowTie structure. We intend to map all strongly connected components of large crawls of the Webgraph [WebBase], look closer at the topological structure of the IN and of the OUT regions, exploit the empirical findings to design better crawling strategies.Expected Results.Empirical study of the strongly connected components of the Webgraph.Phase 2.Define new concepts and methods that allow to reconstruct a picture of the Web at a specific time from data collected over time and available in Web repositories. Design methods that allow to associate vertices and hyperlinks with temporal information. Determine the evolution over time of the main statistical properties of the Webgraph, as for instance degree-distribution, connectivity properties, spectral properties, quality of pages, the topical regions of the Web.Expected Results.A framework for studying the temporal evolution of the topological structure of the Webgraph.