Contenuto
Ti trovi in: HOME »Programmi, progetti e risultati »I progetti »PRIN - Programmi di ricerca di Rilevante Interesse Nazionale»Programma di ricerca»Unità di ricercaINIZIO_TESTO_DA_INDICIZZARE
UNITA' DI RICERCA
italiano - english
Research program
Informatics an geo-linguistic research: ALS: micro-areal atlases and data base fruitionUniversity Co-ordinator
Università degli Studi di PALERMO - INGEGNERIA INFORMATICA - PALERMO(PA)Research Unit Leader
Antonio GENTILEDescription
The research herein proposed is aimed at adding several features, including intelligent data retrieval capabilities, automatic word spotting and transcriptions alignment, to the functionalities of the digital repository developed for the Linguistic Atlas of Sicily. Within the framework of the Linguistic Atlas of Sicily, this research unit will investigate artificial intelligence-based techniques for information retrieval of complex speech and textual data. Automatic, data-driven language model construction techniques will be studied to enhance the system capabilities. The proposed activity is articulated in the following four phases:1- Study of Artificial Intelligence inspired methodologies for complex data management
2 - Robust classification based on artificial neural networks of speech phonetic features
3 - Study of systems for automatic language model construction
4 - Design of systems for complex query generation, execution, and result visualization.
Phase 1
Study of Artificial Intelligence inspired methodologies for complex data management
During phase 1, the research unit will study methodologies derived from artificial intelligence, to address tasks in the area of Intelligent Data Analysis, Knowledge Discovery, Statistical Natural Language Processing, and Speech Recognition. Sequential decision theory might also suggest some interesting approaches to the above tasks. In fact, according to this theory it is possible to define active agents with optimal statistical prediction capabilities in a-priori unknown environments. Based on the "optimal reinforcement learning" paradigm, these agents allow a unified data processing, with reduced computational requirements and simplified treatment of high-level knowledge. The identified methodologies will be specifically addressed to the treatment of the ALS data body, so to offer a friendlier, more efficient access to information retrieval. To reach this objective, sub-symbolic word coding methodologies based on latent semantic analysis will be studied.
Phase 2
Robust classification based on artificial neural networks of speech phonetic features
To support and assist linguistic studies on the ALS corpus, state of the art speech processing capabilities will be added to the existing information system. In particular, the activity in phase 2 will focus on designing robust classifiers of speech phonetic features using artificial neural networks. This research unit has, in fact, a well-documented long-experience in the design of neural classifiers for variety of applications. Its neural network designs are currently employed in the National Science Foundation Automatic Speech Attribute Transcription (ASAT) project at the Georgia Institute of Technology (USA), within a long-established on-going scientific collaboration program between this research unit and the group led by prof. Mark Clements of the Georgia Tech Center for Signal and Image Processing. The proposed activity consists in segmenting speech signals and determining the classification probabilities into six basic phonetic classes, namely vowels, nasals, semivowels, liquids, fricatives, and plosives. Two neural network architecture designs will be explored, the hyperspheric neuron multilayer perceptron and the alphanet perceptron. The alphanet perceptron is a multilayer perceptron in which the activation function is automatically learnt as a truncated series expansion of Hermite polynomials. The training engine is based for both architectures on a modified Powell Conjugate Gradient optimization algorithms.
Phase 3
Study of systems for automatic language model construction
The objective of phase 3 activity will be investigating sub-symbolic methodologies for automatic, data-driven language model construction. Modules will be developed that could be easily integrated into standard, state-of-the-art toolkits for speech recognition, such as HTK, or SPHINX. To reach this objective a robust estimator will be designed to efficiently extract a parametric estimation of corpus population. This estimation will include both short range and long range constraints. The n-gram model (for small n) will be used to satisfy short range constraints. To satisfy long range constraints, a low dimensionality semantic space will be used for sub-symbolic word representation. This space will be construed using latent semantic analysis techniques, employing truncated singular value decomposition and statistics based on Hellinger metrics. Specific talking agents (vocal chat bots) will be designed to offer a more appealing, natural dialogic interaction with the user, by means of speech recognition and synthesis.
Phase 4
Design of systems for complex query generation, execution, and result visualization.
The objective of phase 4 activity will be the design of systems for complex query generation, execution, and result visualization. The query will be executed on the ALS corpus. Algorithms will be studied to enable semantic querying: starting from a user provided set of keywords, the system will automatically "expand" the original query by adding other terms correlated in a conceptual or semantic way. A meta-search semantic engine will be designed, which can use traditional search engines and automatically expand user queries on the ALS database. To this aim, sub-symbolic word coding will be performed using an appropriate lexicon or document corpus. This coding will allow to associate a numeric vector in a semantic space to each word. The surrounding of a specific vector will then contain vectors associated with words semantically correlated to the original one. It will be possible, thus, to find documents on topics related to the sought words that do not contain it. The set of resulting documents can then be further filtered by an automatic classification, based on neural classifiers trained on the semantic space. This will allow the full system to inherit the neural network capability to deal with noisy or contradictory input data.



