Vai al contenuto| Home page|

   Ti trovi in: HOME »Programmi, progetti e risultati »I progetti »PRIN - Programmi di ricerca di Rilevante Interesse Nazionale»Programma di ricerca»Unità di ricerca
INIZIO_TESTO_DA_INDICIZZARE

UNITA' DI RICERCA

italiano - english

Research program

The statistical information in agriculture: present needs and future developments
University Co-ordinator
Università degli Studi di CASSINO - ECONOMIA E TERRITORIO - CASSINO(FR)
Research Unit Leader
Massimo SABBATINI
Description
The objective of the PRIN project is to provide solutions to improve the efficiency of the agricultural statistic system by reducing the data-collection costs and enhancing the information capacity. The Operative Unit in Cassino contributes to this goal by assessing the impact of the agricultural statistic system of the alternative criteria for the statistical definition of farm.

The goal of the research is to identify an optimal criterion to define the statistical unit of Censuses and sample survey in order to reduce the cost of data collection without compromising the information value. We develop a statistic approach to use the information collected on a subsample of the universe to infer about the excluded observations. The adoption of a more restrictive definition criterion implies a loss in information regarding the excluded farms. Such information, however, may be valuable, especially considering the fragmented structure of the Italian farm system. The exclusion of smaller farms from the census implies a reduction both in terms of observation numbers ad in the total farmland. For example, from the 2000 Agricultural Census, we have that the farms with less than 1 hectare of tillable land are 1.163.793 (45% of the total) and hold 4,8% of total farmland.

Because of the percent incidence, smaller farms may be relevant from a social perspective and from a policy point of view. The information loss subsequent to the adoption of the restrictive definition criteria may result in a distortion, providing misleading data to policy makers and - more in general – to the user of the statistics. Then, the possibility of using statistic inference to describe the excluded farms is a prerequisite for the adoption of the new criteria.

On the other hand, the high number of farms was one of the major cost drivers in the 2000 Census. The census reported approximately 2,6 million farms in Italy, a sizable share of which is of negligible economic dimension. The issue is debated at Eurostat, where the criteria for statistical relevance are defined. The adoption of the standard Eurostat criteria would result in an abatement of approximately 1,1 million farms. The subsequent cost abatement is offset by the exclusion of a significant percentage of farms, especially in the southern regions. The evaluation of the information loss, the comparison across alternative criteria and the monitoring instruments are the area of interest in this research.

In particular, the research aims to measure and compare the distortion in the census data and in the sample statistics induced by the alternative farm definitions. In fact the adoption of a new and more restrictive definition has a direct effect – on the total value of the variables – and an indirect effect through the stratification processes of the future sample surveys. A reliable assessment of cost and benefit of each definition calls for a careful evaluation of both effects.

The methodology of analysis is based on the use of the farm-level data from the 2000 agricultural census. In a first step, a set of alternative definition of farm is identified, based on the literature and the current approaches of international statistic organizations (EUROSTAT, RICA, USDA, FAO etc.). Also innovative approaches are proposed. Then, the direct effect will be evaluated by calculating the total values of the key census variables. In this way is possible to compare the effect of definition of the aggregate national and local values. The indirect effect will be estimated by simulating sample surveys (with multiple objectives) based on the alternative definitions and by measuring the statistic error. A Montecarlo process will allow us to check the theoretical distribution of the error. At the end, the alternative criteria will be compared based on the expected cost reduction, and the measurement of the informative distortion.

Summarizing, the project can be broken down into four phases.
1. Preliminary analysis
2. Methodology development
3. Empirics
4. Conclusions and recommendations

Preliminary analysis. The first step of the project is i) a recognition and review of the existing literature about the issues and ii) the construction of the dataset for the empirical analysis. The review of the issues related to the definition of farm will allow us to identify the key elements and the variables in the process. The dataset will be build by processing the census data and linking them with additional data sources such as the administrative data, sample survey, population and industry census. From all the sources is possible to identify the vector of auxiliary information to estimate the data for the excluded farms.

Methodology development. In this step, the operative unit will perform four main tasks: individuation of the population distribution of the objective variable, identification of the vector of auxiliary information, identification of an evaluation criterion for the definitions, development of an algorithm for the threshold values for each criteria. Summarizing,
a. the approach will estimate the excluded observation by using regression estimators and by using auxiliary information for a variety of sources. In particular, we will use generalized regression estimators.
b. An unbiased specification of the information vectors for each variable for the excluded or partially included observation categories is used a s a starting point for an optimization process of the threshold values of the discriminating values. In particular, algorithms of global optimization will be using with objective functions concerning the estimates of the variables from point a.
c. The information vectors will be used to design a sampling plan with multivariate optimal allocation aimed to obtain information about the excluded observations. The allocation model is modified to take into account the presence of regression estimators. The model will use the allocation algorithm by Bethel-Chromy.
d. Finally, a general mixed logit, based on the mentioned vector of auxiliary regression, will estimate the characteristics of the excluded farm, from the list of observation by small area. In this way, the quantitative information from the classical small area estimators are combined with qualitative informations. The characteristics of the model will be specified based on the estimates obtained by Markov Chain Monte-Carlo methods.

Empirics. In this step, the census data and the auxiliary sources will be used to evaluated the alternative farm definition criteria. In particular, we will assess both the distortions on the total estimates and the predictive power of the new sample for the inference about the excluded observations. The study will compare the existing definition criteria with the optimal criterion identified in the previous step.

Conclusions and recommendation. The last step of the project proposes a cost-benefit evaluation of the classification criteria. By using the standard cost measures (developed by ISTAT) it is possible to estimate the future cost of censuses and sample survey. The results from step 3 will be utilized to compare the savings with the loss of information. The results will be presented according two leading principles: maximization of information under a cost constraint and minimization of the cost with an information constraint.