A Neural Network Architecture for Data Editing in the Bank of Italy's Business Surveys
This paper presents an application of neural network models to predictive classification for data quality control. Our aim is to identify data affected by measurement error in the Bank of Italy's business surveys. We build an architecture consisting of three feed-forward networks for variables related to employment, sales and investment respectively: the networks are trained on input matrices extracted from the error-free final survey database for the 2003 wave, and subjected to stochastic transformations reproducing known error patterns. A binary indicator of unit perturbation is used as the output variable. The networks are trained with the Resilient Propagation learning algorithm. On the training and validation sets, correct predictions occur in about 90 percent of the records for employment, 94 percent for sales, and 75 percent for investment. On independent test sets, the respective quotas average 92, 80 and 70 percent. On our data, neural networks perform much better as classifiers than logistic regression, one of the most popular competing methods, on our data. They appear to provide a valid means of improving the efficiency of the quality control process and, ultimately, the reliability of survey data