A generic tool to assess impact of changing edit rules in a business survey - an application to the UK Annual Business Inquiry part 2
Business surveys often use complex sets of edit rules (edits, for short) to check returned questionnaires (records), locate suspicious or unacceptable responses, and support data cleaning operations prior to using the survey responses for estimation of the required target parameters. These sets of edits are complex because they may involve large numbers of survey questionnaires and variables, they may contain a large number of edits, and the edits may depend on a large number of tolerance parameters. When such sets of edits are used, they may cause large numbers of record failures and generate substantial costs of revision, especially if edit failures are dealt with by means of clerical operations, like reviewing original paper questionnaires or digital images of these, and re-contacting businesses for clarification and/or correction of the responses provided. Costs can be high both in terms of the resources required, as well as in terms of timeliness of survey processing, by delaying availability of the survey data for estimation and publication.In this paper we describe a generic tool, developed as a result of the collaboration between the University of Southampton and the ONS. This tool can help to assess the potential impact of changing the edits in a specified business survey. It is a SAS macro using the IML language which enables calculation of a number of edit performance and data quality indicators. Changes to the set of edits aiming to ‘relax’ the existing edits so that failure rates decrease and efficiency savings are achieved are assessed by means of several edit-related performance indicators, like failure and hit rates, false hit rates, etc.. Data quality indicators include proportion of errors missed and estimates of the bias resulting from missing errors for a specified revision of the set of edits. Edit designers and managers can then aim to fine tune their edits so that failure rates, false hit rates and editing costs are reduced, while data quality is preserved. An illustration is provided by the application of the tool to revise the edits used for the UK Annual Business Inquiry Part 2 to the reference year 2007.