Using Stata to Manage and Create a Research Data Bank

We manage a longitudinal research data bank containing 3,000 variables that adds 25,000 observations per year. Data are batch converted from SQL to Stata on a daily basis, resulting in the creation of 20 preliminary data sets. We then use Stata to quality control the data and to prepare a single research data set that can be augmented as required by the data analyst by calls to specialized programs that access the additional data sets. Our philosophy is to that most of the quality control and programming and data set preparation should be built into the dataset creation process rather than requiring the data user to do this. For example, data quality checks and complex data preparation of items such as costs and hospital and mortality codes are programmed into the data set creation process, and relevant additional data sets are automatically created to reflect such new data. The basic data set consists of research and control variables that are needed for most analyses. With simple programming statements such as -getwork- and -getcosts-, preprocessed work and cost data, for example, are merged with the basic set. Global macros identify file locations, database versions, and variable sets, making updating and sharing simple.

MoreLess

Year of publication:	2003-01-08
Authors:	Wolfe, Frederick ; Michaud, Kaleb
Institutions:	Stata User Group

More details

Series:	North American Stata Users' Group Meetings 2003.
Type of publication:	Book / Working Paper
Notes:	The text is part of a series North American Stata Users' Group Meetings, 2003 Number 11
Source:	RePEc - Research Papers in Economics

Persistent link: https://www.econbiz.de/10005102757