|
Title: SURVEY DATA REPAIR USING HOT-DECK IMPUTATION PROCEDURE
Accession Number: 00939796
Record Type: Component
Availability: Transportation Research Board Business Office 500 Fifth Street, NW Abstract: The objective of any survey is to estimate finite population quantities such as population mean of a variable, by sample quantities such as the sample mean. For an accurate determination of the population values, a complete response for the desired variables, among the sampling units is a necessity. However, it is common to find non-responses for some variables among sampling units, such as income. The problem of incomplete data has received attention only recently, although the errors associated with incomplete data were recognized during the last three or four decades. The costs associated with missing data can be very large, given the average cost of a household travel survey on the order of $150, and the potential loss of an entire household due to one or two items of missing data. The objective of this study was to repair survey data, i.e., to find a method to replace missing data items. From a review of the literature, it was observed that the imputation methods used to replace missing data are gaining importance, especially in the fields of Biometrics and Agriculture. An attempt was made in this study to correct item non-response and unit non-response (where an entire record is missing) in transportation surveys using the Hot-Deck imputation procedure. The primary input to this study was the Baton Rouge Personal Transportation Survey data collected in 1997, which has unit and item non-response. In this research, an effective method for data repair was identified and the data were repaired so that no household is excluded from analysis. For unit non-response, the data were studied before imputation to improve their completeness by inference from other members of the household who had responded in the survey. By inference, the non-response could be corrected to an extent although not completely. Finally, by imputation, a complete data set was obtained and on comparing the statistics obtained using the repaired (using Hot-Deck imputation) and unrepaired data, it was observed that the survey estimates obtained after imputation changed compared to the estimates obtained from the unrepaired data. In addition, a test was run in which complete data were changed to have certain data items made artificially missing, and these were then repaired by the same procedure. This procedure showed that Hot-Deck imputation provided estimates that were closer to true values than those obtained from either the data with missing items, or the data excluding those households with missing items. These results indicate the importance of accurate coding of the survey data and the need to repair data by inference and other data repair methods before any analysis. The paper provides a clear procedure for repairing data using both inference and Hot-Deck imputation that could be applied to any survey data.
Supplemental Notes: The CD-ROM contains the proceedings of the sixth, seventh and eighth conferences. The eighth conference proceedings were published in October 2001.
Language: English
Corporate Authors: Transportation Research Board 500 Fifth Street, NW Authors: Dudala, TStopher, P REditors: Donnelly, RBennett, GPagination: p. 109-121
Publication Date: 2002
Conference:
Eighth TRB Conference on the Application of Transportation Planning Methods
Location:
Corpus Christi, Texas Features: References
(6)
; Tables
(13)
TRT Terms: Identifier Terms: Uncontrolled Terms: Subject Areas: Highways; Planning and Forecasting; Public Transportation; I72: Traffic and Transport Planning
Files: TRIS, TRB
Created Date: Mar 19 2003 12:00AM
More Records from this Conference:
|