Missing Data and Multiple Imputation
This online course provides an introduction to the theory and application of Multiple Imputation (MI) (Rubin 1987) which has become a very popular way for handling missing data, because it allows for correct statistical inference in the presence of missing data. With the advent of MI algorithms implemented in statistical standard software (R, SAS, Stata, SPSS,…), the method has become more accessible to data analysts. For didactic purposes, we start by introducing some naive ways of handling missing data, and we use the examination of their weaknesses to create an understanding of the framework of Multiple Imputation. The first day of this course is of a somewhat theoretical nature, but we believe that a fundamental understanding of the MI principle helps to adapt to a wider range of practical problems than focusing on a few select situations. We will subsequently shift to the more practical aspects of statistical analysis with missing data, and we will address frequent problems like regression with missing data. Further examples will be covered throughout the course, which are predominantly based on the statistical language R. We recommend basic R skills for this course, but it is possible to understand the course contents without prior knowledge in R, as the main MI algorithms are almost identical across all major software packages.
For additional details on the course and a day-to-day schedule, please download the full-length syllabus.
Lecturer(s): Florian Meinfelder, Angelina Hammon