class: center, middle, inverse, title-slide # Missing data methods I ### Dr. Olanrewaju Michael Akande ### Oct 24, 2019 --- ## Announcements - Final reports for Team Project 2 now due 11:59pm, Monday, November 4. - Project proposals are now due 11:59pm, Tuesday, November 5. ## Outline - Missing data + Introduction + Types of missing data and mechanisms + Mathematical formulation - Strategies for handling missing data + Complete cases/available cases analyses + Single imputation methods + Multiple imputation --- class: center, middle # Missing data --- ## Motivation - Most real world datasets often suffer from nonresponse, that is, they contain missing values. -- - Ideally, analysts should first decide on how to deal with missing data before moving on to analysis. -- - One needs to make assumptions and ask tons of questions, for example, + why are the values missing? + what is the pattern of missingness? + what is the proportion of missing values in the data? -- - As a Bayesian, one could treat the missing values as parameters and estimate them simultaneously with the analysis, but even in that case, one must still ask the same questions. -- - Ask as many questions as possible to help you figure out the most plausible assumptions! --- ## Motivation - Simplest approach: complete/available case analyses -- delete cases with missing data. Often problematic because: + it is just not feasible sometimes (small `\(n\)` large `\(p\)` problem) -- when we have a small number of observations but a large number of variables, we simply can not afford to throw away data, even when the proportion of missing data is small. -- + information loss -- even when we do not have the small `\(n\)`, large `\(p\)` problem, we still lose information when we delete cases. -- + biased results -- because the missing data mechanism is rarely random, features of the observed data can be completely different from the missing data. -- - More principled approach: impute the missing data (in a statistically proper fashion) and analyze the imputed data. --- ## Why should we care? - <font color="red">Loss of power</font> due to the the smaller sample size + can't regain lost power -- - Any analysis must make an <font color="red">untestable assumption</font> about the missing data + wrong assumption `\(\Rightarrow\)` <font color="red">biased estimates</font> -- - Some popular analyses with missing data get <font color="red">biased standard errors</font> + resulting in wrong p-values and confidence intervals -- - Some popular analyses with missing data are <font color="red">inefficient</font> + so that confidence intervals are wider than they need be --- ## What to do: loss of power Approach by design: - minimize amount of missing data + good communications with participants, for example, patients in clinical trial, participants in surveys and censuses, etc + follow up as much as possible; make repeated attempts using different methods -- - reduce the impact of missing data + collect reasons for missing data + collect information predictive of missing values --- ## What to do: analysis - A suitable method of analysis would: + make the correct (or plausible) assumption about the missing data + give an unbiased estimate (under that assumption) + give an unbiased standard error (so that p-values and confidence intervals are correct) + be efficient (make best use of the available data) -- - However, we can never be sure about what the correct assumption is `\(\Rightarrow\)` sensitivity analyses are essential! --- ## How to approach the analysis? - Start by knowing: + extent of missing data + pattern of missing data (e.g. is `\(X_1\)` always missing whenever `\(X_2\)` is also missing?) + predictors of missing data and of outcome -- - Principled approach to missing data: + identify a plausible assumption (through discussions between you as a data scientist and your clients) + choose an analysis method that's valid under that assumption -- - Just because a method is simple to use does not make it plausible; some analysis methods are simple to describe but have complex and/or implausible assumptions. --- ## Types of nonresponse (missing data) - .hlight[Unit nonresponse]: the individual has no values recorded for any of the variables. For example, when participants do not complete a survey questionnaire at all. - .hlight[Item nonresponse]: the individual has values recorded for at least one variable, but not all variables. <table> <caption>Unit nonresponse vs item nonresponse</caption> <tr> <th> </th> <th height="30px" colspan="3">Variables</th> </tr> <tr> <th> </th> <td height="30px" style="text-align:center" width="100px"> X<sub>1</sub> </td> <td style="text-align:center" width="100px"> X<sub>2</sub> </td> <td style="text-align:center" width="100px"> Y </td> </tr> <tr> <td height="30px" style="text-align:left"> Complete cases </td> <td style="text-align:center"> ✔ </td> <td style="text-align:center"> ✔ </td> <td style="text-align:center"> ✔ </td> </tr> <tr> <td rowspan="3"> Item nonresponse </td> <td rowspan="3" style="text-align:center"> ✔ </td> <td height="30px" style="text-align:center"> ✔ </td> <td style="text-align:center"> ❓ </td> </tr> <tr> <td height="30px" style="text-align:center"> ❓ </td> <td style="text-align:center"> ❓ </td> </tr> <tr> <td height="30px" style="text-align:center"> ❓ </td> <td style="text-align:center"> ✔ </td> </tr> <tr> <td height="30px" style="text-align:left"> Unit nonresponse </td> <td style="text-align:center"> ❓ </td> <td style="text-align:center"> ❓ </td> <td style="text-align:center"> ❓ </td> </tr> </table> --- ## Types of Missing Data Mechanism - Data are said to be .hlight[missing completely at random (MCAR)] if the reason for missingness does not depend on the values of the observed data or missing data. -- - For example, suppose - you handed out a double-sided survey questionnaire of 20 questions to a sample of participants; - questions 1-15 were on the first page but questions 16-20 were at the back; and - some of the participants did not respond to questions 16-20. -- - Then, the values for questions 16-20 for those people who did not respond would be .hlight[missing completely at random] if they simply did not realize the pages were double-sided; they had no reason to ignore those questions. -- - This is rarely plausible in practice! --- ## Types of Missing Data Mechanism - Data are said to be .hlight[missing at random (MAR)] if the reason for missingness may depend on the values of the observed data but not the missing data (conditional on the values of the observed data). -- - Using our previous example, suppose - questions 1-15 include demographic information such as age and education; - questions 16-20 include income related questions; and - once again, some of the participants did not respond to questions 16-20. -- - Then, the values for questions 16-20 for those people who did not respond would be .hlight[missing at random] if younger people are more likely not to respond to those income related questions than old people, where age is observed for all participants. -- - This is the most commonly assumed mechanism in practice! --- ## Types of Missing Data Mechanism - Data are said to be .hlight[missing not at random (MNAR or NMAR)] if the reason for missingness depends on the actual values of the missing (unobserved) data. -- - Continuing with our previous example, suppose again that - questions 1-15 include demographic information such as age and education; - questions 16-20 include income related questions; and - once again, some of the participants did not respond to questions 16-20. -- - Then, the values for questions 16-20 for those people who did not respond would be .hlight[missing not at random] if people who earn more money are less likely to respond to those income related questions than old people. -- - This is usually the case in real analysis, but analysis can be complex! --- ## Mathematical Formulation - Consider the classical multiple regression setting with .block[ .small[ `$$\boldsymbol{Y} = \boldsymbol{X}\boldsymbol{\beta} + \boldsymbol{\epsilon}; \ \ \boldsymbol{\epsilon} \sim N(0, \sigma^2_\epsilon \boldsymbol{I})$$` ] ] where `\(\boldsymbol{I}\)` is the identity matrix and .block[ .small[ $$ \boldsymbol{Y} = `\begin{bmatrix} y_1 \\ y_2 \\ \vdots\\ y_n \\ \end{bmatrix}` \hspace{0.5em} \boldsymbol{X} = `\begin{bmatrix} 1 & x_{11} & x_{12} & \ldots & x_{1p} \\ 1 & x_{21} & x_{22} & \ldots & x_{2p} \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & x_{n1} & x_{n2} & \ldots & x_{np} \\ \end{bmatrix}` \hspace{0.5em} \boldsymbol{\beta} = `\begin{bmatrix} \beta_0\\ \beta_1\\ \beta_2 \\ \vdots \\ \beta_p \\ \end{bmatrix}` \hspace{0.5em} \boldsymbol{\epsilon} = `\begin{bmatrix} \epsilon_1\\ \epsilon_2 \\ \vdots \\ \epsilon_n \\ \end{bmatrix}` $$ ] ] -- - Suppose for now, that `\(\boldsymbol{Y}\)` contains missing values but `\(\boldsymbol{X}\)` is fully observed. -- - We can separate `\(\boldsymbol{Y}\)` into the observed and missing parts, that is, `\(\boldsymbol{Y} = (\boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis})\)`. -- - We can also do the same for `\(\boldsymbol{X}\)` if it contains missing values. --- ## Mathematical Formulation - Let + `\(r_i = 1\)` when `\(y_i\)` is missing, + `\(r_i = 0\)` when `\(y_i\)` is observed. -- - Also, let + `\(\boldsymbol{R} = (r_1,\ldots,r_n)\)` (this is the vector of missing indicators for `\(\boldsymbol{Y}\)`), + `\(\boldsymbol{\theta}\)` be the set of parameters associated with `\(\boldsymbol{R}\)`. -- - When `\(\boldsymbol{X}\)` contains missing values, we can also create a vector of missing indicators for each variable in `\(\boldsymbol{X}\)` with missing entries. -- - Assume `\(\boldsymbol{\theta}\)` and `\((\boldsymbol{\beta},\sigma^2_\epsilon)\)` are distinct. --- ## Mathematical Formulation - MCAR: .block[ `$$f(\boldsymbol{R} | \boldsymbol{Y},\boldsymbol{X},\boldsymbol{\theta},\boldsymbol{\beta},\sigma^2_\epsilon) = f(\boldsymbol{R} | \boldsymbol{\theta})$$` ] -- - MAR: .block[ `$$f(\boldsymbol{R} | \boldsymbol{Y},\boldsymbol{X},\boldsymbol{\theta},\boldsymbol{\beta},\sigma^2_\epsilon) = f(\boldsymbol{R} | \boldsymbol{Y}_{obs},\boldsymbol{X},\boldsymbol{\theta})$$` ] -- - MNAR: .block[ `$$f(\boldsymbol{R} | \boldsymbol{Y},\boldsymbol{X},\boldsymbol{\theta},\boldsymbol{\beta},\sigma^2_\epsilon) = f(\boldsymbol{R} | \boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis},\boldsymbol{X},\boldsymbol{\theta})$$` ] --- ## Implications for likelihood function for parameters - Each type of mechanism has a different implication on the likelihood of the observed data `\(\boldsymbol{Y}_{obs}\)`, and the missing data indicator `\(\boldsymbol{R}\)`. -- - Without missingness in `\(\boldsymbol{Y}\)`, the likelihood function is .block[ `$$L(\boldsymbol{\beta},\sigma^2_\epsilon | \boldsymbol{Y}_{obs}, \boldsymbol{X}) \propto f(\boldsymbol{Y}_{obs} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon)$$` ] -- - With missingness in `\(\boldsymbol{Y}\)`, the likelihood function is instead .block[ $$ `\begin{split} L(\boldsymbol{\beta},\sigma^2_\epsilon,\boldsymbol{\theta} | \boldsymbol{Y}_{obs},\boldsymbol{X}, \boldsymbol{R}) & \propto f(\boldsymbol{Y}_{obs}, \boldsymbol{R} |\boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon,\boldsymbol{\theta}) \\ & = \int f(\boldsymbol{R} | \boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis},\boldsymbol{X},\boldsymbol{\theta}) f(\boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon) \textrm{d}\boldsymbol{Y}_{mis} \\ \end{split}` $$ ] -- - Since we do not actually observe `\(\boldsymbol{Y}_{mis}\)`, we would like to be able to integrate it out so we don't have to deal with it. --- ## Likelihood function: MCAR For MCAR, we have: .block[ $$ `\begin{split} f(\boldsymbol{Y}_{obs}, \boldsymbol{R} | \boldsymbol{X},\boldsymbol{\beta},\sigma^2_\epsilon,\boldsymbol{\theta}) & = \int f(\boldsymbol{R} | \boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis},\boldsymbol{X},\boldsymbol{\theta}) f(\boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon) \textrm{d}\boldsymbol{Y}_{mis} \\ & = \int f(\boldsymbol{R} | \boldsymbol{\theta}) f(\boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon) \textrm{d}\boldsymbol{Y}_{mis} \\ & = f(\boldsymbol{R} | \boldsymbol{\theta}) \int f(\boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon) \textrm{d}\boldsymbol{Y}_{mis} \\ & = f(\boldsymbol{R} | \boldsymbol{\theta}) f(\boldsymbol{Y}_{obs} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon). \\ \end{split}` $$ ] -- That is, the likelihood function is .block[ $$ `\begin{split} L(\boldsymbol{\beta},\sigma^2_\epsilon,\boldsymbol{\theta} | \boldsymbol{Y}_{obs}, \boldsymbol{X}, \boldsymbol{R}) & \propto f(\boldsymbol{Y}_{obs}, \boldsymbol{R} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon,\boldsymbol{\theta}) \\ & = f(\boldsymbol{R} | \boldsymbol{\theta}) f(\boldsymbol{Y}_{obs} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon). \\ \end{split}` $$ ] -- For inference on `\(\boldsymbol{\beta}\)` and `\(\sigma^2_\epsilon\)`, we can simply focus on `\(f(\boldsymbol{Y}_{obs} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon)\)` in the likelihood function, since `\(f(\boldsymbol{R} | \boldsymbol{\theta})\)` does not include any part of the observed data. --- ## Likelihood function: MAR For MAR, we have: .block[ $$ `\begin{split} f(\boldsymbol{Y}_{obs}, \boldsymbol{R} | \boldsymbol{X},\boldsymbol{\beta},\sigma^2_\epsilon,\boldsymbol{\theta}) & = \int f(\boldsymbol{R} | \boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis},\boldsymbol{X},\boldsymbol{\theta}) f(\boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon) \textrm{d}\boldsymbol{Y}_{mis} \\ & = \int f(\boldsymbol{R} | \boldsymbol{Y}_{obs},\boldsymbol{X},\boldsymbol{\theta}) f(\boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon) \textrm{d}\boldsymbol{Y}_{mis} \\ & = f(\boldsymbol{R} | \boldsymbol{Y}_{obs},\boldsymbol{X},\boldsymbol{\theta}) \int f(\boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon) \textrm{d}\boldsymbol{Y}_{mis} \\ & = f(\boldsymbol{R} | \boldsymbol{Y}_{obs},\boldsymbol{X},\boldsymbol{\theta}) f(\boldsymbol{Y}_{obs} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon). \\ \end{split}` $$ ] -- That is, the likelihood function is .block[ $$ `\begin{split} L(\boldsymbol{\beta},\sigma^2_\epsilon,\boldsymbol{\theta} | \boldsymbol{Y}_{obs}, \boldsymbol{X},\boldsymbol{R}) & \propto f(\boldsymbol{Y}_{obs}, \boldsymbol{R} | \boldsymbol{X},\boldsymbol{\beta},\sigma^2_\epsilon,\boldsymbol{\theta}) \\ & = f(\boldsymbol{R} | \boldsymbol{Y}_{obs},\boldsymbol{X},\boldsymbol{\theta}) f(\boldsymbol{Y}_{obs} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon). \\ \end{split}` $$ ] -- For inference on `\(\boldsymbol{\beta}\)` and `\(\sigma^2_\epsilon\)`, we can again focus on `\(f(\boldsymbol{Y}_{obs} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon)\)` in the likelihood function. However, `\(f(\boldsymbol{R} | \boldsymbol{Y}_{obs},\boldsymbol{X},\boldsymbol{\theta})\)` does include the observed data here, so that there would be some bias if we do not account for it. --- ## Likelihood function: MNAR For MNAR, we have: .block[ $$ `\begin{split} f(\boldsymbol{Y}_{obs}, \boldsymbol{R} | \boldsymbol{X},\boldsymbol{\beta},\sigma^2_\epsilon,\boldsymbol{\theta}) & = \int f(\boldsymbol{R} | \boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis},\boldsymbol{X},\boldsymbol{\theta}) f(\boldsymbol{Y}_{obs},\boldsymbol{Y}_{mis} | \boldsymbol{X}, \boldsymbol{\beta},\sigma^2_\epsilon) \textrm{d}\boldsymbol{Y}_{mis}. \\ \end{split}` $$ ] -- + The likelihood under MNAR cannot simplify any further. + In this case, we cannot ignore the missing data when making inferences about `\(\boldsymbol{\beta}\)` and `\(\sigma^2_\epsilon\)`. --- ## Types of missing data mechanisms: how to tell in practice? So how can we tell the type of mechanism we are dealing with? -- In general, we don't know!!! -- - Rare that data are MCAR (unless planned beforehand) -- - Possible that data are MNAR -- - Compromise: assume data are MAR if we include enough variables in model for the missing data indicator `\(\boldsymbol{R}\)`. --- class: center, middle # Strategies for handling missing data --- ## Strategies for handling missing data - Item nonresponse: + use complete/available cases analyses + single imputation methods + multiple imputation + model-based methods -- - Unit nonresponse: + weighting adjustments + model-based methods (identifiability issues!). -- - We will only focus on item nonresponse. If you are interested in building models for both unit and item nonresponse, here is a paper on research I have done on the topic: https://arxiv.org/pdf/1907.06145.pdf --- ## Complete/available cases analyses What can happen when using available case analyses with different types of missing data? -- - MCAR: unbiased when disregarding missing data; variance increase (losing partially complete data) -- - MAR: biased (depending on the strength of MAR and amount of missing data) when missing data mechanism is not modeled; variance increase (losing partially complete data). -- - NMAR: generally biased! --- ## Why should we care? - <div class="question"> Why should we care in practice? What does bias really mean here? How exactly does using only the complete cases affect our results for the three mechanisms? </div> - Let's attempt to answer these questions via simulations. - Set `\(n = 10,000\)`. For `\(i=1,\ldots,n\)`, generate + `\(x_i \overset{iid}{\sim} N(2, 1); \ \ \ y_i|x_i \overset{iid}{\sim} N(-1 + 2 x_{i}, \sigma^2=5^2)\)` + `\(r_i|y_i, x_i \sim \textrm{Bernoulli}(\pi_i); \ \ \ \textrm{log}\left(\dfrac{\pi_i}{1-\pi_i}\right) = \theta_0 + \theta_1 y_i + \theta_2 x_i\)` - Next, set `\(y_i\)` missing whenever `\(r_i = 1\)`. - Set different values for `\(\boldsymbol{\theta} = (\theta_0, \theta_1, \theta_2)\)` to reflect MCAR, MAR and MNAR. - Let's use the R script [here](https://ids-702-f19.github.io/Course-Website/slides/lec-slides/MissingDataSim.R). --- ## Single imputation methods -- - Marginal/conditional mean imputation -- - Nearest neighbor imputation: + hot deck imputation + cold deck imputation -- - Use observation from one of the previous time periods (for panel data) + LOCF -- last observation carried forward + BOCF -- baseline observation carried forward --- ## Mean imputation Plug in the variable mean for missing values. -- - Point estimates of means OK under MCAR -- - Variances and covariances underestimated. -- - Distributional characteristics altered. -- - Regression coefficients inaccurate. -- Similar problems for plug-in conditional means. --- ## Nearest neighbor imputation Plug in donors' observed values. -- - Hot deck: for each non-respondent, find a respondent who "looks like" the non-respondent in the same dataset -- - Cold deck: find potential donors in an external but similar dataset. For example, respondents from a 2016 election poll survey might serve as potential donors for non-respondents in the 2018 version of the same survey. -- - Common metrics: Statistical distance, adjustment cells, propensity scores. --- ## Nearest neighbor imputation - Point estimates of means OK under MAR. -- - Variances and covariances underestimated. -- - Distributional characteristics OK. -- - Regression coefficients OK under MAR. --- ## Multiple imputation (MI) - Fill in dataset `\(m\)` times with imputations. -- - Analyze repeated data sets separately, then combine the estimates from each one. -- - Imputations drawn from probability models for missing data. -- <img src="img/MultipleImp.png" height="370px" style="display: block; margin: auto;" /> --- ## MI example Suppose - `\(Y =\)` income (unit of measurement is $10,000) -- - `\(X =\)` level of education (0 = undergraduate, 1 = graduate) -- <img src="img/MIExample-I.png" height="370px" style="display: block; margin: auto;" /> --- ## MI: inferences from multiply-imputed datasets Rubin (1987) - Population estimand: `\(Q\)` - Sample estimate: `\(q\)` - Variance of `\(q\)`: `\(u\)` - In each imputed dataset `\(d_j\)`, where `\(j = 1,\ldots,m\)`, calculate `$$q_j = q(d_j)$$` `$$u_j = u(d_j)$$` --- ## MI example: inferences from multiply-imputed datasets Suppose we are interested in estimating the mean income in our example. Then - Q = `\(\mu_Y\)` -- - `\(q = \bar{y} = \dfrac{1}{n} \sum\limits_{i=1}^{n} y_i\)` -- - u = `\(\mathbb{\hat{V}}[\bar{y}] = \dfrac{s^2}{n}\)` -- - In each imputed dataset `\(d_j\)`, calculate `$$q_j = \bar{y}_j \ \ \ \textrm{and} \ \ \ u_j = \dfrac{s_j^2}{n}$$` --- ## MI: quantities needed for inference - .block[ `$$\bar{q}_m = \sum\limits_{i=1}^m \dfrac{q_i}{m}$$` ] - .block[ `$$b_m = \sum\limits_{i=1}^m \dfrac{(q_i - \bar{q}_m)^2}{m-1}$$` ] - .block[ `$$\bar{u}_m = \sum\limits_{i=1}^m \dfrac{u_i}{m}$$` ] --- ## MI: inferences from multiply-imputed datasets - MI estimate of `\(Q\)`: .block[ `$$\bar{q}_m$$` ] -- - MI estimate of variance is: .block[ `$$T_m = (1+1/m)b_m + \bar{u}_m$$` ] -- - Use t-distribution inference for `\(Q\)` .block[ `$$\bar{q}_m \pm t_{1-\alpha/2} \sqrt{T_m}$$` ] .hlight[Notice that the variance incorporates uncertainty both from within and between the m datasets.] --- ## MI example Back to our income example, <img src="img/MIExample-II.png" height="400px" style="display: block; margin: auto;" /> -- By the way, `\(\bar{y}=12.64\)` from the "true complete dataset". --- ## MI example - MI estimate of `\(Q\)`: .small[ `$$\bar{q}_m = \sum\limits_{j=1}^m \dfrac{q_j}{m} = \dfrac{12.66 + 13.14 + 12.90}{3} = 12.90$$` ] -- - Between variance .small[ `$$b_m = \sum\limits_{j=1}^m \dfrac{(q_j - \bar{q}_m)^2}{m-1} = 0.06$$` ] -- - Within variance .small[ `$$\bar{u}_m = \sum\limits_{j=1}^m \dfrac{u_j}{m} = \dfrac{0.37 + 0.29 + 0.32}{3} = 0.33$$` ] -- - MI estimate of variance is: .small[ `$$T_m = (1+1/m)b_m + \bar{u}_m = (1+1/3)0.06 + 0.33 = 0.41$$` ] -- <div class="question"> Where should the imputations come from? We will answer that in the next class! </div>