Page 1 of 1

Include independent variables with missing values

Posted: 25 Jan 2022, 15:14
by Anna
Hi,

I am trying to find out how to handle missing data in the following case: I have some missing data (not at random) of a variable that significantly influences the choices in a model where I simply delete all the rows with a missing value. However, these observations are valuable and I would rahter not delete them.

Is there a way to include the variable (if available) and sort of ignore it if the value is missing? Could I apply the method for Joint estimation of multiple model components and treat the data as two sets of data, one with the information and one without?

Thanks,
Anna

Re: Include independent variables with missing values

Posted: 25 Jan 2022, 15:26
by stephanehess
Anna

it's quite a common case, and one that is easily accommodated via a separate parameter.

So let's imagine we're looking at age, and that this is measured continuously in the data, with -99 for missing.

Then let's say we want to interact time sensitivity with age. You would then use:

( beta_time + shift_btime_age * ( age > 0 ) * age + shift_btime_age_missing * ( age == -99 ) ) * time

So there would be a separate effect for those with missing age

Stephane

Re: Include independent variables with missing values

Posted: 25 Jan 2022, 15:34
by Anna
Brilliant! That makes sense. The suggestions on stat stackexchange had me a little worried.

Thanks for the quick response!

Cheers
Anna