Page 1 of 1

Include independent variables with missing values

Posted: 25 Jan 2022, 15:14
by Anna
Hi,

I am trying to find out how to handle missing data in the following case: I have some missing data (not at random) of a variable that significantly influences the choices in a model where I simply delete all the rows with a missing value. However, these observations are valuable and I would rahter not delete them.

Is there a way to include the variable (if available) and sort of ignore it if the value is missing? Could I apply the method for Joint estimation of multiple model components and treat the data as two sets of data, one with the information and one without?

Thanks,
Anna

Re: Include independent variables with missing values

Posted: 25 Jan 2022, 15:26
by stephanehess
Anna

it's quite a common case, and one that is easily accommodated via a separate parameter.

So let's imagine we're looking at age, and that this is measured continuously in the data, with -99 for missing.

Then let's say we want to interact time sensitivity with age. You would then use:

( beta_time + shift_btime_age * ( age > 0 ) * age + shift_btime_age_missing * ( age == -99 ) ) * time

So there would be a separate effect for those with missing age

Stephane

Re: Include independent variables with missing values

Posted: 25 Jan 2022, 15:34
by Anna
Brilliant! That makes sense. The suggestions on stat stackexchange had me a little worried.

Thanks for the quick response!

Cheers
Anna

Re: Include independent variables with missing values

Posted: 08 Feb 2024, 02:36
by Koolau
Greetings. While looking to see how Apollo handles missing values, I noticed this earlier post. The solution of creating a new parameter makes sense if there is only one or few demographic variables with missing values. However, once we start working with latent variables, it's quite common to have at least a small amount of missing values for multiple indicator variables. Have there been any new developments in Apollo to handle missing values?

Thanks for any feedback on this.

Re: Include independent variables with missing values

Posted: 05 May 2024, 12:10
by stephanehess
Hi

in that case, you would have missing data for a dependent variable (indicator in hybrid choice), and you would skip those rows for that dependent variable only. See the 'rows' setting

Stephane