Page 1 of 1

Data set with missing values

Posted: 03 Apr 2022, 11:08
by Blake Huang
Hello Prof. Hess,

My data set inherently has missing values. I designed 9 SP scenarios with different variables between each 3 scenarios, but the alternatives are the same for all scenarios. For example, I considered 3 variables in Scenario 1-3, 2 additional variables in Scenario 4-6, and 3 additional variables in Scenario 7-9. Since Apollo cannot deal with the data set with missing values, what should be done with the data in this case?

ID X1 X2 X3 X4 X5 X6 X7 X8 choice
1 3 85 14 2
2 2 105 11 1
3 3 95 14 3
4 1 105 11 50 3 3
5 1 85 8 30 1 1
6 3 105 14 50 3 2
7 1 1 1 1 1 1 1 1 1
8 2 2 2 2 2 2 2 2 2
9 3 3 3 3 3 3 3 3 3

Best regards
Yue

Re: Data set with missing values

Posted: 04 Apr 2022, 17:23
by stephanehess
Hi

could you be a bit more specific about the setup of your data and how you want to model it?

Thanks

Re: Data set with missing values

Posted: 05 Apr 2022, 03:26
by Blake Huang
Hi, Prof. Hess,
I am very sorry for not describing my problem clearly. For example, let's assume that each respondent is required to complete 18 (9+9) hypothetical scenarios.

Scenario 1-9: including 8 attributes, and then 9 scenarios are obtained through uniform design.
Scenario 10-18: including 10 attributes (8+2), among which 2 attributes are newly added. Through uniform design again, we have 9 scenarios.

An illustrative data format is shown as follows:
ID X1 X2 X3 X4 X5 choice
1 1 1 1 - - 1
1 2 2 2 - - 2
1 3 3 3 - - 3
1 4 4 4 4 4 3
1 5 5 5 5 5 2
1 6 6 6 6 6 1

Since the choice options are the same in all scenarios, we wonder if we can put the data together to build an MNL model?

Re: Data set with missing values

Posted: 05 Apr 2022, 10:49
by stephanehess
Sorry, this is still not clear. Can you show the entire data, plus your proposed model specification?

Re: Data set with missing values

Posted: 05 Apr 2022, 14:02
by Blake Huang
Dear Prof. Hess,

This is my model specification, and you can check the attachment for the complete data format.

V[['alt1']] = asc_1 + b_tt * cartime + b_pollution * carpollution

V[['alt2']] = asc_2 + b_tt * prtime + b_fee * prfee + b_capacity_2 * capacity + b_comfort2_2 * comfort2 + b_comfort3_2 * comfort3 + b_ratio_2 * ratio + b_comment1_2 * comment1 + b_comment2_2 * comment2 + b_pollution * prpollution

V[['alt3']] = asc_3 + b_tt * bustime + b_fee * busfee + b_pollution * buspollution

Re: Data set with missing values

Posted: 06 Apr 2022, 07:58
by stephanehess
Hi

you can just make the attribute 0 when it's missing. See also this example file where the service quality attribute is not present for RP in the data http://apollochoicemodelling.com/files/ ... NL_RP_SP.r

Stephane

Re: Data set with missing values

Posted: 08 Apr 2022, 12:51
by Blake Huang
That's exactly how I deal with it. I just want to make sure it's the right way to do it. Thank you, Prof. Hess.

Re: Data set with missing values

Posted: 08 Apr 2022, 13:16
by stephanehess
The only thing you need to be careful with in that context is with categorical variables. For a continuous variable, using 0 for missing data will course make sense, but for a categorical variable, you need to think about whether 0 is also already a level that is used