Best way to model directly correlated attributes.
Posted: 24 Feb 2021, 08:33
Hello everyone,
I am running a mixed logit model from unlabeled stated choice experiment with four attributes.
I will try to explain in general a similar situation to my problem.
1) Attribute A
2) Attribute B
3) Cash back
4) Total Cost
Now i can think of three ways to estimate the parameters
a) Keeping everything same and estimating parameters for all the four attributes.
b) I make slight modifications to the data after data collection as following and estimate four parameters.
1) Attribute A
2) Attribute B
3) Cash back
4) Net cost = (Total cost - Cash back)
c) Reducing the four attributes to three after data collection and estimate 3 parameters.
1) Attribute A
2) Attribute B
3) Net cost ( Total cost - Cash back)
Now here are some outputs scenarios from different models and i want direction on which is the best way to model.
- For model a) and b) the final likelihood are almost same with similar AIC and BIC estimates but the correlation between the parameter
estimates from a) for Cash back and Total Cost is higher than the correlation between the estimates of b) between Cash back and Net cost which seems obvious (and correlation among other attributes also reduces). The parameter estimates of Attribute A and Attribute B changes in small magnitude compared to that of cash back parameter, parameter estimate for cost also changes significantly but not as much as with the magnitude of Cash back (Cash back parameter has high significant change).
Again modelling as per the c) the models worsen comparatively more slightly (with LL: may be with reduction in explaining variables) and the parameter estimates of attribute A and B changes slightly but Net Cost has a significant change.
So, my question here is as the questions were asked in format a) should i stick with design a), or as correlation between attributes (among all) reduces in model b) should i use model b) assuming the respondents had attribute processing strategy while selecting the alternatives or even model c). (my questions comes as these different results shows different tradeoff values and i am sorry if this is in fact a straight forward question).
I am running a mixed logit model from unlabeled stated choice experiment with four attributes.
I will try to explain in general a similar situation to my problem.
1) Attribute A
2) Attribute B
3) Cash back
4) Total Cost
Now i can think of three ways to estimate the parameters
a) Keeping everything same and estimating parameters for all the four attributes.
b) I make slight modifications to the data after data collection as following and estimate four parameters.
1) Attribute A
2) Attribute B
3) Cash back
4) Net cost = (Total cost - Cash back)
c) Reducing the four attributes to three after data collection and estimate 3 parameters.
1) Attribute A
2) Attribute B
3) Net cost ( Total cost - Cash back)
Now here are some outputs scenarios from different models and i want direction on which is the best way to model.
- For model a) and b) the final likelihood are almost same with similar AIC and BIC estimates but the correlation between the parameter
estimates from a) for Cash back and Total Cost is higher than the correlation between the estimates of b) between Cash back and Net cost which seems obvious (and correlation among other attributes also reduces). The parameter estimates of Attribute A and Attribute B changes in small magnitude compared to that of cash back parameter, parameter estimate for cost also changes significantly but not as much as with the magnitude of Cash back (Cash back parameter has high significant change).
Again modelling as per the c) the models worsen comparatively more slightly (with LL: may be with reduction in explaining variables) and the parameter estimates of attribute A and B changes slightly but Net Cost has a significant change.
So, my question here is as the questions were asked in format a) should i stick with design a), or as correlation between attributes (among all) reduces in model b) should i use model b) assuming the respondents had attribute processing strategy while selecting the alternatives or even model c). (my questions comes as these different results shows different tradeoff values and i am sorry if this is in fact a straight forward question).