Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

Dummy coding of categorical variables in MNL_SP example.

Ask questions about existing examples and put in requests to software developers and users for other example implementations of models.
Post Reply
Bob123
Posts: 9
Joined: 20 Jun 2023, 13:39

Dummy coding of categorical variables in MNL_SP example.

Post by Bob123 »

Dear Prof. Hess,

This might be a simple question, but might also be useful also to others starting out in Apollo.

In the Apollo MNL_SP model example, there is a categorical attribute for service (service_rail), which has 4 levels: 1 for no-frills, 2 for wifi, 3 for food, 0 if not used. Also in this example, "b_no_frills" is kept at its starting value in Apollo_fixed. In the utility functions, these attributes are listed as: ( service_rail == 1 ), ( service_rail == 2 ) and ( service_rail == 2 ).

I have a couple of questions relating to this question and dummy coding of categorical variables in Apollo.

Is my understanding of coding categorical variables (with more than two categories) in Apollo correct:

1) The ability to write out utility functions in this way in Apollo for categorical attributes (e.g., service_rail == 2), prevents the need for creating multiple dummy variables in the data (css file). It is my understanding that in most other analyses (outside of Apollo), we would usually create k-1 dummy (binary) variables (where k is the number of levels, here 4) and select one as the reference category by listing it as 0,0,0,0 to which these would be compared/relative to? In the MNL_SP csv file for example, there are no such dummy variables which makes me think that this is the case. This was the only case I could find in the Apollo examples where a categorical variable with more than 2 (i.e., binary) levels was used in any of the csv files so wanted to be clear my understanding was correct. I think is due to ease of interpretation of binary categorical variables versus those with 2+ levels. As a follow-on the interpretation then

2) Also, listing b_no_frills in Apollo_fixed performs the same function (effectively) as selecting a reference category by using a 0,0,0,0 dummy if coded that way. Including this level (service_rail ==1) in the utility functions is therefore optional as no beta will be calculated, but having a 0.000 in the output simplifies interpretation?

Thanks in advance,

Robin
stephanehess
Site Admin
Posts: 998
Joined: 24 Apr 2020, 16:29

Re: Dummy coding of categorical variables in MNL_SP example.

Post by stephanehess »

Hi

yes, there is no need to create separate attributes in what some people refer to as recoding of the attributes. You can just do it directly in the utility functions.

Btw, this attribute has 3 levels, not 4. The "not used" level is really a separate thing, which is that the attribute doesn't exist at all in the RP data, but in effect, this is the same as no-frills then

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
Post Reply