Page 1 of 1

Loading tab delimited data

Posted: 03 Feb 2021, 10:49
by svenne
Dear all,

I run a post-estimation analysis on market share prediction for segments (subgroups) of my sample. My sample is cross-sectional and has 434 observations.

I estimated a simple MNL named MyModel. The database contains a characteristic (categorical) inc with 5 outcomes.

When I run

Code: Select all

sharesTest_settings=list()
sharesTest_settings[["alternatives"]] = c(A=1,B=2,C=3,D=4,E=5)
sharesTest_settings[["choiceVar"]]    = database$choice
sharesTest_settings[["subsamples"]]   = list(
                                             inc10k = (database$inc == 1),
                                             inc20k = (database$inc == 2),
                                             inc30k = (database$inc == 3),
                                             inc40k = (database$inc == 4),
                                             inc50k = (database$inc == 5)
                                           )
apollo_inputs = apollo_validateInputs()
apollo_sharesTest(MyModel,apollo_probabilities,apollo_inputs,sharesTest_settings)
I get

Code: Select all

Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 0, 434
In addition: Warning message:
In split.default(x = seq_len(nrow(x)), f = f, drop = drop, ...) :
  data length is not a multiple of split variable
Google tells me that this may have something to do with the row ID. However, I am stuck here. Any help would be greatly appreciated.

Best
Sven

Re: Error for market share recovery for subgroups of data (Sec. 9.6)

Posted: 03 Feb 2021, 17:28
by stephanehess
Hi Sven

this is not an error message we've ever come across.

Can you confirm which version of Apollo you're using? Also, are you making any changes to the data between estimation and calling apollo_sharesTest?

Stephane

Re: Error for market share recovery for subgroups of data (Sec. 9.6)

Posted: 03 Feb 2021, 17:52
by svenne
Hi Stephane,

Thanks for the prompt reply. I use Apollo 0.2.2. I do not modify the data from estimation to prediction.

However, I think I could narrow down the error to some extent: In estimation the number of individuals is wrong. See the following excerpt.

Model run using Apollo for R, version 0.2.2 on Darwin by svenmueller
www.ApolloChoiceModelling.com

Model name : Base
Model description : Basline model: challenge question 1
Model run at : 2021-02-03 12:13:37
Estimation method : bfgs
Model diagnosis : successful convergence
Number of individuals : 1
Number of rows in database : 434
Number of modelled outcomes : 434

However, I point in apollo_control to the ID column:

Code: Select all

apollo_initialise()
apollo_control=list(modelName="Base",
                    modelDescr="Basline model",
                    indivID="ID",
                    panelData = FALSE
)
My database looks like

Code: Select all

> head(database[,22:26])
# A tibble: 6 x 5
  avail3 avail4 avail5  ones    ID
   <dbl>  <dbl>  <dbl> <dbl> <int>
1      1      0      1     1     1
2      1      0      1     1     2
3      1      0      1     1     3
4      1      0      1     1     4
5      1      0      1     1     5
6      1      0      1     1     6
When I apply out of sample validation I get

Code: Select all

> ## cross Validation / Out of Sample Test
> apollo_outOfSample(apollo_beta, apollo_fixed,
+                    apollo_probabilities, apollo_inputs)
10 separate runs will be conducted, each using a random subset of 90% for estimation and
  the remainder for validation.
Error in apollo_outOfSample(apollo_beta, apollo_fixed, apollo_probabilities,  : 
  validationSize must be between 1 and (nIndivs-1).
I strongly think that the two errors (out of sample validation and market shares of subsamples) are both due to the problem with the number of individuals.

Unfortunately, I do not see how to correct for this.

Best
Sven

Re: Error for market share recovery for subgroups of data (Sec. 9.6)

Posted: 03 Feb 2021, 18:07
by stephanehess
Sven

very strange, you may have found a bug. Could you drop the line panelData = FALSE as you don't need, and let's see if this changes things.

Thanks

Stephane

Re: Error for market share recovery for subgroups of data (Sec. 9.6)

Posted: 03 Feb 2021, 18:12
by svenne
Stephane,

If I drop panelData then I get by

Code: Select all

apollo_inputs = apollo_validateInputs()
Several observations per individual detected based on the value of ID. Setting panelData
  in apollo_control set to TRUE.
All checks on apollo_control completed.
All checks on database completed.
Should I send you my .R script and the data?

Best
Sven

Re: Error for market share recovery for subgroups of data (Sec. 9.6)

Posted: 03 Feb 2021, 18:13
by stephanehess
Sven

thanks. So does the data in fact contain multiple rows for some people?

Stephane

Re: Error for market share recovery for subgroups of data (Sec. 9.6)

Posted: 03 Feb 2021, 18:19
by svenne
Stephane,

actually no. I use an excerpt of the residential telephone service data (Train et al. 1987). As far as I know this is cross sectional RP.

Best
Sven

Re: Error for market share recovery for subgroups of data (Sec. 9.6)

Posted: 03 Feb 2021, 19:29
by svenne
Thanks to Stephane the issue could be resolved. I loaded the data as

Code: Select all

database <- read_delim("mydata.txt", 
                       "\t", escape_double = FALSE, 
                       trim_ws = TRUE)
However,

Code: Select all

database <- read.delim("mydata.txt")
does the job.

Best
Sven