Page 1 of 1

A data contain too many observations for each person even workInLogs=TRUE

Posted: 29 Jan 2024, 03:58
by toshi
Hi, I am working with a choice model that a person is observed about 2300 times of up to 7 alternatives.
In this data, even I use "workInLogs=TRUE", R reached the numerical limits on the choice probability when I use apollo_panelProd.
(e.g. exp(log(0.14)*2300) = 0)
I try to use "Rmpfr" to increase the numerical limits by modifying the apollo_panelProd.
(e.g. exp(log(Rmpfr::mpfr(0.14, 32)*2300)) = 322.00000024)
However, the function depends on the type of object and seems too many modifications are required to use Rmpfr in apollo_panelProd.
I also surmised that there may be several other functions that make up apollo that need to be fixed as well.

I would appreciate it if you would consider introducing an option to increase the accuracy of the choice probability when using panel data If you think it is worthwhile to fix this problem.

Thank you for your consideration,
Toshifumi

Re: A data contain too many observations for each person even workInLogs=TRUE

Posted: 29 Jan 2024, 16:08
by stephanehess
Hi

this is an extreme case. Do you have that large a number of choices for each person in the data or just for one person (in which case they will dominate the data).

If you're not going to run a model with random parameters, then you may consider treating the data as if it came from separate people, which would avoid the issue, but of course means not correcting the standard errors for repeated choice

Stephane

Re: A data contain too many observations for each person even workInLogs=TRUE

Posted: 30 Jan 2024, 02:19
by toshi
Thank you for your suggestion.

Yes, I have a large number of choices for each person. The number of individuals is about 5,000. Therefore, the data contains 9M rows.
This was the first time I tried Apollo after being unable to estimate with various other packages.
Because Apollo makes explicit what it calculates one at a time, it make me recognize why simple logits could be estimated for this data, but not individual-specific parameters.
I am currently trying alternative approaches, such as dropping from the data those selection opportunities that are not important to the problem or writing the whole estimation code that employs the Rmpfr.

Thank you for your consideration,
Toshifumi

Re: A data contain too many observations for each person even workInLogs=TRUE

Posted: 30 Jan 2024, 14:43
by stephanehess
Hi

you could also try some sampling approaches

Stephane