Page 1 of 1

Out of sample results

Posted: 12 Nov 2022, 13:08
by JuliavB
Dear Stephane,

I´ve conducted the Out of Sample test function in Apollo resulting in the following outputs:

LL per obs in estimation sample LL per obs in validation sample % difference
1 -1.0348 -1.0144 1.97
2 -1.0338 -1.0256 0.79
3 -1.0236 -1.1150 -8.94
4 -1.0296 -1.0592 -2.88
5 -1.0289 -1.0652 -3.53
6 -1.0386 -0.9809 5.56
7 -1.0367 -0.9970 3.83
8 -1.0339 -1.0222 1.13
9 -1.0378 -0.9923 4.38
10 -1.0312 -1.0448 -1.32
Average -1.0329 -1.0317 0.10

Unfortunately, I did not find any specific quantifiable interpretation guideline for the out of sample outputs in the manual.
Can you tell how the results can be interpreted? And is it okay to stay with the default of 10% validation sample for a total sample size of N=330?

Thanks for your support in advance.
Best,
J.

Re: Out of sample results

Posted: 25 Nov 2022, 13:25
by stephanehess
Hi

there is no specific hard rule in terms of what size of difference is acceptable, but your differences look pretty small, suggesting no specific risk of overfitting

Stephane

Re: Out of sample results

Posted: 09 Feb 2024, 13:00
by kkavta
Dear Prof. Stephane,

I hope this email finds you well.

I have some follow-up questions regarding the same topic. How should we interpret the average difference in log-likelihood (LL) between the estimated sample and validation sample? Does a smaller value for percentage difference imply that the estimated model fits the validation sample better compared to situations with a higher difference?

In my case, I'm obtaining an average difference value of -2.7 % for MNL and -4.7% fro MMNL model. Is this acceptable?

Also, Is there any way to calculate the "% correct predicted" metric for validation in Apollo?

Thank you for your assistance.

Best regards,
K.

Re: Out of sample results

Posted: 06 May 2024, 08:03
by stephanehess
Hi

apologies for the slow reply

So what you're finding is that you MMNL overfits the estimation data a bit more than MNL, but these differences are small.

No, Apollo does not compute the % correct predicted metric as this is a misleading metric that is contrary to probabilistic choice models. It should never be used. See Kenneth Train's book

Stephane