Version 12, last updated by SylvainPL at January 03, 2011 16:56 UTC
Résultats SDA (tableau demandé le 26 avril)
Le tableau présenté ici est celui demandé le 26 avril en classe. Tous les modèles sont de même capacité avec tous les mêmes hyper-paramètres. Seules quelques détails sont changés et sont expliqués dans le tableau.
3 hidden layers, 1000 units by layer, around 12 millions exemples seen for each layers in the pre-train, pre-train learning rate of 0.01, decreasing finetune learning rate starting from 0.1 and is divided by 2 on each epoch of P07 (or 100 times NIST), version theano: c02f8559e0f2+, corruption level of the DA 0,2, validation set used by the model is always NIST. Calculations were made on GPU. The model's directory PATH is /data/lisa/data/ift6266h10/experiments_SDA/
| Pre-train dataset: | Type of non-linearities: | Type of finetune done | IFT6266 version | Validation error (NIST) | Validation error (P07) | Validation error (PNIST) | Test error on NIST (with stderror) |
Test error on P07 (with stderror) |
Test error on PNIST07(with stderror) | # exemples seen during finetune | model directory | test error on NIST digits (with class prior) | test error on NIST lower case char. (with class prior) | test error on NIST upper case char. (with class prior) | Computation time(hours) |
| NIST | tanh and sigmoid for output |
NIST | 543ae35e387e+ | 12.018750 % | 96.03 % | 64.13 % | 25.827044 % (0.152290) |
96.351250 % (0.066291) |
66.190000 % (0.334506) | 29 000 000 | NIST_petit_sigmoid_tanh | 2.82 % |
12.9 % |
5.94% |
1.85 |
| NIST | sigmoid and softmax for output |
NIST | 88cb95007670+ | 11.942500 % | 97.41 % | 63.61 % | 23.679886 % (0.147917) |
97.452500 % (0.055707) |
65.235000 % (0.336741) | 61 000 000 | NIST_petit_softmax_sigmoid | 2.70 % | 12.8 % | 5.19 % | 4.1 |
| NIST | tanh and softmax for output |
NIST | 543ae35e387e+ | 12.190000 % | 95.94 % | 64.31 % | 25.132671 % (0.150930) |
96.287500 % (0.066846) |
66.305000 % (0.334226) | 31 000 000 | NIST_petit_softmax_tanh | 2.60 % |
12.27 % |
5.93% |
1.85 |
| P07 | tanh and sigmoid for output |
P07 |
d391ad815d89+ | 13.085000 % | 37.79 % | 32.03 % | 18.719114 % (0.135721) |
39.991250 % (0.173199) |
33.575000 % (0.333933) | 439 000 000 | P07_demo | 1.73 % |
11.70 % |
3.76 % |
20.7 |
| P07 | tanh and sigmoid for output |
P07 + NIST |
d391ad815d89+ | 10.463750 % | 48.65 % | 37.48 % | 20.669478 % (0.140895) |
51.395000 % (0.176708) |
40.050000 % (0.346482) | 12 000 000 (of NIST) | P07_demo | 1.38 % |
10.41 % | 3.21 % |
21.3 |
| P07 | sigmoid and softmax for output |
P07 |
88cb95007670+ | 14.357500 % | 40.36 % | 34.56 % | 20.290056 % (0.139929) |
42.637500 % (0.174850) |
36.145000 % (0.339709) | 472 000 000 | P07_petit_softmax_sigmoid | 2.20 % |
12.91 % |
4.51 % |
24.5 |
| P07 | sigmoid and softmax for output |
P07 + NIST |
88cb95007670+ | 11.382500 % | 49.57 % | 41.38 % | 21.774137 % (0.143600) |
53.583750 % (0.176322) |
44.350000 % (0.351289) | 12 000 000 (of NIST) | P07_petit_softmax_sigmoid | 1.82 % |
11.11 % |
3.70 % |
25 |
| P07 | tanh and softmax for output |
P07 |
543ae35e387e+ | 13.677500 % | 39.42 % | 33.38 % | 19.604464 % (0.138135) |
41.827500 % (0.174399) |
34.980000 % (0.337224) | 301 000 000 | P07_petit_softmax_tanh | 2.08 % |
12.23 % |
4.21 % |
14.2 |
| P07 | tanh and softmax for output |
P07 + NIST |
543ae35e387e+ | 10.733750 % | 49.60 % | 40.68 % | 21.767902 % (0.143586) |
52.220000 % (0.176602) |
43.365000 % (0.350427) | 17 000 000 (of NIST) | P07_petit_softmax_tanh | 1.57 % |
10.25 % | 3.24 % |
15 |
| PNIST | tanh and sigmoid for output |
PNIST |
eb42bed0c13b+ | 11.812500 % | 94.03 % | 27.90 % | 17.286549 % (0.131569) |
94.598750 % (0.079918) |
29.745000 % (0.323244) | 401 000 000 | PNIST_petit_sigmoid_tanh | 1.47 % |
10.42 % |
2.72 % |
20.9 |
| PNIST | tanh and sigmoid for output |
PNIST + NIST |
eb42bed0c13b+ | 9.911250 % | 94.47 % | 33.72 % | 20.393629 % (0.140195) |
94.703750 % (0.079181) |
36.635000 % (0.340689) | 25 000 000 (of NIST) |
PNIST_petit_sigmoid_tanh | 1.24 % |
9.85 % |
2.60 % |
22.3 |
| PNIST | sigmoid and softmax for output |
PNIST |
eb42bed0c13b+ | 12.563750 % | 94.57 % | 30.15 % | 18.282736 % (0.134489) |
95.021250 % (0.076900) |
31.870000 % (0.329492) | 536 000 000 | PNIST_petit_softmax_sigmoid | 1.74 % |
11.03 % |
3.49 % |
26.3 |
| PNIST | sigmoid and softmax for output |
PNIST + NIST |
eb42bed0c13b+ | 11.133750 % | 95.02 % | 34.48 % | 20.691450 % (0.140950) |
95.696250 % (0.071751) |
37.350000 % (0.342051) | 3 000 000 of NIST | PNIST_petit_softmax_sigmoid | 1.55 % |
10.58 % | 3.22 % |
26.4 |
| PNIST | tanh and softmax for output |
PNIST |
88cb95007670+ | 11.760000 % | 93.73 % | 27.82 % | 17.082130 % (0.130950) |
94.235000 % (0.082406) |
29.715000 % (0.323150) | 471 000 000 | PNIST_petit_softmax_tanh | 1.38 % |
10.16 % |
2.96 % |
20.9 |
| PNIST | tanh and softmax for output |
PNIST + NIST |
88cb95007670+ | 10.076250 % | 94.05 % | 32.31 % | 20.719295 % (0.141020) |
94.448750 % (0.080956) |
35.085000 % (0.337457) | 7 000 000 (of NIST) | PNIST_petit_softmax_tanh | 1.19 % |
9.63 % |
2.60 % |
21.2 |
Note that a pre-train on P07 and PNIST, especially PNIST is great for the test error on NIST. Note also that a second finetune on NIST makes the validation error to go down, but test error goes up. This might be an indicator of an over-fit or maybe the test distribution is more like P07 and PNIST than NIST.
The choice of the final non-linearity (sigmoid or softmax) does not seems to be an important factor for the performance, but the inside non-linearity (tanh or sigmoid) seems to have a greater effect. Obviously, the effect of the inside non-linearity is not the same with NIST , P07 and PNIST. WIth NIST, the sigmoid non-linearity seems to be the better choice. On the other side, with P07 and PNIST, tanh seems to be from far the better choice.
The error rate on the test set for NIST is higher when a second finetune on NIST is done after a first one on P07 or PNIST. This situation is strange, but it is possible to seen that it is the inverse for the apriori error rate (the error rate when the model know the class of the image) on the test set of NIST. When a second finetune is done on NIST, the apriori error rate on the NIST test set goes down. The second finetune is good for making a discrimination in a class, but not between class. Obviously, I do not have an hypothesis about this strange behaviour yet.
During the training process, the validation set used was always NIST, but in regards to the results in the table we can see a problem. A low validatoin error rate on NIST is not predictive of a low test error rate. As an experiment, we tried to calculate the final validation error on P07 and PNIST too for all the models. WIth these tests, we can see that PNIST is more representative of the NIST test set. A low validation error rate on PNIST is predictive of a low test error on NIST.
Please also note that the ift6266 version is for information only. The exact code ran is in the model's directory reported in the table. The code may differ a little from the repository.
| FT6266 version | Validation error (NIST) | Test error on NIST | Test error on P07 | # exemples seen during finetune | model directory | test error on digits | test error on lower case char. | test error on upper case char. |