Use the deep learning tutorials as canary-in-the-coalmine for theano performance issues
Right now the theano unit tests don't check performance, just correctness. This is reasonable, as checking performance (a) isn't necessary on new installs and (b) would slow down development. However, performance testing is important -- cf. http://groups.google.com/group/theano-dev/browse_thread/thread/c03365bb03d19604. The deep learning tutorials, which run the same code over the same data set are an excellent 'realistic use' test case for theano. I propose that a buildbot should be set up that runs this code, and that significant variations in runtime vs. the prior run or vs. some known standard should trigger an investigation into the change (whether good or bad).
Leave a comment
Our buildbot already do something like that. We could update its reporting to report the running time in the title as well as in the text as now. That will make it more obvious if their is time that is lost as I don't read the email every day.
Sounds good. Ideally, there should be some kind of automated check on the times, above and beyond putting them in the title of the email, that way anyone can see there is something worth looking into, w/o having to have prior knowledge of normal running times. Any time you can get knowledge out of peoples' heads and into code, that's a good thing. :)
I hardcoded some expected time on our buildbot computer. I print the expected time and the result time.
If run time is more the 5% above the expected time, I will consider this as an speed failure. I will report the speed failure on the title.
Do you think of a better way of doing this? I don't know how to make this resilient on the computer type expect by comparing the % of time spent in each op in the profile mode. Event that won't be very usefull as depending of the BLAS used, that will give much different number.
If run time is more the 5% above the expected time, I will consider this as an speed failure. I will report the speed failure on the title.
Do you think of a better way of doing this? I don't know how to make this resilient on the computer type expect by comparing the % of time spent in each op in the profile mode. Event that won't be very usefull as depending of the BLAS used, that will give much different number.
I was keeping it open as I read them, but their is always error as the expected time don't seem correct. Also their is weekly variation in execution time that I don't understand. So yes it is printed, but useless in the current form as we don't know if their is a regression.
So I open it again... But it can wait after the next release.
I don't reorder the ticket as we are not sure when 0.3.1 will go out. If we do a sprint for it, I will try to fix all that in the sprint. If their is no sprint, I will wait for the following sprint.
So I open it again... But it can wait after the next release.
I don't reorder the ticket as we are not sure when 0.3.1 will go out. If we do a sprint for it, I will try to fix all that in the sprint. If their is no sprint, I will wait for the following sprint.
I was keeping it open as I read them, but their is always error as the expected time don't seem correct. Also their is weekly variation in execution time that I don't understand. So yes it is printed, but useless in the current form as we don't know if their is a regression.
So I open it again... But it can wait after the next release.
I don't reorder the ticket as we are not sure when 0.3.1 will go out. If we do a sprint for it, I will try to fix all that in the sprint. If their is no sprint, I will wait for the following sprint.
So I open it again... But it can wait after the next release.
I don't reorder the ticket as we are not sure when 0.3.1 will go out. If we do a sprint for it, I will try to fix all that in the sprint. If their is no sprint, I will wait for the following sprint.
I was keeping it open as I read them, but their is always error as the expected time don't seem correct. Also their is weekly variation in execution time that I don't understand. So yes it is printed, but useless in the current form as we don't know if their is a regression.
So I open it again... But it can wait after the next release.
I don't reorder the ticket as we are not sure when 0.3.1 will go out. If we do a sprint for it, I will try to fix all that in the sprint. If their is no sprint, I will wait for the following sprint.
So I open it again... But it can wait after the next release.
I don't reorder the ticket as we are not sure when 0.3.1 will go out. If we do a sprint for it, I will try to fix all that in the sprint. If their is no sprint, I will wait for the following sprint.