Review doc and all c code for the use of preallocated output.
Hi,
Currently in 2 cases we provide the output to use to the op(in the output_storage). This is not about inplace stuff. Most code have been written to check the shape of the output before reusing it. Most c code don't check the stride and suppose the output to be c contiguous.
In the current 2 cases, this is true, but we should review this and add this check if the code suppose the output to be c contiguous.
This is some place where it have been discussed:
UPDATE: Here are some things still left to do.
Currently in 2 cases we provide the output to use to the op(in the output_storage). This is not about inplace stuff. Most code have been written to check the shape of the output before reusing it. Most c code don't check the stride and suppose the output to be c contiguous.
In the current 2 cases, this is true, but we should review this and add this check if the code suppose the output to be c contiguous.
This is some place where it have been discussed:
- https://groups.google.com/group/theano-dev/browse_thread/thread/5c5d15ea9a8ccb28
- https://groups.google.com/group/theano-dev/msg/f8e970dc727e7b57
- ticket #509
UPDATE: Here are some things still left to do.
- Do not preallocate all memory buffers at the same time, but one map at a time (only previous, then only c_contiguous, etc.).
- More stride patterns, implement 'neg_stride' should be good.
- Buffers with wrong shape (too small, too big, right overall size but wrong shape on some dimensions).
- Copy only non-destroyed inputs (also for regular DebugMode).
- New error or error messages that reflect more faithfully what the problem really is.
- Check on memory buffers that were already provided in the initial storage_map.
Leave a comment
There is now a new flag, config.DebugMode.check_preallocated_output that tries various preallocated output storages.
I have yet to
- test more thoroughly the behaviour of this flag,
- run the whole test suite with it, to detect problems in the existing Ops.
Not so critical, but should be done at some point:
- test pre-allocated output for Sparse, maybe Scalar,
- test more patterns (negative strides, non-contiguous memory),
See also* 435.
I have yet to
- test more thoroughly the behaviour of this flag,
- run the whole test suite with it, to detect problems in the existing Ops.
Not so critical, but should be done at some point:
- test pre-allocated output for Sparse, maybe Scalar,
- test more patterns (negative strides, non-contiguous memory),
See also* 435.
Done in https://github.com/Theano/Theano/pull/604, I'll close the ticket when the PR is accepted.