Heap exhaustion on large file uploads with OnDiskFileParamHolder
This is caused by SHtml.fileUpload using the file member of FileParamHolder to determine length instead of delegating to the backing store.
Per Maarten Koopmans:
Per Maarten Koopmans:
Hi,
I have tried uploading a large (1GB) file to Lift in both developer and
production mode (i.e. set mode production in SBT). File was created by
"mkfile 1024M 1GFile"
My SBT settings are:
java -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=192m -Xmx1500M -jar
`dirname $0`/sbt.jar "$@"
I have set LiftRulers in Boot.boot to:
LiftRules.maxMimeFileSize = 1173741824L
LiftRules.maxMimeSize = 1273741824L
//Make sure we don't put stuff in memory for uploads
LiftRules.handleMimeFile = OnDiskFileParamHolder.apply
And here's the upload snippet, that basically does nothing:
class TestUpload {
var fileHolder : Box[FileParamHolder] = Empty
def uploader(xhtml: NodeSeq): NodeSeq = {
bind("upload",xhtml,
"loader" -> SHtml.fileUpload((f) => {fileHolder = new Full(f)}),
"submit" -> SHtml.submit(S ? "Upload",() => {Console println("File
uploaded! Path = "+cloudfiles.fullkey)})
)
}
}
And here's the output:
Message: java.lang.OutOfMemoryError: Java heap space
java.util.Arrays.copyOf(Arrays.java:2786)
java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
net.liftweb.util.IoHelpers$class.readOnce$2(IoHelpers.scala:99)
net.liftweb.util.IoHelpers$class.readWholeStream(IoHelpers.scala:103)
net.liftweb.util.Helpers$.readWholeStream(Helpers.scala:34)
net.liftweb.http.OnDiskFileParamHolder.file(Req.scala:159)
net.liftweb.http.SHtml$$anonfun$29.apply(SHtml.scala:1602)
net.liftweb.http.SHtml$$anonfun$29.apply(SHtml.scala:1602)
net.liftweb.http.S$BinFuncHolder.apply(S.scala:2350)
net.liftweb.http.S$ProxyFuncHolder.apply(S.scala:2332)
net.liftweb.http.LiftSession$$anonfun$buildFunc$1$1$$anonfun$apply$35.apply(LiftSession.scala:659)
net.liftweb.http.LiftSession$$anonfun$buildFunc$1$1$$anonfun$apply$35.apply(LiftSession.scala:659)
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
scala.collection.immutable.List.foreach(List.scala:45)
scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
scala.collection.immutable.List.map(List.scala:45)
net.liftweb.http.LiftSession$$anonfun$buildFunc$1$1.apply(LiftSession.scala:659)
net.liftweb.http.LiftSession$$anonfun$16$$anonfun$apply$41.apply(LiftSession.scala:677)
net.liftweb.http.LiftSession$$anonfun$16$$anonfun$apply$41.apply(LiftSession.scala:677)
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
scala.collection.immutable.List.foreach(List.scala:45)
scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
scala.collection.immutable.List.map(List.scala:45)
net.liftweb.http.LiftSession$$anonfun$16.apply(LiftSession.scala:677)
net.liftweb.http.LiftSession$$anonfun$16.apply(LiftSession.scala:666)
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:227)
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:227)
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
With just me testing (i.e. one concurrent user) this really scares me.
You'd say that an OnDiskFileParamHolder simply uses smart buffering
and that the data transfer itself would never cause more than a few MB
for the stream in terms of memory allocation.
Going through the source code, I see that Req.scala in
new.liftweb.http reads the temp file into a ByteArrayStream (see line
159) which I think is causing the problems. Any thoughts? It looks
like a bug to me, or there is a very simple thing I am missing.
The functionality is kind of crucial to me, so if this is a bug,
please point me on how to test a modified lift repo against a sample
project locally, and how to contribute back (off-list is good, too).
--Maarten
Leave a comment
on 2011-01-14 00:30 *
By dchenbecker
Status changed from Accepted to Test
Work remaining changed from 1.0 to 0.0
Fix committed on dcb-issue-841 for testing by Maarten.
Tested with a 1GB file upload, works.
Needs to go through review board before committing to master
(In revision:ec47233e3c424a1e5ed5df87a9e02e4cd7fee203) Fixed heap exhaustion with large file uploads
Closes #841
SHtml.fileUpload was using FileParamHolder.file,
which with OnDiskFileParamHolder induced the entire
file to be read into memory. Instead, I added a
length member to FileParamHolder that does the
right thing depending on whether it's an in-memory
or on-disk file upload.
Branch: master
Closes #841
SHtml.fileUpload was using FileParamHolder.file,
which with OnDiskFileParamHolder induced the entire
file to be read into memory. Instead, I added a
length member to FileParamHolder that does the
right thing depending on whether it's an in-memory
or on-disk file upload.
Branch: master