clojure.contrib.sql runs out of memory on very large datasets
If you have a query that results in a very large dataset you will run out of memory because by default, with-query-results pulls the entire result set into memory (even though it creates a lazy-seq using resultset-seq).
This issue has been discussed previously here:
http://groups.google.com/group/clojure/browse_thread/thread/7b0c250e0ba6c9eb/fb9001522b49c20a
The fix is simple, just call (.setFetchSize stmt 1). But, with-query-results doesn't give you a way to do that currently. I propose adding an optional parameter after sql-params that is a hash of attributes to set on the PreparedStatement before executing it.
So you can do this:
The new code in clojure.contrib.sql is very simple, but it depends on a new contrib namespace called clojure.contrib.bean. Here is more info on my proposed fix:
http://groups.google.com/group/clojure-dev/browse_thread/thread/d8334759f10f3f45
This issue has been discussed previously here:
http://groups.google.com/group/clojure/browse_thread/thread/7b0c250e0ba6c9eb/fb9001522b49c20a
The fix is simple, just call (.setFetchSize stmt 1). But, with-query-results doesn't give you a way to do that currently. I propose adding an optional parameter after sql-params that is a hash of attributes to set on the PreparedStatement before executing it.
So you can do this:
(with-connection {…}
(.setAutoCommit (sql/connection) false) ;; needed for postgres
(with-query-results results ["SELECT id, data FROM nodes"]
{:fetch-size 1000}
(doseq [r results]
…)))
The new code in clojure.contrib.sql is very simple, but it depends on a new contrib namespace called clojure.contrib.bean. Here is more info on my proposed fix:
http://groups.google.com/group/clojure-dev/browse_thread/thread/d8334759f10f3f45
Leave a comment
on 2010-09-03 15:48 *
By justin.balthrop
Hey Steve,
Any progress on this? Another option instead of passing a map of attributes and using bean would be to allow the user to pass an optional function that would be called on the statement object. I've seen code like this in compojure, I think:
If this sounds better to you, I can submit a patch.
Cheers,
Justin
Any progress on this? Another option instead of passing a map of attributes and using bean would be to allow the user to pass an optional function that would be called on the statement object. I've seen code like this in compojure, I think:
(with-connection {…}
(.setAutoCommit (sql/connection) false) ;; needed for postgres
(with-query-results results ["SELECT id, data FROM nodes"]
#(.setFetchSize % 1)
(doseq [r results]
…)))
If this sounds better to you, I can submit a patch.
Cheers,
Justin
on 2010-10-17 22:45 *
By justin.balthrop
Kyle Burton just posted a blog post about this issue: http://asymmetrical-view.com/2010/10/14/clojure-lazy-walk-sql-table.html
His fix is to add a new method called with-query-results-cursor which turns off auto-commit and sets the fetch size.
Steve, would this approach work better for you?
His fix is to add a new method called with-query-results-cursor which turns off auto-commit and sets the fetch size.
Steve, would this approach work better for you?