append-spit should only write out an encoding marker once
In clojure.contrib.duck-streams append-spit writes out encoding
markers (for UnicodeLittle for example this is a FEFF in hex)
each time it appends to a file. This should happen only when
the file is initially created.
Test case for reproducing this behaviour:
The slurp outputs
"Line 1\n?Line 2\n"
The expected output is:
"Line 1\nLine 2\n"
markers (for UnicodeLittle for example this is a FEFF in hex)
each time it appends to a file. This should happen only when
the file is initially created.
Test case for reproducing this behaviour:
(use 'clojure.contrib.duck-streams)
(binding [*default-encoding* "UnicodeLittle"]
(append-spit "/foo.txt" "Line 1\n"))
(binding [*default-encoding* "UnicodeLittle"]
(append-spit "/foo.txt" "Line 2\n"))
(slurp "c:/foo.txt" "UnicodeLittle")
The slurp outputs
"Line 1\n?Line 2\n"
The expected output is:
"Line 1\nLine 2\n"
Leave a comment
on 2010-04-13 12:18 *
By stuart.halloway
Assigned to set to stuart.halloway
Milestone changed from Backlog to Release 1.2
I am not sure there is a good answer here. The code above chooses an encoding with an explicit marker, and gets what it asks for. :-(
One proposed solution (http://github.com/sergey-miryanov/clojure-contrib/commits/bug-30) tries to detect this scenario, and recover via a hard-coded mapping between encodings-with-markers and similar-encodings-without. But I don't think this can work in general, because the set of possible encodings is open and the Charset API doesn't provide a mapping between the with-markers and without-markers versions.
Sorry, and please feel free to reopen this if I am missing an obvious approach.
One proposed solution (http://github.com/sergey-miryanov/clojure-contrib/commits/bug-30) tries to detect this scenario, and recover via a hard-coded mapping between encodings-with-markers and similar-encodings-without. But I don't think this can work in general, because the set of possible encodings is open and the Charset API doesn't provide a mapping between the with-markers and without-markers versions.
Sorry, and please feel free to reopen this if I am missing an obvious approach.