I have found a performance issue when using TextIO.StreamIO.input1 to read a functional stream. Looking at gc/non-gc times and using PolyML.profiling, it appears that garbage collection accounts for most of the time. There is some code below to demonstrate with stats that include comparison with SML/NJ.
The profiling shows that readFromReader in basis/BasicStreamIO.sml is responsible for creating values that are being garbage collected. Looking at this code, I can see various things that would contribute to this garbage collection but nothing that is obviously problematic. Is it simply the case that overheads in the implementation mean that it is not suitable for a large number of small reads?
Thanks, Phil
(* `makeTextFile (filename, length)` generates a text file *) (* called `filename` of length `length` where every character *) (* is the digit '0'. *)
local val l = 2048; val s = implode (List.tabulate (l, fn _ => #"0"));
in fun makeTextFile (filename, length) = let open TextIO;
val ostream = getOutstream (openOut filename);
fun write n = if n < l then StreamIO.output (ostream, String.substring (s, 0, n)) else (StreamIO.output (ostream, s); write (n - l)); in write length; StreamIO.closeOut ostream end;
end;
(* `readTextFile filename` reads the file called `filename` *) (* one character at a time using a functional input stream *) (* and returns the time taken to read it. *)
fun readTextFile filename = let open TextIO;
val src = getInstream (openIn filename);
fun read src = case StreamIO.input1 src of SOME (_, src') => read src' | NONE => ();
val startTime = Timer.startCPUTimer (); in read src; (fn {sys, usr} => Time.+ (sys, usr)) (Timer.checkCPUTimer startTime) before StreamIO.closeIn src end;
(* Create files and then read *)
makeTextFile ("text-01MB", 1024 * 1024 * 1); makeTextFile ("text-02MB", 1024 * 1024 * 2); makeTextFile ("text-03MB", 1024 * 1024 * 3); makeTextFile ("text-04MB", 1024 * 1024 * 4); makeTextFile ("text-05MB", 1024 * 1024 * 5);
readTextFile "text-01MB"; readTextFile "text-02MB"; readTextFile "text-03MB"; readTextFile "text-04MB"; readTextFile "text-05MB";
(* read times in seconds on Intel Core 2 Duo CPU T7500 @ 2.20GHz, 2GB *) (* Poly/ML 4.1.3 Poly/ML 5.2 SML/NJ 110.52
text-01MB 0.7 - 0.8 0.3 - 1.5 0.03 - 0.05 text-02MB 2.2 - 2.3 1.9 - 2.8 0.06 - 0.09 text-03MB 4.3 - 4.5 2.0 - 4.3 0.08 - 0.14 text-04MB 7.2 - 7.5 4.2 - 6.6 0.11 - 0.16 text-05MB 11.0 - 11.1 4.0 - 5.8 0.14 - 0.18
*)
(* for large files, Poly/ML 4.1.3 can run out of store: *)
makeTextFile ("text-40MB", 1024 * 1024 * 40);
readTextFile "text-40MB";
(* Poly/ML 4.1.3 Poly/ML 5.2 SML/NJ 110.52
text-40MB out of store 101.4 1.1 - 1.2
4.1.3 gives the following message:
Run out of store - interrupting console processes Exception- Interrupt raised
*)
The information contained in this E-Mail and any subsequent correspondence is private and is intended solely for the intended recipient(s). The information in this communication may be confidential and/or legally privileged. Nothing in this e-mail is intended to conclude a contract on behalf of QinetiQ or make QinetiQ subject to any other legally binding commitments, unless the e-mail contains an express statement to the contrary or incorporates a formal Purchase Order.
For those other than the recipient any disclosure, copying, distribution, or any action taken or omitted to be taken in reliance on such information is prohibited and may be unlawful.
Emails and other electronic communication with QinetiQ may be monitored and recorded for business purposes including security, audit and archival purposes. Any response to this email indicates consent to this.
Telephone calls to QinetiQ may be monitored or recorded for quality control, security and other business purposes.
QinetiQ Limited Registered in England & Wales: Company Number:3796233 Registered office: 85 Buckingham Gate, London SW1E 6PD, United Kingdom Trading address: Cody Technology Park, Cody Building, Ively Road, Farnborough, Hampshire, GU14 0LX, United Kingdom http://www.QinetiQ.com/home/legal.html
Philip Clayton wrote:
I have found a performance issue when using TextIO.StreamIO.input1 to read a functional stream. Looking at gc/non-gc times and using PolyML.profiling, it appears that garbage collection accounts for most of the time. There is some code below to demonstrate with stats that include comparison with SML/NJ.
The profiling shows that readFromReader in basis/BasicStreamIO.sml is responsible for creating values that are being garbage collected. Looking at this code, I can see various things that would contribute to this garbage collection but nothing that is obviously problematic. Is it simply the case that overheads in the implementation mean that it is not suitable for a large number of small reads?
I think I need to look again at the functional IO part of Poly/ML's basis library.
The idea of functional IO is that a stream should be repeatable. i.e. if a stream, f, has returned some data then re-reading from the stream should return the same data. The definition of functional IO in the basis library that I used when implementing this in Poly/ML had a number of program snippets that implied that it was not just the content that had to be repeatable but also the way the content was broken down. So, if the stream was read using "input1" to return a single character then a subsequent call to "input" on that same functional stream must return precisely one character.
val str = getInstream(openIn "/tmp/abc");
val str = ? : TextIO.StreamIO.instream
StreamIO.input1 str;
val it = SOME (#"0", ?) : (TEXT_STREAM_IO.elem * TextIO.StreamIO.instream) option
StreamIO.input str;
val it = ("0", ?) : TEXT_STREAM_IO.vector * TextIO.StreamIO.instream
However, I think when the book was published many of these examples were left out and although it doesn't seem to be stated formally I think the idea is that only the content needs to be repeatable. So "input" should return a string whose first character is the same as that returned by "input1" but whose length is unspecified. This seems to be what SML/NJ at least does. It looks like the problem for your example is that Poly/ML is building up a enormous stream of single character elements and that this is overwhelming the storage management.
Although the basis library defines imperative IO in terms of functional IO the implementation in Poly/ML is different so that it doesn't suffer from these problems.
David
An interesting question! I don't think your interpretation was what we intended - the input function was supposed to return whatever content was available at the time of the call. Given that Andrew and John were the main instigators of the functional IO subsystem and they wrote the SML/NJ version, it would seem safe to go with the SML/NJ behaviour.
Best,
Dave.
At 12:49 01/09/2008, David Matthews wrote:
Philip Clayton wrote:
I have found a performance issue when using TextIO.StreamIO.input1 to read a functional stream. Looking at gc/non-gc times and using PolyML.profiling, it appears that garbage collection accounts for most of the time. There is some code below to demonstrate with stats that include comparison with SML/NJ. The profiling shows that readFromReader in basis/BasicStreamIO.sml is responsible for creating values that are being garbage collected. Looking at this code, I can see various things that would contribute to this garbage collection but nothing that is obviously problematic. Is it simply the case that overheads in the implementation mean that it is not suitable for a large number of small reads?
I think I need to look again at the functional IO part of Poly/ML's basis library.
The idea of functional IO is that a stream should be repeatable. i.e. if a stream, f, has returned some data then re-reading from the stream should return the same data. The definition of functional IO in the basis library that I used when implementing this in Poly/ML had a number of program snippets that implied that it was not just the content that had to be repeatable but also the way the content was broken down. So, if the stream was read using "input1" to return a single character then a subsequent call to "input" on that same functional stream must return precisely one character.
val str = getInstream(openIn "/tmp/abc");
val str = ? : TextIO.StreamIO.instream
StreamIO.input1 str;
val it = SOME (#"0", ?) : (TEXT_STREAM_IO.elem * TextIO.StreamIO.instream) option
StreamIO.input str;
val it = ("0", ?) : TEXT_STREAM_IO.vector * TextIO.StreamIO.instream
However, I think when the book was published many of these examples were left out and although it doesn't seem to be stated formally I think the idea is that only the content needs to be repeatable. So "input" should return a string whose first character is the same as that returned by "input1" but whose length is unspecified. This seems to be what SML/NJ at least does. It looks like the problem for your example is that Poly/ML is building up a enormous stream of single character elements and that this is overwhelming the storage management.
Although the basis library defines imperative IO in terms of functional IO the implementation in Poly/ML is different so that it doesn't suffer from these problems.
David _______________________________________________ polyml mailing list polyml@inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
Philip Clayton wrote:
I have found a performance issue when using TextIO.StreamIO.input1 to read a functional stream. Looking at gc/non-gc times and using PolyML.profiling, it appears that garbage collection accounts for most of the time. There is some code below to demonstrate with stats that include comparison with SML/NJ.
The profiling shows that readFromReader in basis/BasicStreamIO.sml is responsible for creating values that are being garbage collected. Looking at this code, I can see various things that would contribute to this garbage collection but nothing that is obviously problematic. Is it simply the case that overheads in the implementation mean that it is not suitable for a large number of small reads?
I've finished redoing the functional IO code and I think I've eliminated most of the inefficiencies. The new version is now in CVS. Could you rerun your tests and let me know if it all works satisfactorily?
The new version is actually faster than using TextIO.input1 because TextIO.input1 has to lock a mutex to be thread-safe whereas TextIO.StreamIO.input1, being functional, only needs to lock a mutex when reading a block.
David
Thanks for sorting that out - that has certainly solved the problem!
I have rerun the tests using the latest CVS snapshot and the results are below.
Phil
(* read times in seconds on Intel Core 2 Duo CPU T7500 @ 2.20GHz, 2GB *) (* Poly/ML 4.1.3 Poly/ML 5.2 SML/NJ 110.52 Poly/ML latest
text-01MB 0.7 - 0.8 0.3 - 1.5 0.03 - 0.05 0.035 - 0.075 text-02MB 2.2 - 2.3 1.9 - 2.8 0.06 - 0.09 0.067 - 0.096 text-03MB 4.3 - 4.5 2.0 - 4.3 0.08 - 0.14 0.107 - 0.147 text-04MB 7.2 - 7.5 4.2 - 6.6 0.11 - 0.16 0.135 - 0.170 text-05MB 11.0 - 11.1 4.0 - 5.8 0.14 - 0.18 0.191 - 0.233
text-40MB out of store 101.4 1.1 - 1.2 1.54 - 1.57
*)
David Matthews wrote:
Philip Clayton wrote:
I have found a performance issue when using TextIO.StreamIO.input1 to read a functional stream. Looking at gc/non-gc times and using PolyML.profiling, it appears that garbage collection accounts for most of the time. There is some code below to demonstrate with stats that include comparison with SML/NJ.
The profiling shows that readFromReader in basis/BasicStreamIO.sml is responsible for creating values that are being garbage collected. Looking at this code, I can see various things that would contribute to this garbage collection but nothing that is obviously problematic. Is it simply the case that overheads in the implementation mean that it is not suitable for a large number of small reads?
I've finished redoing the functional IO code and I think I've eliminated most of the inefficiencies. The new version is now in CVS. Could you rerun your tests and let me know if it all works satisfactorily?
The new version is actually faster than using TextIO.input1 because TextIO.input1 has to lock a mutex to be thread-safe whereas TextIO.StreamIO.input1, being functional, only needs to lock a mutex when reading a block.
David
The information contained in this E-Mail and any subsequent correspondence is private and is intended solely for the intended recipient(s). The information in this communication may be confidential and/or legally privileged. Nothing in this e-mail is intended to conclude a contract on behalf of QinetiQ or make QinetiQ subject to any other legally binding commitments, unless the e-mail contains an express statement to the contrary or incorporates a formal Purchase Order.
For those other than the recipient any disclosure, copying, distribution, or any action taken or omitted to be taken in reliance on such information is prohibited and may be unlawful.
Emails and other electronic communication with QinetiQ may be monitored and recorded for business purposes including security, audit and archival purposes. Any response to this email indicates consent to this.
Telephone calls to QinetiQ may be monitored or recorded for quality control, security and other business purposes.
QinetiQ Limited Registered in England & Wales: Company Number:3796233 Registered office: 85 Buckingham Gate, London SW1E 6PD, United Kingdom Trading address: Cody Technology Park, Cody Building, Ively Road, Farnborough, Hampshire, GU14 0LX, United Kingdom http://www.qinetiq.com/home/notices/legal.html