Hi Peter, Yes, it doesn't look very efficient, particularly as it uses Array.sub and Array.update that check the array bound on every access.
move_words is normally implemented inline by the code-generator. The assembly code is only used as a fall-back in special circumstances. The version produced by the code-generator does not deal correctly with overlapping arrays.
I wonder if the best solution to this is to have two different RTS calls, one that deals with overlapping arrays and one that doesn't, similar to memcpy/memmove. They could both be implemented by the same assembly code but only the non-overlapping case would be implemented in the code-generator. There would then be a small, constant, overhead of a function call for ArraySlice.copy compared with Array.copy.
There's always a question about how much effort to expend in producing the optimum version of library functions. It's possible to spend a lot of time implementing a function in the most efficient way but if no one uses it it's wasted effort. Thanks for tracking this down. Regards, David
On 20/06/2013 11:06, Peter Lammich wrote:
Hi all,
(Referring to "PolyML 5.5.0 release")
I am trying to copy the initial part of an array into another array, something like OCamls: "Array.blit src 0 dst 0 len"
In PolyML, I have Array.copy, that copies the whole array, i.e., fixes len=length src, which is to restrictive for my purpose.
I can solve my problem with "ArraySlice.copy {src=ArraySlice.slice(src,0,SOME len),di=0,dst=dst}"
However, this solution has very poor performance. Looking at the code, it becomes clear why: Array.copy is implemented natively by "move_words", however, ArraySlice.copy is implemented as a recursive function inside ML. This is a performance difference of roughly two orders of magnitude!
There is a comment on ArraySlice.copy that "We can't use System_move_words because of the potential overlap problem."
However, looking at the implementations of move_words, I find:
x86asm.asm: ;# Must deal with the case of overlapping segments correctly. (which the implementation there does)
run_time.cpp: memmove(dest, source, words*sizeof(PolyWord)); /* must work for overlapping segments. */ (which it does according to C++ specification)
Nevertheless, a quick test reveals that System_move_words does not work for overlapping memory! What am I missing?
So, this results in a few questions:
Is there a fast solution two my problem within the SML-standard, where I can even assume that src and dst are different arrays?
Can ArraySlice.copy be patched to use System_move_words after a check that the arrays are actually not the same?
Where is the code executed by System_move_words, and can this be fixed to correctly handle overlapping segments?
Best, Peter
polyml mailing list polyml at inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/polyml