Hi all,
(Referring to "PolyML 5.5.0 release")
I am trying to copy the initial part of an array into another array, something like OCamls: "Array.blit src 0 dst 0 len"
In PolyML, I have Array.copy, that copies the whole array, i.e., fixes len=length src, which is to restrictive for my purpose.
I can solve my problem with "ArraySlice.copy {src=ArraySlice.slice(src,0,SOME len),di=0,dst=dst}"
However, this solution has very poor performance. Looking at the code, it becomes clear why: Array.copy is implemented natively by "move_words", however, ArraySlice.copy is implemented as a recursive function inside ML. This is a performance difference of roughly two orders of magnitude!
There is a comment on ArraySlice.copy that "We can't use System_move_words because of the potential overlap problem."
However, looking at the implementations of move_words, I find:
x86asm.asm: ;# Must deal with the case of overlapping segments correctly. (which the implementation there does)
run_time.cpp: memmove(dest, source, words*sizeof(PolyWord)); /* must work for overlapping segments. */ (which it does according to C++ specification)
Nevertheless, a quick test reveals that System_move_words does not work for overlapping memory! What am I missing?
So, this results in a few questions: 1) Is there a fast solution two my problem within the SML-standard, where I can even assume that src and dst are different arrays?
2) Can ArraySlice.copy be patched to use System_move_words after a check that the arrays are actually not the same?
3) Where is the code executed by System_move_words, and can this be fixed to correctly handle overlapping segments?
Best, Peter
Hi Peter, Yes, it doesn't look very efficient, particularly as it uses Array.sub and Array.update that check the array bound on every access.
move_words is normally implemented inline by the code-generator. The assembly code is only used as a fall-back in special circumstances. The version produced by the code-generator does not deal correctly with overlapping arrays.
I wonder if the best solution to this is to have two different RTS calls, one that deals with overlapping arrays and one that doesn't, similar to memcpy/memmove. They could both be implemented by the same assembly code but only the non-overlapping case would be implemented in the code-generator. There would then be a small, constant, overhead of a function call for ArraySlice.copy compared with Array.copy.
There's always a question about how much effort to expend in producing the optimum version of library functions. It's possible to spend a lot of time implementing a function in the most efficient way but if no one uses it it's wasted effort. Thanks for tracking this down. Regards, David
On 20/06/2013 11:06, Peter Lammich wrote:
Hi all,
(Referring to "PolyML 5.5.0 release")
I am trying to copy the initial part of an array into another array, something like OCamls: "Array.blit src 0 dst 0 len"
In PolyML, I have Array.copy, that copies the whole array, i.e., fixes len=length src, which is to restrictive for my purpose.
I can solve my problem with "ArraySlice.copy {src=ArraySlice.slice(src,0,SOME len),di=0,dst=dst}"
However, this solution has very poor performance. Looking at the code, it becomes clear why: Array.copy is implemented natively by "move_words", however, ArraySlice.copy is implemented as a recursive function inside ML. This is a performance difference of roughly two orders of magnitude!
There is a comment on ArraySlice.copy that "We can't use System_move_words because of the potential overlap problem."
However, looking at the implementations of move_words, I find:
x86asm.asm: ;# Must deal with the case of overlapping segments correctly. (which the implementation there does)
run_time.cpp: memmove(dest, source, words*sizeof(PolyWord)); /* must work for overlapping segments. */ (which it does according to C++ specification)
Nevertheless, a quick test reveals that System_move_words does not work for overlapping memory! What am I missing?
So, this results in a few questions:
Is there a fast solution two my problem within the SML-standard, where I can even assume that src and dst are different arrays?
Can ArraySlice.copy be patched to use System_move_words after a check that the arrays are actually not the same?
Where is the code executed by System_move_words, and can this be fixed to correctly handle overlapping segments?
Best, Peter
polyml mailing list polyml at inf.ed.ac.uk http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
Hi Matt. Thanks for the quick answer.
What's about
- Can ArraySlice.copy be patched to use System_move_words after a check that the arrays are actually not the same?
This would only be a minor change.
Is the patch below safe?
fun copy {src = Slice{array=s, start=srcStart, length=srcLen}, dst, di: int} = if di < 0 orelse di+srcLen > Array.length dst then raise General.Subscript * else if s <> dst then * System_move_words(RunCall.unsafeCast s, srcStart+1, RunCall.unsafeCast dst, di+1, srcLen) else (* We can't use System_move_words because of the potential overlap problem. *) let fun copyUp n = if n = srcLen then () else (Array.update(dst, n+di, Array.sub(s, n+srcStart)); copyUp(n+1))
and copyDown n = if n < 0 then () else (Array.update(dst, n+di, Array.sub(s, n+srcStart)); copyDown(n-1)) in if di > srcStart then copyDown(srcLen-1) else copyUp 0 end
Best, Peter
I've just committed the change I suggested. Give it a try and let me know how it works. Regards, David
On 20/06/2013 12:27, Peter Lammich wrote:
Hi Matt. Thanks for the quick answer.
What's about
2) Can ArraySlice.copy be patched to use System_move_words after a check that the arrays are actually not the same?
This would only be a minor change.
Is the patch below safe?
fun copy {src = Slice{array=s, start=srcStart, length=srcLen}, dst, di: int} = if di < 0 orelse di+srcLen > Array.length dst then raise General.Subscript
else if s <> dst then
System_move_words(RunCall.unsafeCast s, srcStart+1, RunCall.unsafeCast dst, di+1, srcLen) else (* We can't use System_move_words because of the potential overlap problem. *) let fun copyUp n = if n = srcLen then () else (Array.update(dst, n+di, Array.sub(s, n+srcStart)); copyUp(n+1)) and copyDown n = if n < 0 then () else (Array.update(dst, n+di, Array.sub(s, n+srcStart)); copyDown(n-1)) in if di > srcStart then copyDown(srcLen-1) else copyUp 0 end
Best, Peter
On Do, 2013-06-20 at 12:30 +0100, David Matthews wrote:
I've just committed the change I suggested. Give it a try and let me know how it works.
Fast like hell! For big arrays, it has the same speed as Array.copy.
Now, the following function is efficient:
fun blit (src,si,dst,di,len) = ArraySlice.copy { src = ArraySlice.slice(src,si,SOME len), dst = dst, di = di }
Thanks a lot, Peter
Regards, David
On 20/06/2013 12:27, Peter Lammich wrote:
Hi Matt. Thanks for the quick answer.
What's about
2) Can ArraySlice.copy be patched to use System_move_words after a check that the arrays are actually not the same?
This would only be a minor change.
Is the patch below safe?
fun copy {src = Slice{array=s, start=srcStart, length=srcLen}, dst, di: int} = if di < 0 orelse di+srcLen > Array.length dst then raise General.Subscript
else if s <> dst then
System_move_words(RunCall.unsafeCast s, srcStart+1, RunCall.unsafeCast dst, di+1, srcLen) else (* We can't use System_move_words because of the potential overlap problem. *) let fun copyUp n = if n = srcLen then () else (Array.update(dst, n+di, Array.sub(s, n+srcStart)); copyUp(n+1)) and copyDown n = if n < 0 then () else (Array.update(dst, n+di, Array.sub(s, n+srcStart)); copyDown(n-1)) in if di > srcStart then copyDown(srcLen-1) else copyUp 0 end
Best, Peter