[polyml] Segmentation Fault When Porting

18 Jan 2016


      James,
I've managed to set up a big-endian mips debian virtual machine using 
qemu inside a virtual debian machine in virtualbox on Windows.  Despite 
all the layers of virtualisation it works and more importantly Poly/ML 
actually builds successfully.  It does crash with some larger examples, 
such as Tests/Succeed/Test133.ML, and I've seen some other crashes in 
the garbage-collector.  I suspect that there is a problem with 
endian-ness somewhere but it may be possible to narrow this down with gdb.
Regards,
David
On 16/01/2016 17:01, James Clarke wrote:
...
Hi David,
I just tried building on mipsel, and that compiles and passes the test suite with the same compiler flags. Endianness is looking like a strong candidate, given that the only architectures it fails on are big-endian, although compiler optimisations are ?responsible?. I shall see if a very old version works on big-endian mips; if so, I will try and do a git bisect, otherwise it might have to be some painful debugging.
Regards,
James
...
On 15 Jan 2016, at 11:59, James Clarke <jrtc27 at jrtc27.com> wrote:
They are all big-endian. I haven't tried mipsel; that could help narrow it down. One thing making me not so sure it's an endianness issue is that you support 32-bit PowerPC, and that runs properly. Also the mips builds are broken by GCC's optimisations; adding -fno-omit-frame-pointer made it work for some reason, if I remember correctly.
James
...
On 15 Jan 2016, at 11:29, David Matthews <David.Matthews at prolingua.co.uk> wrote:
I wish I could help but there's not much I can suggest.  The only idea that occurs to me is that there is some endian-ness issue that has crept in.  Are these little-endian or big-endian?  In theory the interpreter should work on both big-endian and little-endian but I've only tested the most recent version on X86.  Have a look at an earlier version of Poly/ML and see if you have any more success with that.
David
...
On 12/01/2016 14:52, James Clarke wrote:
Hi,
I?ve been trying to port Poly/ML to mips and IBM?s S/390 (the 64-bit version, often referred to as s390x). For both, I tried just adding an extra case in configure.ac, along with corresponding HOSTARCHITECTURE macros and cases in libpolyml/elfexport.cpp. However, these all seem to segfault when polyimport is run when building (both with 5.5.2 and git commit ee26375, "Merge branch ?PICTest?"). I can?t seem to get a meaningful stack trace out of the mips segfault, but it crashes just after ?Use: basis/Socket.sml? is printed. However, on s390x, it crashes before anything is printed, and valgrind gave me the following (with no errors before this point) when running ee26375?s polyimport:
==16138== Thread 3:
==16138== Invalid read of size 8
==16138==    at 0x489EA50: Offset (globals.h:315)
==16138==    by 0x489EA50: GetConstSegmentForCode (globals.h:344)
==16138==    by 0x489EA50: GetConstSegmentForCode (globals.h:350)
==16138==    by 0x489EA50: ConstPtrForCode (globals.h:355)
==16138==    by 0x489EA50: buildStackList(TaskData*, PolyWord*, PolyWord*) (run_time.cpp:413)
==16138==    by 0x489EC87: exceptionToTraceException(TaskData*, SaveVecEntry*) (run_time.cpp:471)
==16138==    by 0x48AC9ED: IntTaskData::SwitchToPoly() (interpret.cpp:877)
==16138==    by 0x48ACC33: IntTaskData::EnterPolyCode() (interpret.cpp:1428)
==16138==    by 0x489324D: NewThreadFunction(void*) (processes.cpp:1128)
==16138==    by 0x48E591D: start_thread (pthread_create.c:335)
==16138==    by 0x4C8CEA9: ??? (in /lib/s390x-linux-gnu/libc-2.21.so)
==16138==  Address 0xe000000005ab5b38 is not stack'd, malloc'd or (recently) free'd
==16138==
==16138==
==16138== Process terminating with default action of signal 11 (SIGSEGV)
==16138==  Access not within mapped region at address 0xE000000005AB5000
==16138==    at 0x489EA50: Offset (globals.h:315)
==16138==    by 0x489EA50: GetConstSegmentForCode (globals.h:344)
==16138==    by 0x489EA50: GetConstSegmentForCode (globals.h:350)
==16138==    by 0x489EA50: ConstPtrForCode (globals.h:355)
==16138==    by 0x489EA50: buildStackList(TaskData*, PolyWord*, PolyWord*) (run_time.cpp:413)
==16138==    by 0x489EC87: exceptionToTraceException(TaskData*, SaveVecEntry*) (run_time.cpp:471)
==16138==    by 0x48AC9ED: IntTaskData::SwitchToPoly() (interpret.cpp:877)
==16138==    by 0x48ACC33: IntTaskData::EnterPolyCode() (interpret.cpp:1428)
==16138==    by 0x489324D: NewThreadFunction(void*) (processes.cpp:1128)
==16138==    by 0x48E591D: start_thread (pthread_create.c:335)
==16138==    by 0x4C8CEA9: ??? (in /lib/s390x-linux-gnu/libc-2.21.so)
==16138==  If you believe this happened as a result of a stack
==16138==  overflow in your program's main thread (unlikely but
==16138==  possible), you can try to increase the size of the
==16138==  main thread stack using the --main-stacksize= flag.
==16138==  The main thread stack size used in this run was 8388608.
(the ??? for libc is because valgrind does not yet understand compressed debug info; I removed a whole load of warnings to that effect)
Have you ever come across anything like this? Do you have any thoughts for where to start with hunting this down?
Regards,
James Clarke

polyml mailing list
polyml at inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

polyml mailing list
polyml at inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

polyml mailing list
polyml at inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[polyml] Segmentation Fault When Porting