Martin von Gagern wrote:
Addition to "Segfault on SMP Linux kernels": I get those problems on the /third/ call to ReserveAddressSpace: (compiled with -O0, otherwise ReserveAddressSpace will be inlined) #0 ReserveAddressSpace (addr=0x3f000000 <Address 0x3f000000 out of bounds>, len=8192) at mmap.c:364 #1 0x080591fe in ReserveMLSpace ( bottom=0x3f000000 <Address 0x3f000000 out of bounds>, top=0x3f002000 <Address 0x3f002000 out of bounds>) at mmap.c:378 #2 0x080592f1 in ReserveMLSpaces () at mmap.c:406 #3 0x080521f3 in main (argc=1, argv=0xbfffe4f4) at mpoly.c:710
Thanks for that information. It does help to narrow down the problem but doesn't, unfortunately, give me an immediate solution.
It looks as though some recent changes to the Linux kernel have changed the addresses which are used by various items and that is causing a conflict with Poly/ML. The problem arises because Poly/ML really needs to be able to load the database(s) at a specific address. The databases contain all the ML data structures that make up the program and data within the database and these inevitably contain the absolute addresses of other items. That means that the database has to be loaded at the same address where it was created otherwise every time it was loaded the driver program would have to work through the whole database relocating addresses. The database also contains the addresses of items in a vector (the "IO area") which are set at run-time to entry points to the run-time system. Again these have to be fixed addresses, so for example, 0x3f000140 is the address of the "commit" function, and it looks as though the problem is with allocating this vector. It's compounded because in Unix the mmap function, unlike the equivalent in Windows, doesn't return an error if asked to allocate at a specific address when there is something else, perhaps a dynamic library, already loaded there. It simply overwrites what's there and I guess it's this that is causing the seg fault.
The addresses for various operating systems are hard-wired into the program in the addresses.h file. It is possible to change them and then run the disc garbage collector (poly -d ) on the database and then the database will be relocated.
It would help if people who have had these crashes could send me the output of cat /proc/self/maps This will show the virtual address space used by the program, in this case "cat" itself, and I may be able to find an area that will be safe.
David.