Hello!
I know this problem was already mentioned in two threads. Still I have additional information.
Addition to "Segfault on SMP Linux kernels": I get those problems on the /third/ call to ReserveAddressSpace: (compiled with -O0, otherwise ReserveAddressSpace will be inlined) #0 ReserveAddressSpace (addr=0x3f000000 <Address 0x3f000000 out of bounds>, len=8192) at mmap.c:364 #1 0x080591fe in ReserveMLSpace ( bottom=0x3f000000 <Address 0x3f000000 out of bounds>, top=0x3f002000 <Address 0x3f002000 out of bounds>) at mmap.c:378 #2 0x080592f1 in ReserveMLSpaces () at mmap.c:406 #3 0x080521f3 in main (argc=1, argv=0xbfffe4f4) at mpoly.c:710
Addition to "Seg. fault - 2.6.9 kernel": I have a plain vanilla 2.6.9, no patches. Still this does not work for me.
I have a gentoo distribution, maybe you want to keep an eye on the related gentoo bug report as well: http://bugs.gentoo.org/show_bug.cgi?id=35548
I only had a rather brief look at the source code, but is there a special reason why the mmapped area should be at a fixed address?
Greetings, Martin von Gagern
Martin von Gagern wrote:
Addition to "Segfault on SMP Linux kernels": I get those problems on the /third/ call to ReserveAddressSpace: (compiled with -O0, otherwise ReserveAddressSpace will be inlined) #0 ReserveAddressSpace (addr=0x3f000000 <Address 0x3f000000 out of bounds>, len=8192) at mmap.c:364 #1 0x080591fe in ReserveMLSpace ( bottom=0x3f000000 <Address 0x3f000000 out of bounds>, top=0x3f002000 <Address 0x3f002000 out of bounds>) at mmap.c:378 #2 0x080592f1 in ReserveMLSpaces () at mmap.c:406 #3 0x080521f3 in main (argc=1, argv=0xbfffe4f4) at mpoly.c:710
Thanks for that information. It does help to narrow down the problem but doesn't, unfortunately, give me an immediate solution.
It looks as though some recent changes to the Linux kernel have changed the addresses which are used by various items and that is causing a conflict with Poly/ML. The problem arises because Poly/ML really needs to be able to load the database(s) at a specific address. The databases contain all the ML data structures that make up the program and data within the database and these inevitably contain the absolute addresses of other items. That means that the database has to be loaded at the same address where it was created otherwise every time it was loaded the driver program would have to work through the whole database relocating addresses. The database also contains the addresses of items in a vector (the "IO area") which are set at run-time to entry points to the run-time system. Again these have to be fixed addresses, so for example, 0x3f000140 is the address of the "commit" function, and it looks as though the problem is with allocating this vector. It's compounded because in Unix the mmap function, unlike the equivalent in Windows, doesn't return an error if asked to allocate at a specific address when there is something else, perhaps a dynamic library, already loaded there. It simply overwrites what's there and I guess it's this that is causing the seg fault.
The addresses for various operating systems are hard-wired into the program in the addresses.h file. It is possible to change them and then run the disc garbage collector (poly -d ) on the database and then the database will be relocated.
It would help if people who have had these crashes could send me the output of cat /proc/self/maps This will show the virtual address space used by the program, in this case "cat" itself, and I may be able to find an area that will be safe.
David.
David Matthews wrote:
It would help if people who have had these crashes could send me the output of cat /proc/self/maps
08048000-0804c000 r-xp 00000000 03:03 2427817 /bin/cat 0804c000-0804d000 rw-p 00003000 03:03 2427817 /bin/cat 0804d000-0806e000 rw-p 0804d000 00:00 0 49bfb000-49c0f000 r-xp 00000000 03:03 309512 /lib/ld-2.3.4.so 49c0f000-49c10000 rw-p 00013000 03:03 309512 /lib/ld-2.3.4.so 4a342000-4a446000 r-xp 00000000 03:03 310146 /lib/libc-2.3.4.so 4a446000-4a449000 rw-p 00104000 03:03 310146 /lib/libc-2.3.4.so 4a449000-4a44c000 rw-p 4a449000 00:00 0 b7da8000-b7da9000 r--p 004c1000 03:03 865118 /usr/.../locale-archive b7da9000-b7ddd000 r--p 0048b000 03:03 865118 /usr/.../locale-archive b7ddd000-b7de4000 r--p 0046c000 03:03 865118 /usr/.../locale-archive b7de4000-b7fe4000 r--p 00000000 03:03 865118 /usr/.../locale-archive b7fe4000-b7fe5000 rw-p b7fe4000 00:00 0 b7fff000-b8000000 rw-p b7fff000 00:00 0 bfffe000-c0000000 rw-p bfffe000 00:00 0 ffffe000-fffff000 ---p 00000000 00:00 0
I have the same Problem on 2.6.9 without SMP.
Would it be possible to let the linker reserve some appropriate memory and then use this as the base address where to mmap the files?
How much overhead would be inflicted if every pointer in the mmapped data was indirect, i.e. a pointer difference relative to the beginning of the mmapped area?
Greetings, Martin von Gagern