Ok. Got me there. .../intel-cc/8.1.030/lib/libipr.so.6 is the culprit.
Rocky
-----Original Message-----
From: owner-manta@sci.utah.edu [mailto:owner-manta@sci.utah.edu] On Behalf
Of James Bigler
Sent: Wednesday, May 25, 2005 3:06 PM
Cc: manta@sci.utah.edu
Subject: Re: [MANTA] bad exit from bin/manta on Altix
Doing a google on those files indicates that libffio.so is a Fortran
library commonly found on SGI machines. libeag_ffio.so also appears to
be an SGI oriented library.
Try this for me:
ldd bin/manta | awk '{print $3}' | \
xargs --max-args=1 grep -l "Incorrect Phase"
James
Rocky Rhodes wrote:
It still fails the same way on exit.
Another clue (red herring?) is that the "Incorrect Phase" error message
is
coming from either libffio.so or libeag_ffio.so on the Altix. These are
the
only two .so files in /usr/lib that contain this string.
Rocky
-----Original Message-----
From: owner-manta@sci.utah.edu [mailto:owner-manta@sci.utah.edu] On
Behalf
Of James Bigler
Sent: Wednesday, May 25, 2005 11:08 AM
Cc: manta@sci.utah.edu
Subject: Re: [MANTA] bad exit from bin/manta on Altix
Rocky,
Can you try something for me? The only thread code that is specific to
the altix is the barrier code. In Thread_pthread.cc there are a few
places that use __ia64__. Could you replace these with __ia64_noway__
to see if you still have problems (I want to not compile the ia64
specific code here)?
Thanks
James
I had tried older versions of the main trunk and didn't have any luck
isolating a change. I tried versions back to 308 that all failed
similarly,
and version 300 doesn't build for me. It has probably always been
this
way
and I just had this environment variable set when I tried it before.
Rocky
-----Original Message-----
From: owner-manta@sci.utah.edu [mailto:owner-manta@sci.utah.edu] On
Behalf
Of James Bigler
Sent: Tuesday, May 24, 2005 7:59 PM
Cc: manta@sci.utah.edu
Subject: Re: [MANTA] bad exit from bin/manta on Altix
You could try checking out an older version and see if it happens.
The Thread_pthread.cc file is pretty hairy. The altix is the only
machine we've had complaints about, though. Should this be showing
up
in other modern distributions? What do the SGI docs say about that?
James
Rocky Rhodes wrote:
If I run "bin/manta -bench 10 10 -imagedisplay null -np 2" on an
Altix,
the program exits in a bad way, complaining of "ERROR: Incorrect
Phase"
and then telling me that "Thread 'idle or main'" got a SIGSEGV. If
I
run this again with the LD_ASSUME_KERNEL environment variable set
to
"2.4.19" it exits cleanly.
SGI's documentation says that this behavior is indicative of an
application "which depends on behaviors in which the LinuxThreads
implementation deviates from the POSIX standard". The
LD_ASSUME_KERNEL
environment variable forces the application to use the old
LinuxThreads
implementation rather than NPTL (Native POSIX Thread Library). I
think
the new thread package was included with ProPack 3.0 on the Altix.
You
might not see this problem on your Altix if it is running an
earlier
version of the system software.
I thought I had tried this earlier and didn't have this problem,
but
as
it is just an environment variable, now I'm wondering if this has
always
been broken this way. Is anyone aware of any changes in the
pthread
code made over the last week or so that may have changed this
behavior?
Does anyone feel more qualified than I do about mucking around in
this
code and trying to understand what goes on when the program exits?
Rocky
Archive powered by MHonArc 2.6.16.