Text archives Help
- From: Rocky Rhodes <rhodes@sgi.com>
- To: "'James Bigler'" <bigler@cs.utah.edu>
- Cc: manta@sci.utah.edu
- Subject: RE: [MANTA] bad exit from bin/manta on Altix
- Date: Wed, 25 May 2005 18:19:32 -0700
On the Altix with icc, the exit_handler function is called after the
pthread_exit(0) function is called in Thread_shutdown. This is caused by a
call to "atexit(exit_handler)" that is made when the Thread is initialized.
Exit_handler then calls Thread_shutdown again, where it fails with a segv
because it tries to reference a data structure that has already been
deleted.
When running the gcc compiled version, the exit_handler function is not
called on thread exit, but only when the entire program exits.
I'm on my way out, but I'll research the "expected" behavior of atexit() in
a pthread program tomorrow morning if nobody tells me what it is supposed to
do before I get to it.
Rocky
>
-----Original Message-----
>
From: owner-manta@sci.utah.edu [mailto:owner-manta@sci.utah.edu] On Behalf
>
Of Rocky Rhodes
>
Sent: Wednesday, May 25, 2005 5:09 PM
>
To: 'James Bigler'
>
Cc: manta@sci.utah.edu
>
Subject: RE: [MANTA] bad exit from bin/manta on Altix
>
>
Yep. Works fine compiled with gcc/g++ (although a bit slower).
>
>
Rocky
>
>
> -----Original Message-----
>
> From: owner-manta@sci.utah.edu [mailto:owner-manta@sci.utah.edu] On
>
Behalf
>
> Of James Bigler
>
> Sent: Wednesday, May 25, 2005 3:30 PM
>
> Cc: manta@sci.utah.edu
>
> Subject: Re: [MANTA] bad exit from bin/manta on Altix
>
>
>
> Can you build manta with GCC instead of ICC, since this is an ICC
>
> specific library?
>
>
>
> James
>
>
>
> Rocky Rhodes wrote:
>
> > Ok. Got me there. .../intel-cc/8.1.030/lib/libipr.so.6 is the
>
culprit.
>
> >
>
> > Rocky
>
> >
>
> >
>
> >>-----Original Message-----
>
> >>From: owner-manta@sci.utah.edu [mailto:owner-manta@sci.utah.edu] On
>
> Behalf
>
> >>Of James Bigler
>
> >>Sent: Wednesday, May 25, 2005 3:06 PM
>
> >>Cc: manta@sci.utah.edu
>
> >>Subject: Re: [MANTA] bad exit from bin/manta on Altix
>
> >>
>
> >>Doing a google on those files indicates that libffio.so is a Fortran
>
> >>library commonly found on SGI machines. libeag_ffio.so also appears
>
to
>
> >>be an SGI oriented library.
>
> >>
>
> >>Try this for me:
>
> >>
>
> >>ldd bin/manta | awk '{print $3}' | \
>
> >>xargs --max-args=1 grep -l "Incorrect Phase"
>
> >>
>
> >>James
>
> >>
>
> >>Rocky Rhodes wrote:
>
> >>
>
> >>>It still fails the same way on exit.
>
> >>>
>
> >>>Another clue (red herring?) is that the "Incorrect Phase" error
>
message
>
> >>
>
> >>is
>
> >>
>
> >>>coming from either libffio.so or libeag_ffio.so on the Altix. These
>
> are
>
> >>
>
> >>the
>
> >>
>
> >>>only two .so files in /usr/lib that contain this string.
>
> >>>
>
> >>> Rocky
>
> >>>
>
> >>>
>
> >>>
>
> >>>>-----Original Message-----
>
> >>>>From: owner-manta@sci.utah.edu [mailto:owner-manta@sci.utah.edu] On
>
> >>
>
> >>Behalf
>
> >>
>
> >>>>Of James Bigler
>
> >>>>Sent: Wednesday, May 25, 2005 11:08 AM
>
> >>>>Cc: manta@sci.utah.edu
>
> >>>>Subject: Re: [MANTA] bad exit from bin/manta on Altix
>
> >>>>
>
> >>>>Rocky,
>
> >>>>
>
> >>>>Can you try something for me? The only thread code that is specific
>
> to
>
> >>>>the altix is the barrier code. In Thread_pthread.cc there are a few
>
> >>>>places that use __ia64__. Could you replace these with
>
__ia64_noway__
>
> >>>>to see if you still have problems (I want to not compile the ia64
>
> >>>>specific code here)?
>
> >>>>
>
> >>>>Thanks
>
> >>>>James
>
> >>>>
>
> >>>>
>
> >>>>
>
> >>>>>>>I had tried older versions of the main trunk and didn't have any
>
> luck
>
> >>>>>>>isolating a change. I tried versions back to 308 that all failed
>
> >>>>>>
>
> >>>>>>similarly,
>
> >>>>>>
>
> >>>>>>
>
> >>>>>>
>
> >>>>>>>and version 300 doesn't build for me. It has probably always
>
been
>
> >>
>
> >>this
>
> >>
>
> >>>>>>way
>
> >>>>>>
>
> >>>>>>
>
> >>>>>>
>
> >>>>>>>and I just had this environment variable set when I tried it
>
> before.
>
> >>>>>>>
>
> >>>>>>> Rocky
>
> >>>>>>>
>
> >>>>>>>
>
> >>>>>>>
>
> >>>>>>>
>
> >>>>>>>
>
> >>>>>>>>-----Original Message-----
>
> >>>>>>>>From: owner-manta@sci.utah.edu [mailto:owner-manta@sci.utah.edu]
>
> On
>
> >>>>>>
>
> >>>>>>Behalf
>
> >>>>>>
>
> >>>>>>
>
> >>>>>>
>
> >>>>>>>>Of James Bigler
>
> >>>>>>>>Sent: Tuesday, May 24, 2005 7:59 PM
>
> >>>>>>>>Cc: manta@sci.utah.edu
>
> >>>>>>>>Subject: Re: [MANTA] bad exit from bin/manta on Altix
>
> >>>>>>>>
>
> >>>>>>>>You could try checking out an older version and see if it
>
happens.
>
> >>>>>>>>
>
> >>>>>>>>The Thread_pthread.cc file is pretty hairy. The altix is the
>
only
>
> >>>>>>>>machine we've had complaints about, though. Should this be
>
> showing
>
> >>
>
> >>up
>
> >>
>
> >>>>>>>>in other modern distributions? What do the SGI docs say about
>
> that?
>
> >>>>>>>>
>
> >>>>>>>>James
>
> >>>>>>>>
>
> >>>>>>>>Rocky Rhodes wrote:
>
> >>>>>>>>
>
> >>>>>>>>
>
> >>>>>>>>
>
> >>>>>>>>
>
> >>>>>>>>>If I run "bin/manta -bench 10 10 -imagedisplay null -np 2" on
>
an
>
> >>>>
>
> >>>>Altix,
>
> >>>>
>
> >>>>
>
> >>>>>>>>>the program exits in a bad way, complaining of "ERROR:
>
Incorrect
>
> >>>>
>
> >>>>Phase"
>
> >>>>
>
> >>>>
>
> >>>>>>>>>and then telling me that "Thread 'idle or main'" got a SIGSEGV.
>
> If
>
> >>
>
> >>I
>
> >>
>
> >>>>>>>>>run this again with the LD_ASSUME_KERNEL environment variable
>
set
>
> >>
>
> >>to
>
> >>
>
> >>>>>>>>>"2.4.19" it exits cleanly.
>
> >>>>>>>>>
>
> >>>>>>>>>
>
> >>>>>>>>>
>
> >>>>>>>>>SGI's documentation says that this behavior is indicative of an
>
> >>>>>>>>>application "which depends on behaviors in which the
>
LinuxThreads
>
> >>>>>>>>>implementation deviates from the POSIX standard". The
>
> >>>>
>
> >>>>LD_ASSUME_KERNEL
>
> >>>>
>
> >>>>
>
> >>>>>>>>>environment variable forces the application to use the old
>
> >>>>
>
> >>>>LinuxThreads
>
> >>>>
>
> >>>>
>
> >>>>>>>>>implementation rather than NPTL (Native POSIX Thread Library).
>
I
>
> >>>>
>
> >>>>think
>
> >>>>
>
> >>>>
>
> >>>>>>>>>the new thread package was included with ProPack 3.0 on the
>
> Altix.
>
> >>>>
>
> >>>>You
>
> >>>>
>
> >>>>
>
> >>>>>>>>>might not see this problem on your Altix if it is running an
>
> >>
>
> >>earlier
>
> >>
>
> >>>>>>>>>version of the system software.
>
> >>>>>>>>>
>
> >>>>>>>>>
>
> >>>>>>>>>
>
> >>>>>>>>>I thought I had tried this earlier and didn't have this
>
problem,
>
> >>
>
> >>but
>
> >>
>
> >>>>as
>
> >>>>
>
> >>>>
>
> >>>>>>>>>it is just an environment variable, now I'm wondering if this
>
has
>
> >>>>>>
>
> >>>>>>always
>
> >>>>>>
>
> >>>>>>
>
> >>>>>>
>
> >>>>>>>>>been broken this way. Is anyone aware of any changes in the
>
> >>
>
> >>pthread
>
> >>
>
> >>>>>>>>>code made over the last week or so that may have changed this
>
> >>>>
>
> >>>>behavior?
>
> >>>>
>
> >>>>
>
> >>>>>>>>>Does anyone feel more qualified than I do about mucking around
>
in
>
> >>>>
>
> >>>>this
>
> >>>>
>
> >>>>
>
> >>>>>>>>>code and trying to understand what goes on when the program
>
> exits?
>
> >>>>>>>>>
>
> >>>>>>>>>
>
> >>>>>>>>>
>
> >>>>>>>>> Rocky
>
> >>>>>>>>>
>
> >>>>>>>>>
>
> >>>>>>>>>
- Re: [MANTA] bad exit from bin/manta on Altix, (continued)
- RE: [MANTA] bad exit from bin/manta on Altix, Rocky Rhodes, 05/25/2005
- RE: [MANTA] bad exit from bin/manta on Altix, Rocky Rhodes, 05/25/2005
- RE: [MANTA] bad exit from bin/manta on Altix, Rocky Rhodes, 05/25/2005
- RE: [MANTA] bad exit from bin/manta on Altix, Rocky Rhodes, 05/25/2005
- RE: [MANTA] bad exit from bin/manta on Altix, Rocky Rhodes, 05/25/2005
- RE: [MANTA] bad exit from bin/manta on Altix, Rocky Rhodes, 05/27/2005
Archive powered by MHonArc 2.6.16.