manta - Re: [MANTA] loop unrolled?

Closed list
Subscribers: 0
Owners

sparker

thiago

Subscribe
Unsubscribe
Info
Admin
Archive

Post

Shared documents

Manta Interactive Ray Tracer Development Mailing List

Text archives Help

Re: [MANTA] loop unrolled?

From: Hansong Zhang <hansong@sgi.com>
To: "Steven G. Parker" <sparker@cs.utah.edu>
Cc: "'manta@sci.utah.edu'" <manta@sci.utah.edu>
Subject: Re: [MANTA] loop unrolled?
Date: Wed, 21 Dec 2005 14:45:52 -0800

some more info on this:

Unrolling is part of the story. Effectiveness of the inlining operation is the other part that the compiler doesn't seem to be doing well.

Exactly. Unrolling is just the beginning. Itanium has so many registers
there is no good reason for any loads or stores to any intermediate
vectors in the computation! It is really hard for me how to understand
how the compiler is making such serious mistakes. It almost seems as
if the compiler is making the transformations in the wrong order.

Steven G. Parker wrote:

It would be straightforward (although tedious) to create specializations for 3 dimensional vectors/points of floats, which might help. You will also find the same pattern in the ColorSpace class...
Steve

On Dec 21, 2005, at 12:35 PM, Hansong Zhang wrote:

Steven G. Parker wrote:

I have confirmed that gcc unrolls this loop (mac and x86) when the - funroll-loops flag is enabled.

Thanks, Steve. Now I'm really torn between gcc and icc :-)
On the other hand, this gives Itanium some hope because the code it's running now is rather crappy.

Hansong

On Dec 21, 2005, at 12:17 PM, Hansong Zhang wrote:

In Manta, vector operations like the following are implemented for generic dimensionality:

   VectorT<T, Dim>& operator*=(T s) {
     for(int i=0;i<Dim;i++)
       data[i] *= s;
     return *this;
   }

I understand that this is in hope of the compiler's being able to unroll the loop, so that it's just as efficient as explicit 3 or 4 vector implementation. However, we have observed that, on Altix/ Itanium with the Intel compiler, the above function consumes a lot of time because it's not properly inlined on many occasions. The generated assembly code is, to say the least, not pretty.

So the question is, has anybody verified with certainty that the   above loop is unrolled on other platforms (Mac, gcc, ...)? If not, fixing this could be a boost to all platforms. The compiler plays a much bigger role on Itanium than other processors, i.e. if the unrolling doesn't happen Itanium suffers much more. I wonder whether it's just that other platforms hide it better or if other compilers are smarter.

Thanks,
Hansong

[MANTA] loop unrolled?, Hansong Zhang, 12/21/2005
- Re: [MANTA] loop unrolled?, Steven G. Parker, 12/21/2005
  - Re: [MANTA] loop unrolled?, Hansong Zhang, 12/21/2005
    - Re: [MANTA] loop unrolled?, Steven G. Parker, 12/21/2005
      - Re: [MANTA] loop unrolled?, Hansong Zhang, 12/21/2005

Archive powered by MHonArc 2.6.16.