I have confirmed that gcc unrolls this loop (mac and x86) when the - funroll-loops flag is enabled.
On Dec 21, 2005, at 12:17 PM, Hansong Zhang wrote:
In Manta, vector operations like the following are implemented for generic dimensionality:
VectorT<T, Dim>& operator*=(T s) {
for(int i=0;i<Dim;i++)
data[i] *= s;
return *this;
}
I understand that this is in hope of the compiler's being able to unroll the loop, so that it's just as efficient as explicit 3 or 4 vector implementation. However, we have observed that, on Altix/ Itanium with the Intel compiler, the above function consumes a lot of time because it's not properly inlined on many occasions. The generated assembly code is, to say the least, not pretty.
So the question is, has anybody verified with certainty that the above loop is unrolled on other platforms (Mac, gcc, ...)? If not, fixing this could be a boost to all platforms. The compiler plays a much bigger role on Itanium than other processors, i.e. if the unrolling doesn't happen Itanium suffers much more. I wonder whether it's just that other platforms hide it better or if other compilers are smarter.
Thanks,
Hansong
Archive powered by MHonArc 2.6.16.