Manta Interactive Ray Tracer Development Mailing List

Text archives Help


[Manta] Re: Info about Manta multi-threading


Chronological Thread 
  • From: Biagio Cosenza < >
  • To: Abe Stephens < >
  • Cc:
  • Subject: [Manta] Re: Info about Manta multi-threading
  • Date: Tue, 9 Jun 2009 15:06:39 +0200
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=mq+fNwsQBFlYGbbhAPsdTsLJxO/Y8Ro2h2Rj0k+ipBG2H0mrKToh5ZypFvYZVGwSLS Kv4HCxCaQR1pxX7GgoNBvt9ScS2tQACuFYwpUXuMyHh4UVoRVdB1en7+ZH/DKnshV8jw NCklYKp2jw1QJErckghCtCzp/sBebDGg9TCQw=

Thanks booth for your answer.

I've successful implemented the if(I'm thefirst) construct by using the AtomicCounter.

About the scalability issue, I just improved the performance using a sensitively higher granularity (500), such as Abe said.

I'm considering a scene complex enough, with high unbalance, at two resolution: 512x512 and 1024x1024.
1 to 8 rays per pixel. Area lights. Rendering times range from 0.5 to 10.0 fps.

However I've still some strange result for one of the test scenes (the cheaper one), with 4 threads, 512x512 and 1 sample per pixel. It looks like the performance drops for the 4th thread, but scalability is linear up to 3 threads. However it happens only with that scene.


Thanks for your help
Biagio



2nd)
I implemented two ImageTraverser using different load balancing strategy: the first using the WorkQueue approach (with granularity=5), the latter using a trivial static scheduling of packet between threads.

I also tested these two balancer on 3 complex scenes (from 200 000 to 8 000 000 of primary rays + reflections).

The test platform is an Intel Core 2 Quad Q6600 (4 cores).
Surprisingly scalability is higher with the static approach...
am I using the WorkQueue in the wrong way? Which is the better load balancer available in Manta for my target platform?

The better load balancing approach is probably more dependent on the graphics workload than on the hardware platform.  The rendering cost can be visualized by pressing 't' and then using 't+ctrl' or 't+shift' to adjust the color map. How does changing granularity effect scalability?


I also noted that the scalability is fine with low fps (close to 4x), but it slow down for higher fps (3x).

What are the actual refresh rates for low and high? Also what changes to produce the lower or higher frame rate, i.e. does the scene become more complicated, the output resolution greater etc.?

I wonder if this is due to a relatively higher cost of the barrier synch.

The barriers only become expensive if there is another source of load imbalance in the pipeline which the load balancer is unable to compensate for, e.g. the animation callbacks occur in a separate section of the control loop from rendering, so if one thread executes a very expensive animation update, the other threads might end up idling at the barrier which would hurt scalability. This type of load imbalance is usually visible in a profiler.


Abe








Archive powered by MHonArc 2.6.16.

Top of page