Manta Interactive Ray Tracer Development Mailing List

Text archives Help


[Manta] Re: information about the acceleration structures?


Chronological Thread 
  • From: Thiago Ize < >
  • To:
  • Subject: [Manta] Re: information about the acceleration structures?
  • Date: Mon, 12 Apr 2010 16:02:32 -0600

RecursiveGrid is sometimes better than DynBVH and sometimes worse no matter what kind of ray tracing. It can sometimes end up being just scene and even camera view dependent.  As a general guideline I'd say DynBVH is a safe bet to use when you know the rays are coherent and recursive grid or kdtree (using a single ray traversal) if they are not. If performance is an issue, all structures should be tried each time to see what is better.  Here's an example where I path trace the conference room scene:

64 rays per packet
DynBVH:         .175fps
RecursiveGrid:  .217fps
KDTree:            .241fps

16 rays per packet
DynBVH:         .181fps
RecursiveGrid:  .219fps
KDTree:            .251fps

1 rays per packet
DynBVH:         .199fps
RecursiveGrid:  .218fps
KDTree:            .315fps

So in this case the KDTree is actually the best, then RecursiveGrid and finally DynBVH. Note that smaller ray packets were better for the KDTree and DynBVH since they didn't have to deal with the overhead of breaking up the packets or forcing rays to travel into nodes which they do not belong in.  Yes, the recursive grid, because it is already tracing individual rays, will not suffer from the incoherence introduced during path tracing, but it will also not gain from any available coherence.  However, DynBVH will somewhat gracefully degrade to tracing individual rays and there is still some coherence even in path tracing (for instance, the camera rays are all coherent), so aside from some overhead, it won't perform too poorly.  The KDTree did well here, but in other situations its inability to breakup rays into subpackets as well as DynBVH during traversal could cause it to be (much) slower unless you specifically set it to traverse just single rays. 

Here we path trace just the stanford bunny.  In this scene most camera rays will hit the bunny and then bounce off into space and so the rays that are traversed will be more coherent even though this is path tracing.

64 rays per packet
DynBVH:         .544fps
RecursiveGrid:  .354fps
KDTree:           .440fps

16 rays per packet
DynBVH:         .537fps
RecursiveGrid:  .345fps
KDTree:           .412fps

1 rays per packet
DynBVH:         .198fps
RecursiveGrid:  .278fps
KDTree:           .272fps

So this was an example of path tracing where DynBVH did best and all the structures did best with large packets.  Even the grid was faster with the bigger packets, although this is not due to the grid getting faster but to other parts of manta performing better with large packets (for instance, the bunny is a single material so the packets do not need to broken up for shading).

As for path tracing on the GPU, that was one of my tests.  Optix and manta both have a cornell box path tracer demo so I was able to make a direct comparison and in this situation I still found manta to be 20% faster.  It is very possible that for more complicated scenes the results could be different.  Since the Optix folks are doing lots of path tracer work, my guess is that they've put more time and effort into this then what went into manta's path tracer, so perhaps they do more clever things that results in Optix outperforming manta for more complex path tracing scenes.  But until someone does a direct comparison we will have to rely on the cornell box test that says manta is faster.

For most scenes I've tested, doing a full material sort usually ended up being worse.  Path tracing in manta is not yet mature and there are likely many things that could be done to make it faster/better.  Manta's path tracer has never sorted packets based on origin or direction and the material sorting doesn't break packets into subpackets (it only sorts) so it has little affect on how the acceleration structure would perform.  Using packets with path tracing can still give performance boosts, as the bunny example shows above.

The gain made by BSP is decent (I have a paper online that discusses this if you are interested) and it does a great job handling situations where other acceleration structures break down on.  But the build currently is orders of magnitude slower.  If you can wait a day to build a multi-million triangle scene, then go for it. But usually I'm not that patient :-)

DynBVH will only break up a packet in certain situations. If you have 64 rays and the first one enters the left child, then the next 62 rays enter just the right child and finally the last ray enters the left child then you will traverse all 64 rays into the left child instead of just 2 rays.  In this situation a smaller ray packet would have had less overhead.

Thiago

Bo Huang wrote:
" type="cite">
It seems RecursiveGrid is the better choice than DynBVH for path tracing due to incoherence. As for Manta's path tracer implementation where the RayPacketData is constantly sorted to be material coherent, can I assume RecursiveGrid is still better because sorting or not, the directions taken by the rays are still incoherent due to varying pdfs.

Regarding the more solid performance achieved by Manta's standard ray tracer compared to the GPU version, would any path tracer comparison be similar? The intensive sorting of RayPacketData mentioned frequently yields sub-packets with length <5 for example, and I wonder if any tricks GPU developers may employ. Of course, I am also curious in general what Manta's path tracer excels or lacks, other than the ones I mentioned in a thread a while back. For example, on the CPU, is sorting still better than single ray traversal in Manta's path tracer.

What are some reasons the gain made by BSP is miniscule compared to others assuming geometries are static?

Why does reducing packet size (default is 64) influence DynBVH?

Thanks

Bo


-----Original Message-----
From: Thiago Ize [
 
 ">mailto:
 ]
Sent: Sun 4/11/2010 4:21 PM
To: 
 
 ">
 
Cc: Carson Brownlee
Subject: [Manta] Re: Re: Re: Re: information about the acceleration structures?
 
Packet size should be an issue for certain acceleration structures when 
you have lots of triangles or incoherent rays, such as in path tracing.  
If the structure only traces a single ray at a time, then this clearly 
doesn't matter much, nor does it matter too much if the acceleration 
structure can gracefully degenerate to a single ray type of performance, 
which DynBVH can sort of do in some situations (although the state of 
the art in this has improved since DynBVH was written).  Some 
structures, like the kdtree, do not handle large packets well when each 
ray wants to traverse different nodes and so in this case smaller 
packets or even an explicit single ray traversal might work better.

I'm not familiar with GridSpheres, but I would assume that RecursiveGrid 
would be a much better grid structure than that.  RecursiveGrid has the 
nice property that it scales to really large number of primitives in 
both performance and while using only linear memory and offering decent 
traversal performance for a single ray traversal algorithm.  Note that 
the build is not at all optimized so it can be a bit slow.

While you mentioned manta performance compared to gpu rasterizer 
performance, in the case of gpu ray tracing performance I've come across 
some interesting results.  I've done some light benchmarking of manta 
running on a nice system (8 nehalem cores) versus a GPU only ray tracer 
(NVIDIA's Optix) running on a tesla s870 (4 GPUs for total of 2TFlops of 
performance) all on the same system and found that manta was about 20% 
faster in all my tests and could scale up to large datasets since it was 
not restricted to a tiny amount of GPU memory.  Of course, you can now 
get a tesla system that is twice as fast, but the same also goes for the 
CPUs (plus manta has access to all of the cheap system memory versus the 
small expensive memory restrictions of GPUs).  The take home message is 
that Manta should be pretty fast if you can run it using SSE and on a 
machine with many cores.  That of course is the important point.  If you 
have a really fancy and expensive graphics card(s) and run manta on a 2 
core machine with SSE turned off, then of course manta won't stand a 
chance.

Thiago

David E DeMarle wrote:
  
I'll try to post a better (but still informal) breakdown to the manta list
later this week. Here is a sketch of what performance looks like to
me:

The timings are for the MantaBenchmark program in vtkManta that I
wrote, I will export the data to something pure manta can read and
see how that compares this week.

The best accel structure in my benchmark is DynBVH. With that starting
around 50k triangles the framerate starts to plummet downward from
60fps to 30fps at 100k to 6fps at 1million. Playing with max packet
size changes the 1million tri case by about 1fps on either side.

GridSpheres 3 is better between 50k and 100k (it drops barely at all),
but shortly after 100k the build runs out of memory and crashes. The
build time is too long to use in practice for sci-vis, where change is
the norm and pure rendering (just camera motion) comes in bursts.

Manta pretty handily outperforms Mesa GL (it is roughly even for  < 50k
and against a single manta thread), but it never outperforms GL on my
NVidia card. It does start to be competative above 1million tris though.

David E DeMarle
Kitware, Inc.
R&D Engineer
28 Corporate Drive
Clifton Park, NY 12065-8662
Phone: 518-371-3971 x109



On Sun, Apr 11, 2010 at 3:34 AM, Carson Brownlee 
 
 "><
 > wrote:
  
    
How many triangles are you talking about and would you mind giving a
breakdown of your timings?  I believe Thiagos comment on decreasing packet
size may help if you have a small window size and lot of threads which might
choke the load balancer.  It should not affect single threaded performance
however.
Carson


On Apr 9, 2010, at 5:35 PM, David E DeMarle wrote:

    
      
Awesome, thanks.

David E DeMarle
Kitware, Inc.
R&D Engineer
28 Corporate Drive
Clifton Park, NY 12065-8662
Phone: 518-371-3971 x109



On Fri, Apr 9, 2010 at 7:30 PM, Thiago Ize 
 
 "><
 > wrote:
      
        
BSP and CellSkipper you probably would never want to use since the BSP
has a
super slow build for relatively minor speed gains and CellSkipper is slow
and just interesting from an academic view.

RecursiveGrid is a very good choice when the rays are very incoherent and
single ray traversal is just as good (or better) than packet traversal.
 This happens with path tracing for instance.  The number of levels to
use
is described in my dissertation, but 3 is usually the right amount.  The
build is decent, but could be made much faster if someone spent the time
to
optimize it.

KDTree is ok, but the version in manta is not as optimized as it could
be,
so I'd probably use DynBVH over it.

DynBVH is usually a good acceleration structure to use.  The build can be
made faster by going into cmake and turning
MANTA_USE_DYNBVH_APPROXIMATE   to  ON
however that will result in slower traversal performance.  Faster build
algorithms exist and someone should implement it as well as parallelizing
the build.

The poor scaling with increasing triangles could be a result of the
packet
size being too big. Trying turning it down to see if it improves
performance. The exception to this rule is RecursiveGrid which already
does
a single ray traversal so packet size is not an issue.

If possible, I'd recommend flattening the groups so that there is just a
single group and then building a tree over this since this will give the
best performance. Otherwise, the type of structures to use would depend
on
how the objects are distributed within a group and how the groups are
distributed.

Thiago

David E DeMarle wrote:
        
          
Can anyone tell me about, or point me to descriptions of, the ?five?
different acceleration structures (BSP, CellSkipper, RecursiveGrid,
KDTree, RecursiveGrid) in Manta?

vtkManta uses DynBVH, and it appears to be the fastest so far, but it
doesn't seem to scale all that well when the number of triangles
increases. I am wondering if one of the others (with well chosen
settings) would be a better choice.

So far, vtkManta isn't changing the contents of the DynBVH, so the
Dynamic nature of it isn't that important (but quick build times are
important).

We do want to be able to have (groups of) triangles, cylinders and
spheres in the acceleration structure so a non-homogenous acceleration
structure is preferable.

thanks for any pointers,

David E DeMarle
Kitware, Inc.
R&D Engineer
28 Corporate Drive
Clifton Park, NY 12065-8662
Phone: 518-371-3971 x109

          
            
    
      


The information contained in this e-mail and any accompanying attachments may contain information that is privileged, confidential or otherwise protected from disclosure. If you are not the intended recipient of this message, or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. Any dissemination, distribution or other use of the contents of this message by anyone other than the intended recipient is strictly prohibited. The company accepts no liability for any damage caused by any virus transmitted by this email or any attachments.
  



Archive powered by MHonArc 2.6.16.

Top of page