RecursiveGrid is sometimes better than DynBVH and sometimes worse no
matter what kind of ray tracing. It can sometimes end up being just
scene and even camera view dependent. As a general guideline I'd say
DynBVH is a safe bet to use when you know the rays are coherent and
recursive grid or kdtree (using a single ray traversal) if they are
not. If performance is an issue, all structures should be tried each
time to see what is better. Here's an example where I path trace the
conference room scene: 64 rays per packet DynBVH: .175fps RecursiveGrid: .217fps KDTree: .241fps 16 rays per packet DynBVH: .181fps RecursiveGrid: .219fps KDTree: .251fps 1 rays per packet DynBVH: .199fps RecursiveGrid: .218fps KDTree: .315fps So in this case the KDTree is actually the best, then RecursiveGrid and finally DynBVH. Note that smaller ray packets were better for the KDTree and DynBVH since they didn't have to deal with the overhead of breaking up the packets or forcing rays to travel into nodes which they do not belong in. Yes, the recursive grid, because it is already tracing individual rays, will not suffer from the incoherence introduced during path tracing, but it will also not gain from any available coherence. However, DynBVH will somewhat gracefully degrade to tracing individual rays and there is still some coherence even in path tracing (for instance, the camera rays are all coherent), so aside from some overhead, it won't perform too poorly. The KDTree did well here, but in other situations its inability to breakup rays into subpackets as well as DynBVH during traversal could cause it to be (much) slower unless you specifically set it to traverse just single rays. Here we path trace just the stanford bunny. In this scene most camera rays will hit the bunny and then bounce off into space and so the rays that are traversed will be more coherent even though this is path tracing. 64 rays per packet DynBVH: .544fps RecursiveGrid: .354fps KDTree: .440fps 16 rays per packet DynBVH: .537fps RecursiveGrid: .345fps KDTree: .412fps 1 rays per packet DynBVH: .198fps RecursiveGrid: .278fps KDTree: .272fps So this was an example of path tracing where DynBVH did best and all the structures did best with large packets. Even the grid was faster with the bigger packets, although this is not due to the grid getting faster but to other parts of manta performing better with large packets (for instance, the bunny is a single material so the packets do not need to broken up for shading). As for path tracing on the GPU, that was one of my tests. Optix and manta both have a cornell box path tracer demo so I was able to make a direct comparison and in this situation I still found manta to be 20% faster. It is very possible that for more complicated scenes the results could be different. Since the Optix folks are doing lots of path tracer work, my guess is that they've put more time and effort into this then what went into manta's path tracer, so perhaps they do more clever things that results in Optix outperforming manta for more complex path tracing scenes. But until someone does a direct comparison we will have to rely on the cornell box test that says manta is faster. For most scenes I've tested, doing a full material sort usually ended up being worse. Path tracing in manta is not yet mature and there are likely many things that could be done to make it faster/better. Manta's path tracer has never sorted packets based on origin or direction and the material sorting doesn't break packets into subpackets (it only sorts) so it has little affect on how the acceleration structure would perform. Using packets with path tracing can still give performance boosts, as the bunny example shows above. The gain made by BSP is decent (I have a paper online that discusses this if you are interested) and it does a great job handling situations where other acceleration structures break down on. But the build currently is orders of magnitude slower. If you can wait a day to build a multi-million triangle scene, then go for it. But usually I'm not that patient :-) DynBVH will only break up a packet in certain situations. If you have 64 rays and the first one enters the left child, then the next 62 rays enter just the right child and finally the last ray enters the left child then you will traverse all 64 rays into the left child instead of just 2 rays. In this situation a smaller ray packet would have had less overhead. Thiago Bo Huang wrote: " type="cite">It seems RecursiveGrid is the better choice than DynBVH for path tracing due to incoherence. As for Manta's path tracer implementation where the RayPacketData is constantly sorted to be material coherent, can I assume RecursiveGrid is still better because sorting or not, the directions taken by the rays are still incoherent due to varying pdfs. Regarding the more solid performance achieved by Manta's standard ray tracer compared to the GPU version, would any path tracer comparison be similar? The intensive sorting of RayPacketData mentioned frequently yields sub-packets with length <5 for example, and I wonder if any tricks GPU developers may employ. Of course, I am also curious in general what Manta's path tracer excels or lacks, other than the ones I mentioned in a thread a while back. For example, on the CPU, is sorting still better than single ray traversal in Manta's path tracer. What are some reasons the gain made by BSP is miniscule compared to others assuming geometries are static? Why does reducing packet size (default is 64) influence DynBVH? Thanks Bo -----Original Message----- From: Thiago Ize [ ">mailto: ] Sent: Sun 4/11/2010 4:21 PM To: "> Cc: Carson Brownlee Subject: [Manta] Re: Re: Re: Re: information about the acceleration structures? Packet size should be an issue for certain acceleration structures when you have lots of triangles or incoherent rays, such as in path tracing. If the structure only traces a single ray at a time, then this clearly doesn't matter much, nor does it matter too much if the acceleration structure can gracefully degenerate to a single ray type of performance, which DynBVH can sort of do in some situations (although the state of the art in this has improved since DynBVH was written). Some structures, like the kdtree, do not handle large packets well when each ray wants to traverse different nodes and so in this case smaller packets or even an explicit single ray traversal might work better. I'm not familiar with GridSpheres, but I would assume that RecursiveGrid would be a much better grid structure than that. RecursiveGrid has the nice property that it scales to really large number of primitives in both performance and while using only linear memory and offering decent traversal performance for a single ray traversal algorithm. Note that the build is not at all optimized so it can be a bit slow. While you mentioned manta performance compared to gpu rasterizer performance, in the case of gpu ray tracing performance I've come across some interesting results. I've done some light benchmarking of manta running on a nice system (8 nehalem cores) versus a GPU only ray tracer (NVIDIA's Optix) running on a tesla s870 (4 GPUs for total of 2TFlops of performance) all on the same system and found that manta was about 20% faster in all my tests and could scale up to large datasets since it was not restricted to a tiny amount of GPU memory. Of course, you can now get a tesla system that is twice as fast, but the same also goes for the CPUs (plus manta has access to all of the cheap system memory versus the small expensive memory restrictions of GPUs). The take home message is that Manta should be pretty fast if you can run it using SSE and on a machine with many cores. That of course is the important point. If you have a really fancy and expensive graphics card(s) and run manta on a 2 core machine with SSE turned off, then of course manta won't stand a chance. Thiago David E DeMarle wrote:I'll try to post a better (but still informal) breakdown to the manta list later this week. Here is a sketch of what performance looks like to me: The timings are for the MantaBenchmark program in vtkManta that I wrote, I will export the data to something pure manta can read and see how that compares this week. The best accel structure in my benchmark is DynBVH. With that starting around 50k triangles the framerate starts to plummet downward from 60fps to 30fps at 100k to 6fps at 1million. Playing with max packet size changes the 1million tri case by about 1fps on either side. GridSpheres 3 is better between 50k and 100k (it drops barely at all), but shortly after 100k the build runs out of memory and crashes. The build time is too long to use in practice for sci-vis, where change is the norm and pure rendering (just camera motion) comes in bursts. Manta pretty handily outperforms Mesa GL (it is roughly even for < 50k and against a single manta thread), but it never outperforms GL on my NVidia card. It does start to be competative above 1million tris though. David E DeMarle Kitware, Inc. R&D Engineer 28 Corporate Drive Clifton Park, NY 12065-8662 Phone: 518-371-3971 x109 On Sun, Apr 11, 2010 at 3:34 AM, Carson Brownlee ">< > wrote:How many triangles are you talking about and would you mind giving a breakdown of your timings? I believe Thiagos comment on decreasing packet size may help if you have a small window size and lot of threads which might choke the load balancer. It should not affect single threaded performance however. Carson On Apr 9, 2010, at 5:35 PM, David E DeMarle wrote:Awesome, thanks. David E DeMarle Kitware, Inc. R&D Engineer 28 Corporate Drive Clifton Park, NY 12065-8662 Phone: 518-371-3971 x109 On Fri, Apr 9, 2010 at 7:30 PM, Thiago Ize ">< > wrote:BSP and CellSkipper you probably would never want to use since the BSP has a super slow build for relatively minor speed gains and CellSkipper is slow and just interesting from an academic view. RecursiveGrid is a very good choice when the rays are very incoherent and single ray traversal is just as good (or better) than packet traversal. This happens with path tracing for instance. The number of levels to use is described in my dissertation, but 3 is usually the right amount. The build is decent, but could be made much faster if someone spent the time to optimize it. KDTree is ok, but the version in manta is not as optimized as it could be, so I'd probably use DynBVH over it. DynBVH is usually a good acceleration structure to use. The build can be made faster by going into cmake and turning MANTA_USE_DYNBVH_APPROXIMATE to ON however that will result in slower traversal performance. Faster build algorithms exist and someone should implement it as well as parallelizing the build. The poor scaling with increasing triangles could be a result of the packet size being too big. Trying turning it down to see if it improves performance. The exception to this rule is RecursiveGrid which already does a single ray traversal so packet size is not an issue. If possible, I'd recommend flattening the groups so that there is just a single group and then building a tree over this since this will give the best performance. Otherwise, the type of structures to use would depend on how the objects are distributed within a group and how the groups are distributed. Thiago David E DeMarle wrote:Can anyone tell me about, or point me to descriptions of, the ?five? different acceleration structures (BSP, CellSkipper, RecursiveGrid, KDTree, RecursiveGrid) in Manta? vtkManta uses DynBVH, and it appears to be the fastest so far, but it doesn't seem to scale all that well when the number of triangles increases. I am wondering if one of the others (with well chosen settings) would be a better choice. So far, vtkManta isn't changing the contents of the DynBVH, so the Dynamic nature of it isn't that important (but quick build times are important). We do want to be able to have (groups of) triangles, cylinders and spheres in the acceleration structure so a non-homogenous acceleration structure is preferable. thanks for any pointers, David E DeMarle Kitware, Inc. R&D Engineer 28 Corporate Drive Clifton Park, NY 12065-8662 Phone: 518-371-3971 x109The information contained in this e-mail and any accompanying attachments may contain information that is privileged, confidential or otherwise protected from disclosure. If you are not the intended recipient of this message, or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. Any dissemination, distribution or other use of the contents of this message by anyone other than the intended recipient is strictly prohibited. The company accepts no liability for any damage caused by any virus transmitted by this email or any attachments. |
Archive powered by MHonArc 2.6.16.