Takahiro HARADA

“Rendering Vector Displacement Mapped Surfaces in a GPU Ray Tracer” by Takahiro Harada, featured in GPU Pro 6:

Overview

The paper presents a method for direct ray tracing of vector displacement mapped (VDM) surfaces on the GPU, specifically using OpenCL and tested on AMD FirePro W9100 GPU. The technique avoids pre-tessellation and instead builds geometry on-the-fly, enabling high-detail rendering with low memory usage and competitive performance.

Key Concepts

1. Vector Displacement Mapping (VDM)

Scalar Displacement Mapping displaces vertices along the surface normal using scalar values.
Vector Displacement Mapping uses vector values to displace vertices in arbitrary directions, allowing complex geometry like overhangs.
VDM provides greater modeling freedom but poses challenges for ray tracing, especially on GPUs.

2. Ray Tracing with VDM

Challenges

VDM lacks directional constraints, making intersection tests expensive.
Requires on-the-fly tessellation and displacement, followed by BVH construction for acceleration.

Solution

Use Bounding Volume Hierarchies (BVHs) built dynamically per VD patch.
Batch rays intersecting a VD patch to amortize tessellation and BVH build costs.

3. Quad BVH Construction

Structure

Each VD patch is recursively subdivided into quads.
Leaf nodes store:
- Quantized AABBs (2-byte integers)
- Texture coordinates and normals (4 bytes each)
Nodes are stored in breadth-first order.
Stackless traversal is used to reduce memory traffic and register pressure.

Compression

AABBs are quantized to reduce memory footprint.
Texture coordinates and normals are also compressed.

4. GPU Implementation with OpenCL

Parallelization Strategy

Rays intersecting VD patches are grouped and sorted.
A work group processes one VD patch at a time:
- Computes LOD (Level of Detail)
- Builds BVH
- Performs ray intersection
BVH data is stored in global memory due to LDS size limitations.

Atomic Operations

Used to safely update hit information across work groups.
64-bit atomic min operations encode hit distance and other data.

5. Integration into Ray Tracer

Three-Level BVH Hierarchy

Top-Level: Stores meshes and transforms.
Middle-Level: Stores primitives (triangles, quads, VD patches).
Bottom-Level: Built on-the-fly for VD patches.

Traversal Strategy

Top and middle-level BVHs are traversed first.
VD patch intersections are deferred and processed in batches.

6. Performance and Results

Memory Usage

Dramatically reduced compared to pre-tessellation:
- “Party” scene: 52GB → 16MB
- “Pumpkin” scene: 380MB → 0.12MB

Rendering Time

Direct ray tracing is faster in most cases despite complexity.
Pre-tessellation only faster for low-detail scenes.
Bottom-level BVH build is the most time-consuming part.

Indirect Illumination

Performance degrades with deeper ray bounces due to repeated BVH builds.
Caching BVHs is suggested as a future optimization.

Conclusion

The proposed method enables efficient GPU ray tracing of vector displacement mapped surfaces with:

Low memory footprint
Competitive or superior performance
Scalability for complex scenes

The paper emphasizes that optimizing bottom-level BVH construction and traversal is more critical than optimizing for simple primitives.