Neural Texture Block Compression
Paper: Neural Texture Block Compression
Authors: Shin Fujieda, Takahiro Harada
1. Problem & Motivation
- Material textures (diffuse, normal, roughness, etc.) are often stored as block‑compressed (BC) formats (e.g., BC1‑BC5) to keep GPU memory low.
- Traditional BC encoders treat each texture independently, ignoring the strong spatial and semantic correlations that exist across the multiple maps of a single material.
- Existing neural‑based texture compressors either:
- Encode each map separately (low compression gain), or
Random‑access neural compressors that need large latent grids (high memory, slow inference).
- Encode each map separately (low compression gain), or
Goal: Learn a compact, shared neural representation for all texture maps of a material, enabling sub‑linear storage while preserving the ability to random‑access decode each texel in real time.
2. Core Idea
- Treat a material as a set of correlated 2‑D signals (RGB maps + single‑channel maps) and learn a single neural field that can reconstruct any map on demand.
- Use a tiny MLP (≈ 2 k parameters) plus a learned hash‑grid encoding (as in Instant‑NGP) to map a 2‑D coordinate (u, v) to a latent feature vector.
- The latent vector is then passed through two lightweight heads:
- Endpoint head – predicts the four palette colors (the “endpoints”) for a BC block.
- Weight head – predicts the per‑pixel index (0‑3 for BC1, 0‑7 for BC4) using a soft‑argmax (straight‑through estimator) so the whole pipeline stays differentiable.
- The network is trained jointly on all maps of a material, sharing the same hash‑grid and MLP weights.
- At inference time, the hash‑grid is baked into a small binary table (≈ 10 KB) and the MLP weights are stored once per material.
3. Technical Contributions
| Contribution | What it solves | How it’s done | |————–|—————-|—————| | Unified material‑wide neural field | Captures inter‑map redundancy (e.g., diffuse and roughness often share edges) | Same hash‑grid + MLP for all maps; separate output heads only. | | Learned BC palette & indices | Produces valid BC1/BC4 data that can be decoded by existing hardware pipelines | Soft‑max over distances → expected weight; STE for argmax to obtain hard indices. | | Compact storage format | Reduces per‑material footprint from ~48 MB (standard BC) to 13 MB (aggressive) or 27 MB (conservative) while keeping random‑access. | Store: (i) 4 × 4 byte endpoints per 4×4 block, (ii) a tiny latent table (hash‑grid) and MLP weights. | | Real‑time random‑access decoding | No need to decode whole texture; suitable for streaming or on‑the‑fly material swaps. | Decode a texel by querying the hash‑grid → MLP → endpoint/weight → reconstruct color. |
4. Results (Key Numbers)
Dataset | Material | Original BC size | Aggressive NTBC | Conservative NTBC | PSNR (dB) – Diffuse |
---|---|---|---|---|---|
ambientCG – MetalPlates013 | 6 maps (RGB + 4 single‑channel) | 48 MB | 13.4 MB (72 % reduction) | 26.7 MB (45 % reduction) | 38.6 (aggr) / 38.6 (cons) vs. 42.5 (reference) |
Poly Haven – roof_09 | 5 maps | 40 MB | 13.4 MB | 26.7 MB | 37.4 / 37.4 vs. 40.3 (reference) |
Poly Haven – forest_sand_01 (hard case) | 5 high‑freq maps | 40 MB | 13.4 MB | 26.7 MB | 25.9 / 26.0 vs. 31.4 (reference) |
- Visual quality: Naïve BC (or a naïve NN that predicts weights directly) shows obvious block artifacts. NTBC removes most artifacts, at the cost of slight blur.
- Performance: Encoding (training) takes ~20 k steps per material (≈ 5 min on a RTX 3090). Decoding a texel is < 1 µs, fully compatible with real‑time shaders.
5. Significance & Impact
- Storage‑efficient material libraries: Game engines and asset pipelines can store entire material sets (diffuse, normal, roughness, AO, etc.) in a fraction of the usual BC footprint without sacrificing random‑access.
- Drop‑in compatibility: The compressed data can be fed to existing BC decoders; only the encoding stage is neural.
- Scalable to higher‑quality BC formats: The same framework can be extended to BC7 or ASTC, promising even better visual fidelity.
- Foundation for future work: Combining this with texture‑specific encodings (e.g., latent‑space for specular maps) or learned hash probing could push quality further while retaining the compact, random‑access nature.
6. Take‑away TL;DR
The authors present a material‑wide neural compressor that learns a tiny shared neural field for all texture maps of a material, outputs valid block‑compressed data (BC1/BC4), and achieves > 70 % storage reduction with real‑time random‑access decoding. It bridges the gap between classic BC compression (fast, hardware‑friendly) and modern neural codecs (high compression, but usually non‑random‑access), opening a practical path for neural texture compression in real‑time rendering pipelines.