000

Index Labels

Stingray Renderer Walkthrough #4: Sorting

.
Stingray Renderer Walkthrough #4: Sorting

Introduction

This post will focus on ordering of the commands in the RenderContexts. I briefly touched on this subject in the last post and if you’ve implemented a rendering engine before you’re probably not new to this problem. Basically we need a way to make sure our RenderJobPackages (draw calls) end up on the screen in the correct order, both from a visual point of view as well as from a performance point of view. Some concrete examples,

  1. Make sure g-buffers and shadow maps are rendered before any lighting happens.
  2. Make sure opaque geometry is rendered front to back to reduce overdraw.
  3. Make sure transparent geometry is rendered back to front for alpha blending to generate correct results.
  4. Make sure the sky dome is rendered after all opaque geometry but before any transparent geometry.
  5. All of the above but also strive to reduce state switches as much as possible.
  6. All of the above but depending on GPU architecture maybe shift some work around to better utilize the hardware.

There are many ways of tackling this problem and it’s not uncommon that engines uses multiple sorting systems and spend quite a lot of frame time getting this right.

Personally I’m a big fan of explicit ordering with a single stable sort. What I mean by explicit ordering is that every command that gets recorded to a RenderContext already has the knowledge of when it will be executed relative to other commands. For us this knowledge is in the form of a 64 bit sort_key, in the case where we get two commands with the exact same sort_key we rely on the sort being stable to not introduce any kind of temporal instabilities in the final output.

The reasons I like this approach are many,

  1. It’s trivial to implement compared to various bucketing schemes and sorting of those buckets.
  2. We only need to visit renderable objects once per view (when calling their render() function), no additional pre-visits for sorting are needed.
  3. The sort is typically fast, and cost is isolated and easy to profile.
  4. Parallel rendering works out of the box, we can just take all the Command arrays of all the RenderContexts and merge them before sorting.

To make this work each command needs to know its absolute sort_key. Let’s breakdown the sort_key we use when working with our data-driven rendering pipe in Stingray. (Note: if the user doesn’t care about playing nicely together with our system for data-driven rendering it is fine to completely ignore the bit allocation patterns described below and roll their own.)

sort_key breakdown

Most significant bit on the left, here are our bit ranges:

MSB [ 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ] LSB
^ ^ ^ ^ ^^ ^
| | | | || |- 3 bits - Shader System (Pass Immediate)
| | | | ||- 16 bits - Depth
| | | | |- 1 bit - Instance bit
| | | |- 32 bits - User defined
| | |- 3 bits - Shader System (Pass Deferred)
| - 7 bits - Layer System
|- 2 bits - Unused

2 bits - Unused

Nothing to see here, moving on… (Not really sure why these 2 bits are unused, I guess they weren’t at some point but for the moment they are always zero) :)

7 bits - Layer System

This 7-bits range is managed by the “Layer system”. The Layer system is responsible for controlling the overall scheduling of a frame and is set up in the render_config file. It’s a central part of the data-driven rendering architecture in Stingray. It allows you to configure what layers to expose to the shader system and in which order these layers should be drawn. We will look closer at the implementation of the layer system in a later post but in the interest of clarifying how it interops with the sort_key here’s a small example:


default = [
// sort_key = [ 00000000 10000000 00000000 00000000 00000000 00000000 00000000 00000000 ]
{ name="gbuffer" render_targets=["gbuffer0", "gbuffer1", "gbuffer2", "gbuffer3"]
depth_stencil_target="depth_stencil_buffer" sort="FRONT_BACK" profiling_scope="gbuffer" }

// sort_key = [ 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ]
{ name="decals" render_targets=["gbuffer0" "gbuffer1"] depth_stencil_target="depth_stencil_buffer"
profiling_scope="decal" sort="EXPLICIT" }

// sort_key = [ 00000001 10000000 00000000 00000000 00000000 00000000 00000000 00000000 ]
{ resource_generator="lighting" profiling_scope="lighting" }

// sort_key = [ 00000010 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ] LSB
{ name="emissive" render_targets=["hdr0"] depth_stencil_target="depth_stencil_buffer"
sort="FRONT_BACK" profiling_scope="emissive" }
]

Above we have three layers exposed to the shader system and one kick of a resource_generator called lighting (more about resource_generators in a later post). The layers are rendered in the order they are declared, this is handled by letting each new layer increment the 7 bits range belonging to the Layer System with 1 (as can be seen in the sort_key comments above).

The shader author dictates into which layer(s) it wants to render. When a RenderJobPackage is recorded to the RenderContext (as described in the last post) the correct layer sort_keys are looked up from the layer system and the result is bitwise ORed together with the sort_key value piped as argument to RenderContext::render().

3 bits - Shader System (Pass Deferred)

The next 3 bits are controlled by the Shader System. These three bits encode the shader pass index within a layer. When I say shader in this context I refer to our ShaderTemplate::Context which is basically a wrapper around multiple linked shaders rendering into one or many layers. (Nathan Reed recently blogged about “The Many Meanings of “Shader””, in his analogy our ShaderTemplate is the same as an “Effect”)

Since we can have a multi-pass shader rendering into the same layer we need to encode the pass index into the sort_key, that is what this 3 bit range is used for.

32 bits - User defined

We then have 32 user defined bits, these bits are primarily used by our “Resource Generator” system (I will be covering this system in the post about render_config & data-driven rendering later), but the user is free to use them anyway they like and still maintain compatibility with the data-driven rendering system.

1 bit - Instance bit

This single bit also comes from the Shader System and is set if the shader implements support for “Instance Merging”. I will be covering this in a bit more detail in my next post about the RenderDevice but essentially this bit allows us to scan through all commands and find ranges of commands that potentially can be merged together to fewer draw calls.

16 bits - Depth

One of the arguments piped to RenderContext::render() is an unsigned normalized depth value (0.0-1.0). This value gets quantized into these 16 bits and is what drives the front-to-back vs back-to-front sorting of RenderJobPackages. If the sorting criteria for the layer (see layer example above) is set to back-to-front we simply flip the bits in this range.

3 bits - Shader System (Pass Immediate)

A shader can be configured to run in “Immediate Mode” instead of “Deferred Mode” (default). This forces passes in a multi-pass shader to run immediately after each other and is achieved by moving the pass index bits into the least significant bits of the sort_key. The concept is probably easiest to explain with an artificial example and some pseudo code:

Take a simple scene with a few instances of the same mesh, each mesh recording one RenderJobPackages to one or many RenderContexts and all RenderJobPackages are being rendered with the same multi-pass shader.

In “Deferred Mode” (i.e pass indices encoded in the “Shader System (Pass Deferred)” range) you would get something like this:

foreach (pass in multi-pass-shader)
foreach (render-job in render-job-packages)
render (render-job)
end
end

If shader is configured to run in “Immediate Mode” you would instead get something like this:

foreach (render-job in render-job-packages)
foreach (pass in multi-pass-shader)
render (render-job)
end
end

As you probably can imagine the latter results in more shader / state switches but can sometimes be necessary to guarantee correctly rendered results. A typical example is when using multi-pass shaders that does alpha blending.

Wrap up

The actual sort is implemented using a standard stable radix sort and happens immediately after the user has called RenderDevice::dispatch() handing over n-number of RenderContexts to the RenderDevice for translation into graphics API calls.

Next post will cover this and give an overview of what a typical rendering back-end (RenderDevice) looks like in Stingray. Stay tuned.

Blog Archive

Labels

.NET Programming 2D Drafting 3D Animation 3D Art 3D Artist 3D design 3D effects 3D Engineering 3D Materials 3D Modeling 3D models 3D presentation 3D Printing 3D rendering 3D scanning 3D scene 3D simulation 3D Sketch Inventor 3D Texturing 3D visualization 3D Web App 3ds Max 4D Simulation ACC Adaptive Clearing adaptive components Add-in Development Additive Manufacturing Advanced CAD features Advanced Modeling AEC Technology AEC Tools affordable Autodesk tools AI AI animation AI Assistance AI collaboration AI Design AI Design Tools AI Experts AI for Revit AI Guide AI in CAD AI in CNC AI in design AI in Manufacturing AI in Revit AI insights AI lighting AI rigging AI Tips AI Tools AI troubleshooting AI workflow AI-assisted AI-assisted rendering AI-enhanced Animation animation pipeline animation tips Animation workflow annotation AR architectural design architectural modeling architectural preservation architectural visualization Architecture architecture design Architecture Engineering Architecture Firm Architecture Productivity architecture software architecture technology Architecture Workflow Arnold Renderer Arnold Shader Artificial Intelligence As-Built Model Asset Management augmented reality AutoCAD AutoCAD advice AutoCAD API AutoCAD Basics AutoCAD Beginner AutoCAD Beginners AutoCAD Civil 3D AutoCAD Civil3D AutoCAD commands AutoCAD efficiency AutoCAD Expert Advice AutoCAD features AutoCAD File Management AutoCAD Layer AutoCAD Layers AutoCAD learning AutoCAD print settings AutoCAD productivity AutoCAD Teaching AutoCAD Techniques AutoCAD tips AutoCAD tools AutoCAD training. AutoCAD tricks AutoCAD Tutorial AutoCAD workflow AutoCAD Xref Autodesk Autodesk 2025 Autodesk 2026 Autodesk 3ds Max Autodesk AI Autodesk AI Tools Autodesk Alias Autodesk AutoCAD Autodesk BIM Autodesk BIM 360 Autodesk Certification Autodesk Civil 3D Autodesk Cloud Autodesk community forums Autodesk Construction Cloud Autodesk Docs Autodesk Dynamo Autodesk features Autodesk for Education Autodesk Forge Autodesk FormIt Autodesk Fusion Autodesk Fusion 360 Autodesk help Autodesk InfraWorks Autodesk Inventor Autodesk Inventor Frame Generator Autodesk Inventor iLogic Autodesk Knowledge Network Autodesk License Autodesk Maya Autodesk mistakes Autodesk Navisworks Autodesk news Autodesk plugins Autodesk productivity Autodesk Recap Autodesk resources Autodesk Revit Autodesk Software Autodesk support ecosystem Autodesk Takeoff Autodesk Tips Autodesk training Autodesk tutorials Autodesk update Autodesk Upgrade Autodesk Vault Autodesk Video Autodesk Viewer Automated Design Automation Automation Tutorial automotive design automotive visualization Backup Basic Commands Basics Batch Plot Beginner Beginner Tips beginner tutorial beginners guide Big Data BIM BIM 360 BIM Challenges BIM collaboration BIM Compliance BIM Coordination BIM Data BIM Design BIM Efficiency BIM for Infrastructure BIM Implementation BIM Library BIM Management BIM modeling BIM software BIM Standards BIM technology BIM tools BIM Trends BIM workflow Block Editor Block Management Block Organization Building Design Software Building Maintenance building modeling Building Systems Building Technology ByLayer CAD CAD API CAD assembly CAD Automation CAD Blocks CAD CAM CAD commands CAD comparison CAD Customization CAD Data Management CAD Design CAD errors CAD Evolution CAD File Size Reduction CAD Integration CAD Learning CAD line thickness CAD management CAD Migration CAD mistakes CAD modeling CAD Optimization CAD plugins CAD Productivity CAD Rendering CAD Security CAD Skills CAD software CAD software 2026 CAD software training CAD standards CAD technology CAD Tips CAD Tools CAD tricks CAD Tutorial CAD workflow CAM car design software Case Study CEO Guide CGI design Character Rig cinematic lighting Civil 3D Civil 3D hidden gems Civil 3D productivity Civil 3D tips civil design software civil engineering Civil engineering software Clash Detection Class-A surfacing clean CAD file cleaning command client engagement Cloud CAD Cloud Collaboration Cloud design platform Cloud Engineering Cloud Management Cloud Storage Cloud-First CNC CNC machining collaboration command abbreviations Complex Renovation concept car conceptual workflow Connected Design construction Construction Analytics Construction Automation Construction BIM Construction Cloud Construction Planning Construction Scheduling Construction Technology contractor tools Contractor Workflow Contraints corridor design Cost Effective Design cost estimation Create resizable blocks Creative Teams CTB STB Custom visual styles Cutting Parameters Cybersecurity Data Backup data management Data Protection Data Reference Data Security Data Shortcut Design Automation Design Career Design Collaboration Design Comparison Design Coordination design efficiency Design Engineering Design Hacks Design Innovation design optimization Design Options design productivity design review Design Rules design software design software tips Design Technology design tips Design Tools Design Workflow design-to-construction Designer Designer Tools Digital Art Digital Assets Digital Construction Digital Construction Technology Digital Content Digital Design Digital engineering digital fabrication Digital Manufacturing digital marketing digital takeoff Digital Thread Digital Tools Digital Transformation Digital Twin Digital Twins digital workflow dimension dimensioning Disaster Recovery drafting Drafting Efficiency Drafting Shortcuts Drafting Standards Drafting Tips Drawing Drawing Automation drawing tips Dref Dynamic Block Dynamic Block AutoCAD Dynamic Blocks Dynamic doors Dynamic windows Dynamo Dynamo automation early stage design eco design editing commands Electrical Systems Emerging Features Energy Analysis energy efficiency Engineering Engineering Automation engineering data Engineering Design Engineering Innovation Engineering Productivity Engineering Skills engineering software Engineering Technology engineering tools Engineering Tools 2025 Engineering Workflow Excel Export Workflow Express Tools External Reference facial animation Facial Rigging Facility Management Families Fast Structural Design Field Documentation File Optimization File Recovery Flame flange tips flat pattern Forge Development Forge Viewer FreeCAD Fusion 360 Fusion 360 API Fusion 360 tutorial Future of Design Future Skills Game Development Gamification Generative Design Geospatial Data GIS Global design teams global illumination grading optimization green building Green Technology Grips Handoff HDRI health check Healthcare Facilities heavy CAD file Heavy CAD Files heritage building conservation hidden commands Hospital Design HVAC HVAC Design Tools HVAC Engineering Hydraulic Modeling IK/FK iLogic Import Workflow Industry 4.0 Infrastructure infrastructure design Infrastructure Monitoring Infrastructure Planning Infrastructure Technology InfraWorks innovation Insight intelligent modeling Interactive Design interactive presentation Interior Design Inventor Inventor API Inventor Drawing Template Inventor Frame Generator Inventor Graphics Issues Inventor IDW Inventor Tips Inventor Tutorial IoT ISO 19650 joints Keyboard Shortcuts keyframe animation Keyframe generation Landscape Design Large Projects Laser Scan Layer Management Layer Organization Learn AutoCAD Legacy CAD Licensing light techniques Lighting and shading Lighting Techniques Linked Models Machine Learning Machine Learning in CAD Machine Optimization Machining Efficiency maintenance command Management manufacturing Manufacturing Innovation Manufacturing Technology Mapping Technology marketing visuals Material Creation Maya Maya character animation Maya lighting Maya Shader Maya Tips Maya tutorial measurement Mechanical Design Mechanical Engineering Media & Entertainment MEP Modeling Mesh-to-BIM Metal Structure modal analysis Model Management Model Optimization Modeling Secrets Modular Housing Motion capture motion graphics motion simulation MotionBuilder Multi Office Workflow Multi-User Environment multileader Navisworks Navisworks Best Practices Net Zero Design ObjectARX .NET API Open Source CAD Organization OVERKILL OVERKILL AutoCAD Page Setup Palette Parametric Components parametric design parametric family Parametric Modeling particle effects particle systems PDF PDM system Personal Brand Phasing PlanGrid Plot Settings Plot Style Plot Style AutoCAD Plotting Plugin Tutorial Plumbing Design point cloud Portfolio Post Construction Post-Processing Practice Drawing preconstruction workflow predictive analysis predictive animation Predictive Maintenance Predictive rigging Prefabrication Presentation-ready visuals Printing Printing Quality Procedural animation procedural motion Procedural Rig Procedural Textures Product Design Product Development product lifecycle product rendering Productivity productivity tools Professional 3D design Professional CAD Professional Drawings professional printing Professional Tips Project Documentation project efficiency project management Project Management Tools Project Visualization PTC Creo PURGE PURGE AutoCAD Rail Transit Rapid Prototyping realistic rendering ReCap Redshift Shader reduce CAD file size Render Render Passes Render Quality Render Settings Rendering rendering engine Rendering Engines Rendering Optimization rendering software Rendering Tips Rendering Workflow RenderMan Renewable Energy Renovation Project Renovation Workflow Reports Resizable Block restoration workflow Revit Revit add-ins Revit API Revit automation Revit Best Practices Revit Collaboration Revit Documentation Revit Family Revit integration Revit MEP Revit Performance Revit Phasing Revit Plugins Revit Scripting Revit skills Revit Standards Revit Template Revit Tips Revit tutorial Revit Workflow Ribbon Rigging robotics ROI Scale Autodesk Schedules screen Sculpting Secure Collaboration Sensor Data Shader Networks Sheet Metal Design Sheet Metal Tricks Sheet Set Manager shortcut keys Shortcuts Siemens NX Simulation simulation tools Sketch Sketching Tricks Small Firms Smart Architecture Smart Block Smart Building Design Smart City Smart Design Smart Engineering Smart Factory Smart Infrastructur Software Compliance software ecosystem Software Management Software Trends software troubleshooting Software Update Solar Energy Solar Panels SolidWorks Startup Design static stress Steel Structure Design Structural Optimization subscription model Subscription Value Surface Modeling sustainability sustainable design Sustainable Manufacturing system performance T-Spline team training guide Technical Drawing technical support Template Setup text style Texture Mapping Texturing thermal analysis Time Management time saving tools Title Blocks toolbar Toolpath Optimization Toolpaths Topography Troubleshooting Tutorial Tutorials urban planning User Interface (UI) UV Mapping UV Unwrap V-Ray Vault Best Practices Vault Lifecycle Vault Mistakes Vector Plotting vehicle modeling VFX Viewport configuration Virtual Environments virtual reality visual effects visualization workflow VR VR Tools VRED Water Infrastructure Water Management Weight Painting What’s New in Autodesk Wind Energy Wind Turbines Workbook workflow Workflow Automation workflow efficiency Workflow Optimization Workflow Tips Worksets Worksharing Workspace XLS Xref Xrefs เขียนแบบ