Render Resources
Before any rendering can happen we need a way to reason about GPU resources. Since we want all graphics API specific code to stay isolated we need some kind of abstraction on the engine side, for that we have an interface called RenderDevice
. All calls to graphics APIs like D3D, OGL, GNM, Metal, etc. stays behind this interface. We will be covering the RenderDevice
in a later post so for now just know that it is there.
We want to have a graphics API agnostic representation for a bunch of different types of resources and we need to link these representations to their counterparts on the RenderDevice
side. This linking is handled through a POD-struct called RenderResource
:
struct RenderResource
{
enum {
TEXTURE, RENDER_TARGET, DEPENDENT_RENDER_TARGET, BACK_BUFFER_WRAPPER,
CONSTANT_BUFFER, VERTEX_STREAM, INDEX_STREAM, RAW_BUFFER,
BATCH_INFO, VERTEX_DECLARATION, SHADER,
NOT_INITIALIZED = 0xFFFFFFFF
};
uint32_t render_resource_handle;
};
Any engine resource that also needs a representation on the RenderDevice side inherits from this struct. It contains a single member render_resource_handle
which is used to lookup the correct graphics API specific representation in the RenderDevice.
The most significant 8 bits of render_resource_handle
holds the type enum, the lower 24 bits is simply an index into an array for that specific resource type inside the RenderDevice.
Various Render Resources
Let’s take a look at the different render resource that can be found in Stingray:
Texture
- A regular texture, this object wraps all various types of different texture layouts such as 2D, Cube, 3D.RenderTarget
- Basically the same asTexture
but writable from the GPU.DependentRenderTarget
- Similar toRenderTarget
but with logics for inheriting properties from anotherRenderTarget
. This is used for creating render targets that needs to be reallocated when the output window (swap chain) is being resized.BackBufferWrapper
- Special type ofRenderTarget
created inside theRenderDevice
as part of the swap chain creation. Almost all render targets are explicitly created by the user, this is the only exception as the back buffer associated with the swap chain is typically created together with the swap chain.ShaderConstantBuffer
- Shader constant buffers designed for explicit update and sharing between multiple shaders, mainly used for “view-global” state.VertexStream
- A regular Vertex Buffer.VertexDeclaration
- Describes the contents of one or manyVertexStreams
.IndexStream
- A regular Index Buffer.RawBuffer
- A linear memory buffer, can be setup for GPU writing through an UAV (Unordered Access View).Shader
- For now just think of this as something containing everything needed to build a full pipeline state object (PSO). Basically a wrapper over a number of shaders, render states, sampler states etc. I will cover the shader system in a later post.
Most of the above resources have a few things in common:
- They describe a buffer either populated by the CPU or by the GPU
- CPU populated buffers has a validity field describing its update frequency:
STATIC
- The buffer is immutable and won’t change after creation, typically most buffers coming from DCC assets areSTATIC
.UPDATABLE
- The buffer can be updated but changes less than once per frame, e.g: UI elements, post processing geometry and similar.DYNAMIC
- The buffer frequently changes, at least once per frame but potentially many times in a single frame e.g: particle systems.
- They have enough data for creating a graphics API specific representation inside the RenderDevice, i.e they know about strides, sizes, view requirements (e.g should an UAV be created or not), etc.
Render Resource Context
With the RenderResource
concept sorted, we’ll go through the interface for creating and destroying the RenderDevice
representation of the resources. That interface is called RenderResourceContext
(RRC).
We want resource creation to be thread safe and while the RenderResourceContext
in itself isn’t, we can achieve free threading by allowing the user to create any number of RRC’s they want, and as long as they don’t touch the same RRC from multiple threads everything will be fine.
Similar to many other rendering systems in Stingray the RRC is basically just a small helper class wrapping an abstract “command buffer”. On this command buffer we put what we call “packages” describing everything that is needed for creating/destroying RenderResource
objects. These packages have variable length depending on what kind of object they represent. In addition to that the RRC can also hold platform specific allocators that allow allocating/deallocating GPU mapped memory directly, avoiding any additional memory shuffling in the RenderDevice
. This kind of mechanism allows for streaming e.g textures and other immutable buffers directly into GPU memory on platforms that provides that kind of low-level control.
Typically the only two functions the user need to care about are:
class RenderResourceContext
{
public:
void alloc(RenderResource *resource);
void dealloc(RenderResource *resource);
};
When the user is done allocating/deallocating resources they hand over the RRC either directly to the RenderDevice
or to the RenderInterface
.
class RenderDevice
{
public:
virtual void dispatch(uint32_t n_contexts, RenderResourceContext **rrc, uint32_t gpu_affinity_mask = RenderContext::GPU_DEFAULT) = 0;
};
Handing it over directly to the RenderDevice
requires the caller to be on the controller thread for rendering as RenderDevice::dispatch()
isn’t thread safe. If the caller is on any other thread (like e.g. one of the worker threads or the resource streaming thread) RenderInterface::dispatch()
should be used instead. We will cover the RenderInterface
in a later post so for now just think of it as a way of piping data into the renderer from an arbitrary thread.
Wrap up
The main reason of having the RenderResourceContext
concept instead of exposing allocate()/deallocate()
functions directly in the RenderDevice/RenderInterface
interfaces is for efficiency. We have a need for allocating and deallocating lots of resources, sometimes in parallel from multiple threads. Decoupling the interface for doing so makes it easy to schedule when in the frame the actual RenderDevice representations gets created, it also makes the code easier to maintain as we don’t have to worry about thread-safety of the RenderResourceContext
.
In the next post we will discuss the RenderJobs
and RenderContexts
which are the two main building blocks for creating and scheduling draw calls and state changes.
Stay tuned.