mirror of
synced 2025-03-02 23:10:50 +00:00
* Update release number to 107. Public Issues: * Fix revision date for the `<<VK_AMD_gpu_shader_half_float>>` appendix (public issue 617). * Make <<synchronization-pipeline-barriers-subpass-self-dependencies, subpass self-dependencies>> less restrictive (public issue 777). * Fix the `<<VK_EXT_full_screen_exclusive>>` dependency on `<<VK_KHR_win32_surface>>` in `vk.xml` (public pull request 849). * Remove single-page (`apispec.html`) refpage sub-targets from the Makefile `allman` target and the build instructions. The target is still present in the Makefile, but we have not been actively maintaining the single-page document and do not promise it will work. The full Specification and the individual API reference pages are what we support and publish at present (public issue 949). Internal Issues: * De-duplicate common valid usage statements shared by multiple commands or structures by using asciidoctor includes and dynamically assigning part of the valid usage ID based on which command or structure they're being applied to (internal issue 779). * Add reference pages for constructs not part of the formal API, such as platform calling convention macros, and script changes supporting them This required suppressing some check_spec_links warning classes in order to pass CI, until a more sophisticated fix can be done (internal issue 888). * Change math notation for the elink:VkPrimitiveTopology descriptions to use short forms `v` and `p` instead of `vertex` and `primitive`, increasing legibility (internal issue 1611). * Rewrite generated file includes relative to a globally specified path, fixing some issues with refpage generation (internal issue 1630). * Update contributor list for `<<VK_EXT_calibrated_timestamps>>`. * Fix use of pathlin in `scripts/generator.py` so the script will work on Windows under Python 3.5 (internal merge request 3107). * Add missing conditionals around the <<descriptorsets-accelerationstructure, Acceleration Structure>> section (internal merge request 3108). * More script synchronization with OpenXR spec repository (internal merge request 3109). * Mark the `<<VK_AMD_gpu_shader_half_float>>` and `<<VK_AMD_gpu_shader_int16>>` extensions as deprecated in `vk.xml` and the corresponding extension appendices (internal merge request 3112). New Extensions: * `<<VK_EXT_headless_surface>>`
607 lines
26 KiB
607 lines
26 KiB
= Ray Tracing
Unlike draw commands, which use rasterization, ray tracing is a rendering
method that generates an image by tracing the path of rays which have a
single origin and using shaders to determine the final colour of an image
Ray tracing uses a separate rendering pipeline from both the graphics and
compute pipelines (see <<pipelines-raytracing,Ray tracing Pipeline>>).
It has a unique set of programmable and fixed function stages.
image::{images}/raypipe.svg[align="center",title="Ray tracing Pipeline",opts="{imageopts}"]
Interaction between the different shader stages in the ray tracing pipeline
== Ray Tracing Commands
_Ray tracing commands_ provoke work in the ray tracing pipeline.
Ray tracing commands are recorded into a command buffer and when executed by
a queue will produce work that executes according to the currently bound ray
tracing pipeline.
A ray tracing pipeline must: be bound to a command buffer before any ray
tracing commands are recorded in that command buffer.
Each ray tracing call operates on a set of shader stages that are specific
to the ray tracing pipeline as well as a set of
sname:VkAccelerationStructureNV objects, which describe the scene geometry
in an implementation-specific way.
The relationship between the ray tracing pipeline object and the
acceleration structures is passed into the ray tracing command in a
slink:VkBuffer object known as a _shader binding table_.
During execution, control alternates between scheduling and other
The scheduling functionality is implementation-specific and is responsible
for workload execution.
The shader stages are programmable.
_Traversal_, which refers to the process of traversing acceleration
structures to find potential intersections of rays with geometry, is fixed
The programmable portions of the pipeline are exposed in a single-ray
programming model.
Each GPU thread handles one ray at a time.
Memory operations can: be synchronized using standard memory barriers.
However, communication and synchronization between threads is not allowed.
In particular, the use of compute pipeline synchronization functions is not
supported in the ray tracing pipeline.
[open,refpage='vkCmdTraceRaysNV',desc='Initialize a ray tracing dispatch',type='protos']
:refpage: vkCmdTraceRaysNV
To dispatch a ray tracing call use:
* pname:commandBuffer is the command buffer into which the command will be
* pname:raygenShaderBindingTableBuffer is the buffer object that holds the
shader binding table data for the ray generation shader stage.
* pname:raygenShaderBindingOffset is the offset in bytes (relative to
pname:raygenShaderBindingTableBuffer) of the ray generation shader being
used for the trace.
* pname:missShaderBindingTableBuffer is the buffer object that holds the
shader binding table data for the miss shader stage.
* pname:missShaderBindingOffset is the offset in bytes (relative to
pname:missShaderBindingTableBuffer) of the miss shader being used for
the trace.
* pname:missShaderBindingStride is the size in bytes of each shader
binding table record in pname:missShaderBindingTableBuffer.
* pname:hitShaderBindingTableBuffer is the buffer object that holds the
shader binding table data for the hit shader stages.
* pname:hitShaderBindingOffset is the offset in bytes (relative to
pname:hitShaderBindingTableBuffer) of the hit shader group being used
for the trace.
* pname:hitShaderBindingStride is the size in bytes of each shader binding
table record in pname:hitShaderBindingTableBuffer.
* pname:callableShaderBindingTableBuffer is the buffer object that holds
the shader binding table data for the callable shader stage.
* pname:callableShaderBindingOffset is the offset in bytes (relative to
pname:callableShaderBindingTableBuffer) of the callable shader being
used for the trace.
* pname:callableShaderBindingStride is the size in bytes of each shader
binding table record in pname:callableShaderBindingTableBuffer.
* pname:width is the width of the ray trace query dimensions.
* pname:height is height of the ray trace query dimensions.
* pname:depth is depth of the ray trace query dimensions.
When the command is executed, a ray generation group of [eq]#pname:width
{times} pname:height {times} pname:depth# rays is assembled.
.Valid Usage
* [[VUID-vkCmdTraceRaysNV-raygenShaderBindingOffset-02455]]
pname:raygenShaderBindingOffset must: be less than the size of
* [[VUID-vkCmdTraceRaysNV-raygenShaderBindingOffset-02456]]
pname:raygenShaderBindingOffset must: be a multiple of
* [[VUID-vkCmdTraceRaysNV-missShaderBindingOffset-02457]]
pname:missShaderBindingOffset must: be less than the size of
* [[VUID-vkCmdTraceRaysNV-missShaderBindingOffset-02458]]
pname:missShaderBindingOffset must: be a multiple of
* [[VUID-vkCmdTraceRaysNV-hitShaderBindingOffset-02459]]
pname:hitShaderBindingOffset must: be less than the size of
* [[VUID-vkCmdTraceRaysNV-hitShaderBindingOffset-02460]]
pname:hitShaderBindingOffset must: be a multiple of
* [[VUID-vkCmdTraceRaysNV-callableShaderBindingOffset-02461]]
pname:callableShaderBindingOffset must: be less than the size of
* [[VUID-vkCmdTraceRaysNV-callableShaderBindingOffset-02462]]
pname:callableShaderBindingOffset must: be a multiple of
* [[VUID-vkCmdTraceRaysNV-missShaderBindingStride-02463]]
pname:missShaderBindingStride must: be a multiple of
* [[VUID-vkCmdTraceRaysNV-hitShaderBindingStride-02464]]
pname:hitShaderBindingStride must: be a multiple of
* [[VUID-vkCmdTraceRaysNV-callableShaderBindingStride-02465]]
pname:callableShaderBindingStride must: be a multiple of
* [[VUID-vkCmdTraceRaysNV-missShaderBindingStride-02466]]
pname:missShaderBindingStride must: be a less than or equal to
* [[VUID-vkCmdTraceRaysNV-hitShaderBindingStride-02467]]
pname:hitShaderBindingStride must: be a less than or equal to
* [[VUID-vkCmdTraceRaysNV-callableShaderBindingStride-02468]]
pname:callableShaderBindingStride must: be a less than or equal to
* [[VUID-vkCmdTraceRaysNV-width-02469]]
pname:width must: be less than or equal to
* [[VUID-vkCmdTraceRaysNV-height-02470]]
pname:height must: be less than or equal to
* [[VUID-vkCmdTraceRaysNV-depth-02471]]
pname:depth must: be less than or equal to
== Shader Binding Table
A _shader binding table_ is a resource which establishes the relationship
between the ray tracing pipeline and the acceleration structures that were
built for the ray tracing query.
It indicates the shaders that operate on each geometry in an acceleration
In addition, it contains the resources accessed by each shader, including
indices of textures and constants.
The application allocates and manages _shader binding tables_ as
slink:VkBuffer objects.
Each entry in the shader binding table consists of
pname:shaderGroupHandleSize bytes of data as queried by
flink:vkGetRayTracingShaderGroupHandlesNV to refer to the shader that it
The remainder of the data specified by the stride is application-visible
data that can be referenced by a code:shaderRecordNV block in the shader.
The shader binding tables to use in a ray tracing query are passed to
Shader binding tables are read-only in shaders that are executing on the ray
tracing pipeline.
=== Indexing Rules
In order to execute the correct shaders and access the correct resources
during a ray tracing dispatch, the implementation must: be able to locate
shader binding table entries at various stages of execution.
This is accomplished by defining a set of indexing rules that compute shader
binding table record positions relative to the buffer's base address in
The application must: organize the contents of the shader binding table's
memory in a way that application of the indexing rules will lead to correct
==== Ray Generation Shaders
Only one ray generation shader is executed per ray tracing dispatch.
Its location is passed into flink:vkCmdTraceRaysNV using the
pname:raygenShaderBindingTableBuffer and
pname:raygenShaderBindingTableOffset parameters -- there is no indexing.
==== Hit Shaders
The base for the computation of intersection, any-hit and closest hit shader
locations is the code:instanceShaderBindingTableRecordOffset value stored
with each instance of a top-level acceleration structure.
This value determines the beginning of the shader binding table records for
a given instance.
Each geometry in the instance must: have at least one hit program record.
In the following rule, _geometryIndex_ refers to the location of the
geometry within the instance.
The code:sbtRecordStride and code:sbtRecordOffset values are passed in as
parameters to code:traceNV() calls made in the shaders.
See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language
Specification for more details.
In SPIR-V, these correspond to the code:SBTOffset and code:SBTStride
parameters to the code:OpTraceNV instruction.
The result of this computation is then added to
pname:hitShaderBindingOffset, a base offset passed to
The complete rule to compute a hit shader binding table record address in
the pname:hitShaderBindingTableBuffer is:
:: [eq]#pname:hitShaderBindingOffset {plus} pname:hitShaderBindingStride
{times} ( code:instanceShaderBindingTableRecordOffset {plus}
_geometryIndex_ {times} code:sbtRecordStride {plus}
code:sbtRecordOffset )#
==== Miss Shaders
A miss shader is executed whenever a ray query fails to find an intersection
for the given scene geometry.
Multiple miss shaders may: be executed throughout a ray tracing dispatch.
The base for the computation of miss shader locations is
pname:missShaderBindingOffset, a base offset passed into
The code:missIndex value is passed in as parameters to code:traceNV() calls
made in the shaders.
See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language
Specification for more details.
In SPIR-V, this corresponds to the code:MissIndex parameter to the
code:OpTraceNV instruction.
The complete rule to compute a miss shader binding table record address in
the pname:missShaderBindingTableBuffer is:
:: [eq]#pname:missShaderBindingOffset {plus} pname:missShaderBindingStride
{times} code:missIndex#
==== Callable Shaders
A callable shader is executed when requested by a ray tracing shader.
Multiple callable shaders may: be executed throughout a ray tracing
The base for the computation of callable shader locations is
code:callableShaderBindingOffset, a base offset passed into
The code:sbtRecordIndex value is passed in as a parameter to
code:executeCallableNV() calls made in the shaders.
See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language
Specification for more details.
In SPIR-V, this corresponds to the code:SBTIndex parameter to the
code:OpExecuteCallableNV instruction.
The complete rule to compute a callable shader binding table record address
in the pname:callableShaderBindingTableBuffer is:
:: [eq]#code:callableShaderBindingOffset {plus}
pname:callableShaderBindingStride {times} code:sbtRecordIndex#
== Acceleration Structures
_Acceleration structures_ are data structures used by the implementation to
efficiently manage the scene geometry as it is traversed during a ray
tracing query.
The application is responsible for managing acceleration structure objects
(see <<resources-acceleration-structures,Acceleration Structures>>,
including allocation, destruction, executing builds or updates, and
synchronizing resources used during ray tracing queries.
There are two types of acceleration structures, _top level acceleration
structures_ and _bottom level acceleration structures_.
image::{images}/accelstruct.svg[align="center",title="Acceleration Structure",opts="{imageopts}"]
The diagram shows the relationship between top and bottom level acceleration
=== Instances
_Instances_ are found in top level acceleration structures and contain data
that refer to a single bottom-level acceleration structure, a transform
matrix, and shading information.
Multiple instances can: point to a single bottom level acceleration
An instance is defined in a slink:VkBuffer by a structure consisting of 64
bytes of data.
* pname:transform is 12 floats representing a 4x3 transform matrix in
row-major order
* pname:instanceCustomIndex The low 24 bits of a 32-bit integer after the
This value appears in the builtin code:gl_InstanceCustomIndexNV
* pname:mask The high 8 bits of the same integer as
This is the visibility mask.
The instance may: only be hit if `rayMask & instance.mask != 0`
* pname:instanceOffset The low 24 bits of the next 32-bit integer.
The value contributed by this instance to the hit shader binding table
index computation as code:instanceShaderBindingTableRecordOffset.
* pname:flags The high 8 bits of the same integer as pname:instanceOffset.
elink:VkGeometryInstanceFlagBitsNV values that apply to this instance.
* pname:accelerationStructure.
The 8 byte value returned by flink:vkGetAccelerationStructureHandleNV
for the bottom level acceleration structure referred to by this
The C language spec does not define the ordering of bit-fields, but in
practice, this struct produces the layout described above:
struct VkGeometryInstanceNV {
float transform[12];
uint32_t instanceCustomIndex : 24;
uint32_t mask : 8;
uint32_t instanceOffset : 24;
uint32_t flags : 8;
uint64_t accelerationStructureHandle;
[open,refpage='VkGeometryInstanceFlagBitsNV',desc='Instance flag bits',type='enums']
Possible values of pname:flags in the instance modifying the behavior of
that instance are:,
culling for this instance.
indicates that the front face of the triangle for culling purposes is
the face that is counter clockwise in object space relative to the ray
Because the facing is determined in object space, an instance transform
matrix does not change the winding, but a geometry transform does.
* ename:VK_GEOMETRY_INSTANCE_FORCE_OPAQUE_BIT_NV causes this instance to
act as though ename:VK_GEOMETRY_OPAQUE_BIT_NV were specified on all
geometries referenced by this instance.
This behavior can: be overridden by the ray flag
* ename:VK_GEOMETRY_INSTANCE_FORCE_NO_OPAQUE_BIT_NV causes this instance
to act as though ename:VK_GEOMETRY_OPAQUE_BIT_NV were not specified on
all geometries referenced by this instance.
This behavior can: be overridden by the ray flag
ename:VK_GEOMETRY_INSTANCE_FORCE_OPAQUE_BIT_NV must: not be used in the same
[open,refpage='VkGeometryInstanceFlagsNV',desc='Bitmask of VkGeometryInstanceFlagBitsNV',type='flags']
tname:VkGeometryInstanceFlagsNV is a bitmask type for setting a mask of zero
or more elink:VkGeometryInstanceFlagBitsNV.
=== Geometry
_Geometries_ refer to a triangle or axis-aligned bounding box.
=== Top Level Acceleration Structures
Opaque acceleration structure for an array of instances.
The descriptor referencing this is the starting point for tracing
=== Bottom Level Acceleration Structures
Opaque acceleration structure for an array of geometries.
=== Building Acceleration Structures
[open,refpage='vkCmdBuildAccelerationStructureNV',desc='Build an acceleration structure',type='protos']
To build an acceleration structure call:
* pname:commandBuffer is the command buffer into which the command will be
* pname:pInfo contains the shared information for the acceleration
structure's structure.
* pname:instanceData is the buffer containing instance data that will be
used to build the acceleration structure as described in
<<acceleration-structure-instance, Accelerator structure instances.>>
This parameter must: be `NULL` for bottom level acceleration structures.
* pname:instanceOffset is the offset in bytes (relative to the start of
pname:instanceData) at which the instance data is located.
* pname:update specifies whether to update the pname:dst acceleration
structure with the data in pname:src.
* pname:dst points to the target acceleration structure for the build.
* pname:src points to an existing acceleration structure that is to be
used to update the pname:dst acceleration structure.
* pname:scratch is the slink:VkBuffer that will be used as scratch memory
for the build.
* pname:scratchOffset is the offset in bytes relative to the start of
pname:scratch that will be used as a scratch memory.
.Valid Usage
* [[VUID-vkCmdBuildAccelerationStructureNV-geometryCount-02241]]
pname:geometryCount must: be less than or equal to
* [[VUID-vkCmdBuildAccelerationStructureNV-dst-02488]]
pname:dst must: have been created with compatible
slink:VkAccelerationStructureInfoNV where
slink:VkAccelerationStructureInfoNV::pname:type and
slink:VkAccelerationStructureInfoNV::pname:flags are identical,
slink:VkAccelerationStructureInfoNV::pname:instanceCount and
slink:VkAccelerationStructureInfoNV::pname:geometryCount for pname:dst
are greater than or equal to the build size and each geometry in
slink:VkAccelerationStructureInfoNV::pname:pGeometries for pname:dst has
greater than or equal to the number of vertices, indices, and AABBs.
* [[VUID-vkCmdBuildAccelerationStructureNV-update-02489]]
If pname:update is ename:VK_TRUE, pname:src must: not be
* [[VUID-vkCmdBuildAccelerationStructureNV-update-02490]]
If pname:update is ename:VK_TRUE, pname:src must: have been built before
* [[VUID-vkCmdBuildAccelerationStructureNV-update-02491]]
If pname:update is ename:VK_FALSE, The pname:size member of the
slink:VkMemoryRequirements structure returned from a call to
flink:vkGetAccelerationStructureMemoryRequirementsNV with
set to pname:dst and
slink:VkAccelerationStructureMemoryRequirementsInfoNV::pname:type set to
must: be less than or equal to the size of pname:scratch minus
* [[VUID-vkCmdBuildAccelerationStructureNV-update-02492]]
If pname:update is ename:VK_TRUE, The pname:size member of the
slink:VkMemoryRequirements structure returned from a call to
flink:vkGetAccelerationStructureMemoryRequirementsNV with
set to pname:dst and
slink:VkAccelerationStructureMemoryRequirementsInfoNV::pname:type set to
must: be less than or equal to the size of pname:scratch minus
=== Copying Acceleration Structures
An additional command exists for copying acceleration structures without
updating their contents.
The acceleration structure object can: be compacted in order to improve
Before copying, an application must: query the size of the resulting
acceleration structure.
[open,refpage='vkCmdWriteAccelerationStructuresPropertiesNV',desc='Write acceleration structure result parameters to query results.',type='protos']
To query acceleration structure size parameters call:
* pname:commandBuffer is the command buffer into which the command will be
* pname:accelerationStructureCount is the count of acceleration structures
for which to query the property.
* pname:pAccelerationStructures points to an array of existing previously
built acceleration structures.
* pname:queryType is a elink:VkQueryType value specifying the type of
queries managed by the pool.
* pname:queryPool is the query pool that will manage the results of the
* pname:firstQuery is the first query index within the query pool that
will contain the pname:accelerationStructureCount number of results.
.Valid Usage
* [[VUID-vkCmdWriteAccelerationStructuresPropertiesNV-queryType-02242]]
pname:queryType must: be
* [[VUID-vkCmdWriteAccelerationStructuresPropertiesNV-queryPool-02493]]
pname:queryPool must: have been created with a pname:queryType matching
* [[VUID-vkCmdWriteAccelerationStructuresPropertiesNV-queryPool-02494]]
The queries identified by pname:queryPool and pname:firstQuery must: be
* [[VUID-vkCmdWriteAccelerationStructuresPropertiesNV-accelerationStructures-02495]]
All acceleration structures in pname:accelerationStructures must: have
been built with
pname:queryType is
[open,refpage='vkCmdCopyAccelerationStructureNV',desc='Copy an acceleration structure',type='protos']
To copy an acceleration structure call:
* pname:commandBuffer is the command buffer into which the command will be
* pname:dst points to the target acceleration structure for the copy.
* pname:src points to the source acceleration structure for the copy.
* pname:mode is a elink:VkCopyAccelerationStructureModeNV value that
specifies additional operations to perform during the copy.
.Valid Usage
* [[VUID-vkCmdCopyAccelerationStructureNV-mode-02496]]
* [[VUID-vkCmdCopyAccelerationStructureNV-src-02497]]
pname:src must: have been built with
[open,refpage='VkCopyAccelerationStructureModeNV',desc='Acceleration structure copy mode',type='enums']
Possible values of flink:vkCmdCopyAccelerationStructureNV::pname:mode,
specifying additional operations to perform during the copy, are:
of the acceleration structure specified in pname:src into the one
specified by pname:dst.
The pname:dst acceleration structure must: have been created with the
same parameters as pname:src.
compact version of an acceleration structure pname:src into pname:dst.
The acceleration structure pname:dst must: have been created with a
pname:compactedSize corresponding to the one returned by
flink:vkCmdWriteAccelerationStructuresPropertiesNV after the build of
the acceleration structure specified by pname:src.