675 lines
29 KiB
Plaintext
675 lines
29 KiB
Plaintext
// Copyright (c) 2015-2017 Khronos Group. This work is licensed under a
|
|
// Creative Commons Attribution 4.0 International License; see
|
|
// http://creativecommons.org/licenses/by/4.0/
|
|
|
|
[[shaders]]
|
|
= Shaders
|
|
|
|
A shader specifies programmable operations that execute for each vertex,
|
|
control point, tessellated vertex, primitive, fragment, or workgroup in the
|
|
corresponding stage(s) of the graphics and compute pipelines.
|
|
|
|
Graphics pipelines include vertex shader execution as a result of
|
|
<<drawing,primitive assembly>>, followed, if enabled, by tessellation
|
|
control and evaluation shaders operating on
|
|
<<drawing-primitive-topologies-patches,patches>>, geometry shaders, if
|
|
enabled, operating on primitives, and fragment shaders, if present,
|
|
operating on fragments generated by <<primsrast,Rasterization>>.
|
|
In this specification, vertex, tessellation control, tessellation evaluation
|
|
and geometry shaders are collectively referred to as vertex processing
|
|
stages and occur in the logical pipeline before rasterization.
|
|
The fragment shader occurs logically after rasterization.
|
|
|
|
Only the compute shader stage is included in a compute pipeline.
|
|
Compute shaders operate on compute invocations in a workgroup.
|
|
|
|
Shaders can: read from input variables, and read from and write to output
|
|
variables.
|
|
Input and output variables can: be used to transfer data between shader
|
|
stages, or to allow the shader to interact with values that exist in the
|
|
execution environment.
|
|
Similarly, the execution environment provides constants that describe
|
|
capabilities.
|
|
|
|
Shader variables are associated with execution environment-provided inputs
|
|
and outputs using _built-in_ decorations in the shader.
|
|
The available decorations for each stage are documented in the following
|
|
subsections.
|
|
|
|
|
|
[[shader-modules]]
|
|
== Shader Modules
|
|
|
|
[open,refpage='VkShaderModule',desc='Opaque handle to a shader module object',type='handles']
|
|
--
|
|
|
|
_Shader modules_ contain _shader code_ and one or more entry points.
|
|
Shaders are selected from a shader module by specifying an entry point as
|
|
part of <<pipelines,pipeline>> creation.
|
|
The stages of a pipeline can: use shaders that come from different modules.
|
|
The shader code defining a shader module must: be in the SPIR-V format, as
|
|
described by the <<spirvenv,Vulkan Environment for SPIR-V>> appendix.
|
|
|
|
Shader modules are represented by sname:VkShaderModule handles:
|
|
|
|
include::../api/handles/VkShaderModule.txt[]
|
|
|
|
--
|
|
|
|
[open,refpage='vkCreateShaderModule',desc='Creates a new shader module object',type='protos']
|
|
--
|
|
|
|
To create a shader module, call:
|
|
|
|
include::../api/protos/vkCreateShaderModule.txt[]
|
|
|
|
* pname:device is the logical device that creates the shader module.
|
|
* pname:pCreateInfo parameter is a pointer to an instance of the
|
|
sname:VkShaderModuleCreateInfo structure.
|
|
* pname:pAllocator controls host memory allocation as described in the
|
|
<<memory-allocation, Memory Allocation>> chapter.
|
|
* pname:pShaderModule points to a sname:VkShaderModule handle in which the
|
|
resulting shader module object is returned.
|
|
|
|
Once a shader module has been created, any entry points it contains can: be
|
|
used in pipeline shader stages as described in <<pipelines-compute,Compute
|
|
Pipelines>> and <<pipelines-graphics,Graphics Pipelines>>.
|
|
|
|
ifdef::VK_NV_glsl_shader[]
|
|
If the shader stage fails to compile ename:VK_ERROR_INVALID_SHADER_NV will
|
|
be generated and the compile log will be reported back to the application by
|
|
+VK_EXT_debug_report+ if enabled.
|
|
endif::VK_NV_glsl_shader[]
|
|
|
|
include::../validity/protos/vkCreateShaderModule.txt[]
|
|
--
|
|
|
|
[open,refpage='VkShaderModuleCreateInfo',desc='Structure specifying parameters of a newly created shader module',type='structs']
|
|
--
|
|
|
|
The sname:VkShaderModuleCreateInfo structure is defined as:
|
|
|
|
include::../api/structs/VkShaderModuleCreateInfo.txt[]
|
|
|
|
* pname:sType is the type of this structure.
|
|
* pname:pNext is `NULL` or a pointer to an extension-specific structure.
|
|
* pname:flags is reserved for future use.
|
|
* pname:codeSize is the size, in bytes, of the code pointed to by
|
|
pname:pCode.
|
|
* pname:pCode points to code that is used to create the shader module.
|
|
The type and format of the code is determined from the content of the
|
|
memory addressed by pname:pCode.
|
|
|
|
.Valid Usage
|
|
****
|
|
* [[VUID-VkShaderModuleCreateInfo-codeSize-01085]]
|
|
pname:codeSize must: be greater than 0
|
|
ifndef::VK_NV_glsl_shader[]
|
|
* [[VUID-VkShaderModuleCreateInfo-codeSize-01086]]
|
|
pname:codeSize must: be a multiple of 4
|
|
* [[VUID-VkShaderModuleCreateInfo-pCode-01087]]
|
|
pname:pCode must: point to valid SPIR-V code, formatted and packed as
|
|
described by the <<spirv-spec,Khronos SPIR-V Specification>>
|
|
* [[VUID-VkShaderModuleCreateInfo-pCode-01088]]
|
|
pname:pCode must: adhere to the validation rules described by the
|
|
<<spirvenv-module-validation, Validation Rules within a Module>> section
|
|
of the <<spirvenv-capabilities,SPIR-V Environment>> appendix
|
|
endif::VK_NV_glsl_shader[]
|
|
ifdef::VK_NV_glsl_shader[]
|
|
* [[VUID-VkShaderModuleCreateInfo-pCode-01376]]
|
|
If pname:pCode points to SPIR-V code, pname:codeSize must: be a multiple
|
|
of 4
|
|
* [[VUID-VkShaderModuleCreateInfo-pCode-01377]]
|
|
pname:pCode must: point to either valid SPIR-V code, formatted and
|
|
packed as described by the <<spirv-spec,Khronos SPIR-V Specification>>
|
|
or valid GLSL code which must: be written to the +GL_KHR_vulkan_glsl+
|
|
extension specification
|
|
* [[VUID-VkShaderModuleCreateInfo-pCode-01378]]
|
|
If pname:pCode points to SPIR-V code, that code must: adhere to the
|
|
validation rules described by the <<spirvenv-module-validation,
|
|
Validation Rules within a Module>> section of the
|
|
<<spirvenv-capabilities,SPIR-V Environment>> appendix
|
|
* [[VUID-VkShaderModuleCreateInfo-pCode-01379]]
|
|
If pname:pCode points to GLSL code, it must: be valid GLSL code written
|
|
to the +GL_KHR_vulkan_glsl+ GLSL extension specification
|
|
endif::VK_NV_glsl_shader[]
|
|
* [[VUID-VkShaderModuleCreateInfo-pCode-01089]]
|
|
pname:pCode must: declare the code:Shader capability for SPIR-V code
|
|
* [[VUID-VkShaderModuleCreateInfo-pCode-01090]]
|
|
pname:pCode must: not declare any capability that is not supported by
|
|
the API, as described by the <<spirvenv-module-validation,
|
|
Capabilities>> section of the <<spirvenv-capabilities,SPIR-V
|
|
Environment>> appendix
|
|
* [[VUID-VkShaderModuleCreateInfo-pCode-01091]]
|
|
If pname:pCode declares any of the capabilities that are listed as not
|
|
required by the implementation, the relevant feature must: be enabled,
|
|
as listed in the <<spirvenv-capabilities-table,SPIR-V Environment>>
|
|
appendix
|
|
****
|
|
|
|
include::../validity/structs/VkShaderModuleCreateInfo.txt[]
|
|
--
|
|
|
|
[open,refpage='vkDestroyShaderModule',desc='Destroy a shader module module',type='protos']
|
|
--
|
|
|
|
To destroy a shader module, call:
|
|
|
|
include::../api/protos/vkDestroyShaderModule.txt[]
|
|
|
|
* pname:device is the logical device that destroys the shader module.
|
|
* pname:shaderModule is the handle of the shader module to destroy.
|
|
* pname:pAllocator controls host memory allocation as described in the
|
|
<<memory-allocation, Memory Allocation>> chapter.
|
|
|
|
A shader module can: be destroyed while pipelines created using its shaders
|
|
are still in use.
|
|
|
|
.Valid Usage
|
|
****
|
|
* [[VUID-vkDestroyShaderModule-shaderModule-01092]]
|
|
If sname:VkAllocationCallbacks were provided when pname:shaderModule was
|
|
created, a compatible set of callbacks must: be provided here
|
|
* [[VUID-vkDestroyShaderModule-shaderModule-01093]]
|
|
If no sname:VkAllocationCallbacks were provided when pname:shaderModule
|
|
was created, pname:pAllocator must: be `NULL`
|
|
****
|
|
|
|
include::../validity/protos/vkDestroyShaderModule.txt[]
|
|
--
|
|
|
|
|
|
[[shaders-execution]]
|
|
== Shader Execution
|
|
|
|
At each stage of the pipeline, multiple invocations of a shader may: execute
|
|
simultaneously.
|
|
Further, invocations of a single shader produced as the result of different
|
|
commands may: execute simultaneously.
|
|
The relative execution order of invocations of the same shader type is
|
|
undefined.
|
|
Shader invocations may: complete in a different order than that in which the
|
|
primitives they originated from were drawn or dispatched by the application.
|
|
However, fragment shader outputs are written to attachments in
|
|
<<primrast-order,rasterization order>>.
|
|
|
|
The relative order of invocations of different shader types is largely
|
|
undefined.
|
|
However, when invoking a shader whose inputs are generated from a previous
|
|
pipeline stage, the shader invocations from the previous stage are
|
|
guaranteed to have executed far enough to generate input values for all
|
|
required inputs.
|
|
|
|
|
|
[[shaders-execution-memory-ordering]]
|
|
== Shader Memory Access Ordering
|
|
|
|
The order in which image or buffer memory is read or written by shaders is
|
|
largely undefined.
|
|
For some shader types (vertex, tessellation evaluation, and in some cases,
|
|
fragment), even the number of shader invocations that may: perform loads and
|
|
stores is undefined.
|
|
|
|
In particular, the following rules apply:
|
|
|
|
* <<shaders-vertex-execution,Vertex>> and
|
|
<<shaders-tessellation-evaluation-execution,tessellation evaluation>>
|
|
shaders will be invoked at least once for each unique vertex, as defined
|
|
in those sections.
|
|
* <<shaders-fragment-execution,Fragment>> shaders will be invoked zero or
|
|
more times, as defined in that section.
|
|
* The relative order of invocations of the same shader type are undefined.
|
|
A store issued by a shader when working on primitive B might complete
|
|
prior to a store for primitive A, even if primitive A is specified prior
|
|
to primitive B.
|
|
This applies even to fragment shaders; while fragment shader outputs are
|
|
always written to the framebuffer in <<primrast-order, rasterization
|
|
order>>, stores executed by fragment shader invocations are not.
|
|
* The relative order of invocations of different shader types is largely
|
|
undefined.
|
|
|
|
[NOTE]
|
|
.Note
|
|
====
|
|
The above limitations on shader invocation order make some forms of
|
|
synchronization between shader invocations within a single set of primitives
|
|
unimplementable.
|
|
For example, having one invocation poll memory written by another invocation
|
|
assumes that the other invocation has been launched and will complete its
|
|
writes in finite time.
|
|
====
|
|
|
|
Stores issued to different memory locations within a single shader
|
|
invocation may: not be visible to other invocations, or may: not become
|
|
visible in the order they were performed.
|
|
|
|
The code:OpMemoryBarrier instruction can: be used to provide stronger
|
|
ordering of reads and writes performed by a single invocation.
|
|
code:OpMemoryBarrier guarantees that any memory transactions issued by the
|
|
shader invocation prior to the instruction complete prior to the memory
|
|
transactions issued after the instruction.
|
|
Memory barriers are needed for algorithms that require multiple invocations
|
|
to access the same memory and require the operations to be performed in a
|
|
partially-defined relative order.
|
|
For example, if one shader invocation does a series of writes, followed by
|
|
an code:OpMemoryBarrier instruction, followed by another write, then the
|
|
results of the series of writes before the barrier become visible to other
|
|
shader invocations at a time earlier or equal to when the results of the
|
|
final write become visible to those invocations.
|
|
In practice it means that another invocation that sees the results of the
|
|
final write would also see the previous writes.
|
|
Without the memory barrier, the final write may: be visible before the
|
|
previous writes.
|
|
|
|
Writes that are the result of shader stores through a variable decorated
|
|
with code:Coherent automatically have available writes to the same buffer,
|
|
buffer view, or image view made visible to them, and are themselves
|
|
automatically made available to access by the same buffer, buffer view, or
|
|
image view.
|
|
Reads that are the result of shader loads through a variable decorated with
|
|
code:Coherent automatically have available writes to the same buffer, buffer
|
|
view, or image view made visible to them.
|
|
The order that coherent writes to different locations become available is
|
|
undefined, unless enforced by a memory barrier instruction or other memory
|
|
dependency.
|
|
|
|
.Note
|
|
[NOTE]
|
|
====
|
|
Explicit memory dependencies must: still be used to guarantee availability
|
|
and visibility for access via other buffers, buffer views, or image views.
|
|
====
|
|
|
|
The built-in atomic memory transaction instructions can: be used to read and
|
|
write a given memory address atomically.
|
|
While built-in atomic functions issued by multiple shader invocations are
|
|
executed in undefined order relative to each other, these functions perform
|
|
both a read and a write of a memory address and guarantee that no other
|
|
memory transaction will write to the underlying memory between the read and
|
|
write.
|
|
Atomic operations ensure automatic availability and visibility for writes
|
|
and reads in the same way as those to code:Coherent variables.
|
|
|
|
.Note
|
|
[[NOTE]]
|
|
====
|
|
Memory accesses performed on different resource descriptors with the same
|
|
memory backing may: not be well-defined even with the code:Coherent
|
|
decoration or via atomics, due to things such as image layouts or ownership
|
|
of the resource - as described in the <<synchronization, Synchronization and
|
|
Cache Control>> chapter.
|
|
====
|
|
|
|
[NOTE]
|
|
.Note
|
|
====
|
|
Atomics allow shaders to use shared global addresses for mutual exclusion or
|
|
as counters, among other uses.
|
|
====
|
|
|
|
|
|
[[shaders-inputs]]
|
|
== Shader Inputs and Outputs
|
|
|
|
Data is passed into and out of shaders using variables with input or output
|
|
storage class, respectively.
|
|
User-defined inputs and outputs are connected between stages by matching
|
|
their code:Location decorations.
|
|
Additionally, data can: be provided by or communicated to special functions
|
|
provided by the execution environment using code:BuiltIn decorations.
|
|
|
|
In many cases, the same code:BuiltIn decoration can: be used in multiple
|
|
shader stages with similar meaning.
|
|
The specific behavior of variables decorated as code:BuiltIn is documented
|
|
in the following sections.
|
|
|
|
|
|
[[shaders-vertex]]
|
|
== Vertex Shaders
|
|
|
|
Each vertex shader invocation operates on one vertex and its associated
|
|
<<fxvertex-attrib,vertex attribute>> data, and outputs one vertex and
|
|
associated data.
|
|
Graphics pipelines must: include a vertex shader, and the vertex shader
|
|
stage is always the first shader stage in the graphics pipeline.
|
|
|
|
|
|
[[shaders-vertex-execution]]
|
|
=== Vertex Shader Execution
|
|
|
|
A vertex shader must: be executed at least once for each vertex specified by
|
|
a draw command.
|
|
ifdef::VK_KHX_multiview[]
|
|
If the subpass includes multiple views in its view mask, the shader may: be
|
|
invoked separately for each view.
|
|
endif::VK_KHX_multiview[]
|
|
During execution, the shader is presented with the index of the vertex and
|
|
instance for which it has been invoked.
|
|
Input variables declared in the vertex shader are filled by the
|
|
implementation with the values of vertex attributes associated with the
|
|
invocation being executed.
|
|
|
|
If the same vertex is specified multiple times in a draw command (e.g. by
|
|
including the same index value multiple times in an index buffer) the
|
|
implementation may: reuse the results of vertex shading if it can statically
|
|
determine that the vertex shader invocations will produce identical results.
|
|
|
|
[NOTE]
|
|
.Note
|
|
==================
|
|
It is implementation-dependent when and if results of vertex shading are
|
|
reused, and thus how many times the vertex shader will be executed.
|
|
This is true also if the vertex shader contains stores or atomic operations
|
|
(see <<features-features-vertexPipelineStoresAndAtomics,
|
|
pname:vertexPipelineStoresAndAtomics>>).
|
|
==================
|
|
|
|
|
|
|
|
[[shaders-tessellation-control]]
|
|
== Tessellation Control Shaders
|
|
|
|
The tessellation control shader is used to read an input patch provided by
|
|
the application and to produce an output patch.
|
|
Each tessellation control shader invocation operates on an input patch
|
|
(after all control points in the patch are processed by a vertex shader) and
|
|
its associated data, and outputs a single control point of the output patch
|
|
and its associated data, and can: also output additional per-patch data.
|
|
The input patch is sized according to the pname:patchControlPoints member of
|
|
slink:VkPipelineTessellationStateCreateInfo, as part of input assembly.
|
|
The size of the output patch is controlled by the code:OpExecutionMode
|
|
code:OutputVertices specified in the tessellation control or tessellation
|
|
evaluation shaders, which must: be specified in at least one of the shaders.
|
|
The size of the input and output patches must: each be greater than zero and
|
|
less than or equal to
|
|
sname:VkPhysicalDeviceLimits::pname:maxTessellationPatchSize.
|
|
|
|
|
|
[[shaders-tessellation-control-execution]]
|
|
=== Tessellation Control Shader Execution
|
|
|
|
A tessellation control shader is invoked at least once for each _output_
|
|
vertex in a patch.
|
|
ifdef::VK_KHX_multiview[]
|
|
If the subpass includes multiple views in its view mask, the shader may: be
|
|
invoked separately for each view.
|
|
endif::VK_KHX_multiview[]
|
|
|
|
Inputs to the tessellation control shader are generated by the vertex
|
|
shader.
|
|
Each invocation of the tessellation control shader can: read the attributes
|
|
of any incoming vertices and their associated data.
|
|
The invocations corresponding to a given patch execute logically in
|
|
parallel, with undefined relative execution order.
|
|
However, the code:OpControlBarrier instruction can: be used to provide
|
|
limited control of the execution order by synchronizing invocations within a
|
|
patch, effectively dividing tessellation control shader execution into a set
|
|
of phases.
|
|
Tessellation control shaders will read undefined values if one invocation
|
|
reads a per-vertex or per-patch attribute written by another invocation at
|
|
any point during the same phase, or if two invocations attempt to write
|
|
different values to the same per-patch output in a single phase.
|
|
|
|
|
|
[[shaders-tessellation-evaluation]]
|
|
== Tessellation Evaluation Shaders
|
|
|
|
The Tessellation Evaluation Shader operates on an input patch of control
|
|
points and their associated data, and a single input barycentric coordinate
|
|
indicating the invocation's relative position within the subdivided patch,
|
|
and outputs a single vertex and its associated data.
|
|
|
|
|
|
[[shaders-tessellation-evaluation-execution]]
|
|
=== Tessellation Evaluation Shader Execution
|
|
|
|
A tessellation evaluation shader is invoked at least once for each unique
|
|
vertex generated by the tessellator.
|
|
ifdef::VK_KHX_multiview[]
|
|
If the subpass includes multiple views in its view mask, the shader may: be
|
|
invoked separately for each view.
|
|
endif::VK_KHX_multiview[]
|
|
|
|
|
|
[[shaders-geometry]]
|
|
== Geometry Shaders
|
|
|
|
The geometry shader operates on a group of vertices and their associated
|
|
data assembled from a single input primitive, and emits zero or more output
|
|
primitives and the group of vertices and their associated data required for
|
|
each output primitive.
|
|
|
|
|
|
[[shaders-geometry-execution]]
|
|
=== Geometry Shader Execution
|
|
|
|
A geometry shader is invoked at least once for each primitive produced by
|
|
the tessellation stages, or at least once for each primitive generated by
|
|
<<drawing,primitive assembly>> when tessellation is not in use.
|
|
The number of geometry shader invocations per input primitive is determined
|
|
from the invocation count of the geometry shader specified by the
|
|
code:OpExecutionMode code:Invocations in the geometry shader.
|
|
If the invocation count is not specified, then a default of one invocation
|
|
is executed.
|
|
ifdef::VK_KHX_multiview[]
|
|
If the subpass includes multiple views in its view mask, the shader may: be
|
|
invoked separately for each view.
|
|
endif::VK_KHX_multiview[]
|
|
|
|
|
|
[[shaders-fragment]]
|
|
== Fragment Shaders
|
|
|
|
Fragment shaders are invoked as the result of rasterization in a graphics
|
|
pipeline.
|
|
Each fragment shader invocation operates on a single fragment and its
|
|
associated data.
|
|
With few exceptions, fragment shaders do not have access to any data
|
|
associated with other fragments and are considered to execute in isolation
|
|
of fragment shader invocations associated with other fragments.
|
|
|
|
|
|
[[shaders-fragment-execution]]
|
|
=== Fragment Shader Execution
|
|
|
|
For each fragment generated by rasterization, a fragment shader may: be
|
|
invoked.
|
|
A fragment shader must: not be invoked if the <<fragops-early,Early
|
|
Per-Fragment Tests>> cause it to have no coverage.
|
|
ifdef::VK_KHX_multiview[]
|
|
If the subpass includes multiple views in its view mask, the shader may: be
|
|
invoked separately for each view.
|
|
endif::VK_KHX_multiview[]
|
|
|
|
Furthermore, if it is determined that a fragment generated as the result of
|
|
rasterizing a first primitive will have its outputs entirely overwritten by
|
|
a fragment generated as the result of rasterizing a second primitive in the
|
|
same subpass, and the fragment shader used for the fragment has no other
|
|
side effects, then the fragment shader may: not be executed for the fragment
|
|
from the first primitive.
|
|
|
|
Relative ordering of execution of different fragment shader invocations is
|
|
not defined.
|
|
|
|
The number of fragment shader invocations produced per-pixel is determined
|
|
as follows:
|
|
|
|
* If per-sample shading is enabled, the fragment shader is invoked once
|
|
per covered sample.
|
|
* Otherwise, the fragment shader is invoked at least once per fragment but
|
|
no more than once per covered sample.
|
|
|
|
In addition to the conditions outlined above for the invocation of a
|
|
fragment shader, a fragment shader invocation may: be produced as a _helper
|
|
invocation_.
|
|
A helper invocation is a fragment shader invocation that is created solely
|
|
for the purposes of evaluating derivatives for use in non-helper fragment
|
|
shader invocations.
|
|
Stores and atomics performed by helper invocations must: not have any effect
|
|
on memory, and values returned by atomic instructions in helper invocations
|
|
are undefined.
|
|
|
|
|
|
[[shaders-fragment-earlytest]]
|
|
=== Early Fragment Tests
|
|
|
|
An explicit control is provided to allow fragment shaders to enable early
|
|
fragment tests.
|
|
If the fragment shader specifies the code:EarlyFragmentTests
|
|
code:OpExecutionMode, the per-fragment tests described in
|
|
<<fragops-early-mode,Early Fragment Test Mode>> are performed prior to
|
|
fragment shader execution.
|
|
Otherwise, they are performed after fragment shader execution.
|
|
|
|
ifdef::VK_EXT_post_depth_coverage[]
|
|
[[shaders-fragment-earlytest-postdepthcoverage]]
|
|
If the fragment shader additionally specifies the code:PostDepthCoverage
|
|
code:OpExecutionMode, the value of a variable decorated with the
|
|
<<interfaces-builtin-variables-samplemask,code:SampleMask>> built-in
|
|
reflects the coverage after the early fragment tests.
|
|
Otherwise, it reflects the coverage before the early fragment tests.
|
|
endif::VK_EXT_post_depth_coverage[]
|
|
|
|
[[shaders-compute]]
|
|
== Compute Shaders
|
|
|
|
Compute shaders are invoked via flink:vkCmdDispatch and
|
|
flink:vkCmdDispatchIndirect commands.
|
|
In general, they have access to similar resources as shader stages executing
|
|
as part of a graphics pipeline.
|
|
|
|
Compute workloads are formed from groups of work items called workgroups and
|
|
processed by the compute shader in the current compute pipeline.
|
|
A workgroup is a collection of shader invocations that execute the same
|
|
shader, potentially in parallel.
|
|
Compute shaders execute in _global workgroups_ which are divided into a
|
|
number of _local workgroups_ with a size that can: be set by assigning a
|
|
value to the code:LocalSize execution mode or via an object decorated by the
|
|
code:WorkgroupSize decoration.
|
|
An invocation within a local workgroup can: share data with other members of
|
|
the local workgroup through shared variables and issue memory and control
|
|
flow barriers to synchronize with other members of the local workgroup.
|
|
|
|
|
|
[[shaders-interpolation-decorations]]
|
|
== Interpolation Decorations
|
|
|
|
Interpolation decorations control the behavior of attribute interpolation in
|
|
the fragment shader stage.
|
|
Interpolation decorations can: be applied to code:Input storage class
|
|
variables in the fragment shader stage's interface, and control the
|
|
interpolation behavior of those variables.
|
|
|
|
Inputs that could be interpolated can: be decorated by at most one of the
|
|
following decorations:
|
|
|
|
* code:Flat: no interpolation
|
|
* code:NoPerspective: linear interpolation (for
|
|
<<line_linear_interpolation,lines>> and
|
|
<<triangle_linear_interpolation,polygons>>).
|
|
|
|
Fragment input variables decorated with neither code:Flat nor
|
|
code:NoPerspective use perspective-correct interpolation (for
|
|
<<line_perspective_interpolation,lines>> and
|
|
<<triangle_perspective_interpolation,polygons>>).
|
|
|
|
The presence of and type of interpolation is controlled by the above
|
|
interpolation decorations as well as the auxiliary decorations code:Centroid
|
|
and code:Sample.
|
|
|
|
A variable decorated with code:Flat will not be interpolated.
|
|
Instead, it will have the same value for every fragment within a triangle.
|
|
This value will come from a single <<vertexpostproc-flatshading,provoking
|
|
vertex>>.
|
|
A variable decorated with code:Flat can: also be decorated with
|
|
code:Centroid or code:Sample, which will mean the same thing as decorating
|
|
it only as code:Flat.
|
|
|
|
For fragment shader input variables decorated with neither code:Centroid nor
|
|
code:Sample, the assigned variable may: be interpolated anywhere within the
|
|
pixel and a single value may: be assigned to each sample within the pixel.
|
|
|
|
code:Centroid and code:Sample can: be used to control the location and
|
|
frequency of the sampling of the decorated fragment shader input.
|
|
If a fragment shader input is decorated with code:Centroid, a single value
|
|
may: be assigned to that variable for all samples in the pixel, but that
|
|
value must: be interpolated to a location that lies in both the pixel and in
|
|
the primitive being rendered, including any of the pixel's samples covered
|
|
by the primitive.
|
|
Because the location at which the variable is interpolated may: be different
|
|
in neighboring pixels, and derivatives may: be computed by computing
|
|
differences between neighboring pixels, derivatives of centroid-sampled
|
|
inputs may: be less accurate than those for non-centroid interpolated
|
|
variables.
|
|
ifdef::VK_EXT_post_depth_coverage[]
|
|
The <<shaders-fragment-earlytest-postdepthcoverage,code:PostDepthCoverage>>
|
|
execution mode does not affect the determination of the centroid location.
|
|
endif::VK_EXT_post_depth_coverage[]
|
|
If a fragment shader input is decorated with code:Sample, a separate value
|
|
must: be assigned to that variable for each covered sample in the pixel, and
|
|
that value must: be sampled at the location of the individual sample.
|
|
When pname:rasterizationSamples is ename:VK_SAMPLE_COUNT_1_BIT, the pixel
|
|
center must: be used for code:Centroid, code:Sample, and undecorated
|
|
attribute interpolation.
|
|
|
|
Fragment shader inputs that are signed or unsigned integers, integer
|
|
vectors, or any double-precision floating-point type must: be decorated with
|
|
code:Flat.
|
|
|
|
ifdef::VK_AMD_shader_explicit_vertex_parameter[]
|
|
When the +VK_AMD_shader_explicit_vertex_parameter+ device extension is
|
|
enabled inputs can: be also decorated with the code:CustomInterpAMD
|
|
interpolation decoration, including fragment shader inputs that are signed
|
|
or unsigned integers, integer vectors, or any double-precision
|
|
floating-point type.
|
|
Inputs decorated with code:CustomInterpAMD can: only be accessed by the
|
|
extended instruction code:InterpolateAtVertexAMD and allows accessing the
|
|
value of the input for individual vertices of the primitive.
|
|
endif::VK_AMD_shader_explicit_vertex_parameter[]
|
|
|
|
|
|
[[shaders-staticuse]]
|
|
== Static Use
|
|
|
|
A SPIR-V module declares a global object in memory using the code:OpVariable
|
|
instruction, which results in a pointer code:x to that object.
|
|
A specific entry point in a SPIR-V module is said to _statically use_ that
|
|
object if that entry point's call tree contains a function that contains a
|
|
memory instruction or image instruction with code:x as an code:id operand.
|
|
See the "`Memory Instructions`" and "`Image Instructions`" subsections of
|
|
section 3 "`Binary Form`" of the SPIR-V specification for the complete list
|
|
of SPIR-V memory instructions.
|
|
|
|
Static use is not used to control the behavior of variables with code:Input
|
|
and code:Output storage.
|
|
The effects of those variables are applied based only on whether they are
|
|
present in a shader entry point's interface.
|
|
|
|
[[shaders-invocationgroups]]
|
|
== Invocation and Derivative Groups
|
|
|
|
An _invocation group_ (see the subsection "`Control Flow`" of section 2 of
|
|
the SPIR-V specification) for a compute shader is the set of invocations in
|
|
a single local workgroup.
|
|
For graphics shaders, an invocation group is an implementation-dependent
|
|
subset of the set of shader invocations of a given shader stage which are
|
|
produced by a single drawing command.
|
|
For indirect drawing commands with pname:drawCount greater than one,
|
|
invocations from separate draws are in distinct invocation groups.
|
|
|
|
[NOTE]
|
|
.Note
|
|
====
|
|
Because the partitioning of invocations into invocation groups is
|
|
implementation-dependent and not observable, applications generally need to
|
|
assume the worst case of all invocations in a draw belonging to a single
|
|
invocation group.
|
|
====
|
|
|
|
A _derivative group_ (see the subsection "`Control Flow`" of section 2 of
|
|
the SPIR-V 1.00 Revision 4 specification) for a fragment shader is the set
|
|
of invocations generated by a single primitive (point, line, or triangle),
|
|
including any helper invocations generated by that primitive.
|
|
Derivatives are undefined for a sampled image instruction if the instruction
|
|
is in flow control that is not uniform across the derivative group.
|