Vulkan-Docs/doc/specs/vulkan/appendices/VK_EXT_shader_subgroup_vote.txt
Jon Leech df88ded281 Change log for September 5, 2017 Vulkan 1.0.60 spec update:
* Bump API patch number and header version number to 60 for this update.

Github Issues:

  * Document that <<queries-timestamps, Timestamp Queries>> can only be
    meaningfully compared when they are written from the same queue (public
    issue 216).
  * Document that the `<extension>` tag `type` attribute is required for
    non-disabled extensions (derived from, but does not close public issue
    354).
  * Clean up registry schema length attribute descriptions to be consistent
    and correct (public issue 555).

Internal Issues:

  * Replace as much of the hand-written extension appendix metadata as
    possible with asciidoc includes generated from corresponding attributes
    of +vk.xml+, and enhance the style guide to match. This avoids
    inconsistencies between +vk.xml+ and the appendices, and produces a more
    uniform style (internal issue 137).
  * Remove the generated extDependency.{py,sh} files from the tree and
    create them dynamically on demand instead, reducing merge conflicts
    (internal issue 713).
  * Add a prototype tool for generating in-place difference markup for
    sections guarded by asciidoc conditionals, and new syntax for open
    blocks to support it (internal issue 833).
  * Remove unnecessary restriction of etext:*SYNC_FD_BIT_KHR external handle
    types to the same physical device in the
    slink:VkPhysicalDeviceIDPropertiesKHR,
    flink:VkImportMemoryWin32HandleInfoKHR,
    slink:VkImportFenceWin32HandleInfoKHR, slink:VkImportFenceFdInfoKHR,
    slink:VkImportSemaphoreWin32HandleInfoKHR,
    slink:VkImportSemaphoreFdInfoKHR
    <<external-memory-handle-types-compatibility, External memory handle
    types compatibility>>, <<external-semaphore-handle-types-compatibility,
    External semaphore handle types compatibility>>, and
    <<external-fence-handle-types-compatibility, External fence handle types
    compatibility>> sections (internal issue 956).

Other Issues:

  * Remove dependency of +VK_KHX_device_group+ on +VK_KHR_swapchain+ (there
    is an interaction, but not a strict dependency), and add a new
    `extension` attribute to the `<require` XML tag to allow classifying a
    subset of interfaces of an extension as requiring another extension.
    Update the registry schema and documentation accordingly.

New Extensions:

  * `VK_AMD_shader_fragment_mask` (and related `GL_AMD_shader_fragment_mask`
    GLSL extension)
  * `VK_EXT_sample_locations`
  * `VK_EXT_validation_cache`
2017-09-04 03:06:55 -07:00

135 lines
4.1 KiB
Plaintext

include::meta/VK_EXT_shader_subgroup_vote.txt[]
*Status*::
Draft
*Last Modified Date*::
2016-11-28
*IP Status*::
No known IP claims.
*Interactions and External Dependencies*::
- This extension requires the
https://www.khronos.org/registry/spir-v/extensions/KHR/SPV_KHR_subgroup_vote.html[+SPV_KHR_subgroup_vote+]
SPIR-V extension.
- This extension requires the
https://www.khronos.org/registry/OpenGL/extensions/ARB/ARB_shader_group_vote.txt[+GL_ARB_shader_group_vote+]
extension for GLSL source languages.
*Contributors*::
- Neil Henning, Codeplay
- Daniel Koch, NVIDIA Corporation
This extension adds support for the following SPIR-V extension in Vulkan:
* +SPV_KHR_subgroup_vote+
This extension provides new SPIR-V instructions:
* code:OpSubgroupAllKHR,
* code:OpSubgroupAnyKHR, and
* code:OpSubgroupAllEqualKHR.
to compute the composite of a set of boolean conditions across a group of
shader invocations that are running concurrently (a _subgroup_).
These composite results may be used to execute shaders more efficiently on a
slink:VkPhysicalDevice.
When using GLSL source-based shader languages, the following shader
functions from GL_ARB_shader_group_vote can map to these SPIR-V
instructions:
* code:anyInvocationARB() -> code:OpSubgroupAnyKHR,
* code:allInvocationsARB() -> code:OpSubgroupAllKHR, and
* code:allInvocationsEqualARB() -> code:OpSubgroupAllEqualKHR.
The subgroup across which the boolean conditions are evaluated is
implementation-dependent, and this extension provides no guarantee over how
individual shader invocations are assigned to subgroups.
In particular, a subgroup has no necessary relationship with the compute
shader _local workgroup_ -- any pair of shader invocations in a compute
local workgroup may execute in different subgroups as used by these
instructions.
Compute shaders operate on an explicitly specified group of threads (a local
workgroup), but many implementations will also group non-compute shader
invocations and execute them concurrently.
When executing code like
[source,c++]
----------------------------------------
if (condition) {
result = do_fast_path();
} else {
result = do_general_path();
}
----------------------------------------
where code:condition diverges between invocations, an implementation might
first execute code:do_fast_path() for the invocations where code:condition
is true and leave the other invocations dormant.
Once code:do_fast_path() returns, it might call code:do_general_path() for
invocations where code:condition is false and leave the other invocations
dormant.
In this case, the shader executes *both* the fast and the general path and
might be better off just using the general path for all invocations.
This extension provides the ability to avoid divergent execution by
evaluating a condition across an entire subgroup using code like:
[source,c++]
----------------------------------------
if (allInvocationsARB(condition)) {
result = do_fast_path();
} else {
result = do_general_path();
}
----------------------------------------
The built-in function code:allInvocationsARB() will return the same value
for all invocations in the group, so the group will either execute
code:do_fast_path() or code:do_general_path(), but never both.
For example, shader code might want to evaluate a complex function
iteratively by starting with an approximation of the result and then
refining the approximation.
Some input values may require a small number of iterations to generate an
accurate result (code:do_fast_path) while others require a larger number
(code:do_general_path).
In another example, shader code might want to evaluate a complex function
(code:do_general_path) that can be greatly simplified when assuming a
specific value for one of its inputs (code:do_fast_path).
=== New Object Types
None.
=== New Enum Constants
None.
=== New Enums
None.
=== New Structures
None.
=== New Functions
None.
=== New Built-In Variables
None.
=== New SPIR-V Capabilities
* <<spirvenv-capabilities-table-subgroupvote,SubgroupVoteKHR>>
=== Issues
None.
=== Version History
* Revision 1, 2016-11-28 (Daniel Koch)
- Initial draft