2019-01-06 03:40:12 +00:00
|
|
|
// Copyright (c) 2017-2019 Khronos Group. This work is licensed under a
|
2018-09-16 01:35:16 +00:00
|
|
|
// Creative Commons Attribution 4.0 International License; see
|
|
|
|
// http://creativecommons.org/licenses/by/4.0/
|
|
|
|
|
|
|
|
[appendix]
|
2018-09-08 22:52:13 +00:00
|
|
|
[[memory-model]]
|
|
|
|
= Memory Model
|
|
|
|
|
|
|
|
[[memory-model-agent]]
|
|
|
|
== Agent
|
|
|
|
|
|
|
|
_Operation_ is a general term for any task that is executed on the system.
|
|
|
|
|
|
|
|
NOTE: An operation is by definition something that is executed, thus if an
|
|
|
|
instruction is skipped due to flow control it does not constitute an
|
|
|
|
operation.
|
|
|
|
|
|
|
|
Each operation is executed by a particular _agent_.
|
|
|
|
Possible agents include each shader invocation, each host thread, and each
|
|
|
|
fixed-function stage of the pipeline.
|
|
|
|
|
|
|
|
|
|
|
|
[[memory-model-memory-location]]
|
|
|
|
== Memory Location
|
|
|
|
|
|
|
|
A _memory location_ identifies unique storage for 8 bits of data.
|
|
|
|
Memory operations access a _set of memory locations_ consisting of one or
|
|
|
|
more memory locations at a time, e.g. an operation accessing a 32-bit
|
|
|
|
integer in memory would read/write a set of four memory locations.
|
|
|
|
Two sets of memory locations _overlap_ if the intersection of their sets of
|
|
|
|
memory locations is non-empty.
|
|
|
|
A memory operation must: not affect memory at a memory location not within
|
|
|
|
its set of memory locations.
|
|
|
|
|
|
|
|
Memory locations for buffers and images are explicitly allocated in
|
Change log for January 13, 2019 Vulkan 1.1.98 spec update:
* Update release number to 98.
Public Issues:
* Fix missing markup in flink:vkDestroyPipelineLayout valid usage
statement (pull request 882).
* Add missing contributors for `<<VK_EXT_buffer_device_address>>` (public
pull request 891).
Internal Issues:
* Detect nested bullet points in valid usage blocks and warn about them
during VUID assignment (internal issue 1382).
* Update the style guide to document the process for reserving new bits in
bitmask types (internal issue 1411).
* Clarify for slink:VkApplicationInfo::pname:apiVersion and in the
<<fundamentals-validusage-versions, Valid Usage for Newer Core
Versions>> section when it is valid for an application to use a certain
version of Vulkan API functionality (for an instance and for a
device/physical device); and when the validation layers must generate an
error (internal issue 1412).
* Add optional <<memory-model-availability-visibility, transitive
availability/visibility operations to the memory model, including a new
pname:vulkanMemoryModelAvailabilityVisibilityChains feature for
slink:VkPhysicalDeviceVulkanMemoryModelFeaturesKHR (internal issue
1460).
* Add the code:StorageBuffer storage class to those in the
<<interfaces-resources-descset, Descriptor Set Interface>> (internal
issue 1480).
* Add missing `returnedonly` tags for a number of returned extension
structures that can be passed in pname:pNext chains (internal issue
1515).
* Clean up and rearrange some spec language for
slink:VkRenderPassCreateInfo and slink:VkAttachmentReference.txt
(internal issue 1522).
* Correctly round the code:OpVectorTimesScalar and
code:OpMatrixTimesScalar SPIR-V operations in the <<Precision of core
SPIR-V Instructions>> table (internal merge request 2996).
* Work around cases in flink:vkCmdBeginTransformFeedbackEXT,
flink:vkCmdEndTransformFeedbackEXT, and
slink:VkPipelineCoverageModulationStateCreateInfoNV where an array
parameter is `optional` but the length is not in `vk.xml`. This is an
interim fix using `noautovalidity` + handcoded VU replacing those that
should be autogenerated (internal issue 2944 and
https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/480).
* Remove redundant capability validation of the code:float16 and code:int8
SPIR-V capabilities from the <<spirvenv-capabilities, Capabilities>>
section, since they are already covered in the preceding table.
* Update check_spec_links script, including validation for reference page
open blocks. Fix errors identified by the script.
2019-01-13 13:53:27 +00:00
|
|
|
slink:VkDeviceMemory objects, and are implicitly allocated for SPIR-V
|
|
|
|
variables in each shader invocation.
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
[[memory-model-allocation]]
|
|
|
|
== Allocation
|
|
|
|
|
|
|
|
The values stored in newly allocated memory locations are determined by a
|
|
|
|
SPIR-V variable's initializer, if present, or else are undefined.
|
|
|
|
At the time an allocation is created there have been no
|
|
|
|
<<memory-model-memory-operation,memory operations>> to any of its memory
|
|
|
|
locations.
|
|
|
|
The initialization is not considered to be a memory operation.
|
|
|
|
|
|
|
|
NOTE: For tessellation control shader output variables, a consequence of
|
|
|
|
initialization not being considered a memory operation is that some
|
|
|
|
implementations may need to insert a barrier between the initialization of
|
|
|
|
the output variables and any reads of those variables.
|
|
|
|
|
|
|
|
[[memory-model-memory-operation]]
|
|
|
|
== Memory Operation
|
|
|
|
|
|
|
|
For an operation A and memory location M:
|
|
|
|
|
|
|
|
* [[memory-model-access-read]] A _reads_ M if and only if the data stored
|
|
|
|
in M is an input to A.
|
|
|
|
* [[memory-model-access-write]] A _writes_ M if and only if the data
|
|
|
|
output from A is stored to M.
|
|
|
|
* [[memory-model-access-access]] A _accesses_ M if and only if it either
|
|
|
|
reads or writes (or both) M.
|
|
|
|
|
|
|
|
NOTE: A write whose value is the same as what was already in those memory
|
|
|
|
locations is still considered to be a write and has all the same effects.
|
|
|
|
|
|
|
|
[[memory-model-references]]
|
|
|
|
== Reference
|
|
|
|
|
|
|
|
A _reference_ is an object that a particular agent can: use to access a set
|
|
|
|
of memory locations.
|
|
|
|
On the host, a reference is a host virtual address.
|
|
|
|
On the device, a reference is:
|
|
|
|
|
|
|
|
* The descriptor that a variable is bound to, for variables in Image,
|
|
|
|
Uniform, or StorageBuffer storage classes.
|
|
|
|
If the variable is an array (or array of arrays, etc.) then each element
|
|
|
|
of the array may: be a unique reference.
|
2019-01-06 03:40:12 +00:00
|
|
|
ifdef::VK_EXT_buffer_device_address[]
|
|
|
|
* The address range for a buffer in code:PhysicalStorageBufferEXT storage
|
|
|
|
class, where the base of the address range is queried with
|
|
|
|
flink:vkGetBufferDeviceAddressEXT and the length of the range is the
|
|
|
|
size of the buffer.
|
|
|
|
endif::VK_EXT_buffer_device_address[]
|
|
|
|
* The variable itself for variables in other storage classes.
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
Two memory accesses through distinct references may: require availability
|
|
|
|
and visibility operations as defined
|
|
|
|
<<memory-model-location-ordered,below>>.
|
|
|
|
|
|
|
|
[[memory-model-program-order]]
|
|
|
|
== Program-Order
|
|
|
|
|
|
|
|
A _dynamic instance_ of an instruction is defined in SPIR-V
|
|
|
|
(https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#DynamicInstance)
|
|
|
|
as a way of referring to a particular execution of a static instruction.
|
|
|
|
Program-order is an ordering on dynamic instances of instructions executed
|
|
|
|
by a single shader invocation:
|
|
|
|
|
|
|
|
* (Basic block): If instructions A and B are in the same basic block, and
|
|
|
|
A is listed in the module before B, then the n'th dynamic instance of A
|
|
|
|
is program-ordered before the n'th dynamic instance of B.
|
|
|
|
* (Branch): The dynamic instance of a branch or switch instruction is
|
|
|
|
program-ordered before the dynamic instance of the OpLabel instruction
|
|
|
|
to which it transfers control.
|
|
|
|
* (Call entry): The dynamic instance of a function call instruction is
|
|
|
|
program-ordered before the dynamic instances of the
|
|
|
|
code:OpFunctionParameter instructions and the body of the called
|
|
|
|
function.
|
|
|
|
* (Call exit): The dynamic instance of the instruction following a
|
|
|
|
function call instruction is program-ordered after the dynamic instance
|
|
|
|
of the return instruction executed by the called function.
|
|
|
|
* (Transitive Closure): If dynamic instance A of any instruction is
|
|
|
|
program-ordered before dynamic instance B of any instruction and B is
|
|
|
|
program-ordered before dynamic instance C of any instruction then A is
|
|
|
|
program-ordered before C.
|
|
|
|
* (Complete definition): No other dynamic instances are program-ordered.
|
|
|
|
|
|
|
|
For instructions executed on the host, the source language defines the
|
2018-10-08 23:12:09 +00:00
|
|
|
program-order relation (e.g. as "`sequenced-before`").
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
[[memory-model-scope]]
|
|
|
|
== Scope
|
|
|
|
|
|
|
|
A _scope_ describes a set of shader invocations, where each such set is a
|
|
|
|
_scope instance_.
|
|
|
|
Scopes are defined hierarchically such that a more inclusive scope includes
|
|
|
|
one or more sets of less inclusive scope instances.
|
|
|
|
The scopes defined by SPIR-V are as follows, defined from most inclusive to
|
|
|
|
least inclusive:
|
|
|
|
|
|
|
|
* code:CrossDevice identifies all shader invocations in a Vulkan instance
|
|
|
|
across all shader launches, and all host threads interacting with that
|
|
|
|
instance.
|
|
|
|
* code:Device identifes all shader invocations that execute on a given
|
|
|
|
device, including those from different shader launches.
|
|
|
|
* code:QueueFamilyKHR identifes all shader invocations that execute on any
|
|
|
|
queue in a given queue family, including those from different shader
|
|
|
|
launches.
|
|
|
|
* code:Workgroup identifies all invocations in a single workgroup.
|
|
|
|
* code:Subgroup identifies all invocations in a single subgroup.
|
|
|
|
* code:Invocation identifies a single invocation.
|
|
|
|
|
|
|
|
Atomic and barrier instructions include scopes which identify sets of shader
|
|
|
|
invocations that must: obey the requested ordering and atomicity rules of
|
|
|
|
the operation, as defined below.
|
|
|
|
|
|
|
|
[[memory-model-atomic-operation]]
|
|
|
|
== Atomic Operation
|
|
|
|
|
|
|
|
An _atomic operation_ on the device is any SPIR-V operation whose name
|
|
|
|
begins with code:OpAtomic.
|
|
|
|
An atomic operation on the host is any operation performed with an
|
|
|
|
std::atomic typed object.
|
|
|
|
|
|
|
|
Each atomic operation has a memory <<memory-model-scope,scope>> and a
|
|
|
|
<<memory-model-memory-semantics,semantics>>.
|
|
|
|
Informally, the scope determines which other agents it is atomic with
|
|
|
|
respect to, and the <<memory-model-memory-semantics,semantics>> constrains
|
2018-10-29 04:32:44 +00:00
|
|
|
its ordering against other memory accesses.
|
2018-09-08 22:52:13 +00:00
|
|
|
Device atomic operations have explicit scopes and semantics.
|
|
|
|
Each host atomic operation implicitly uses the code:CrossDevice scope, and
|
|
|
|
uses a memory semantics equivalent to a C++ std::memory_order value of
|
|
|
|
relaxed, acquire, release, acq_rel, or seq_cst.
|
|
|
|
|
|
|
|
Two atomic operations A and B are _potentially-mutually-ordered_ if and only
|
|
|
|
if all of the following are true:
|
|
|
|
|
|
|
|
* They access the same set of memory locations.
|
|
|
|
* They use the same reference.
|
|
|
|
* A is in the instance of B's memory scope.
|
|
|
|
* B is in the instance of A's memory scope.
|
2019-02-10 10:35:13 +00:00
|
|
|
* A and B are not the same operation (irreflexive).
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
Two atomic operations A and B are _mutually-ordered_ if and only if they are
|
|
|
|
potentially-mutually-ordered and any of the following are true:
|
|
|
|
|
|
|
|
* A and B are both device operations.
|
|
|
|
* A and B are both host operations.
|
|
|
|
* A is a device operation, B is a host operation, and the implementation
|
|
|
|
supports concurrent host- and device-atomics.
|
|
|
|
|
2018-10-29 06:53:18 +00:00
|
|
|
NOTE: If two atomic operations are not mutually-ordered, and if their sets
|
|
|
|
of memory locations overlap, then each must: be synchronized against the
|
|
|
|
other as if they were non-atomic operations.
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
[[memory-model-scoped-modification-order]]
|
|
|
|
== Scoped Modification Order
|
|
|
|
|
2019-02-11 06:18:29 +00:00
|
|
|
For a given atomic write A, all atomic writes that are mutually-ordered with
|
|
|
|
A occur in an order known as A's _scoped modification order_.
|
2018-10-29 04:32:44 +00:00
|
|
|
A's scoped modification order relates no other operations.
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
NOTE: Invocations outside the instance of A's memory scope may: observe the
|
|
|
|
values at A's set of memory locations becoming visible to it in an order
|
|
|
|
that disagrees with the scoped modification order.
|
|
|
|
|
|
|
|
NOTE: It is valid to have non-atomic operations or atomics in a different
|
|
|
|
scope instance to the same set of memory locations, as long as they are
|
|
|
|
synchronized against each other as if they were non-atomic (if they are not,
|
|
|
|
it is treated as a <<memory-model-access-data-race,data race>>).
|
|
|
|
That means this definition of A's scoped modification order could include
|
|
|
|
atomic operations that occur much later, after intervening non-atomics.
|
|
|
|
That is a bit non-intuitive, but it helps to keep this definition simple and
|
|
|
|
non-circular.
|
|
|
|
|
|
|
|
[[memory-model-memory-semantics]]
|
|
|
|
== Memory Semantics
|
|
|
|
|
|
|
|
Non-atomic memory operations, by default, may: be observed by one agent in a
|
|
|
|
different order than they were written by another agent.
|
|
|
|
|
|
|
|
Atomics and some synchronization operations include _memory semantics_,
|
|
|
|
which are flags that constrain the order in which other memory accesses
|
|
|
|
(including non-atomic memory accesses and
|
|
|
|
<<memory-model-availability-visibility,availability and visibility
|
|
|
|
operations>>) performed by the same agent can: be observed by other agents,
|
|
|
|
or can: observe accesses by other agents.
|
|
|
|
|
|
|
|
Device instructions that include semantics are code:OpAtomic*,
|
|
|
|
code:OpControlBarrier, code:OpMemoryBarrier, and code:OpMemoryNamedBarrier.
|
|
|
|
Host instructions that include semantics are some std::atomic methods and
|
|
|
|
memory fences.
|
|
|
|
|
|
|
|
SPIR-V supports the following memory semantics:
|
|
|
|
|
|
|
|
* Relaxed: No constraints on order of other memory accesses.
|
|
|
|
* Acquire: A memory read with this semantic performs an _acquire
|
|
|
|
operation_.
|
|
|
|
A memory barrier with this semantic is an _acquire barrier_.
|
|
|
|
* Release: A memory write with this semantic performs a _release
|
|
|
|
operation_.
|
|
|
|
A memory barrier with this semantic is a _release barrier_.
|
|
|
|
* AcquireRelease: A memory read-modify-write operation with this semantic
|
|
|
|
performs both an acquire operation and a release operation, and inherits
|
|
|
|
the limitations on ordering from both of those operations.
|
|
|
|
A memory barrier with this semantic is both a release and acquire
|
|
|
|
barrier.
|
|
|
|
|
2018-10-08 23:12:09 +00:00
|
|
|
NOTE: SPIR-V does not support "`consume`" semantics on the device.
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
The memory semantics operand also includes _storage class semantics_ which
|
|
|
|
indicate which storage classes are constrained by the synchronization.
|
|
|
|
SPIR-V storage class semantics include:
|
|
|
|
|
|
|
|
* UniformMemory
|
|
|
|
* WorkgroupMemory
|
|
|
|
* ImageMemory
|
|
|
|
* OutputMemoryKHR
|
|
|
|
|
|
|
|
Each SPIR-V memory operation accesses a single storage class.
|
|
|
|
Semantics in synchronization operations can include a combination of storage
|
|
|
|
classes.
|
|
|
|
|
|
|
|
The UniformMemory storage class semantic applies to accesses to memory in
|
2019-01-06 03:40:12 +00:00
|
|
|
the
|
|
|
|
ifdef::VK_EXT_buffer_device_address[]
|
|
|
|
PhysicalStorageBufferEXT,
|
|
|
|
endif::VK_EXT_buffer_device_address[]
|
|
|
|
Uniform and StorageBuffer storage classes.
|
2018-09-08 22:52:13 +00:00
|
|
|
The WorkgroupMemory storage class semantic applies to accesses to memory in
|
|
|
|
the Workgroup storage class.
|
|
|
|
The ImageMemory storage class semantic applies to accesses to memory in the
|
|
|
|
Image storage class.
|
|
|
|
The OutputMemoryKHR storage class semantic applies to accesses to memory in
|
|
|
|
the Output storage class.
|
|
|
|
|
|
|
|
NOTE: Informally, these constraints limit how memory operations can be
|
|
|
|
reordered, and these limits apply not only to the order of accesses as
|
|
|
|
performed in the agent that executes the instruction, but also to the order
|
|
|
|
the effects of writes become visible to all other agents within the same
|
|
|
|
instance of the instruction's memory scope.
|
|
|
|
|
|
|
|
NOTE: Release and acquire operations in different threads can: act as
|
|
|
|
synchronization operations, to guarantee that writes that happened before
|
|
|
|
the release are visible after the acquire.
|
|
|
|
(This is not a formal definition, just an informative forward reference.)
|
|
|
|
|
|
|
|
NOTE: The OutputMemoryKHR storage class semantic is only useful in
|
|
|
|
tessellation control shaders, which is the only execution model where output
|
|
|
|
variables are shared between invocations.
|
|
|
|
|
|
|
|
The memory semantics operand also optionally includes availability and
|
2018-10-29 06:53:18 +00:00
|
|
|
visibility flags, which apply optional availability and visibility
|
|
|
|
operations as described in
|
|
|
|
<<memory-model-availability-visibility,availability and visibility>>.
|
2018-09-08 22:52:13 +00:00
|
|
|
The availability/visibility flags are:
|
|
|
|
|
|
|
|
* MakeAvailable: Semantics must: be Release or AcquireRelease.
|
|
|
|
Performs an availability operation before the release operation or
|
|
|
|
barrier.
|
|
|
|
* MakeVisible: Semantics must: be Acquire or AcquireRelease.
|
|
|
|
Performs a visibility operation after the acquire operation or barrier.
|
|
|
|
|
|
|
|
The specifics of these operations are defined in
|
|
|
|
<<memory-model-availability-visibility-semantics,Availability and Visibility
|
|
|
|
Semantics>>.
|
|
|
|
|
|
|
|
Host atomic operations may: support a different list of memory semantics and
|
|
|
|
synchronization operations, depending on the host architecture and source
|
|
|
|
language.
|
|
|
|
|
|
|
|
[[memory-model-release-sequence]]
|
|
|
|
== Release Sequence
|
|
|
|
|
|
|
|
After an atomic operation A performs a release operation on a set of memory
|
|
|
|
locations M, the _release sequence headed by A_ is the longest continuous
|
|
|
|
subsequence of A's scoped modification order that consists of:
|
|
|
|
|
|
|
|
* the atomic operation A as its first element
|
|
|
|
* atomic read-modify-write operations on M by any agent
|
|
|
|
|
|
|
|
NOTE: The atomics in the last bullet must: be mutually-ordered with A by
|
|
|
|
virtue of being in A's scoped modification order.
|
|
|
|
|
2018-10-08 23:12:09 +00:00
|
|
|
NOTE: This intentionally omits "`atomic writes to M performed by the same
|
|
|
|
agent that performed A`", which is present in the corresponding C++
|
2018-09-08 22:52:13 +00:00
|
|
|
definition.
|
|
|
|
|
|
|
|
[[memory-model-synchronizes-with]]
|
|
|
|
== Synchronizes-With
|
|
|
|
|
2018-10-29 04:32:44 +00:00
|
|
|
_Synchronizes-with_ is a relation between operations, where each operation
|
|
|
|
is either an atomic operation or a memory barrier (aka fence on the host).
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
If A and B are atomic operations, then A synchronizes-with B if and only if
|
|
|
|
all of the following are true:
|
|
|
|
|
|
|
|
* A performs a release operation
|
|
|
|
* B performs an acquire operation
|
|
|
|
* A and B are mutually-ordered
|
2018-10-29 04:32:44 +00:00
|
|
|
* B reads a value written by A or by an operation in the release sequence
|
|
|
|
headed by A
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
code:OpControlBarrier, code:OpMemoryBarrier, and code:OpMemoryNamedBarrier
|
|
|
|
are _memory barrier_ instructions in SPIR-V.
|
|
|
|
|
|
|
|
If A is a release barrier and B is an atomic operation that performs an
|
|
|
|
acquire operation, then A synchronizes-with B if and only if all of the
|
|
|
|
following are true:
|
|
|
|
|
|
|
|
* there exists an atomic write X (with any memory semantics)
|
|
|
|
* A is program-ordered before X
|
|
|
|
* X and B are mutually-ordered
|
2018-10-29 04:32:44 +00:00
|
|
|
* B reads a value written by X or by an operation in the release sequence
|
|
|
|
headed by X
|
|
|
|
** If X is relaxed, it is still considered to head a hypothetical release
|
|
|
|
sequence for this rule
|
2018-09-08 22:52:13 +00:00
|
|
|
* A and B are in the instance of each other's memory scopes
|
|
|
|
* X's storage class is in A's semantics.
|
|
|
|
|
|
|
|
If A is an atomic operation that performs a release operation and B is an
|
|
|
|
acquire barrier, then A synchronizes-with B if and only if all of the
|
|
|
|
following are true:
|
|
|
|
|
|
|
|
* there exists an atomic read X (with any memory semantics)
|
|
|
|
* X is program-ordered before B
|
|
|
|
* X and A are mutually-ordered
|
2018-10-29 04:32:44 +00:00
|
|
|
* X reads a value written by A or by an operation in the release sequence
|
|
|
|
headed by A
|
2018-09-08 22:52:13 +00:00
|
|
|
* A and B are in the instance of each other's memory scopes
|
|
|
|
* X's storage class is in B's semantics.
|
|
|
|
|
|
|
|
If A is a release barrier and B is an acquire barrier, then A
|
|
|
|
synchronizes-with B if all of the following are true:
|
|
|
|
|
|
|
|
* there exists an atomic write X (with any memory semantics)
|
|
|
|
* A is program-ordered before X
|
|
|
|
* there exists an atomic read Y (with any memory semantics)
|
|
|
|
* Y is program-ordered before B
|
|
|
|
* X and Y are mutually-ordered
|
2018-10-29 06:53:18 +00:00
|
|
|
* Y reads the value written by X or by an operation in the release
|
|
|
|
sequence headed by X
|
2018-10-29 04:32:44 +00:00
|
|
|
** If X is relaxed, it is still considered to head a hypothetical release
|
|
|
|
sequence for this rule
|
2018-09-08 22:52:13 +00:00
|
|
|
* A and B are in the instance of each other's memory scopes
|
|
|
|
* X's and Y's storage class is in A's and B's semantics.
|
|
|
|
** NOTE: X and Y must have the same storage class, because they are
|
|
|
|
mutually ordered.
|
|
|
|
|
|
|
|
If A is a release barrier and B is an acquire barrier and C is a control
|
2018-10-29 06:53:18 +00:00
|
|
|
barrier (where A can optionally equal C and B can optionally equal C), then
|
|
|
|
A synchronizes-with B if all of the following are true:
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
* A is program-ordered before (or equals) C
|
|
|
|
* C is program-ordered before (or equals) B
|
|
|
|
* A and B are in the instance of each other's memory scopes
|
|
|
|
* A and B are in the instance of C's execution scope
|
|
|
|
|
|
|
|
NOTE: This is similar to the barrier-barrier synchronization above, but with
|
|
|
|
a control barrier filling the role of the relaxed atomics.
|
|
|
|
|
|
|
|
No other release and acquire barriers synchronize-with each other.
|
|
|
|
|
|
|
|
[[memory-model-system-synchronizes-with]]
|
|
|
|
== System-Synchronizes-With
|
|
|
|
|
|
|
|
_System-synchronizes-with_ is a relation between arbitrary operations on the
|
|
|
|
device or host.
|
|
|
|
Certain operations system-synchronize-with each other, which informally
|
|
|
|
means the first operation occurs before the second and that the
|
|
|
|
synchronization is performed without using application-visible memory
|
|
|
|
accesses.
|
|
|
|
|
|
|
|
If there is an <<synchronization-dependencies-execution,execution
|
|
|
|
dependency>> between two operations A and B, then the operation in the first
|
|
|
|
synchronization scope system-synchronizes-with the operation in the second
|
|
|
|
synchronization scope.
|
|
|
|
|
|
|
|
NOTE: This covers all Vulkan synchronization primitives, including device
|
|
|
|
operations executing before a synchronization primitive is signaled, wait
|
|
|
|
operations happening before subsequent device operations, signal operations
|
|
|
|
happening before host operations that wait on them, and host operations
|
Change log for January 13, 2019 Vulkan 1.1.98 spec update:
* Update release number to 98.
Public Issues:
* Fix missing markup in flink:vkDestroyPipelineLayout valid usage
statement (pull request 882).
* Add missing contributors for `<<VK_EXT_buffer_device_address>>` (public
pull request 891).
Internal Issues:
* Detect nested bullet points in valid usage blocks and warn about them
during VUID assignment (internal issue 1382).
* Update the style guide to document the process for reserving new bits in
bitmask types (internal issue 1411).
* Clarify for slink:VkApplicationInfo::pname:apiVersion and in the
<<fundamentals-validusage-versions, Valid Usage for Newer Core
Versions>> section when it is valid for an application to use a certain
version of Vulkan API functionality (for an instance and for a
device/physical device); and when the validation layers must generate an
error (internal issue 1412).
* Add optional <<memory-model-availability-visibility, transitive
availability/visibility operations to the memory model, including a new
pname:vulkanMemoryModelAvailabilityVisibilityChains feature for
slink:VkPhysicalDeviceVulkanMemoryModelFeaturesKHR (internal issue
1460).
* Add the code:StorageBuffer storage class to those in the
<<interfaces-resources-descset, Descriptor Set Interface>> (internal
issue 1480).
* Add missing `returnedonly` tags for a number of returned extension
structures that can be passed in pname:pNext chains (internal issue
1515).
* Clean up and rearrange some spec language for
slink:VkRenderPassCreateInfo and slink:VkAttachmentReference.txt
(internal issue 1522).
* Correctly round the code:OpVectorTimesScalar and
code:OpMatrixTimesScalar SPIR-V operations in the <<Precision of core
SPIR-V Instructions>> table (internal merge request 2996).
* Work around cases in flink:vkCmdBeginTransformFeedbackEXT,
flink:vkCmdEndTransformFeedbackEXT, and
slink:VkPipelineCoverageModulationStateCreateInfoNV where an array
parameter is `optional` but the length is not in `vk.xml`. This is an
interim fix using `noautovalidity` + handcoded VU replacing those that
should be autogenerated (internal issue 2944 and
https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/480).
* Remove redundant capability validation of the code:float16 and code:int8
SPIR-V capabilities from the <<spirvenv-capabilities, Capabilities>>
section, since they are already covered in the preceding table.
* Update check_spec_links script, including validation for reference page
open blocks. Fix errors identified by the script.
2019-01-13 13:53:27 +00:00
|
|
|
happening before flink:vkQueueSubmit.
|
2018-09-08 22:52:13 +00:00
|
|
|
The list is spread throughout the synchronization chapter, and is not
|
|
|
|
repeated here.
|
|
|
|
|
|
|
|
System-synchronizes-with implicitly includes all storage class semantics and
|
|
|
|
has code:CrossDevice scope.
|
|
|
|
|
|
|
|
If A system-synchronizes-with B, we also say A is
|
|
|
|
_system-synchronized-before_ B and B is _system-synchronized-after_ A.
|
|
|
|
|
|
|
|
[[memory-model-non-private]]
|
|
|
|
== Private vs. Non-Private
|
|
|
|
|
|
|
|
By default, non-atomic memory operations are treated as _private_, meaning
|
|
|
|
such a memory operation is not intended to be used for communication with
|
|
|
|
other agents.
|
|
|
|
Memory operations with the NonPrivatePointerKHR/NonPrivateTexelKHR bit set
|
|
|
|
are treated as _non-private_, and are intended to be used for communication
|
|
|
|
with other agents.
|
|
|
|
|
|
|
|
More precisely, for private memory operations to be
|
|
|
|
<<memory-model-location-ordered,Location-Ordered>> between distinct agents
|
|
|
|
requires using system-synchronizes-with rather than shader-based
|
|
|
|
synchronization.
|
|
|
|
Non-private memory operations still obey program-order.
|
|
|
|
|
|
|
|
Atomic operations are always considered non-private.
|
|
|
|
|
|
|
|
[[memory-model-inter-thread-happens-before]]
|
|
|
|
== Inter-Thread-Happens-Before
|
|
|
|
|
|
|
|
Let SC be a non-empty set of storage class semantics.
|
|
|
|
Then (using template syntax) operation A _inter-thread-happens-before_<SC>
|
|
|
|
operation B if and only if any of the following is true:
|
|
|
|
|
|
|
|
* A system-synchronizes-with B
|
|
|
|
* A synchronizes-with B, and both A and B have all of SC in their
|
|
|
|
semantics
|
|
|
|
* A is an operation on memory in a storage class in SC or that has all of
|
|
|
|
SC in its semantics, B is a release barrier or release atomic with all
|
|
|
|
of SC in its semantics, and A is program-ordered before B
|
|
|
|
* A is an acquire barrier or acquire atomic with all of SC in its
|
|
|
|
semantics, B is an operation on memory in a storage class in SC or that
|
|
|
|
has all of SC in its semantics, and A is program-ordered before B
|
|
|
|
* A and B are both host operations and A inter-thread-happens-before B as
|
|
|
|
defined in the host language spec
|
|
|
|
* A inter-thread-happens-before<SC> some X and X
|
|
|
|
inter-thread-happens-before<SC> B
|
|
|
|
|
|
|
|
[[memory-model-happens-before]]
|
|
|
|
== Happens-Before
|
|
|
|
|
|
|
|
Operation A _happens-before_ operation B if and only if any of the following
|
|
|
|
is true:
|
|
|
|
|
|
|
|
* A is program-ordered before B
|
|
|
|
* A inter-thread-happens-before<SC> B for some set of storage classes SC
|
|
|
|
|
|
|
|
_Happens-after_ is defined similarly.
|
|
|
|
|
|
|
|
NOTE: Unlike C++, happens-before is not always sufficient for a write to be
|
|
|
|
visible to a read.
|
|
|
|
Additional <<memory-model-availability-visibility,availability and
|
|
|
|
visibility>> operations may: be required for writes to be
|
|
|
|
<<memory-model-visible-to,visible-to>> other memory accesses.
|
|
|
|
|
|
|
|
NOTE: Happens-before is not transitive, but each of program-order and
|
|
|
|
inter-thread-happens-before<SC> are transitive.
|
2018-10-08 23:12:09 +00:00
|
|
|
These can be thought of as covering the "`single-threaded`" case and the
|
2018-10-21 13:08:41 +00:00
|
|
|
"`multi-threaded`" case, and it's not necessary (and not valid) to form
|
|
|
|
chains between the two.
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
[[memory-model-availability-visibility]]
|
|
|
|
== Availability and Visibility
|
|
|
|
|
|
|
|
_Availability_ and _visibility_ are states of a write operation, which
|
|
|
|
(informally) track how far the write has permeated the system, i.e. which
|
|
|
|
agents and references are able to observe the write.
|
|
|
|
Availability state is per _memory domain_.
|
|
|
|
Visibility state is per (agent,reference) pair.
|
|
|
|
Availability and visibility states are per-memory location for each write.
|
|
|
|
|
|
|
|
Memory domains are named according to the agents whose memory accesses use
|
|
|
|
the domain.
|
|
|
|
Domains used by shader invocations are organized hierarchically into
|
|
|
|
multiple smaller memory domains which correspond to the different
|
|
|
|
<<memory-model-scope, scopes>>.
|
|
|
|
The memory domains defined in Vulkan include:
|
|
|
|
|
|
|
|
* _host_ - accessible by host agents
|
|
|
|
* _device_ - accessible by all device agents for a particular device
|
|
|
|
* _shader_ - accessible by shader agents for a particular device,
|
|
|
|
corresponding to the code:Device scope
|
|
|
|
* _queue family instance_ - accessible by shader agents in a single queue
|
|
|
|
family, corresponding to the code:QueueFamilyKHR scope.
|
|
|
|
* _workgroup instance_ - accessible by shader agents in the same
|
|
|
|
workgroup, corresponding to the code:Workgroup scope.
|
|
|
|
* _subgroup instance_ - accessible by shader agents in the same subgroup,
|
|
|
|
corresponding to the code:Subgroup scope.
|
|
|
|
|
|
|
|
NOTE: These do not correspond to storage classes or device-local and
|
Change log for January 13, 2019 Vulkan 1.1.98 spec update:
* Update release number to 98.
Public Issues:
* Fix missing markup in flink:vkDestroyPipelineLayout valid usage
statement (pull request 882).
* Add missing contributors for `<<VK_EXT_buffer_device_address>>` (public
pull request 891).
Internal Issues:
* Detect nested bullet points in valid usage blocks and warn about them
during VUID assignment (internal issue 1382).
* Update the style guide to document the process for reserving new bits in
bitmask types (internal issue 1411).
* Clarify for slink:VkApplicationInfo::pname:apiVersion and in the
<<fundamentals-validusage-versions, Valid Usage for Newer Core
Versions>> section when it is valid for an application to use a certain
version of Vulkan API functionality (for an instance and for a
device/physical device); and when the validation layers must generate an
error (internal issue 1412).
* Add optional <<memory-model-availability-visibility, transitive
availability/visibility operations to the memory model, including a new
pname:vulkanMemoryModelAvailabilityVisibilityChains feature for
slink:VkPhysicalDeviceVulkanMemoryModelFeaturesKHR (internal issue
1460).
* Add the code:StorageBuffer storage class to those in the
<<interfaces-resources-descset, Descriptor Set Interface>> (internal
issue 1480).
* Add missing `returnedonly` tags for a number of returned extension
structures that can be passed in pname:pNext chains (internal issue
1515).
* Clean up and rearrange some spec language for
slink:VkRenderPassCreateInfo and slink:VkAttachmentReference.txt
(internal issue 1522).
* Correctly round the code:OpVectorTimesScalar and
code:OpMatrixTimesScalar SPIR-V operations in the <<Precision of core
SPIR-V Instructions>> table (internal merge request 2996).
* Work around cases in flink:vkCmdBeginTransformFeedbackEXT,
flink:vkCmdEndTransformFeedbackEXT, and
slink:VkPipelineCoverageModulationStateCreateInfoNV where an array
parameter is `optional` but the length is not in `vk.xml`. This is an
interim fix using `noautovalidity` + handcoded VU replacing those that
should be autogenerated (internal issue 2944 and
https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/480).
* Remove redundant capability validation of the code:float16 and code:int8
SPIR-V capabilities from the <<spirvenv-capabilities, Capabilities>>
section, since they are already covered in the preceding table.
* Update check_spec_links script, including validation for reference page
open blocks. Fix errors identified by the script.
2019-01-13 13:53:27 +00:00
|
|
|
host-local slink:VkDeviceMemory allocations, rather they indicate whether a
|
|
|
|
write can be made visible only to agents in the same subgroup, same
|
|
|
|
workgroup, in any shader invocation, or anywhere on the device, or host.
|
2018-09-08 22:52:13 +00:00
|
|
|
The shader, queue family instance, workgroup instance, and subgroup instance
|
|
|
|
domains are only used for shader-based availability/visibility operatons, in
|
|
|
|
other cases writes can be made available from/visible to the shader via the
|
|
|
|
device domain.
|
|
|
|
|
|
|
|
_Availability operations_, _visibility operations_, and _memory domain
|
|
|
|
operations_ alter the state of the write operations that happen-before them,
|
|
|
|
and which are included in their _source scope_ to be available or visible to
|
|
|
|
their _destination scope_.
|
|
|
|
|
|
|
|
* For an availability operation, the source scope is a set of
|
|
|
|
(agent,reference,memory location) tuples, and the destination scope is a
|
|
|
|
set of memory domains.
|
|
|
|
* For a memory domain operation, the source scope is a memory domain and
|
|
|
|
the destination scope is a memory domain.
|
|
|
|
* For a visibility operation, the source scope is a set of memory domains
|
|
|
|
and the destination scope is a set of (agent,reference,memory location)
|
|
|
|
tuples.
|
|
|
|
|
|
|
|
How the scopes are determined depends on the specific operation.
|
|
|
|
Availability and memory domain operations expand the set of memory domains
|
|
|
|
to which the write is available.
|
|
|
|
Visibility operations expand the set of (agent,reference,memory location)
|
|
|
|
tuples to which the write is visible.
|
|
|
|
|
|
|
|
Recall that availability and visibility states are per-memory location, and
|
|
|
|
let W be a write operation to one or more locations performed by agent A via
|
|
|
|
reference R. Let L be one of the locations written.
|
|
|
|
(W,L) (the write W to L), is initially not available to any memory domain
|
|
|
|
and only visible to (A,R,L).
|
|
|
|
An availability operation AV that happens-after W and that includes (A,R,L)
|
|
|
|
in its source scope makes (W,L) _available_ to the memory domains in its
|
|
|
|
destination scope.
|
|
|
|
|
|
|
|
A memory domain operation DOM that happens-after AV and for which (W,L) is
|
|
|
|
available in the source scope makes (W,L) available in the destination
|
|
|
|
memory domain.
|
|
|
|
|
|
|
|
A visibility operation VIS that happens-after AV (or DOM) and for which
|
|
|
|
(W,L) is available in any domain in the source scope makes (W,L) _visible_
|
|
|
|
to all (agent,reference,L) tuples included in its destination scope.
|
|
|
|
|
|
|
|
If write W~2~ happens-after W, and their sets of memory locations overlap,
|
|
|
|
then W will not be available/visible to all agents/references for those
|
|
|
|
memory locations that overlap (and future AV/DOM/VIS ops can't revive W's
|
|
|
|
write to those locations).
|
|
|
|
|
|
|
|
Availability, memory domain, and visibility operations are treated like
|
|
|
|
other non-atomic memory accesses for the purpose of
|
|
|
|
<<memory-model-memory-semantics,memory semantics>>, meaning they can be
|
|
|
|
ordered by release-acquire sequences or memory barriers.
|
|
|
|
|
Change log for January 13, 2019 Vulkan 1.1.98 spec update:
* Update release number to 98.
Public Issues:
* Fix missing markup in flink:vkDestroyPipelineLayout valid usage
statement (pull request 882).
* Add missing contributors for `<<VK_EXT_buffer_device_address>>` (public
pull request 891).
Internal Issues:
* Detect nested bullet points in valid usage blocks and warn about them
during VUID assignment (internal issue 1382).
* Update the style guide to document the process for reserving new bits in
bitmask types (internal issue 1411).
* Clarify for slink:VkApplicationInfo::pname:apiVersion and in the
<<fundamentals-validusage-versions, Valid Usage for Newer Core
Versions>> section when it is valid for an application to use a certain
version of Vulkan API functionality (for an instance and for a
device/physical device); and when the validation layers must generate an
error (internal issue 1412).
* Add optional <<memory-model-availability-visibility, transitive
availability/visibility operations to the memory model, including a new
pname:vulkanMemoryModelAvailabilityVisibilityChains feature for
slink:VkPhysicalDeviceVulkanMemoryModelFeaturesKHR (internal issue
1460).
* Add the code:StorageBuffer storage class to those in the
<<interfaces-resources-descset, Descriptor Set Interface>> (internal
issue 1480).
* Add missing `returnedonly` tags for a number of returned extension
structures that can be passed in pname:pNext chains (internal issue
1515).
* Clean up and rearrange some spec language for
slink:VkRenderPassCreateInfo and slink:VkAttachmentReference.txt
(internal issue 1522).
* Correctly round the code:OpVectorTimesScalar and
code:OpMatrixTimesScalar SPIR-V operations in the <<Precision of core
SPIR-V Instructions>> table (internal merge request 2996).
* Work around cases in flink:vkCmdBeginTransformFeedbackEXT,
flink:vkCmdEndTransformFeedbackEXT, and
slink:VkPipelineCoverageModulationStateCreateInfoNV where an array
parameter is `optional` but the length is not in `vk.xml`. This is an
interim fix using `noautovalidity` + handcoded VU replacing those that
should be autogenerated (internal issue 2944 and
https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/480).
* Remove redundant capability validation of the code:float16 and code:int8
SPIR-V capabilities from the <<spirvenv-capabilities, Capabilities>>
section, since they are already covered in the preceding table.
* Update check_spec_links script, including validation for reference page
open blocks. Fix errors identified by the script.
2019-01-13 13:53:27 +00:00
|
|
|
An _availability chain_ is a sequence of availability operations of
|
|
|
|
increasing scope where element N+1 of the chain is performed in the same
|
|
|
|
scope instance as the destination of element N and element N happens-before
|
|
|
|
element N+1.
|
|
|
|
An example is an availability operation with destination scope of the
|
|
|
|
workgroup instance domain that happens before an availability operation to
|
|
|
|
the shader domain performed by an invocation in the same workgroup.
|
|
|
|
An availability chain AVC that happens-after W and that includes (A,R,L) in
|
|
|
|
the source scope makes (W,L) _available_ to the memory domains in its final
|
|
|
|
destination scope.
|
|
|
|
An availability chain with a single element is just the availability
|
|
|
|
operation.
|
|
|
|
|
|
|
|
Similarly, a _visibility chain_ is a sequence of visibility operations of
|
|
|
|
decreasing scope where element N of the chain is performed in the same scope
|
|
|
|
instance as the source of element N+1 and element N happens-before element
|
|
|
|
N+1.
|
|
|
|
An example is a visibility operation with source scope of the shader domain
|
|
|
|
that happens before a visibility operation with source scope of the
|
|
|
|
workgroup instance domain performance by an invocation in the same
|
|
|
|
workgroup.
|
|
|
|
A visibility chain VISC that happens-after AVC (or DOM) and for which (W,L)
|
|
|
|
is available in any domain in the source scope makes (W,L) _visible_ to all
|
|
|
|
(agent,reference,L) tuples included in its final destination scope.
|
|
|
|
A visibility chain with a single element is just the visibility operation.
|
|
|
|
|
2018-09-08 22:52:13 +00:00
|
|
|
[[memory-model-vulkan-availability-visibility]]
|
|
|
|
== Availability, Visibility, and Domain Operations
|
|
|
|
|
|
|
|
The following operations generate availability, visibility, and domain
|
|
|
|
operations.
|
|
|
|
When multiple availability/visibility/domain operations are described, they
|
|
|
|
are system-synchronized-with each other in the order listed.
|
|
|
|
|
|
|
|
An operation that performs a <<synchronization-dependencies-memory,memory
|
|
|
|
dependency>> generates:
|
|
|
|
|
|
|
|
* If the source access mask includes ename:VK_ACCESS_HOST_WRITE_BIT, then
|
|
|
|
the dependency includes a memory domain operation from host domain to
|
|
|
|
device domain.
|
|
|
|
* An availability operation with source scope of all writes in the first
|
|
|
|
<<synchronization-dependencies-access-scopes,access scope>> of the
|
|
|
|
dependency and a destination scope of the device domain.
|
|
|
|
* A visibility operation with source scope of the device domain and
|
|
|
|
destination scope of the second access scope of the dependency.
|
|
|
|
* If the destination access mask includes ename:VK_ACCESS_HOST_READ_BIT or
|
|
|
|
ename:VK_ACCESS_HOST_WRITE_BIT, then the dependency includes a memory
|
|
|
|
domain operation from device domain to host domain.
|
|
|
|
|
|
|
|
flink:vkFlushMappedMemoryRanges performs an availability operation, with a
|
|
|
|
source scope of (agents,references) = (all host threads, all mapped memory
|
|
|
|
ranges passed to the command), and destination scope of the host domain.
|
|
|
|
|
|
|
|
flink:vkInvalidateMappedMemoryRanges performs a visibility operation, with a
|
|
|
|
source scope of the host domain and a destination scope of
|
|
|
|
(agents,references) = (all host threads, all mapped memory ranges passed to
|
|
|
|
the command).
|
|
|
|
|
|
|
|
flink:vkQueueSubmit performs a memory domain operation from host to device,
|
|
|
|
and a visibility operation with source scope of the device domain and
|
|
|
|
destination scope of all agents and references on the device.
|
|
|
|
|
|
|
|
[[memory-model-availability-visibility-semantics]]
|
|
|
|
== Availability and Visibility Semantics
|
|
|
|
|
|
|
|
A memory barrier or atomic operation via agent A that includes MakeAvailable
|
|
|
|
in its semantics performs an availability operation whose source scope
|
|
|
|
includes agent A and all references in the storage classes in that
|
|
|
|
instruction's storage class semantics, and all memory locations, and whose
|
|
|
|
destination scope is a set of memory domains selected as specified below.
|
|
|
|
The implicit availability operation is program-ordered between the barrier
|
2018-10-21 13:08:41 +00:00
|
|
|
or atomic and all other operations program-ordered before the barrier or
|
|
|
|
atomic.
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
A memory barrier or atomic operation via agent A that includes MakeVisible
|
|
|
|
in its semantics performs a visibility operation whose source scope is a set
|
|
|
|
of memory domains selected as specified below, and whose destination scope
|
|
|
|
includes agent A and all references in the storage classes in that
|
|
|
|
instruction's storage class semantics, and all memory locations.
|
|
|
|
The implicit visibility operation is program-ordered between the barrier or
|
2018-10-21 13:08:41 +00:00
|
|
|
atomic and all other operations program-ordered after the barrier or atomic.
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
The memory domains are selected based on the memory scope of the instruction
|
|
|
|
as follows:
|
|
|
|
|
|
|
|
* code:Device scope uses the shader domain
|
|
|
|
* code:QueueFamilyKHR scope uses the queue family instance domain
|
|
|
|
* code:Workgroup scope uses the workgroup instance domain
|
|
|
|
* code:Subgroup uses the subgroup instance domain
|
|
|
|
* code:Invocation perform no availability/visibility operations.
|
|
|
|
|
|
|
|
When an availability operation performed by an agent A includes a memory
|
|
|
|
domain D in its destination scope, where D corresponds to scope instance S,
|
|
|
|
it also includes the memory domains that correspond to each smaller scope
|
|
|
|
instance S' that is a subset of S and that includes A. Similarly for
|
|
|
|
visibility operations.
|
|
|
|
|
|
|
|
[[memory-model-instruction-av-vis]]
|
|
|
|
== Per-Instruction Availability and Visibility Semantics
|
|
|
|
|
|
|
|
A memory write instruction that includes MakePointerAvailable, or an image
|
|
|
|
write instruction that includes MakeTexelAvailable, performs an availability
|
|
|
|
operation whose source scope includes the agent and reference used to
|
|
|
|
perform the write and the memory locations written by the instruction, and
|
|
|
|
whose destination scope is a set of memory domains selected by the Scope
|
|
|
|
operand specified in <<memory-model-availability-visibility-semantics,
|
|
|
|
Availability and Visibility Semantics>>.
|
|
|
|
The implicit availability operation is program-ordered between the write and
|
|
|
|
all other operations program-ordered after the write.
|
|
|
|
|
|
|
|
A memory read instruction that includes MakePointerVisible, or an image read
|
|
|
|
instruction that includes MakeTexelVisible, performs a visibility operation
|
|
|
|
whose source scope is a set of memory domains selected by the Scope operand
|
|
|
|
as specified in <<memory-model-availability-visibility-semantics,
|
|
|
|
Availability and Visibility Semantics>>, and whose destination scope
|
|
|
|
includes the agent and reference used to perform the read and the memory
|
|
|
|
locations read by the instruction.
|
|
|
|
The implicit visibility operation is program-ordered between read and all
|
|
|
|
other operations program-ordered before the read.
|
|
|
|
|
|
|
|
NOTE: Although reads with per-instruction visibility only perform visibility
|
|
|
|
ops from the shader or workgroup instance or subgroup instance domain, they
|
|
|
|
will also see writes that were made visible via the device domain, i.e.
|
|
|
|
those writes previously performed by non-shader agents and made visible via
|
|
|
|
API commands.
|
|
|
|
|
|
|
|
NOTE: It is expected that all invocations in a subgroup execute on the same
|
|
|
|
processor with the same path to memory, and thus availability and visibility
|
2018-10-08 23:12:09 +00:00
|
|
|
operations with subgroup scope can be expected to be "`free`".
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
[[memory-model-location-ordered]]
|
|
|
|
== Location-Ordered
|
|
|
|
|
|
|
|
Let X and Y be memory accesses to overlapping sets of memory locations M,
|
|
|
|
where X != Y. Let (A~X~,R~X~) be the agent and reference used for X, and
|
2018-10-08 23:12:09 +00:00
|
|
|
(A~Y~,R~Y~) be the agent and reference used for Y. For now, let "`->`"
|
|
|
|
denote happens-before and "`->^rcpo^`" denote the reflexive closure of
|
2018-09-08 22:52:13 +00:00
|
|
|
program-ordered before.
|
|
|
|
|
|
|
|
If D~1~ and D~2~ are different memory domains, then let DOM(D~1~,D~2~) be a
|
|
|
|
memory domain operation from D~1~ to D~2~.
|
|
|
|
Otherwise, let DOM(D,D) be a placeholder such that X->DOM(D,D)->Y if and
|
|
|
|
only if X->Y.
|
|
|
|
|
|
|
|
X is _location-ordered_ before Y for a location L in M if and only if any of
|
|
|
|
the following is true:
|
|
|
|
|
|
|
|
* A~X~ == A~Y~ and R~X~ == R~Y~ and X->Y
|
|
|
|
** NOTE: this case means no availability/visibility ops required when it's
|
|
|
|
the same (agent,reference).
|
|
|
|
|
|
|
|
* X is a read, both X and Y are non-private, and X->Y
|
|
|
|
* X is a read, and X (transitively) system-synchronizes with Y
|
|
|
|
|
|
|
|
* If R~X~ == R~Y~ and A~X~ and A~Y~ access a common memory domain D (e.g.
|
|
|
|
are in the same workgroup instance if D is the workgroup instance
|
|
|
|
domain), and both X and Y are non-private:
|
Change log for January 13, 2019 Vulkan 1.1.98 spec update:
* Update release number to 98.
Public Issues:
* Fix missing markup in flink:vkDestroyPipelineLayout valid usage
statement (pull request 882).
* Add missing contributors for `<<VK_EXT_buffer_device_address>>` (public
pull request 891).
Internal Issues:
* Detect nested bullet points in valid usage blocks and warn about them
during VUID assignment (internal issue 1382).
* Update the style guide to document the process for reserving new bits in
bitmask types (internal issue 1411).
* Clarify for slink:VkApplicationInfo::pname:apiVersion and in the
<<fundamentals-validusage-versions, Valid Usage for Newer Core
Versions>> section when it is valid for an application to use a certain
version of Vulkan API functionality (for an instance and for a
device/physical device); and when the validation layers must generate an
error (internal issue 1412).
* Add optional <<memory-model-availability-visibility, transitive
availability/visibility operations to the memory model, including a new
pname:vulkanMemoryModelAvailabilityVisibilityChains feature for
slink:VkPhysicalDeviceVulkanMemoryModelFeaturesKHR (internal issue
1460).
* Add the code:StorageBuffer storage class to those in the
<<interfaces-resources-descset, Descriptor Set Interface>> (internal
issue 1480).
* Add missing `returnedonly` tags for a number of returned extension
structures that can be passed in pname:pNext chains (internal issue
1515).
* Clean up and rearrange some spec language for
slink:VkRenderPassCreateInfo and slink:VkAttachmentReference.txt
(internal issue 1522).
* Correctly round the code:OpVectorTimesScalar and
code:OpMatrixTimesScalar SPIR-V operations in the <<Precision of core
SPIR-V Instructions>> table (internal merge request 2996).
* Work around cases in flink:vkCmdBeginTransformFeedbackEXT,
flink:vkCmdEndTransformFeedbackEXT, and
slink:VkPipelineCoverageModulationStateCreateInfoNV where an array
parameter is `optional` but the length is not in `vk.xml`. This is an
interim fix using `noautovalidity` + handcoded VU replacing those that
should be autogenerated (internal issue 2944 and
https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/480).
* Remove redundant capability validation of the code:float16 and code:int8
SPIR-V capabilities from the <<spirvenv-capabilities, Capabilities>>
section, since they are already covered in the preceding table.
* Update check_spec_links script, including validation for reference page
open blocks. Fix errors identified by the script.
2019-01-13 13:53:27 +00:00
|
|
|
** X is a write, Y is a write, AVC(A~X~,R~X~,D,L) is an availability chain
|
|
|
|
making (X,L) available to domain D, and X->^rcpo^AVC(A~X~,R~X~,D,L)->Y
|
|
|
|
** X is a write, Y is a read, AVC(A~X~,R~X~,D,L) is an availability chain
|
|
|
|
making (X,L) available to domain D, VISC(A~Y~,R~Y~,D,L) is a visibility
|
|
|
|
chain making writes to L available in domain D visible to Y, and
|
|
|
|
X->^rcpo^AVC(A~X~,R~X~,D,L)->VISC(A~Y~,R~Y~,D,L)->^rcpo^Y
|
|
|
|
** If
|
|
|
|
slink:VkPhysicalDeviceVulkanMemoryModelFeaturesKHR::pname:vulkanMemoryModelAvailabilityVisibilityChains
|
|
|
|
is ename:VK_FALSE, then AVC and VISC must: each only have a single
|
|
|
|
element in the chain, in each sub-bullet above.
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
* Let D~X~ and D~Y~ each be either the device domain or the host domain,
|
|
|
|
depending on whether A~X~ and A~Y~ execute on the device or host:
|
|
|
|
** X is a write and Y is a write, and
|
|
|
|
X->AV(A~X~,R~X~,D~X~,L)->DOM(D~X~,D~Y~)->Y
|
|
|
|
** X is a write and Y is a read, and
|
|
|
|
X->AV(A~X~,R~X~,D~X~,L)->DOM(D~X~,D~Y~)->VIS(A~Y~,R~Y~,D~Y~,L)->Y
|
|
|
|
|
|
|
|
NOTE: The final bullet (synchronization through device/host domain) requires
|
|
|
|
API-level synchronization operations, since the device/host domains are not
|
|
|
|
accessible via shader instructions.
|
2018-10-08 23:12:09 +00:00
|
|
|
And "`device domain`" is not to be confused with "`device scope`", which
|
|
|
|
synchronizes through the "`shader domain`".
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
[[memory-model-access-data-race]]
|
|
|
|
== Data Race
|
|
|
|
|
|
|
|
Let X and Y be operations that access overlapping sets of memory locations
|
|
|
|
M, where X != Y, and at least one of X and Y is a write, and X and Y are not
|
|
|
|
mutually-ordered atomic operations.
|
|
|
|
If there does not exist a location-ordered relation between X and Y for each
|
|
|
|
location in M, then there is a _data race_.
|
|
|
|
|
|
|
|
Applications must: ensure that no data races occur during the execution of
|
|
|
|
their application.
|
|
|
|
|
|
|
|
NOTE: Data races can only occur due to instructions that are actually
|
|
|
|
executed, and for example an instruction skipped due to flow control must
|
|
|
|
not contribute to a data race.
|
|
|
|
|
|
|
|
[[memory-model-visible-to]]
|
|
|
|
== Visible-To
|
|
|
|
|
|
|
|
Let X be a write and Y be a read whose sets of memory locations overlap, and
|
|
|
|
let M be the set of memory locations that overlap.
|
|
|
|
Let M~2~ be a non-empty subset of M. Then X is _visible-to_ Y for memory
|
|
|
|
locations M~2~ if and only if all of the following are true:
|
|
|
|
|
|
|
|
* X is location-ordered before Y for each location L in M~2~.
|
|
|
|
* There does not exist another write Z to any location L in M~2~ such that
|
|
|
|
X is location-ordered before Z for location L and Z is location-ordered
|
|
|
|
before Y for location L.
|
|
|
|
|
|
|
|
If X is visible-to Y, then Y reads the value written by X for locations
|
|
|
|
M~2~.
|
|
|
|
|
|
|
|
NOTE: It is possible for there to be a write between X and Y that overwrites
|
|
|
|
a subset of the memory locations, but the remaining memory locations (M~2~)
|
|
|
|
will still be visible-to Y.
|
|
|
|
|
2019-03-11 08:19:45 +00:00
|
|
|
[[memory-model-acyclicity]]
|
|
|
|
== Acyclicity
|
|
|
|
|
|
|
|
_Reads-from_ is a relation between operations, where the first operation is a
|
|
|
|
write, the second operation is a read, and the second operation reads the
|
|
|
|
value written by the first operation.
|
|
|
|
_From-reads_ is a relation between operations, where the first operation is a
|
|
|
|
read, the second operation is a write, and the first operation reads a value
|
|
|
|
written earlier than the second operation in the second operation's scoped
|
|
|
|
modification order (or the first operation reads from the initial value, and
|
|
|
|
the second operation is any write to the same locations).
|
|
|
|
|
|
|
|
Then the implementation must: guarantee that no cycles exist in the union of
|
|
|
|
the following relations:
|
2018-09-08 22:52:13 +00:00
|
|
|
|
2019-03-11 08:19:45 +00:00
|
|
|
* location-ordered
|
|
|
|
* scoped modification order (over all atomic writes)
|
|
|
|
* reads-from
|
|
|
|
* from-reads
|
2018-09-08 22:52:13 +00:00
|
|
|
|
2019-03-11 08:19:45 +00:00
|
|
|
NOTE: This is a "consistency" axiom, which informally guarantees that
|
|
|
|
sequences of operations can't violate causality.
|
|
|
|
|
|
|
|
[[memory-model-scoped-modification-order-coherence]]
|
|
|
|
=== Scoped Modification Order Coherence
|
|
|
|
|
|
|
|
Let A and B be mutually-ordered atomic operations, where A is
|
|
|
|
location-ordered before B.
|
|
|
|
Then the following rules are a consequence of acyclicity:
|
|
|
|
|
|
|
|
* If A and B are both reads and A does not read the initial value, then the
|
|
|
|
write that A takes its value from must: be earlier in its own scoped
|
|
|
|
modification order than (or the same as) the write that B takes its value
|
|
|
|
from (no cycles between location-order, reads-from, and from-reads).
|
|
|
|
* If A is a read and B is a write and A does not read the initial value,
|
|
|
|
then A must: take its value from a write earlier than B in B's scoped
|
|
|
|
modification order (no cycles between location-order, scope modification
|
|
|
|
order, and reads-from).
|
2018-09-08 22:52:13 +00:00
|
|
|
* If A is a write and B is a read, then B must: take its value from A or a
|
2019-03-11 08:19:45 +00:00
|
|
|
write later than A in A's scoped modification order (no cycles between
|
|
|
|
location-order, scoped modification order, and from-reads).
|
|
|
|
* If A and B are both writes, then A must: be earlier than B in A's scoped
|
|
|
|
modification order (no cycles between location-order and scoped
|
|
|
|
modification order).
|
|
|
|
* If A is a write and B is a read-modify-write and B reads the value
|
|
|
|
written by A, then B comes immediately after A in A's scoped modification
|
|
|
|
order (no cycles between scoped modification order and from-reads).
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
[[memory-model-shader-io]]
|
|
|
|
== Shader I/O
|
|
|
|
|
|
|
|
If a shader invocation A in a shader stage other than code:Vertex performs a
|
|
|
|
memory read operation X from an object in the code:Input storage class, then
|
|
|
|
X is system-synchronized-after all writes to the corresponding code:Output
|
|
|
|
storage variable(s) in the upstream shader invocation(s) that contribute to
|
|
|
|
generating invocation A, and those writes are all visible-to X.
|
|
|
|
|
|
|
|
NOTE: It is not necessary for the upstream shader invocations to have
|
|
|
|
completed execution, they only need to have generated the output that is
|
|
|
|
being read.
|
|
|
|
|
|
|
|
[[memory-model-deallocation]]
|
|
|
|
== Deallocation
|
|
|
|
|
Change log for January 13, 2019 Vulkan 1.1.98 spec update:
* Update release number to 98.
Public Issues:
* Fix missing markup in flink:vkDestroyPipelineLayout valid usage
statement (pull request 882).
* Add missing contributors for `<<VK_EXT_buffer_device_address>>` (public
pull request 891).
Internal Issues:
* Detect nested bullet points in valid usage blocks and warn about them
during VUID assignment (internal issue 1382).
* Update the style guide to document the process for reserving new bits in
bitmask types (internal issue 1411).
* Clarify for slink:VkApplicationInfo::pname:apiVersion and in the
<<fundamentals-validusage-versions, Valid Usage for Newer Core
Versions>> section when it is valid for an application to use a certain
version of Vulkan API functionality (for an instance and for a
device/physical device); and when the validation layers must generate an
error (internal issue 1412).
* Add optional <<memory-model-availability-visibility, transitive
availability/visibility operations to the memory model, including a new
pname:vulkanMemoryModelAvailabilityVisibilityChains feature for
slink:VkPhysicalDeviceVulkanMemoryModelFeaturesKHR (internal issue
1460).
* Add the code:StorageBuffer storage class to those in the
<<interfaces-resources-descset, Descriptor Set Interface>> (internal
issue 1480).
* Add missing `returnedonly` tags for a number of returned extension
structures that can be passed in pname:pNext chains (internal issue
1515).
* Clean up and rearrange some spec language for
slink:VkRenderPassCreateInfo and slink:VkAttachmentReference.txt
(internal issue 1522).
* Correctly round the code:OpVectorTimesScalar and
code:OpMatrixTimesScalar SPIR-V operations in the <<Precision of core
SPIR-V Instructions>> table (internal merge request 2996).
* Work around cases in flink:vkCmdBeginTransformFeedbackEXT,
flink:vkCmdEndTransformFeedbackEXT, and
slink:VkPipelineCoverageModulationStateCreateInfoNV where an array
parameter is `optional` but the length is not in `vk.xml`. This is an
interim fix using `noautovalidity` + handcoded VU replacing those that
should be autogenerated (internal issue 2944 and
https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/480).
* Remove redundant capability validation of the code:float16 and code:int8
SPIR-V capabilities from the <<spirvenv-capabilities, Capabilities>>
section, since they are already covered in the preceding table.
* Update check_spec_links script, including validation for reference page
open blocks. Fix errors identified by the script.
2019-01-13 13:53:27 +00:00
|
|
|
A call to flink:vkFreeMemory must: happen-after all memory operations on all
|
|
|
|
memory locations in that slink:VkDeviceMemory object.
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
NOTE: Normally, device memory operations in a given queue are synchronized
|
Change log for January 13, 2019 Vulkan 1.1.98 spec update:
* Update release number to 98.
Public Issues:
* Fix missing markup in flink:vkDestroyPipelineLayout valid usage
statement (pull request 882).
* Add missing contributors for `<<VK_EXT_buffer_device_address>>` (public
pull request 891).
Internal Issues:
* Detect nested bullet points in valid usage blocks and warn about them
during VUID assignment (internal issue 1382).
* Update the style guide to document the process for reserving new bits in
bitmask types (internal issue 1411).
* Clarify for slink:VkApplicationInfo::pname:apiVersion and in the
<<fundamentals-validusage-versions, Valid Usage for Newer Core
Versions>> section when it is valid for an application to use a certain
version of Vulkan API functionality (for an instance and for a
device/physical device); and when the validation layers must generate an
error (internal issue 1412).
* Add optional <<memory-model-availability-visibility, transitive
availability/visibility operations to the memory model, including a new
pname:vulkanMemoryModelAvailabilityVisibilityChains feature for
slink:VkPhysicalDeviceVulkanMemoryModelFeaturesKHR (internal issue
1460).
* Add the code:StorageBuffer storage class to those in the
<<interfaces-resources-descset, Descriptor Set Interface>> (internal
issue 1480).
* Add missing `returnedonly` tags for a number of returned extension
structures that can be passed in pname:pNext chains (internal issue
1515).
* Clean up and rearrange some spec language for
slink:VkRenderPassCreateInfo and slink:VkAttachmentReference.txt
(internal issue 1522).
* Correctly round the code:OpVectorTimesScalar and
code:OpMatrixTimesScalar SPIR-V operations in the <<Precision of core
SPIR-V Instructions>> table (internal merge request 2996).
* Work around cases in flink:vkCmdBeginTransformFeedbackEXT,
flink:vkCmdEndTransformFeedbackEXT, and
slink:VkPipelineCoverageModulationStateCreateInfoNV where an array
parameter is `optional` but the length is not in `vk.xml`. This is an
interim fix using `noautovalidity` + handcoded VU replacing those that
should be autogenerated (internal issue 2944 and
https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/480).
* Remove redundant capability validation of the code:float16 and code:int8
SPIR-V capabilities from the <<spirvenv-capabilities, Capabilities>>
section, since they are already covered in the preceding table.
* Update check_spec_links script, including validation for reference page
open blocks. Fix errors identified by the script.
2019-01-13 13:53:27 +00:00
|
|
|
with flink:vkFreeMemory by having a host thread wait on a fence signalled by
|
|
|
|
that queue, and the wait happens-before the call to flink:vkFreeMemory on
|
|
|
|
the host.
|
2018-09-08 22:52:13 +00:00
|
|
|
|
|
|
|
The deallocation of SPIR-V variables is managed by the system and
|
|
|
|
happens-after all operations on those variables.
|
|
|
|
|
|
|
|
[[memory-model-informative-descriptions]]
|
|
|
|
== Informative Descriptions
|
|
|
|
|
|
|
|
This subsection is non-normative, and offers more easily understandable
|
|
|
|
consequences of the memory model for app/compiler developers.
|
|
|
|
|
|
|
|
Let SC be the storage class(es) specified by a release or acquire operation
|
|
|
|
or barrier.
|
|
|
|
|
|
|
|
* An atomic write with release semantics must not be reordered against any
|
|
|
|
read or write to SC that is program-ordered before it (regardless of the
|
|
|
|
storage class the atomic is in).
|
|
|
|
|
|
|
|
* An atomic read with acquire semantics must not be reordered against any
|
|
|
|
read or write to SC that is program-ordered after it (regardless of the
|
|
|
|
storage class the atomic is in).
|
|
|
|
|
|
|
|
* Any write to SC program-ordered after a release barrier must not be
|
|
|
|
reordered against any read or write to SC program-ordered before that
|
|
|
|
barrier.
|
|
|
|
|
|
|
|
* Any read from SC program-ordered before an acquire barrier must not be
|
|
|
|
reordered against any read or write to SC program-ordered after the
|
|
|
|
barrier.
|
|
|
|
|
|
|
|
A control barrier (even if it has no memory semantics) must not be reordered
|
|
|
|
against any memory barriers.
|
|
|
|
|
|
|
|
This memory model allows memory accesses with and without availability and
|
|
|
|
visibility operations, as well as atomic operations, all to be performed on
|
|
|
|
the same memory location.
|
|
|
|
This is critical to allow it to reason about memory that is reused in
|
|
|
|
multiple ways, e.g. across the lifetime of different shader invocations or
|
|
|
|
draw calls.
|
2018-10-08 23:12:09 +00:00
|
|
|
While GLSL (and legacy SPIR-V) applies the "`coherent`" decoration to
|
2018-09-08 22:52:13 +00:00
|
|
|
variables (for historical reasons), this model treats each memory access
|
|
|
|
instruction as having optional implicit availability/visibility operations.
|
|
|
|
GLSL to SPIR-V compilers should map all (non-atomic) operations on a
|
|
|
|
coherent variable to Make{Pointer,Texel}{Available}{Visible} flags in this
|
|
|
|
model.
|
|
|
|
|
|
|
|
Atomic operations implicitly have availability/visibility operations, and
|
|
|
|
the scope of those operations is taken from the atomic operation's scope.
|
|
|
|
|
|
|
|
[[memory-model-tessellation-output-ordering]]
|
|
|
|
== Tessellation Output Ordering
|
|
|
|
|
|
|
|
For SPIR-V that uses the Vulkan Memory Model, the code:OutputMemory storage
|
|
|
|
class is used to synchronize accesses to tessellation control output
|
|
|
|
variables.
|
|
|
|
For legacy SPIR-V that does not enable the Vulkan Memory Model via
|
|
|
|
code:OpMemoryModel, tessellation outputs can be ordered using a control
|
|
|
|
barrier with no particular memory scope or semantics, as defined below.
|
|
|
|
|
|
|
|
Let X and Y be memory operations performed by shader invocations A~X~ and
|
|
|
|
A~Y~.
|
|
|
|
Operation X is _tessellation-output-ordered_ before operation Y if and only
|
|
|
|
if all of the following are true:
|
|
|
|
|
|
|
|
* There is a dynamic instance of an code:OpControlBarrier instruction C
|
|
|
|
such that X is program-ordered before C in A~X~ and C is program-ordered
|
|
|
|
before Y in A~Y~.
|
|
|
|
* A~X~ and A~Y~ are in the same instance of C's execution scope.
|
|
|
|
|
|
|
|
If shader invocations A~X~ and A~Y~ in the code:TessellationControl
|
|
|
|
execution model execute memory operations X and Y, respectively, on the
|
|
|
|
code:Output storage class, and X is tessellation-output-ordered before Y
|
|
|
|
with a scope of code:Workgroup, then X is location-ordered before Y, and if
|
|
|
|
X is a write and Y is a read then X is visible-to Y.
|
Change log for February 17, 2019 Vulkan 1.1.101 spec update:
* Update release number to 101.
Public Issues:
* Make clear that memory types for imported host memory must be host
visible in slink:VkMemoryHostPointerPropertiesEXT.txt (public issue
897).
* Make <<interfaces-resources-layout, WARNING block>> into a NOTE block,
per the styleguide (public pull request 916).
Internal Issues:
* Make <<textures-output-format-conversion, computation of derivatives in
non-uniform flow control>> have undefined behavior (internal issue
1367).
* Make behavior, not just values, undefined for
<<textures-layout-validation, reads from inconsistent YCbCr layouts>>
(internal issue 1366).
* Consolidate version and extension behavior documentation in the
<<extended-functionality, Extended Functionality>> appendix, While a
great deal of text was moved from other parts of the Specification into
the appendix, this just serves to simplify and make consistent
discussions of versions and extensions (internal issue 1473).
* Add limits for slink:VkPhysicalDeviceRayTracingPropertiesNV in the
<<features-limits-types, Required Limit Types>> and
<<features-limits-required, Required Limits>> tables (internal issue
1511).
* Disallow <<memory-protected-memory, indirect calls within protected
command buffers>> by adding valid usage statements for the related
indirect dispatch and draw commands (internal issue 1513).
* Add valid usage stataements to slink:VkGraphicsPipelineCreateInfo,
slink:VkSubpassDescription, slink:VkSubpassDescription2KHR,
slink:VkSubpassDescriptionDepthStencilResolveKHR, and
slink:VkImageViewCreateInfo preventing the creation of a renderpass with
attachments in formats that are not supported for rendering (internal
issue 1552).
* Qualify valid usage statements for
slink:VkAttachmentReference::pname:layout parameter so restrictions only
apply if an attachment is ename:VK_ATTACHMENT_UNUSED (internal issue
1561).
* Add valid usage statement for flink:vkCmdDrawIndirectByteCountEXT
restricting pname:vertexStride to be positive (internal issue 1566).
* Make the `VK_EXT_sample_locations` extension depend on
`VK_KHR_get_physical_device_properties2` in `vk.xml`.
* Rearrange and simplify the <<interfaces-resources-layout, block layout
rules>>.
New Extensions:
* `VK_NV_cooperative_matrix`
* `VK_EXT_depth_clip_enable` (internal issue 1485).
2019-02-15 12:00:36 +00:00
|
|
|
|
|
|
|
ifdef::VK_NV_cooperative_matrix[]
|
|
|
|
|
|
|
|
[[memory-model-cooperative-matrix]]
|
|
|
|
== Cooperative Matrix Memory Access
|
|
|
|
|
|
|
|
For each dynamic instance of a cooperative matrix load or store instruction
|
|
|
|
(code:OpCooperativeMatrixLoadNV or code:OpCooperativeMatrixStoreNV), a
|
|
|
|
single implementation-dependent invocation within the instance of the
|
|
|
|
matrix's scope performs a non-atomic load or store (respectively) to each
|
|
|
|
memory location that is defined to be accessed by the instruction.
|
|
|
|
|
|
|
|
endif::VK_NV_cooperative_matrix[]
|