2018-09-16 01:35:16 +00:00
|
|
|
include::meta/VK_NV_shader_image_footprint.txt[]
|
|
|
|
|
|
|
|
*Last Modified Date*::
|
|
|
|
2018-09-13
|
|
|
|
*IP Status*::
|
|
|
|
No known IP claims.
|
|
|
|
*Contributors*::
|
|
|
|
- Pat Brown, NVIDIA
|
|
|
|
- Chris Lentini, NVIDIA
|
|
|
|
- Daniel Koch, NVIDIA
|
|
|
|
- Jeff Bolz, NVIDIA
|
|
|
|
|
|
|
|
This extension adds Vulkan support for the `SPV_NV_shader_image_footprint`
|
|
|
|
SPIR-V extension.
|
|
|
|
That SPIR-V extension provides a new instruction
|
|
|
|
code:OpImageSampleFootprintNV allowing shaders to determine the set of
|
|
|
|
texels that would be accessed by an equivalent filtered texture lookup.
|
|
|
|
|
|
|
|
Instead of returning a filtered texture value, the instruction returns a
|
|
|
|
structure that can be interpreted by shader code to determine the footprint
|
|
|
|
of a filtered texture lookup.
|
|
|
|
This structure includes integer values that identify a small neighborhood of
|
|
|
|
texels in the image being accessed and a bitfield that indicates which
|
|
|
|
texels in that neighborhood would be used.
|
|
|
|
The structure also includes a bitfield where each bit identifies whether any
|
|
|
|
texel in a small aligned block of texels would be fetched by the texture
|
|
|
|
lookup.
|
|
|
|
The size of each block is specified by an access _granularity_ provided by
|
|
|
|
the shader.
|
|
|
|
The minimum granularity supported by this extension is 2x2 (for 2D textures)
|
|
|
|
and 2x2x2 (for 3D textures); the maximum granularity is 256x256 (for 2D
|
|
|
|
textures) or 64x32x32 (for 3D textures).
|
|
|
|
Each footprint query returns the footprint from a single texture level.
|
|
|
|
When using minification filters that combine accesses from multiple mipmap
|
|
|
|
levels, shaders must perform separate queries for the two levels accessed
|
2018-10-08 23:12:09 +00:00
|
|
|
("`fine`" and "`coarse`").
|
2018-09-16 01:35:16 +00:00
|
|
|
The footprint query also returns a flag indicating if the texture lookup
|
|
|
|
would access texels from only one mipmap level or from two neighboring
|
|
|
|
levels.
|
|
|
|
|
|
|
|
This extension should be useful for multi-pass rendering operations that do
|
|
|
|
an initial expensive rendering pass to produce a first image that is then
|
|
|
|
used as a texture for a second pass.
|
|
|
|
If the second pass ends up accessing only portions of the first image (e.g.,
|
|
|
|
due to visbility), the work spent rendering the non-accessed portion of the
|
|
|
|
first image was wasted.
|
|
|
|
With this feature, an application can limit this waste using an initial pass
|
|
|
|
over the geometry in the second image that performs a footprint query for
|
|
|
|
each visible pixel to determine the set of pixels that it needs from the
|
|
|
|
first image.
|
|
|
|
This pass would accumulate an aggregate footprint of all visible pixels into
|
2018-10-08 23:12:09 +00:00
|
|
|
a separate "`footprint image`" using shader atomics.
|
2018-09-16 01:35:16 +00:00
|
|
|
Then, when rendering the first image, the application can kill all shading
|
|
|
|
work for pixels not in this aggregate footprint.
|
|
|
|
|
|
|
|
This extension has a number of limitations.
|
|
|
|
The code:OpImageSampleFootprintNV instruction only supports for two- and
|
|
|
|
three-dimensional textures.
|
|
|
|
Footprint evaluation only supports the CLAMP_TO_EDGE wrap mode; results are
|
|
|
|
undefined for all other wrap modes.
|
|
|
|
Only a limited set of granularity values and that set does not support
|
|
|
|
separate coverage information for each texel in the original image.
|
|
|
|
|
|
|
|
When using SPIR-V generated from the OpenGL Shading Language, the new
|
|
|
|
instruction will be generated from code using the new
|
|
|
|
code:textureFootprint*NV built-in functions from the
|
|
|
|
`GL_NV_shader_texture_footprint` shading language extension.
|
|
|
|
|
|
|
|
=== New Object Types
|
|
|
|
|
|
|
|
None.
|
|
|
|
|
|
|
|
=== New Enum Constants
|
|
|
|
|
|
|
|
* Extending elink:VkStructureType:
|
|
|
|
** ename:VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_IMAGE_FOOTPRINT_FEATURES_NV
|
|
|
|
|
|
|
|
=== New Enums
|
|
|
|
|
|
|
|
None.
|
|
|
|
|
|
|
|
=== New Structures
|
|
|
|
|
|
|
|
* slink:VkPhysicalDeviceShaderImageFootprintFeaturesNV
|
|
|
|
|
|
|
|
=== New Functions
|
|
|
|
|
|
|
|
None.
|
|
|
|
|
|
|
|
=== New SPIR-V Capability
|
|
|
|
|
|
|
|
* <<spirvenv-capabilities-table-imagefootprint,ImageFootprintNV>>
|
|
|
|
|
|
|
|
=== Issues
|
|
|
|
|
|
|
|
(1) The footprint returned by the SPIR-V instruction is a structure that
|
|
|
|
includes an anchor, an offset, and a mask that represents a 8x8 or 4x4x4
|
|
|
|
neighborhood of texel groups.
|
|
|
|
But the bits of the mask are not stored in simple pitch order.
|
|
|
|
Why is the footprint built this way?
|
|
|
|
|
|
|
|
*RESOLVED*: We expect that applications using this feature will want to use
|
|
|
|
a fixed granularity and accumulate coverage information from the returned
|
2018-10-08 23:12:09 +00:00
|
|
|
footprints into an aggregate "`footprint image`" that tracks the portions of
|
2018-09-16 01:35:16 +00:00
|
|
|
an image that would be needed by regular texture filtering.
|
|
|
|
If an application is using a two-dimensional image with 4x4 pixel
|
|
|
|
granularity, we expect that the footprint image will use 64-bit texels where
|
|
|
|
each bit in an 8x8 array of bits corresponds to coverage for a 4x4 block in
|
|
|
|
the original image.
|
|
|
|
Texel (0,0) in the footprint image would correspond to texels (0,0) through
|
|
|
|
(31,31) in the original image.
|
|
|
|
|
|
|
|
In the usual case, the footprint for a single access will fully contained in
|
|
|
|
a 32x32 aligned region of the original texture, which corresponds to a
|
|
|
|
single 64-bit texel in the footprint image.
|
|
|
|
In that case, the implementation will return an anchor coordinate pointing
|
|
|
|
at the single footprint image texel, an offset vector of (0,0), and a mask
|
|
|
|
whose bits are aligned with the bits in the footprint texel.
|
|
|
|
For this case, the shader can simply atomically OR the mask bits into the
|
|
|
|
contents of the footprint texel to accumulate footprint coverage.
|
|
|
|
|
|
|
|
In the worst case, the footprint for a single access spans multiple 32x32
|
|
|
|
aligned regions and may require updates to four separate footprint image
|
|
|
|
texels.
|
|
|
|
In this case, the implementation will return an anchor coordinate pointing
|
|
|
|
at the lower right footprint image texel and an offset will identify how
|
2018-10-21 13:08:41 +00:00
|
|
|
many "`columns`" and "`rows`" of the returned 8x8 mask correspond to
|
|
|
|
footprint texels to the left and above the anchor texel.
|
2018-09-16 01:35:16 +00:00
|
|
|
If the anchor is (2,3), the 64 bits of the returned mask are arranged
|
|
|
|
spatially as follows, where each 4x4 block is assigned a bit number that
|
|
|
|
matches its bit number in the footprint image texels:
|
|
|
|
|
2018-09-20 22:08:13 +00:00
|
|
|
----
|
2018-09-16 01:35:16 +00:00
|
|
|
+-------------------------+-------------------------+
|
|
|
|
| -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- |
|
|
|
|
| -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- |
|
|
|
|
| -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- |
|
|
|
|
| -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- |
|
|
|
|
| -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- |
|
|
|
|
| -- -- -- -- -- -- 46 47 | 40 41 42 43 44 45 -- -- |
|
|
|
|
| -- -- -- -- -- -- 54 55 | 48 49 50 51 52 53 -- -- |
|
|
|
|
| -- -- -- -- -- -- 62 63 | 56 57 58 59 60 61 -- -- |
|
|
|
|
+-------------------------+-------------------------+
|
|
|
|
| -- -- -- -- -- -- 06 07 | 00 01 02 03 04 05 -- -- |
|
|
|
|
| -- -- -- -- -- -- 14 15 | 08 09 10 11 12 13 -- -- |
|
|
|
|
| -- -- -- -- -- -- 22 23 | 16 17 18 19 20 21 -- -- |
|
|
|
|
| -- -- -- -- -- -- 30 31 | 24 25 26 27 28 29 -- -- |
|
|
|
|
| -- -- -- -- -- -- 38 39 | 32 33 34 35 36 37 -- -- |
|
|
|
|
| -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- |
|
|
|
|
| -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- |
|
|
|
|
| -- -- -- -- -- -- -- -- | -- -- -- -- -- -- -- -- |
|
|
|
|
+-------------------------+-------------------------+
|
2018-09-20 22:08:13 +00:00
|
|
|
----
|
2018-09-16 01:35:16 +00:00
|
|
|
|
|
|
|
To accumulate coverage for each of the four footprint image texels, a shader
|
|
|
|
can AND the returned mask with simple masks derived from the x and y offset
|
|
|
|
values and then atomically OR the updated mask bits into the contents of the
|
|
|
|
corresponding footprint texel.
|
|
|
|
|
|
|
|
[source,c++]
|
2018-09-20 22:08:13 +00:00
|
|
|
----
|
2018-09-16 01:35:16 +00:00
|
|
|
uint64_t returnedMask = (uint64_t(footprint.mask.x) | (uint64_t(footprint.mask.y) << 32));
|
|
|
|
uint64_t rightMask = ((0xFF >> footprint.offset.x) * 0x0101010101010101UL);
|
|
|
|
uint64_t bottomMask = 0xFFFFFFFFFFFFFFFFUL >> (8 * footprint.offset.y);
|
|
|
|
uint64_t bottomRight = returnedMask & bottomMask & rightMask;
|
|
|
|
uint64_t bottomLeft = returnedMask & bottomMask & (~rightMask);
|
|
|
|
uint64_t topRight = returnedMask & (~bottomMask) & rightMask;
|
|
|
|
uint64_t topLeft = returnedMask & (~bottomMask) & (~rightMask);
|
2018-09-20 22:08:13 +00:00
|
|
|
----
|
2018-09-16 01:35:16 +00:00
|
|
|
|
|
|
|
(2) What should an application do to ensure maximum performance when
|
|
|
|
accumulating footprints into an aggregate footprint image?
|
|
|
|
|
Change log for October 7, 2018 Vulkan 1.1.87 spec update:
* Update release number to 87.
Public Issues:
* Merge flink:vkCmdPipelineBarrier self-dependency barrier VUs referring
to the same subpass dependency (public pull request 756).
* Describe default value of `"optional"` attribute in the registry schema
document (public issue 769)
* Fix links in <<VK_NVX_raytracing>> extension (public pull request 805).
* Mark the <<VK_KHR_mir_surface>> extension obsolete (see public issue 814
- does not close this, however).
* Fix missing endif in Image Creation block (public issue 817).
Internal Issues:
* Clarify that the compressed texture formats corresponding to
<<features-features-textureCompressionETC2>>,
<<features-features-textureCompressionASTC_LDR>>, and
<<features-features-textureCompressionBC>> is not contingent on the
feature bits, and may be supported even if the features are not enabled
(internal issue 663).
* Clarify that code:FragStencilRefEXT is output only in the
<<interfaces-builtin-variables, Built-In Variables>> section (internal
issue 1173).
* Identify and correct many overly-aggressive uses of "`undefined`", and
narrow them down, where straightforward to do so. Mark such resolved
uses of "`undefined`" with the custom undefined: macro. Add a new
<<writing-undefined, Describing Undefined Behavior>> section (internal
issue 1267).
* Don't require code:inline_uniform_block descriptors to be populated
before use in the flink:vkAllocateDescriptorSets section (internal issue
1380).
* Allow suppressing inline SVG images by controlling this with an
attribute set in the Makefile, rather than the explicit [%inline]
directive (internal issue 1391).
* Mark 'Khronos' as a registered trademark in several places, now that it
is one.
* Fix typo in the <<VK_KHR_shader_atomic_int64>> appendix using the GLSL
naming of the compare exchange op when referring to the SPIR-V op.
* Specify in the flink:vkGetPhysicalDeviceQueueFamilyProperties section
that all implementations must support at least one queue family, and
that every queue family must contain at least one queue.
* Make slink:VkPipelineDynamicStateCreateInfo::pname:dynamicStateCount,
slink:VkSampleLocationsInfoEXT::pname:sampleLocationsPerPixel, and
slink:VkSampleLocationsInfoEXT::pname:sampleLocationsCount optional, to
fix bogus implicit valid usage checks that were causing failures in the
conformance tests.
* Fix vendor tag in reserved extension 237 constants. Does not affect
anything since it's just a placeholder, but this should avoid further
comments.
* Minor markup fixes in some extension appendices.
New Extensions:
* `<<VK_FUCHSIA_imagepipe_surface>>`
2018-10-07 13:10:21 +00:00
|
|
|
*RESOLVED*: We expect that the most common usage of this feature will be to
|
|
|
|
accumulate aggregate footprint coverage, as described in the previous issue.
|
2018-09-16 01:35:16 +00:00
|
|
|
Even if you ignore the anisotropic filtering case where the implementation
|
|
|
|
may return a granularity larger than that requested by the caller, each
|
|
|
|
shader invocation will need to use atomic functions to update up to four
|
|
|
|
footprint image texels for each level of detail accessed.
|
|
|
|
Having each active shader invocation perform multiple atomic operations can
|
|
|
|
be expensive, particularly when neighboring invocations will want to update
|
|
|
|
the same footprint image texels.
|
|
|
|
|
|
|
|
Techniques can be used to reduce the number of atomic operations performed
|
|
|
|
when accumulating coverage include:
|
|
|
|
|
Change log for November 25, 2018 Vulkan 1.1.94 spec update:
* Update release number to 94.
Public Issues:
* Use the terms "`texel block`" and "`texel block size`" instead of "`data
element`" and "`element size`", and define "`element`" as an array slot.
In addition to the terminology changes, retitled the <<texel-block-size,
Representation and Texel Block Size>> section and added texel block size
/ no. of texels/block information to the
<<features-formats-compatibility, Compatible Formats>> table. There is
some additional work underway to make sure the compatibility language
makes sense for all of uncompressed, compressed, and multiplanar formats
(public issue 763).
* Cleanup `VK_NV_ray_tracing` language (public issues 858, 859).
Internal Issues:
* Specify in <<shaders-invocationgroups, Invocation and Derivative
Groups>> and <<textures-output-format-conversion, Texel Output Format
Conversion>> that derivative groups are quads when code:SubgroupSize >=
4 (internal issue 1390).
* Make the type of slink:VkDescriptorUpdateTemplateCreateInfo::pNext
`const` following pattern for the other stext:Vk*CreateInfo structures
(internal issue 1459).
* Specify that flink:vkCmdClearAttachments executes as a drawing command,
rather than a transfer command (internal issue 1463).
* Update `VK_NV_ray_tracing` to use code:InstanceId instead of
code:InstanceIndex.
New Extensions:
* `VK_KHR_swapchain_mutable_format`
* `VK_EXT_fragment_density_map`
2018-11-26 07:27:30 +00:00
|
|
|
* Have logic that detects returned footprints where all components of the
|
|
|
|
returned offset vector are zero.
|
|
|
|
In that case, the mask returned by the footprint function is guaranteed
|
|
|
|
to be aligned with the footprint image texels and affects only a single
|
|
|
|
footprint image texel.
|
|
|
|
* Have fragment shaders communicate using built-in functions from the
|
|
|
|
`VK_NV_shader_subgroup_partitioned` extension or other shader subgroup
|
|
|
|
extensions.
|
|
|
|
If you have multiple invocations in a subgroup that need to update the
|
|
|
|
same texel (x,y) in the footprint image, compute an aggregate footprint
|
|
|
|
mask across all invocations in the subgroup updating that texel and have
|
|
|
|
a single invocation perform an atomic operation using that aggregate
|
|
|
|
mask.
|
|
|
|
* When the returned footprint spans multiple texels in the footprint
|
|
|
|
image, each invocation need to perform four atomic operations.
|
|
|
|
In the previous issue, we had an example that computed separate masks
|
|
|
|
for "`topLeft`", "`topRight`", "`bottomLeft`", and "`bottomRight`".
|
|
|
|
When the invocations in a subgroup have good locality, it might be the
|
|
|
|
case the "`top left`" for some invocations might refer to footprint
|
|
|
|
image texel (10,10), while neighbors might have their "`top left`"
|
|
|
|
texels at (11,10), (10,11), and (11,11).
|
|
|
|
If you compute separate masks for even/odd x and y values instead of
|
|
|
|
left/right or top/bottom, the "`odd/odd`" mask for all invocations in
|
|
|
|
the subgroup hold coverage for footprint image texel (11,11), which can
|
|
|
|
be updated by a single atomic operation for the entire subgroup.
|
2018-09-16 01:35:16 +00:00
|
|
|
|
|
|
|
=== Examples
|
|
|
|
|
|
|
|
TBD
|
|
|
|
|
|
|
|
=== Version History
|
|
|
|
|
|
|
|
* Revision 2, 2018-09-13 (Pat Brown)
|
|
|
|
- Add issue (2) with performance tips.
|
|
|
|
|
|
|
|
* Revision 1, 2018-08-12 (Pat Brown)
|
|
|
|
- Initial draft
|