6980 Commits

Author SHA1 Message Date
Lynne
cf0ce1b7e4 hwcontext_vulkan: enable the VK_EXT_shader_replicated_composites extension
Its required to use certain features in SPIR-V that glslang uses.
2026-01-12 17:28:41 +01:00
Lynne
e844b43776 vulkan: support shader compression 2026-01-12 17:28:41 +01:00
Lynne
7ce22b085e cuda/load_helper: move zlib decompression into a separate file
Allows it to be reused for Vulkan
2026-01-12 17:28:41 +01:00
Lynne
540c4df5c7 vulkan: add support for precompiled shaders 2026-01-12 17:28:40 +01:00
Lynne
40edf7d75d vulkan: switch to static allocation for temporary descriptor data
Simplifies management, and the hardware is limited to 4 descriptor sets
and whatever bindings.
2026-01-12 17:28:35 +01:00
Cosmin Stejerean
3474ec01e7 avutil/dovi_meta - fix L11 dovi metadata definition
deprecate the incorrect fields in AVDOVIDmLevel11 and schedule them
for removal
2026-01-07 13:14:11 +00:00
Zhao Zhili
66f7e9db71 avutil/crc: use arm64 crc32 instruction
On rpi5 A76
crc_32_ieee_le_c:                       23146.3 ( 1.00x)
crc_32_ieee_le_crc:                      1060.1 (21.83x)

On RK3566 A55
crc_32_ieee_le_c:                       28773.8 ( 1.00x)
crc_32_ieee_le_crc:                      2602.4 (11.06x)

Co-authored-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-01-04 15:49:30 +01:00
Zhao Zhili
0645c48453 avutil/cpu: add CPU feature flag for arm crc32
Co-Authored-by: Martin Storsjö <martin@martin.st>
2026-01-04 15:49:30 +01:00
Martin Storsjö
3dcbcce80c configure: Check for the AArch64 CRC extension
Name the feature "arm_crc" rather than plain "crc", to make it
clear that this is about a CPU feature extension, not CRC
implementations in general.

This requires dealing with the extension slightly differently
than other extensions, as the name of the feature and the
".arch_extension" extension name differ.

Naming it with an "arm" prefix rather than "aarch64", as the
CPU extension also is available in 32 bit ARM form, even though
we don't intend to use it there.
2026-01-04 15:49:30 +01:00
Andreas Rheinhardt
dc03cffe9c avutil/crc: Use x86 clmul for CRC when available
Observed near 10x speedup on AMD Zen4 7950x:
av_crc_c:                                            22057.0 ( 1.00x)
av_crc_clmul:                                         2202.8 (10.01x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-01-04 15:49:30 +01:00
Shreesh Adiga
1b6571c765 avutil/crc: add x86 SSE4.2 clmul SIMD implementation for av_crc
Implemented the algorithm described in the paper titled
"Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction"
by Intel.
It is not used yet; the integration will be added in a separate commit.

Observed near 10x speedup on AMD Zen4 7950x:
av_crc_c:                                            22057.0 ( 1.00x)
av_crc_clmul:                                         2202.8 (10.01x)
2026-01-04 15:49:30 +01:00
Andreas Rheinhardt
52190efade avutil/crc: Don't assert AVCRCId to be valid
This function is supposed to return NULL on failure.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-01-04 15:49:30 +01:00
Shreesh Adiga
e382772e4a avutil/cpu: add x86 CPU feature flag for clmul
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-01-04 15:49:30 +01:00
Russell Greene
38e89fe502 hwcontext_vulkan: add support for implict DRM sync for export
When a frame is exported to DRM, it may be written to to read to in an asyncronous fashion. Make sure, on unmap of a Vulkan frame that was mapped to DRM, to import any fences that were put on the dmabuf
2025-12-31 15:53:20 +00:00
Lynne
c3530d9a70 vulkan: remove FFVkBuffer.stage and access
Keeping global state for every buffer is unncessary and possibly
suboptimal.
2025-12-31 15:00:47 +01:00
Lynne
95baff9b61 vulkan: add ff_vk_buf_barrier()
This is a shorthand way of writing buffer barrier structures.
2025-12-31 15:00:46 +01:00
Lynne
b7d2469e4c vulkan_functions: add vkCmdDispatchBase
Its useful for multi-stage operations.
2025-12-31 15:00:46 +01:00
Lynne
9f3a04d2f6 vulkan: use HOST_CACHED memory flag only if such a heap exists
NVK does not offer such, so our code failed to allocate memory.
2025-12-31 15:00:46 +01:00
Lynne
bb30126349 hwcontext_vulkan: enable the explicit shader workgroup extension
Its useful as it allows us to alias memory in shaders.
2025-12-31 15:00:45 +01:00
Lynne
d70c6cb511 hwcontext_vulkan: enable subgroup extended types
We were already using them, but had forgotten to enable them.
2025-12-31 15:00:12 +01:00
Kacper Michajłow
eea648ef1d avutil/hwcontext_vulkan: fix logic error when checking for encode support
Both FF_VK_EXT_VIDEO_ENCODE_QUEUE and FF_VK_EXT_VIDEO_MAINTENANCE_1 are
required, not only one of them.

Found by VVL.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-12-31 10:30:36 +00:00
James Almer
d7ee7ac20f avutil/iamf: remove default value from demixing_matrix_def
It's not required sice the previous commit, and fixes memleaks introduced by
a6e5fa3fbb.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-12-30 20:15:32 -03:00
James Almer
c6e7243f11 avutil/opt: fix av_opt_is_set_to_default() for array options with no default value
If AVOptionArrayDef.def is NULL, av_opt_is_set_to_default() should return true
when the field in the object is NULL.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-12-30 20:14:56 -03:00
Benjamin Cheng
2298978b47 hwcontext_vulkan: Remove unnecessary validation filters
With the latest changes, I can no longer get these to show up.
2025-12-30 14:39:19 -05:00
Benjamin Cheng
e41e21a1ec avutil/hwcontext_vulkan: Limit usages for lone DPBs
Lone DPBs will only be requested when the implementation requires
a separate DPB. These images are never given out, and are never used for
anything other than a DPB. Also most implementations requiring these
won't support any other usages on DPBs, so limiting the usages would fix
some validation errors.
2025-12-30 14:15:43 -05:00
James Almer
a6e5fa3fbb avutil/iamf: add an AVOption for AVIAMFLayer.demixing_matrix
Plus a length field, to fulfill the requirements of AV_OPT_TYPE_FLAG_ARRAY options.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-12-29 11:59:36 -03:00
Zhao Zhili
09d8300779 avutil/imgutils: use AV_CEIL_RSHIFT 2025-12-23 03:39:16 +00:00
Lynne
6656b11a7b hwcontext_vulkan: disable host transfers on MoltenVK
https://github.com/KhronosGroup/MoltenVK/issues/2618
2025-12-22 19:46:27 +01:00
Lynne
d33bf66519 vulkan_functions: add vkCmdUpdateBuffer
It embeds the data to be uploaded directly into the command buffer,
which makes it useful for uploading transient data.
2025-12-22 19:46:27 +01:00
Rémi Denis-Courmont
65018b3e83 lavu/float_dsp: fix R-V V scalarpdocut_double with ILP32 ABI 2025-12-22 18:55:16 +02:00
Rémi Denis-Courmont
435623cbda lavu/float_dsp: fix R-V V vector_dmul_scalar with ILP32 ABI 2025-12-22 18:55:16 +02:00
Rémi Denis-Courmont
56d933b0a7 lavu/float_dsp: fix R-V V vector_dmac_scalar with ILP32 ABI 2025-12-22 18:55:16 +02:00
Rémi Denis-Courmont
a583639bf0 lavu/fixed_dsp: fix scalarproduct on riscv32
On riscv32, the result must be narrowed from 63 to 32 bit before being
moved to the scalar side.
2025-12-22 18:55:13 +02:00
James Almer
5fac8addc5 avutil/log: use atomics to load and store logging level, flags and callback pointer
Based on code setting cpu flags in libavutil/cpu.c

Signed-off-by: James Almer <jamrial@gmail.com>
2025-12-13 21:33:11 +00:00
Zhao Zhili
1e2d86201f Revert "avutil/tx_template: extend to 2M"
This reverts commit 8f48a62, 9af8782, and bd3e71b.

Commit 8f48a62 extends tx to 2M, resulting in the tx_float bss
section reaching a size of 4M.

This isn't a issue on devices with normal memory sizes and OS
supporting virtual memory. But it's a real issue for embedded devices
with realtime OS, which may not support virtual memory, e.g., Nuttx.
This 4M of bss section map to physical memory directly, which is a
scarce resource on embedded devices.
2025-12-13 15:14:38 +00:00
James Almer
44862a9d68 avutil/aarch64/cpu: fix check for SME on Linux
SME is a AT_HWCAP2 entry, not AT_HWCAP.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-12-10 21:51:11 -03:00
Georgii Zagoruiko
cdb14bc74d configure: add detection of assembler support for SME
All changes are made during development/testing of SVE/SME for ffmpeg (vvc). Tested on Apple M4
2025-12-09 21:38:38 +00:00
Cameron Gutman
212eb8413a hwcontext_vulkan: add APIs to get optional extensions
These provide a way for apps that initialize Vulkan themselves to know
which extensions we may be able to use without having to hardcode it.

Signed-off-by: Cameron Gutman <aicommander@gmail.com>
2025-12-08 23:22:31 +00:00
Kacper Michajłow
6083c9bb8c avutil/hwcontext_vaapi: mark try_all with av_unused to suppres warning
Fixes: warning: variable 'try_all' set but not used [-Wunused-but-set-variable]
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-12-08 21:31:13 +00:00
Kacper Michajłow
9ed71a837b avutil/vulkan: fix device memory size truncation
size_t cannot fit VK_WHOLE_SIZE on 32-bit builds.

Fixes: warning: conversion from 'long long unsigned int' to 'size_t' {aka 'unsigned int'} changes value from '18446744073709551615' to '4294967295'

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2025-12-03 23:45:44 +00:00
Lynne
a8e8daa276 hwcontext_vulkan: fix final error to let old header files work
........
2025-12-03 22:34:32 +01:00
Lynne
bce14bb160 hwcontext_vulkan: fix compilation with older header versions 2025-12-03 21:22:54 +01:00
Andreas Rheinhardt
5d9270df7f libavutil/internal: Remove {SIZE,PTRDIFF}_SPECIFIER
Possible since 222127418b.

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-12-03 11:52:54 +01:00
averne
1e90047fe6 vulkan: fix host copy stride
memoryRowLength is is texels, not bytes
2025-12-01 15:40:40 +01:00
llyyr
7043522fe0 avutil/hwcontext_d3d12va: use hwdev context for logging
This fixes warning about av_log being called with NULL AVClass. This is
also an API violation

Fixes: https://trac.ffmpeg.org/ticket/11335
2025-12-01 03:15:25 +00:00
Lynne
932a872dbc hwcontext_vulkan: fix VkImageToMemoryCopyEXT.sType
It was copy pasted from the upload path.
Somehow, it was missed, despite god knows how many validation layer runs.
2025-11-30 23:11:46 +01:00
Russell Greene
3beaa2d70f hwcontext_vulkan: remove VK_HOST_IMAGE_COPY_MEMCPY flag
Reading the spec for what this flag means, it copies the data verbatim, including any swizzling/tiling, this has two issues

1. the format may not be what ffmpeg expects elsewhere, as it is expecing normal pitch linear host memeory in `swf`
2. the size of the copied data may not match the size of buffer provided, causing heap buffer overflow

It seems like addition of this flag is an oversight as it seems to be for caching/backups of image data, just to be used with copying back to the GPU with the MEMCPY flag, which is *not* how its used in ffmpeg.

Additionally, set memoryRowLength as if it isn't set, it assumes pitch = width_in_bytes, which I don't think is necessarily the case
2025-11-30 21:47:12 +00:00
Andreas Rheinhardt
59d75bf9e4 avutil/x86/Makefile: Only compile ASM init files when X86ASM is enabled
To do so, simply add these init files to X86ASM-OBJS instead of OBJS
in the Makefile. The former is already used for the actual assembly
files, but using them for the C init files just works, because the build
system uses file extensions to derive whether it is a C or a NASM file.

This avoids compiling unused function stubs and also reduces our
reliance on DCE: We don't add %if checks to the asm files except
for AVX, AVX2, FMA3, FMA4, XOP and AVX512, so all the MMX-SSE4
functions will be available. It also allows to remove HAVE_X86ASM checks
in these init files.

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 22:20:13 +01:00
Andreas Rheinhardt
0ec9c1b68d avutil/x86/x86inc: Use parentheses in has_epilogue
Prevents surprises.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-11-30 00:15:43 +01:00
Diego de Souza
9c76d7db86 avutil/hwcontext_cuda: Expands pixel formats support
Add support for additional pixel formats in CUDA hardware context:
- Planar formats (yuv420p10, yuv422p, yuv422p10, yuv444p10)
- Semiplanar formats (nv16, p210, p216)

Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2025-11-27 22:11:57 +01:00