Prior to this, the results were not saturated into the uchar/ushort range before
being written. The characteristics of the Lanczos filter exposed this issue.
In addition, the results were truncated rather than rounded, which resulted
in checkerboard artifacts in solid color areas and were noticeable when
using Lanczos with 8-bit input.
Example:
ffmpeg -init_hw_device cuda -f lavfi -i testsrc2=s=960x540,format=yuv420p \
-vf hwupload,scale_cuda=format=yuv420p:w=-2:h=720:interp_algo=lanczos \
-c:v h264_nvenc -qp:v 20 -t 1 <OUTPUT>
Fix#20784
Signed-off-by: nyanmisaka <nst799610810@gmail.com>
The swscale internals currently have a quirk which causes the memcpy
backend to be called when the pixfmts match. Obviously, this doesn't do
what is expected, as hardware frames cannot just be copied.
Check for this.
Sponsored-by: Sovereign Tech Fund
swscale gets runtime-defined assembly once again!
This commit splits the Vulkan backend into two, SPIR-V and GLSL,
enabling falling back onto the GLSL implementation if an instruction
is unavailable, or simply for testing.
Sponsored-by: Sovereign Tech Fund
This commit adds a SPIR-V assembler header file. It was partially generated
from the SPIR-V header file JSON definition, then edited by hand to template
and reduce its size as much as possible.
It only implements the essentials required for SPIR-V assembly that swscale
requires.
Sponsored-by: Sovereign Tech Fund
Uniform buffers are much simpler to index, and require no work from
the driver compiler to optimize.
In SPIR-V, large 2D shader constants can be spilled into scratch memory,
since you need to create a function variable to index them during runtime.
Sponsored-by: Sovereign Tech Fund
The issue is that very often, hardware has limited support for BGRA
formats.
As this is a limitation of Vulkan itself, we cannot work around this
in a compatible way.
Sponsored-by: Sovereign Tech Fund
FFmpeg has had an issue with GLSL compilation libraries since they
were first merged 6 years ago. The libraries don't have a stable ABI,
are very difficult for packagers to compile and integrate, are slow,
not threadsafe, and uncomfortable to use. The decision to switch all
Vulkan code to either compile-time GLSL or SPIR-V assembly was taken
in January, and since then, and included with the release of FFmpeg 8.1,
the progress has been steadily eliminating all remaining runtime GLSL
compilation.
Sponsored-by: Sovereign Tech Fund
The main issue is that BGR formats only semi-exist in Vulkan. Unlike all
other formats, they require the user to manually remap the pixel order, and
are also forbidden from being written to without a format in shaders. The main
reason for this was conservative - Vulkan is supposed to work everywhere, including
platforms where there is no write-time remapping/swizzing support.
Sponsored-by: Sovereign Tech Fund
The issue is that with multiplane images, or packed images,
there may be some mismatching between what .elems has, and what
we need.
Descriptors are cheap, so just always reserve 4.
Sponsored-by: Sovereign Tech Fund
The issue is that the main Vulkan context is shared between possibly
multiple shaders, and registering a new shader requires allocating
descriptors.
Sponsored-by: Sovereign Tech Fund
Multiple demuxers call avio_read() without checking its return
value. When input is truncated, destination buffers remain
uninitialized but are still used for offset calculations, memcmp,
and metadata handling. This results in undefined behavior
(detectable with Valgrind/MSan).
Fix this by checking the return value of avio_read() in:
- dss.c: dss_read_seek() — check before using header buffer
- dtshddec.c: FILEINFO chunk — check before using value buffer
- mlvdec.c: check_file_header() — check before memcmp on version
Fixes: #21520
This fixes dummy warnings when link/lld-link is called by the clang:
lld-link: warning: ignoring unknown argument '--as-needed'
lld-link: warning: ignoring unknown argument '-rpath-link=:libswresample:libswscale:libavfilter:libavdevice:libavformat:libavcodec:libavutil'
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Fixes host binaries compilation on platforms without math lib.
Fixes clang host compilation, which replaces `-lm` with `m.lib` that
does not exist:
LINK : fatal error LNK1181: cannot open input file 'm.lib'
clang: error: linker command failed with exit code 1181 (use -v to see invocation)
Fixes MSVC (cl) host warning:
cl : Command line warning D9002 : ignoring unknown option '-lm'
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
This uses llvm tools. `clang-*` toolchain is left mostly for backward
compatibility, although it doesn't use llvm tools, only clang. On top of
that it's for enabling sanitizers. While `llvm` toolchain can be use
without sanitizer suffix.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
handle_rtx_packet() constructs an RTX packet by shifting the payload
of a history entry to insert the original sequence number. It uses
memmove with length (ori_size - 12), but never checks that ori_size
is at least 12 bytes (the minimum RTP header size).
Zero-initialized history slots have seq == 0 and size == 0.
rtp_history_find() only compares sequence numbers, so an RTCP NACK
requesting seq 0 early in a session matches such a slot. The
subtraction then wraps to a huge value when converted to size_t,
causing a stack buffer overflow in memmove().
Add a little size check to reject history entries smaller than and
valid RTP header before any arithmetic on their size.
Found-by: Pwno
It was a bit clunky, lacked semantic contextual information, and made it
harder to reason about the effects of extending this struct. There should be
zero runtime overhead as a result of the fact that this is already a big
union.
I made the changes in this commit by hand, but due to the length and noise
level of the commit, I used Opus 4.6 to verify that I did not accidentally
introduce any bugs or typos.
Signed-off-by: Niklas Haas <git@haasn.dev>
This has the side benefit of not relying on the q2pixel macro to avoid division
by zero, since we can now explicitly avoid operating on undefined clear values.
Signed-off-by: Niklas Haas <git@haasn.dev>
Apple VideoToolbox is the dominant producer of hevc-alpha videos, but
early versions generates non-standard VPS extensions that fail to
parse and return AVERROR_INVALIDDATA. Fix this by returning
AVERROR_PATCHWELCOME instead of AVERROR_INVALIDDATA for unsupported
VPS extension configurations. Setting poc_lsb_not_present for the
alpha layer in the fallback path when it has no direct dependency
on the base layer, so that IDR slices on the alpha layer won't
incorrectly read pic_order_cnt_lsb.
Fix#22384
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
ff_frame_new_side_data() may set sd to NULL and return 0 when
side_data_pref() determines that existing side data should be
preferred.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
ff_frame_new_side_data() may set sd to NULL and return 0 when
side_data_pref() determines that existing side data should be
preferred.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
libvidstab's vsTransformPrepare() takes different internal code paths
for in-place (src == dest) vs. separate-buffer operation. The
separate-buffer path stores a shallow copy of the source frame pointer
in td->src without allocating internal memory (srcMalloced stays 0).
When a subsequent frame takes the in-place path, vsFrameIsNull(&td->src)
is false so vsFrameAllocate() is skipped, and vsFrameCopy() writes into
the stale pointer left over from the previous frame, corrupting memory
that the caller no longer owns.
Whether a given frame is writable depends on pipeline scheduling and
frame reference management, which can change between FFmpeg versions.
Since FFmpeg 8.1, changes in the scheduler caused some frames to arrive
as non-writable, leading to alternation between in-place and
separate-buffer paths that triggered the bug.
Fix this by marking the input pad with AVFILTERPAD_FLAG_NEEDS_WRITABLE.
Fix#22595
We currently don't have any cases where this is needed, but include
it for completeness and clarity.
These macros for BTI were added in
08b4716a9e.
A later comment in this file, added in
248986a0db, referenced the macro
AARCH64_VALID_JUMP_CALL_TARGET which never was added here before.
Unit test covering av_video_enc_params_alloc,
av_video_enc_params_block, and
av_video_enc_params_create_side_data.
Tests allocation for all three codec types (VP9, H264, MPEG2) and
the NONE type, with 0 and 4 blocks, with and without size output.
Verifies block getter indexing by writing and reading back
coordinates, dimensions, and delta_qp values. Tests frame-level qp
and delta_qp fields, and side data creation with frame attachment.
Coverage for libavutil/video_enc_params.c: 0.00% -> 86.21%
(remaining uncovered lines are OOM error paths)
Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
Unit test covering av_detection_bbox_alloc, av_get_detection_bbox,
and av_detection_bbox_create_side_data.
Tests allocation with 0, 1, and 4 bounding boxes, with and without
size output. Verifies bbox getter indexing by writing and reading
back coordinates, labels, and confidence values. Tests classify
fields (labels and confidences), the header source field, and
side data creation with frame attachment.
Coverage for libavutil/detection_bbox.c: 0.00% -> 86.67%
(remaining uncovered lines are OOM error paths)
Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
Unit test covering all 4 public API functions in libavutil/spherical.c:
av_spherical_alloc, av_spherical_projection_name, av_spherical_from_name,
and av_spherical_tile_bounds.
Tests allocation with and without size output, all 7 projection type
name lookups, projection name round-trip verification, out-of-range
handling, and tile bounds computation for full-frame, quarter-tile,
and centered-tile configurations.
Coverage for libavutil/spherical.c: 0.00% -> 100.00%
Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
It is only needed in the unlikely codepath. The ordinary one
only uses six xmm registers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>