There is an x86 implementation using MMX registers, but it actually
issues emms on its own (since 57a29f2e7d).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The last MMX(EXT) convert_unscaled functions have been removed
in 61e851381f. And anyway, there
is no emms_c cleaning up after these functions, so they must not
clobber the fpu state; that they did it at the time this checkasm
test has been added was a bug introduced by
e934194b6a and fixed by the removal
of said MMX(EXT) functions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fix the following issues with the keep option:
- Add similarity check during keep period. Previously, the code
returned early during the keep period without checking if the
frame is actually similar to the reference.
- Reset keep_count on different frames. Previously, the counter
could accumulate across non-consecutive similar frames, causing
frames to be dropped earlier than expected.
- Keep the same frame reference if appropriate. Previously, the
code made similar frames the new reference, causing reference
drift and gradual scene changes.
Signed-off-by: Dana Feng <danaf@twosigma.com>
Test the integer math utility functions: av_gcd, av_rescale,
av_rescale_rnd (all rounding modes including PASS_MINMAX),
av_rescale_q, av_compare_ts, av_compare_mod, av_rescale_delta,
and av_add_stable. Includes large-value tests that exercise the
128-bit multiply path in av_rescale_rnd.
av_bessel_i0 is not tested since it uses floating point math
that is not bitexact across platforms.
Coverage for libavutil/mathematics.c: 0.00% -> 82.03%
Remaining uncovered lines are av_bessel_i0 (float, 23 lines)
and one edge case fallback in av_rescale_delta.
Test all public API functions: name/format round-trip lookups,
bytes_per_sample, is_planar, packed/planar conversions,
alt_sample_fmt, get_sample_fmt_string, samples_get_buffer_size,
samples_alloc, samples_alloc_array_and_samples, samples_copy,
and samples_set_silence. OOM error paths are exercised via
av_max_alloc().
Coverage for libavutil/samplefmt.c: 0.00% -> 95.28%
Remaining uncovered lines are the fill_arrays failure path
and the overlapping memmove branch in samples_copy.
Test the three public API functions: av_rc4_alloc, av_rc4_init,
and av_rc4_crypt. Verifies keystream output against RFC 6229
test vectors for 40, 56, 64, and 128-bit keys, encrypt/decrypt
round-trip, inplace operation, and the invalid key_bits error path.
Coverage for libavutil/rc4.c: 0.00% -> 100.00%
Prior to this, the results were not saturated into the uchar/ushort range before
being written. The characteristics of the Lanczos filter exposed this issue.
In addition, the results were truncated rather than rounded, which resulted
in checkerboard artifacts in solid color areas and were noticeable when
using Lanczos with 8-bit input.
Example:
ffmpeg -init_hw_device cuda -f lavfi -i testsrc2=s=960x540,format=yuv420p \
-vf hwupload,scale_cuda=format=yuv420p:w=-2:h=720:interp_algo=lanczos \
-c:v h264_nvenc -qp:v 20 -t 1 <OUTPUT>
Fix#20784
Signed-off-by: nyanmisaka <nst799610810@gmail.com>
The swscale internals currently have a quirk which causes the memcpy
backend to be called when the pixfmts match. Obviously, this doesn't do
what is expected, as hardware frames cannot just be copied.
Check for this.
Sponsored-by: Sovereign Tech Fund
swscale gets runtime-defined assembly once again!
This commit splits the Vulkan backend into two, SPIR-V and GLSL,
enabling falling back onto the GLSL implementation if an instruction
is unavailable, or simply for testing.
Sponsored-by: Sovereign Tech Fund
This commit adds a SPIR-V assembler header file. It was partially generated
from the SPIR-V header file JSON definition, then edited by hand to template
and reduce its size as much as possible.
It only implements the essentials required for SPIR-V assembly that swscale
requires.
Sponsored-by: Sovereign Tech Fund
Uniform buffers are much simpler to index, and require no work from
the driver compiler to optimize.
In SPIR-V, large 2D shader constants can be spilled into scratch memory,
since you need to create a function variable to index them during runtime.
Sponsored-by: Sovereign Tech Fund
The issue is that very often, hardware has limited support for BGRA
formats.
As this is a limitation of Vulkan itself, we cannot work around this
in a compatible way.
Sponsored-by: Sovereign Tech Fund
FFmpeg has had an issue with GLSL compilation libraries since they
were first merged 6 years ago. The libraries don't have a stable ABI,
are very difficult for packagers to compile and integrate, are slow,
not threadsafe, and uncomfortable to use. The decision to switch all
Vulkan code to either compile-time GLSL or SPIR-V assembly was taken
in January, and since then, and included with the release of FFmpeg 8.1,
the progress has been steadily eliminating all remaining runtime GLSL
compilation.
Sponsored-by: Sovereign Tech Fund
The main issue is that BGR formats only semi-exist in Vulkan. Unlike all
other formats, they require the user to manually remap the pixel order, and
are also forbidden from being written to without a format in shaders. The main
reason for this was conservative - Vulkan is supposed to work everywhere, including
platforms where there is no write-time remapping/swizzing support.
Sponsored-by: Sovereign Tech Fund
The issue is that with multiplane images, or packed images,
there may be some mismatching between what .elems has, and what
we need.
Descriptors are cheap, so just always reserve 4.
Sponsored-by: Sovereign Tech Fund
The issue is that the main Vulkan context is shared between possibly
multiple shaders, and registering a new shader requires allocating
descriptors.
Sponsored-by: Sovereign Tech Fund
Multiple demuxers call avio_read() without checking its return
value. When input is truncated, destination buffers remain
uninitialized but are still used for offset calculations, memcmp,
and metadata handling. This results in undefined behavior
(detectable with Valgrind/MSan).
Fix this by checking the return value of avio_read() in:
- dss.c: dss_read_seek() — check before using header buffer
- dtshddec.c: FILEINFO chunk — check before using value buffer
- mlvdec.c: check_file_header() — check before memcmp on version
Fixes: #21520
This fixes dummy warnings when link/lld-link is called by the clang:
lld-link: warning: ignoring unknown argument '--as-needed'
lld-link: warning: ignoring unknown argument '-rpath-link=:libswresample:libswscale:libavfilter:libavdevice:libavformat:libavcodec:libavutil'
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Fixes host binaries compilation on platforms without math lib.
Fixes clang host compilation, which replaces `-lm` with `m.lib` that
does not exist:
LINK : fatal error LNK1181: cannot open input file 'm.lib'
clang: error: linker command failed with exit code 1181 (use -v to see invocation)
Fixes MSVC (cl) host warning:
cl : Command line warning D9002 : ignoring unknown option '-lm'
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
This uses llvm tools. `clang-*` toolchain is left mostly for backward
compatibility, although it doesn't use llvm tools, only clang. On top of
that it's for enabling sanitizers. While `llvm` toolchain can be use
without sanitizer suffix.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
handle_rtx_packet() constructs an RTX packet by shifting the payload
of a history entry to insert the original sequence number. It uses
memmove with length (ori_size - 12), but never checks that ori_size
is at least 12 bytes (the minimum RTP header size).
Zero-initialized history slots have seq == 0 and size == 0.
rtp_history_find() only compares sequence numbers, so an RTCP NACK
requesting seq 0 early in a session matches such a slot. The
subtraction then wraps to a huge value when converted to size_t,
causing a stack buffer overflow in memmove().
Add a little size check to reject history entries smaller than and
valid RTP header before any arithmetic on their size.
Found-by: Pwno
It was a bit clunky, lacked semantic contextual information, and made it
harder to reason about the effects of extending this struct. There should be
zero runtime overhead as a result of the fact that this is already a big
union.
I made the changes in this commit by hand, but due to the length and noise
level of the commit, I used Opus 4.6 to verify that I did not accidentally
introduce any bugs or typos.
Signed-off-by: Niklas Haas <git@haasn.dev>
This has the side benefit of not relying on the q2pixel macro to avoid division
by zero, since we can now explicitly avoid operating on undefined clear values.
Signed-off-by: Niklas Haas <git@haasn.dev>
Apple VideoToolbox is the dominant producer of hevc-alpha videos, but
early versions generates non-standard VPS extensions that fail to
parse and return AVERROR_INVALIDDATA. Fix this by returning
AVERROR_PATCHWELCOME instead of AVERROR_INVALIDDATA for unsupported
VPS extension configurations. Setting poc_lsb_not_present for the
alpha layer in the fallback path when it has no direct dependency
on the base layer, so that IDR slices on the alpha layer won't
incorrectly read pic_order_cnt_lsb.
Fix#22384
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
ff_frame_new_side_data() may set sd to NULL and return 0 when
side_data_pref() determines that existing side data should be
preferred.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>