Prevents valgrind from complaining about operating on uninitialized bytes.
This should be cheap as it's only done once during setup().
Signed-off-by: Niklas Haas <git@haasn.dev>
This code made the input read conditional on the byte count, but not the
output, leading to a lot of over-write for cases like 15, 5.
Signed-off-by: Niklas Haas <git@haasn.dev>
These align the filter size to a multiple of the internal tap grouping
(either 1/2/4 for vpgatherdd, or the XMM size for the 4x4 transposed kernel).
This may over-read past the natural end of the input buffer, if the aligned
size exceeds the true size.
Signed-off-by: Niklas Haas <git@haasn.dev>
The V-Nova LCEVC pipeline processes frames on internal background
worker threads. LCEVC_ReceiveDecoderPicture returns LCEVC_Again (-1)
when the worker has not yet completed the frame, which is the
documented "not ready, try again" response. The original code treated
any non-zero return as a fatal error (AVERROR_EXTERNAL), causing decode
to abort mid-stream.
Poll until LCEVC_Success or a genuine error is returned.
Signed-off-by: Peter von Kaenel <Peter.vonKaenel@harmonicinc.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Avoids the post_process_opaque_free callback; the only user of
this is already a RefStruct reference and presumably other users
would want to use a pool for this, too, so they would use
RefStruct-objects, too.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When use_loop == true and idx < 0, we would incorrectly check
in_stride[idx], which is OOB read. Reorder conditions to avoid that.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
When the WAV muxer's `-rf64 auto` option is used, the output is intended
to be a normal WAV file if possible, only extended to RF64 format when
the file size grows too large. This was accomplished by reserving space
for the extra RF64-specific data using a standard JUNK chunk (ignored by
readers), then overwriting the reserved space later with a ds64 chunk if
needed.
In the original rf64 auto implementation, the JUNK chunk was placed
right after the RIFF/WAVE file header, before the fmt chunk; this is the
design suggested by the "Achieving compatibility between BWF and RF64"
section of the RF64 spec:
RIFF 'WAVE' <JUNK chunk> <fmt-ck> ...
However, this approach means that the fmt chunk is no longer in its
conventional location at the beginning of the file, and some WAV-reading
tools are confused by this layout. For example, the `file` tool is not
able to show the format information for a file with the extra JUNK chunk
before fmt.
This change shuffles the order of the chunks for `-rf64 auto` mode so
that the reserved space follows fmt instead of preceding it:
RIFF 'WAVE' <fmt-ck> <JUNK chunk> ...
With this small modification, tools expecting the fmt chunk to be the
first chunk in the file work with files produced by `-rf64 auto`.
This means the fmt chunk won't be in the location required by RF64, so
if the automatic RF64 conversion is triggered, the fmt chunk needs to be
relocated by rewriting it following the ds64 chunk during the conversion:
RF64 'WAVE' <ds64 chunk> <fmt-ck> ...
H.264 only uses these functions with height 2 or 4 and
the aarch64, arm and mips versions of them optimize based
on this. Yet this is not true when these functions are used
by the lowres code in mpegvideo_dec.c. So revert back to
the C versions of these functions for mpegvideo_dec so that
the H.264 decoder can still use fully optimized functions.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Frame side data unfortunately lacks padding, which CBS needs, so we can't reuse
the existing AVBufferRef.
Signed-off-by: James Almer <jamrial@gmail.com>
These can randomly trigger the alpha/zero fast paths, resulting in spurious
tests or randomly diverging performance if the backend happens to implement
that particular fast path.
Signed-off-by: Niklas Haas <git@haasn.dev>
This was not actually testing integer path. Additionally, for integer
scales, there is a special fast path for expansion from bits to full range,
which we should separate from the random value test.
The overhead of the loop and memcpy call is less than the overhead of
possibly spilling into one extra unnecessary cache line. 64 is still a
good rule of thumb for L1 cache line size in 2026.
I leave it to future code archeologists to find and tweak this constant if
it ever becomes unnecessary.
Signed-off-by: Niklas Haas <git@haasn.dev>
Most of these filters don't test anything meaningfully different relative to
each other; the only filters that really have special significant are POINT
(for now) and maybe BILINEAR down the line.
Apart from that, SINC, combined with the src size loop, already tests both
extreme cases (large and small filters), with large, oscillating unwindonwed
weights.
The other filters are not adding anything of substance to this, while massively
slowing down the runtime of this test. We can, of course, change this if the
backends ever get more nuanced handling.
checkasm: all 855 tests passed (down from 1575)
Signed-off-by: Niklas Haas <git@haasn.dev>
The current code was a bit clumsy in that it always picked the first
available backend when choosing the new function. This meant that some x86
paths were not being tested at all, whenever the memcpy backend (which has
higher priority) could serve the request.
This change makes it so that each backend is explicitly tested against only
implementations provided by that same backend.
checkasm: all 1575 tests passed (up from 1305)
As an aside, it also lets us benchmark the memcpy backend directly against
the C reference backend.
Signed-off-by: Niklas Haas <git@haasn.dev>
These don't actually exist at runtime, and will soon be removed from the
backends as well.
This commit is intentionally a bit incomplete; as I will rewrite this
based on the auto-generated macros in the upcoming ops_micro series.
Signed-off-by: Niklas Haas <git@haasn.dev>
Check that the driver supports both BUFFER_OFFSET and BYTES_WRITTEN
encode feedback flags before creating the query pool, failing with
EINVAL if either is missing.
Set these flags explicitly instead of masking off HAS_OVERRIDES with a
bitwise NOT, which could pass unrecognized bits from newer drivers to
vkCreateQueryPool causing validation errors and
crashes.
Forward-declaring an enum is not legal C (the underlying type of
the enum may depend upon the enum constants, so this may cause
ABI issues with -fshort-enums); compilers warn about this
with -pedantic.
This essentially reverts 7e84865cff.
Notice that almost* all files that include codec_internal.h also
need to include avcodec.h, so this does not lead to unnecessary
rebuilds.
This addresses part of #22684.
*: The only file I am aware of that defines an FFCodec and does not
need AVCodecContext as complete type is null.c (but even it already
includes it implicitly); the avcodec.c test tool seems to be the only
file where this commit actually leads to an unnecessary avcodec.h
inclusion.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
If any of the dstStrides is not aligned mod 16, the warning
above this one will be triggered, setting stride_unaligned_warned,
so that the following check for stride_unaligned_warned will
be always false.
Reviewed-by: Niklas Haas <ffmpeg@haasn.dev>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The glue code doesn't care about types, so long the functions are
chained correctly. Let's not pretend there is any type safety there, as
the function pointers were casted anyway from unrelated types.
Particularly some f32 and u32 are shared.
This fixes errors like so:
src/libswscale/ops_tmpl_int.c:471:1: runtime error: call to function linear_diagoff3_f32 through pointer to incorrect function type 'void (*)(struct SwsOpIter *, const struct SwsOpImpl *, unsigned int *, unsigned int *, unsigned int *, unsigned int *)'
libswscale/ops_tmpl_float.c:208: note: linear_diagoff3_f32 defined here
Fixes: #22332
It was added to force auto vectorization on GCC builds. Since then auto
vectorization has been enabled for whole code base, 1464930696.
According to GCC documentaiton, the optimize attribute should be used
for debugging purposes only. It is not suitable in production code.
In particular it's unclear whether the attribute is applied, as it's is
actually lost when function is inlined, so usage of it is quite fragile.
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Fixes: out of array access
no testcase
Found-by: Joshua Rogers <joshua@joshua.hu> with ZeroPath
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The makeinfo_html variable wasn't being disabled when the makeinfo test
failed, which prevented texi2html from being probed.
Fixes 589da160b2.
Found-by: Luke Jolliffe <luke.jolliffe@bbc.co.uk>
tape_length * 8 overflows 32-bit int for large input widths. Then
av_malloc_array() allocates a tiny buffer while the subsequent
loop writes tape_length*8 BilinearMap entries, causing
heap-buffer-overflow.
Validate the value in float before converting to int and left
shifting, to avoid both float-to-int and signed left shift
overflow UB. Also split av_malloc_array() arguments to avoid
the multiplication overflow.
Fixes: #21511
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>