Commit Graph

6872 Commits

Author SHA1 Message Date
Andreas Rheinhardt
7fd2be97b9 avcodec/x86/h264_chromamc: Avoid mmx in chroma_mc8_ssse3 functions
No impact on performance here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt
72058ccdf8 tests/checkasm/sw_scale: Don't use declare_func_emms in yuv2nv12cX check
There are no implementations of yuv2nv12cX clobbering the fpu state,
so make the test stricter to ensure that it stays that way.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt
135cc04c3b tests/checkasm/sw_yuv2yuv: Don't use declare_func_emms
It is not needed (there are no MMX functions here) and
given that there is no emms_c() cleaning up after convert_unscaled,
convert_unscaled must not clobber the fpu state.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt
14c30b9d19 tests/checkasm/png: Don't use declare_func_emms for add_paeth_pred
There is an x86 implementation using MMX registers, but it actually
issues emms on its own (since 57a29f2e7d).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt
fcea2aa75d tests/checkasm/vf_fspp: Don't use declare_func_emms for store_slice
Forgotten in ff85a20b7d.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt
96f0e6e927 tests/checkasm/sw_yuv2rgb: Don't use declare_func_emms unnecessarily
The last MMX(EXT) convert_unscaled functions have been removed
in 61e851381f. And anyway, there
is no emms_c cleaning up after these functions, so they must not
clobber the fpu state; that they did it at the time this checkasm
test has been added was a bug introduced by
e934194b6a and fixed by the removal
of said MMX(EXT) functions.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Sankalpa Sarkar
7b49a69f43 fate: add unit tests for libavutil/timecode functions 2026-04-05 22:23:08 +02:00
Sankalpa Sarkar
b462674645 fate/hlsenc: Add tests for untested features 2026-04-05 14:02:48 +00:00
Dana Feng
31a711aa68 vf_mpdecimate: Add comprehensive tests for keep and max options
Add tests for the mpdecimate filter to verify correct behavior:
- fate-filter-mpdecimate-keep: tests keep=3 option
- fate-filter-mpdecimate-keep1: tests keep=1 option
- fate-filter-mpdecimate-maxdrop-pos: tests max=3 (positive) option
- fate-filter-mpdecimate-maxdrop-neg: tests max=-3 (negative) option

Signed-off-by: Dana Feng <danaf@twosigma.com>
2026-04-05 00:26:55 +00:00
marcos ashton
e18c8c533d tests/fate/libavutil: add FATE test for mathematics
Test the integer math utility functions: av_gcd, av_rescale,
av_rescale_rnd (all rounding modes including PASS_MINMAX),
av_rescale_q, av_compare_ts, av_compare_mod, av_rescale_delta,
and av_add_stable. Includes large-value tests that exercise the
128-bit multiply path in av_rescale_rnd.

av_bessel_i0 is not tested since it uses floating point math
that is not bitexact across platforms.

Coverage for libavutil/mathematics.c: 0.00% -> 82.03%

Remaining uncovered lines are av_bessel_i0 (float, 23 lines)
and one edge case fallback in av_rescale_delta.
2026-04-05 00:12:29 +00:00
marcos ashton
66b1dbfb98 tests/fate/libavutil: add FATE test for samplefmt
Test all public API functions: name/format round-trip lookups,
bytes_per_sample, is_planar, packed/planar conversions,
alt_sample_fmt, get_sample_fmt_string, samples_get_buffer_size,
samples_alloc, samples_alloc_array_and_samples, samples_copy,
and samples_set_silence. OOM error paths are exercised via
av_max_alloc().

Coverage for libavutil/samplefmt.c: 0.00% -> 95.28%

Remaining uncovered lines are the fill_arrays failure path
and the overlapping memmove branch in samples_copy.
2026-04-05 00:12:29 +00:00
marcos ashton
117897bcd0 tests/fate/libavutil: add FATE test for rc4
Test the three public API functions: av_rc4_alloc, av_rc4_init,
and av_rc4_crypt. Verifies keystream output against RFC 6229
test vectors for 40, 56, 64, and 128-bit keys, encrypt/decrypt
round-trip, inplace operation, and the invalid key_bits error path.

Coverage for libavutil/rc4.c: 0.00% -> 100.00%
2026-04-05 00:12:29 +00:00
Niklas Haas
85bef2c2bc swscale/ops: split SwsConst up into op-specific structs
It was a bit clunky, lacked semantic contextual information, and made it
harder to reason about the effects of extending this struct. There should be
zero runtime overhead as a result of the fact that this is already a big
union.

I made the changes in this commit by hand, but due to the length and noise
level of the commit, I used Opus 4.6 to verify that I did not accidentally
introduce any bugs or typos.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-02 11:48:15 +00:00
marcos ashton
878eabdfef tests/fate/libavutil: add FATE test for video_enc_params
Unit test covering av_video_enc_params_alloc,
av_video_enc_params_block, and
av_video_enc_params_create_side_data.

Tests allocation for all three codec types (VP9, H264, MPEG2) and
the NONE type, with 0 and 4 blocks, with and without size output.
Verifies block getter indexing by writing and reading back
coordinates, dimensions, and delta_qp values. Tests frame-level qp
and delta_qp fields, and side data creation with frame attachment.

Coverage for libavutil/video_enc_params.c: 0.00% -> 86.21%
(remaining uncovered lines are OOM error paths)

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-31 18:05:51 +01:00
marcos ashton
c8ec660d78 tests/fate/libavutil: add FATE test for detection_bbox
Unit test covering av_detection_bbox_alloc, av_get_detection_bbox,
and av_detection_bbox_create_side_data.

Tests allocation with 0, 1, and 4 bounding boxes, with and without
size output. Verifies bbox getter indexing by writing and reading
back coordinates, labels, and confidence values. Tests classify
fields (labels and confidences), the header source field, and
side data creation with frame attachment.

Coverage for libavutil/detection_bbox.c: 0.00% -> 86.67%
(remaining uncovered lines are OOM error paths)

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-31 18:05:51 +01:00
marcos ashton
be2fa77344 tests/fate/libavutil: add FATE test for spherical
Unit test covering all 4 public API functions in libavutil/spherical.c:
av_spherical_alloc, av_spherical_projection_name, av_spherical_from_name,
and av_spherical_tile_bounds.

Tests allocation with and without size output, all 7 projection type
name lookups, projection name round-trip verification, out-of-range
handling, and tile bounds computation for full-frame, quarter-tile,
and centered-tile configurations.

Coverage for libavutil/spherical.c: 0.00% -> 100.00%

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-31 18:05:51 +01:00
Jun Zhao
514f57f85d tests/checkasm: add HEVC intra prediction test
Add checkasm test for HEVC intra prediction covering DC, planar, and
angular modes at all block sizes (4x4 to 32x32) for 8-bit and 10-bit
depth.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-30 14:32:10 +00:00
Ramiro Polla
a1bfaa0e78 swscale/aarch64: introduce tool to enumerate sws_ops for NEON backend
The NEON sws_ops backend will use a build-time code generator for the
various operation functions it needs to implement. This build time code
generator (ops_asmgen) will need a list of the operations that must be
implemented. This commit adds a tool (sws_ops_aarch64) that generates
such a list (ops_entries.c).

The list is generated by iterating over all possible conversion
combinations and collecting the parameters for each NEON assembly
function that has to be implemented, defined by an unique set of
parameters derived from SwsOp. Whenever swscale evolves, with improved
optimization passes, new pixel formats, or improvements to the backend
itself, this file (ops_entries.c) should be regenerated by running:
    $ make sws_ops_entries_aarch64

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-30 11:38:35 +00:00
Soham Kute
e3bcb9ac76 avformat/tests: add FATE tests for yuv4mpegpipe pixel formats
The existing fate-lavf-yuv420p.y4m covers only the default format.
Add four entries that pass -pix_fmt explicitly to the lavf_video
macro: yuv422p, yuv444p, yuv411p, and gray.

These exercise the branches in yuv4mpegpipe_write_header() that write
the "C422", "C444", "C411", and "Cmono" chroma descriptor strings in
the stream header.  All four are gated on ENCDEC(RAWVIDEO,YUV4MPEGPIPE)
and added to FATE_LAVF_VIDEO_SCALE so they inherit the requirement for
CONFIG_SCALE_FILTER that lavf_video's -auto_conversion_filters needs.

Reference files were generated from the actual encoder output and
follow the md5+size+CRC format used by the other lavf references.

Signed-off-by: Soham Kute <officialsohamkute@gmail.com>
2026-03-29 23:01:39 +00:00
Soham Kute
9bf999c24f avcodec/tests: add encoder-parser API test for H.261
Add tests/api/api-enc-parser-test.c, a generic encoder+parser round-trip
test that takes codec_name, width, and height on the command line
(defaults: h261 176 144).

Three cases are tested:

garbage - a single av_parser_parse2() call on 8 bytes with no Picture
Start Code; verifies out_size == 0 so the parser emits no spurious data.

bulk - encodes 2 frames, concatenates the raw packets, feeds the whole
buffer to a fresh parser in one call, then flushes.  Verifies that
exactly 2 non-empty frames come out and that the parser found the PSC
boundary between them.

split - the same buffer fed in two halves (chunk boundary falls inside
frame 0).  Verifies the parser still emits exactly 2 frames when input
arrives incrementally, and that the collected bytes are identical to
the bulk output (checked with memcmp).

Implementation notes: avcodec_get_supported_config() selects the pixel
format; chroma height uses AV_CEIL_RSHIFT with log2_chroma_h from
AVPixFmtDescriptor; data[1] and data[2] are checked independently so
semi-planar formats work; the encoded buffer is given
AV_INPUT_BUFFER_PADDING_SIZE zero bytes at the end; parse_stream()
skips the fed chunk if consumed==0 to prevent an infinite loop.

Two FATE entries in tests/fate/api.mak: QCIF (176x144) and CIF
(352x288), both standard H.261 resolutions.

Signed-off-by: Soham Kute <officialsohamkute@gmail.com>
2026-03-29 23:01:39 +00:00
Soham Kute
dc8183377c avutil/tests/file: replace trivial test with error-path coverage
The original test only mapped the source file and printed its content,
exercising none of the error branches in av_file_map().

Replace it with a test that maps a real file (path via argv[1] for
out-of-tree builds) and verifies it is non-empty, then calls
av_file_map() on a nonexistent file twice: once with log_offset=0 to
confirm the error is logged at AV_LOG_ERROR, and once with log_offset=1
to confirm the level is raised by one, covering the
log_level_offset_offset path in av_vlog().  A custom av_log callback
captures the emitted level independently of the global log level.
The two error cases share a single for() loop to avoid duplication.

Add a FATE entry in tests/fate/libavutil.mak with CMP=null since
there is no fixed stdout to compare.

Signed-off-by: Soham Kute <officialsohamkute@gmail.com>
2026-03-29 23:01:39 +00:00
Niklas Haas
f6a2d41fe2 swscale/ops: keep track of correct dither min/max
Mostly, this just affects the metadata in benign ways, e.g.:

 rgb24 -> yuv444p:
   [ u8 +++X] SWS_OP_READ         : 3 elem(s) packed >> 0
     min: {0, 0, 0, _}, max: {255, 255, 255, _}
   [ u8 +++X] SWS_OP_CONVERT      : u8 -> f32
     min: {0, 0, 0, _}, max: {255, 255, 255, _}
   [f32 ...X] SWS_OP_LINEAR       : matrix3+off3 [...]
     min: {16, 16, 16, _}, max: {235, 240, 240, _}
   [f32 ...X] SWS_OP_DITHER       : 16x16 matrix + {0 3 2 -1}
-    min: {33/2, 33/2, 33/2, _}, max: {471/2, 481/2, 481/2, _}
+    min: {16.001953, 16.001953, 16.001953, _}, max: {235.998047, 240.998047, 240.998047, _}
   [f32 +++X] SWS_OP_CONVERT      : f32 -> u8
     min: {16, 16, 16, _}, max: {235, 240, 240, _}
   [ u8 XXXX] SWS_OP_WRITE        : 3 elem(s) planar >> 0
     (X = unused, z = byteswapped, + = exact, 0 = zero)

However, it surprisingly actually includes a semantic change, whenever
converting from limited range to monob or monow:

 yuv444p -> monow:
   [ u8 +XXX] SWS_OP_READ         : 1 elem(s) planar >> 0
     min: {0, _, _, _}, max: {255, _, _, _}
   [ u8 +XXX] SWS_OP_CONVERT      : u8 -> f32
     min: {0, _, _, _}, max: {255, _, _, _}
   [f32 .XXX] SWS_OP_LINEAR       : luma [...]
     min: {-20/219, _, _, _}, max: {235/219, _, _, _}
   [f32 .XXX] SWS_OP_DITHER       : 16x16 matrix + {0 -1 -1 -1}
-    min: {179/438, _, _, _}, max: {689/438, _, _, _}
+    min: {-0.089371, _, _, _}, max: {2.071106, _, _, _}
+  [f32 .XXX] SWS_OP_MAX          : {0 0 0 0} <= x
+    min: {0, _, _, _}, max: {2.071106, _, _, _}
   [f32 .XXX] SWS_OP_MIN          : x <= {1 _ _ _}
-    min: {179/438, _, _, _}, max: {1, _, _, _}
+    min: {0, _, _, _}, max: {1, _, _, _}
   [f32 +XXX] SWS_OP_CONVERT      : f32 -> u8
     min: {0, _, _, _}, max: {1, _, _, _}
   [ u8 XXXX] SWS_OP_WRITE        : 1 elem(s) planar >> 3
     (X = unused, z = byteswapped, + = exact, 0 = zero)

Note the presence of an extra SWS_OP_MAX, to correctly clamp sub-blacks
(values below 16) to 0.0, rather than underflowing. This was previously
undetected because the dither was modelled as adding 0.5 to every pixel value,
but that's only true on average - not always.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 12:13:11 +02:00
Niklas Haas
e8f6c9dbf2 swscale/ops: only print SWS_OP_SCALE denom if not 1
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas
804041045e swscale/ops: remove redundant unused mask from ops printout
This is now fully redundant with the previous op's output; because unused
components are always marked as garbage on the input side.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas
6fb0efb35c swscale/ops: strip value range from garbage components
Just removes the unnecessary value range after the WRITE, as a result of
the previous change and the fact that we already skipped printing these for
unused components.

 rgb24 -> bgr24:
   [ u8 XXXX -> +++X] SWS_OP_READ         : 3 elem(s) packed >> 0
     min: {0 0 0 _}, max: {255 255 255 _}
   [ u8 ...X -> +++X] SWS_OP_SWIZZLE      : 2103
     min: {0 0 0 _}, max: {255 255 255 _}
   [ u8 ...X -> XXXX] SWS_OP_WRITE        : 3 elem(s) packed >> 0
-    min: {0 0 0 _}, max: {255 255 255 _}
     (X = unused, z = byteswapped, + = exact, 0 = zero)

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas
7d94d9fc52 swscale/ops: mark all unused components as GARBAGE
This only affects the print-out of the SWS_OP_WRITE at the end of every op,
list, because the ops list print-out was otherwise already checking the unused
mask.

 rgb24 -> bgr24:
   [ u8 XXXX -> +++X] SWS_OP_READ         : 3 elem(s) packed >> 0
     min: {0 0 0 _}, max: {255 255 255 _}
   [ u8 ...X -> +++X] SWS_OP_SWIZZLE      : 2103
     min: {0 0 0 _}, max: {255 255 255 _}
-  [ u8 ...X -> +++X] SWS_OP_WRITE        : 3 elem(s) packed >> 0
+  [ u8 ...X -> XXXX] SWS_OP_WRITE        : 3 elem(s) packed >> 0
     min: {0 0 0 _}, max: {255 255 255 _}
     (X = unused, z = byteswapped, + = exact, 0 = zero)

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas
d8b82c1097 tests/checkasm/sw_ops: add tests for SWS_OP_FILTER_H/V
These tests check that the (fused) read+filter ops work.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas
0402ecc270 tests/checkasm/sw_ops: set value range on op list input
May allow more efficient implementations that rely on the value range being
constrained.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas
43242e8a88 tests/checkasm/sw_ops: increase line count
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas
8fae195395 swscale/ops: avoid printing values for ignored components
Makes the list output a tiny bit tidier. This is cheap to support now thanks
to the print_q4() helper.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas
0d54a1b53a swscale/ops: remove , from comp min/max print-out for consistency
Interferes with an upcoming simplification, otherwise.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas
95e6c68707 swscale/ops: print exact constant on SWS_OP_SCALE
More informative.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Andreas Rheinhardt
e4e5beb394 tests/checkasm/sbcdsp: Add test for calc_scalefactors
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt
cd886bf0a5 avcodec/x86/sbcdsp: Port ff_sbc_analyze_[48]_mmx to SSE2
Halfs the amount of pmaddwd and improves performance a lot:
sbc_analyze_4_c:                                        55.7 ( 1.00x)
sbc_analyze_4_mmx:                                       7.0 ( 7.94x)
sbc_analyze_4_sse2:                                      4.3 (12.93x)
sbc_analyze_8_c:                                       131.1 ( 1.00x)
sbc_analyze_8_mmx:                                      22.4 ( 5.84x)
sbc_analyze_8_sse2:                                     10.7 (12.25x)

It also saves 224B of .text and allows to remove the emms_c()
from sbcenc.c (notice that ff_sbc_calc_scalefactors_mmx()
issues emms on its own, so it already abides by the ABI).

Hint: A pshufd could be avoided per function if the constants
were reordered.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt
7cf5e90586 tests/checkasm: Add sbcdsp tests
Only sbc_analyze_4 and sbc_analyze_8 for now.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt
af45345f7e tests/fate: Add SBC tests
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
marcos ashton
5d70f0844c libavutil/stereo3d: fix prefix matching in *_from_name() functions
The three *_from_name() functions used av_strstart() for prefix matching,
which returns incorrect results when one name is a prefix of another.

av_stereo3d_from_name("side by side (quincunx subsampling)") matched
"side by side" at index 1 and returned AV_STEREO3D_SIDEBYSIDE instead of
AV_STEREO3D_SIDEBYSIDE_QUINCUNX. Similarly,
av_stereo3d_primary_eye_from_name("nonexistent") matched "none" and
returned AV_PRIMARY_EYE_NONE instead of -1.

Switch all three functions from av_strstart() to strcmp() for exact
matching. No in-tree callers rely on prefix matching.

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-25 01:32:20 +00:00
Romain Beauxis
053fb462d8 tests/fate/ogg-*.mak: Make sure that copy tests do not run when
$(FFMPEG) is not compiled.
2026-03-23 10:53:33 -05:00
James Almer
711b1a52bd avformat/movenc: check if a packet is to be discarded when calculating edit list durations
Demuxers like mov will export packets not meant for presentation (e.g. because
an edit list doesn't include them) by flagging them as discard, but the mov
muxer completely ignored this, resulting in output edit lists considering every
packet.

Fixes issue #22552

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-21 23:35:39 -03:00
marcos ashton
924cc51ffe tests/fate/pcm: add FATE tests for pcm_bluray encoder and decoder
Add enc_dec_pcm roundtrip tests for the pcm_bluray codec covering
mono, stereo, 5.1, 7.0, and 7.1 channel layouts in s16. The 5.1
and 7.0 tests use an explicit pan filter for channel layout
conversion so the PAN_FILTER dependency is declared only where
needed. An additional s32 test uses a FATE sample file with real
>16-bit content (divertimenti_2ch_96kHz_s24.wav) and decodes to
s32le to verify the full 32-bit round-trip.

enc_dec_pcm is used instead of transcode because the MPEGTS muxer
produces different binary output on 32-bit and 64-bit platforms,
causing the intermediate file checksum to fail on 32-bit CI.

Coverage for libavcodec/pcm-bluray.c: 0.00% -> 93.75%
Coverage for libavcodec/pcm-blurayenc.c: 0.00% -> 91.71%

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-21 01:04:20 +00:00
marcos ashton
345071f747 tests/fate/libavutil: add FATE test for stereo3d
Add a unit test covering av_stereo3d_alloc, av_stereo3d_alloc_size,
av_stereo3d_create_side_data, av_stereo3d_type_name,
av_stereo3d_from_name, av_stereo3d_view_name,
av_stereo3d_view_from_name, and av_stereo3d_primary_eye_name.
The from_name calls are driven by a static name table so each
string appears exactly once. Round-trip inverse checks verify
that type_name/from_name and view_name/view_from_name are
consistent with each other.

Coverage for libavutil/stereo3d.c: 0.00% -> 100.00%

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-21 01:04:20 +00:00
marcos ashton
ed19c181c2 tests/fate/libavutil: add FATE test for film_grain_params
Add a unit test covering alloc, create_side_data, and select
for AV1 and H.274 film grain parameter types (22 cases).

Coverage for libavutil/film_grain_params.c: 0.00% -> 97.73%

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-21 01:04:20 +00:00
Andreas Rheinhardt
b33d1d1ba2 avcodec/x86/mpeg4videodsp: Add gmc_ssse3
It beats MMX by a lot, because it has to process eight words.
Also notice that the MMX code expects registers to be preserved
between separate inline assembly blocks which is not guaranteed;
the new code meanwhile does not presume this.

Benchmarks:
gmc_c:                                                 817.8 ( 1.00x)
gmc_mmx:                                               210.7 ( 3.88x)
gmc_ssse3:                                              80.7 (10.14x)

The MMX version has been removed.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-19 14:44:37 +01:00
Andreas Rheinhardt
338316f0a3 tests/checkasm: Add test for mpeg4videodsp
It already uncovered a bug in the MMX version of gmc.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-19 14:44:30 +01:00
Andreas Rheinhardt
42ebefbd98 tests/checkasm/rv34dsp: Don't use unnecessarily large buffers
RV34 uses 4x4 blocks.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-18 18:05:06 +01:00
Andreas Rheinhardt
c90cf2aa1f avcodec/x86/rv34dsp: Port ff_rv34_idct_dc_noround_mmxext to sse2
No change in benchmarks here.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-18 18:04:44 +01:00
Zhao Zhili
9800032722 checkasm/aarch64: fix operator precedence bug in ARG_STACK
The expression ((8*(MAX_ARGS - 8) + 15) & ~15 + 16)
evaluates to zero on Apple platforms due to assembler operator
precedence differences. LLVM's integrated assembler uses different
precedence rules depending on the target:

unsigned AsmParser::getBinOpPrecedence(AsmToken::TokenKind K,
				   MCBinaryExpr::Opcode &Kind) {
    bool ShouldUseLogicalShr = MAI.shouldUseLogicalShr();
    return IsDarwin ? getDarwinBinOpPrecedence(K, Kind, ShouldUseLogicalShr)
	      : getGNUBinOpPrecedence(MAI, K, Kind, ShouldUseLogicalShr);
}

In Darwin mode (Apple targets), arithmetic operators (+, -) have
higher precedence than bitwise operators (&, |, ^), similar to C.
In GNU mode (ELF targets), bitwise operators have higher precedence
than arithmetic operators.
2026-03-18 13:48:18 +00:00
James Almer
d1431d3f50 avcodec/bsf/extract_extradata: write correct length start codes for h26x
The specification for H.26{4,5,6} states that start codes may be three or four
bytes long long except for the first NALU in an AU, and for NALUs of parameter
set types, which must be four bytes long.
This is checked by ff_cbs_h2645_unit_requires_zero_byte(), which is made
available outside of CBS for this change.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 19:20:06 -03:00
James Almer
6bc257e292 avformat/nal: remove trailing zeroes from NALUs
Based on the behaviour from cbs_h2645, which removes actual
trailing_zero_8bits bytes and possibly also work arounds issues in
ff_h2645_extract_rbsp(). In this case, the same issue could be
present in ff_nal_find_startcode().

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 19:20:06 -03:00
Andreas Rheinhardt
59b119023f avcodec/apv_dsp: Remove dead 8 bit code
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-14 19:31:45 +01:00