FFmpeg

mirror of https://mirror.skon.top/https://github.com/FFmpeg/FFmpeg synced 2026-04-20 21:00:41 +08:00

Author	SHA1	Message	Date
Michael Niedermayer	f81d6479ec	tools/target_dec_fuzzer: Adjust threshold for MPC8 Fixes: Timeout Fixes: 471587345/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MPC8_fuzzer-4824233864921088 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-02-23 23:15:19 +01:00
Michael Niedermayer	c8b57f0a1e	tools/target_dec_fuzzer: Adjust threshold for BFI Fixes: timeout Fixes: 471606773/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_BFI_fuzzer-6707440390569984 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-02-23 23:14:44 +01:00
Michael Niedermayer	4446dfb0e3	avcodec/flashsv: Check for input space before (re)allocating frame Fixes: Timeout Fixes: 471605680/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_FLASHSV2_DEC_fuzzer-6210773459468288 Fixes: 471605920/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_FLASHSV_DEC_fuzzer-6230719287590912 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-02-23 22:59:44 +01:00
Michael Niedermayer	40cafc25cf	avcodec/mdec: Check input space vs minimal block size Fixes: Timeout Fixes: 481006706/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MDEC_fuzzer-6122832651419648 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-02-23 22:54:38 +01:00
Michael Niedermayer	73681f888d	avcodec/h264_parser: Check remaining input length in loop in scan_mmco_reset() Fixes: read of uninitialized memory Fixes: 476177761/clusterfuzz-testcase-minimized-ffmpeg_dem_H264_fuzzer-6400884824408064 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-02-23 22:43:28 +01:00
Niklas Haas	b21f1b6482	tests/swscale: don't pass fake object to av_opt_eval_* This is UB, as the fake object may be used for logging. Reported-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com> Fixes: `ea791a4ef1`	2026-02-23 20:55:27 +00:00
Niklas Haas	afdb683a3f	swscale: avoid UB on interlaced frames NULL+0 is UB. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-23 19:39:17 +00:00
Niklas Haas	d918551650	swscale/graph: switch SwsPass.output to refstruct Allows multiple passes to share a single output buffer reference. We always allocate an output buffer so that subpasses can share the same output buffer reference while still allowing that reference to implicitly point to the final output image. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-23 19:39:17 +00:00
Niklas Haas	cc346b232d	swscale/graph: store current pass input instead of global args The global args to ff_sws_graph_run() really shouldn't matter inside thread workers. If they ever do, it indicates a leaky abstraction. The only reason it was needed in the first place was because of the way the input/output buffers implicitly defaulted to the global args. However, we can solve this much more elegantly by just calculating it in ff_sws_graph_run() directly and storing the computed SwsImg inside the execution state. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-23 19:39:17 +00:00
Niklas Haas	1e071c8585	swscale/graph: omit memcpy() if src and dst are identical This allows already referenced planes to be skipped, in the case of e.g. only some of the output planes being sucessfully referenced. Also avoids what is technically UB, if the user happens to call ff_sws_graph_run() after already having ref'd an image. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-23 19:39:17 +00:00
Niklas Haas	b98751b13c	swscale/graph: set up palette using current input image Using the original input image here is completely wrong - the format/palette could have been set to anything else in the meantime. At best, we would want to use the original input to add_legacy_sws_pass(), but it's impossible for this to differ from the per-pass input. The only time legacy subpasses are added is when using cascaded contexts, but in this case, the only context actually reading from the palette format would be the first one. I'm not entirely sure why this code was originally written this way, but I'm reasonably confident that it's not at all necessary. Tested extensively on both FATE, the self-test, and real-world files. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-23 19:39:17 +00:00
Niklas Haas	0b446cdccd	swscale/graph: switch to an AVBufferRef per plane This annoyingly requires recreating some of the logic inside av_img_alloc(), because there's no good existing current helper accessible from libswscale that gives per-plane allocations like this. The new code is based off the calculations inside libavframe/bufferpool.c. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-23 19:39:17 +00:00
Niklas Haas	afa08f4971	swscale/graph: duplicate buffer dimensions in SwsPassBuffer When multiple passes share a buffer reference, the true buffer dimensions may be different for each pass, depending on slice alignment. So we can't rely on the pass dimensions being representative. Instead, store this information in the SwsPassBuffer itself. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-23 19:39:17 +00:00
Niklas Haas	fe25e54d0f	swscale/graph: move output image into separate struct I want to add more metadata to this and also turn it into a refstruct, but get the cosmetic diff out of the way first. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-23 19:39:17 +00:00
Niklas Haas	18060a8820	swscale/graph: simplify ff_sws_graph_run() API There's little reason not to directly take an SwsImg here; it's already an internally visible struct. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-23 19:39:17 +00:00
Niklas Haas	e1fd274706	swscale/graph: check output plane pointer instead of pixel format To see if the output buffers are allocated or not. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-23 19:39:17 +00:00
Marvin Scholz	64fafd63f0	avformat: remove HLS protocol The use of this protocol was already discouraged and warned about for years with the recommendation to use the HLS demuxer instead.	2026-02-23 20:20:20 +01:00
Niklas Haas	ea791a4ef1	swscale/tests/swscale: parse flags from string We don't actually have an SwsContext yet at this point, so just use AV_OPT_SEARCH_FAKE_OBJ. For the actual evaluation, the signature only requires that we pass a "pointer to a struct that contains an AVClass as its first member", so passing a double pointer to the class itself is sufficient.	2026-02-23 19:23:09 +01:00
Marvin Scholz	fba9fc0c6b	lavc: wmadec: limit variable scopes Moves the loop variable declarations to the actual loops, narrowing their scopes.	2026-02-23 15:29:27 +00:00
Marvin Scholz	d219be03d6	lavc: wmadec: assert channels count This should never exceed MAX_CHANNELS, else there will be several out of bounds writes.	2026-02-23 15:29:27 +00:00
Lynne	7b15039cdb	Changelog: add changelog entry for Mps212	2026-02-23 07:57:57 +01:00
Lynne	baad75cafa	aacdec_usac: add support for parsing Mpsp212 (MPEG surround) This commit adds the full bitstream parsing for Mps212.	2026-02-23 07:57:57 +01:00
Lynne	86977fdb6b	aacdec_tab: add Mps212 tables To be used in the following commit.	2026-02-23 07:57:57 +01:00
Lynne	a4ab4a98c4	aacdec_tab: split up tables init	2026-02-23 07:57:57 +01:00
James Almer	40e0463113	avformat/mov: free item_name on infe entry parsing failure Fixes regression since `28c330d0f3`. Signed-off-by: James Almer <jamrial@gmail.com>	2026-02-22 23:16:15 -03:00
Michael Niedermayer	7e10579f49	avcodec/exr: fix AVERROR typo Fixes: out of array read Fixes: 485866440/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_EXR_DEC_fuzzer-4520520419966976 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-02-23 01:44:49 +00:00
James Almer	c3aa28f23d	avformat/mov: check for EOF in more loops Signed-off-by: James Almer <jamrial@gmail.com>	2026-02-23 00:43:50 +00:00
James Almer	28c330d0f3	avformat/mov: abort if the queried item doesn't exist instead of overwriting it The check for item presence was insufficient as it would result in the last item in the array being overwritten if it existed even if the id didn't match. Fixes: Assertion ref failed at src/libavformat/mov.c:10649 Fixes: clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-5312542695292928 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: James Almer <jamrial@gmail.com>	2026-02-23 00:43:50 +00:00
Nariman-Sayed	9bc4109b23	avformat/tls_openssl: fix memory leak in cert_from_pem_string When PEM_read_bio_X509 fails, BIO was not freed, causing memory leak. Free BIO before returning NULL to prevent resource leak.	2026-02-22 22:39:43 +00:00
Andreas Rheinhardt	53a9a34e23	avcodec/snow: Reduce sizeof(SnowContext) Each SubBand currently contains an array of 519 uint8_t[32], yet most of these are unused: For both the decoder and the encoder, at most 34 contexts are actually used: The only variable index is context+2, where context is the result of av_log2() and therefore in the 0..31 range. There are also several accesses using compile-time indices, the highest of which is 30. FATE passes with 31 contexts and maybe these are enough, but I don't know. Reducing the number to 34 reduces sizeof(SnowContext) from 2141664B to 155104B here (on x64). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 22:05:16 +01:00
Andreas Rheinhardt	bb92009386	avcodec/snow: Only allocate emu_edge_buffer for encoder Also allocate it during init and move it to the encoder's context. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 22:05:15 +01:00
Michael Niedermayer	c7b5f1537d	CONTRIBUTING.md: Add Forgejo Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-02-22 04:39:22 +00:00
Lynne	13e063ceec	vulkan/ffv1: properly initialize the linecache	2026-02-22 03:39:23 +01:00
Michael Niedermayer	99515a3342	avcodec/jpeg2000htdec: Check Lcup and Lref Fixes: use of uninitialized memory Fixes: 482494999/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_JPEG2000_DEC_fuzzer-6467586186608640 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-02-22 02:31:06 +00:00
Andreas Rheinhardt	6c1c1720cf	avcodec/x86/vvc/dsp_init: Mark dsp init function as av_cold Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 01:05:12 +01:00
Andreas Rheinhardt	af3f8f5bd2	avcodec/x86/vvc/of: Break dependency chain Don't extract and update one word of one and the same register at a time; use separate src and dst registers, so that pextrw and bsr can be done in parallel. Also use movd instead of pinsrw for the first word. Old benchmarks: apply_bdof_8_8x16_c: 3275.2 ( 1.00x) apply_bdof_8_8x16_avx2: 487.6 ( 6.72x) apply_bdof_8_16x8_c: 3243.1 ( 1.00x) apply_bdof_8_16x8_avx2: 284.4 (11.40x) apply_bdof_8_16x16_c: 6501.8 ( 1.00x) apply_bdof_8_16x16_avx2: 570.0 (11.41x) apply_bdof_10_8x16_c: 3286.5 ( 1.00x) apply_bdof_10_8x16_avx2: 461.7 ( 7.12x) apply_bdof_10_16x8_c: 3274.5 ( 1.00x) apply_bdof_10_16x8_avx2: 271.4 (12.06x) apply_bdof_10_16x16_c: 6590.0 ( 1.00x) apply_bdof_10_16x16_avx2: 543.9 (12.12x) apply_bdof_12_8x16_c: 3307.6 ( 1.00x) apply_bdof_12_8x16_avx2: 462.2 ( 7.16x) apply_bdof_12_16x8_c: 3287.4 ( 1.00x) apply_bdof_12_16x8_avx2: 271.8 (12.10x) apply_bdof_12_16x16_c: 6465.7 ( 1.00x) apply_bdof_12_16x16_avx2: 543.8 (11.89x) New benchmarks: apply_bdof_8_8x16_c: 3255.7 ( 1.00x) apply_bdof_8_8x16_avx2: 349.3 ( 9.32x) apply_bdof_8_16x8_c: 3262.5 ( 1.00x) apply_bdof_8_16x8_avx2: 214.8 (15.19x) apply_bdof_8_16x16_c: 6471.6 ( 1.00x) apply_bdof_8_16x16_avx2: 429.8 (15.06x) apply_bdof_10_8x16_c: 3227.7 ( 1.00x) apply_bdof_10_8x16_avx2: 321.6 (10.04x) apply_bdof_10_16x8_c: 3250.2 ( 1.00x) apply_bdof_10_16x8_avx2: 201.2 (16.16x) apply_bdof_10_16x16_c: 6476.5 ( 1.00x) apply_bdof_10_16x16_avx2: 400.9 (16.16x) apply_bdof_12_8x16_c: 3230.7 ( 1.00x) apply_bdof_12_8x16_avx2: 321.8 (10.04x) apply_bdof_12_16x8_c: 3210.5 ( 1.00x) apply_bdof_12_16x8_avx2: 200.9 (15.98x) apply_bdof_12_16x16_c: 6474.5 ( 1.00x) apply_bdof_12_16x16_avx2: 400.2 (16.18x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 01:05:12 +01:00
Andreas Rheinhardt	19dc7b79a4	avcodec/x86/vvc/of: Unify shuffling One can use the same shuffles for the width 8 and width 16 case if one also changes the permutation in vpermd (that always follows pshufb for width 16). This also allows to load it before checking width. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 01:03:22 +01:00
Andreas Rheinhardt	8e82416434	avcodec/x86/vvc/of: Avoid unused register Avoids a push+pop. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 01:02:20 +01:00
Andreas Rheinhardt	81fb70c833	avcodec/x86/vvc/mc,dsp_init: Avoid pointless wrappers for w_avg They only add overhead (in form of another function call, sign-extending some parameters to 64bit (although the upper bits are not used at all) and rederiving the actual number of bits (from the maximum value (1<<bpp)-1)). Old benchmarks: w_avg_8_2x2_c: 16.4 ( 1.00x) w_avg_8_2x2_avx2: 12.9 ( 1.27x) w_avg_8_4x4_c: 48.0 ( 1.00x) w_avg_8_4x4_avx2: 14.9 ( 3.23x) w_avg_8_8x8_c: 168.2 ( 1.00x) w_avg_8_8x8_avx2: 22.4 ( 7.49x) w_avg_8_16x16_c: 396.5 ( 1.00x) w_avg_8_16x16_avx2: 47.9 ( 8.28x) w_avg_8_32x32_c: 1466.3 ( 1.00x) w_avg_8_32x32_avx2: 172.8 ( 8.48x) w_avg_8_64x64_c: 5629.3 ( 1.00x) w_avg_8_64x64_avx2: 678.7 ( 8.29x) w_avg_8_128x128_c: 22122.4 ( 1.00x) w_avg_8_128x128_avx2: 2743.5 ( 8.06x) w_avg_10_2x2_c: 18.7 ( 1.00x) w_avg_10_2x2_avx2: 13.1 ( 1.43x) w_avg_10_4x4_c: 50.3 ( 1.00x) w_avg_10_4x4_avx2: 15.9 ( 3.17x) w_avg_10_8x8_c: 109.3 ( 1.00x) w_avg_10_8x8_avx2: 20.6 ( 5.30x) w_avg_10_16x16_c: 395.5 ( 1.00x) w_avg_10_16x16_avx2: 44.8 ( 8.83x) w_avg_10_32x32_c: 1534.2 ( 1.00x) w_avg_10_32x32_avx2: 141.4 (10.85x) w_avg_10_64x64_c: 6003.6 ( 1.00x) w_avg_10_64x64_avx2: 557.4 (10.77x) w_avg_10_128x128_c: 23722.7 ( 1.00x) w_avg_10_128x128_avx2: 2205.0 (10.76x) w_avg_12_2x2_c: 18.6 ( 1.00x) w_avg_12_2x2_avx2: 13.1 ( 1.42x) w_avg_12_4x4_c: 52.2 ( 1.00x) w_avg_12_4x4_avx2: 16.1 ( 3.24x) w_avg_12_8x8_c: 109.2 ( 1.00x) w_avg_12_8x8_avx2: 20.6 ( 5.29x) w_avg_12_16x16_c: 396.1 ( 1.00x) w_avg_12_16x16_avx2: 45.0 ( 8.81x) w_avg_12_32x32_c: 1532.6 ( 1.00x) w_avg_12_32x32_avx2: 142.1 (10.79x) w_avg_12_64x64_c: 6002.2 ( 1.00x) w_avg_12_64x64_avx2: 557.3 (10.77x) w_avg_12_128x128_c: 23748.7 ( 1.00x) w_avg_12_128x128_avx2: 2206.4 (10.76x) New benchmarks: w_avg_8_2x2_c: 16.0 ( 1.00x) w_avg_8_2x2_avx2: 9.3 ( 1.71x) w_avg_8_4x4_c: 48.4 ( 1.00x) w_avg_8_4x4_avx2: 12.4 ( 3.91x) w_avg_8_8x8_c: 168.7 ( 1.00x) w_avg_8_8x8_avx2: 21.1 ( 8.00x) w_avg_8_16x16_c: 394.5 ( 1.00x) w_avg_8_16x16_avx2: 46.2 ( 8.54x) w_avg_8_32x32_c: 1456.3 ( 1.00x) w_avg_8_32x32_avx2: 171.8 ( 8.48x) w_avg_8_64x64_c: 5636.2 ( 1.00x) w_avg_8_64x64_avx2: 676.9 ( 8.33x) w_avg_8_128x128_c: 22129.1 ( 1.00x) w_avg_8_128x128_avx2: 2734.3 ( 8.09x) w_avg_10_2x2_c: 18.7 ( 1.00x) w_avg_10_2x2_avx2: 10.3 ( 1.82x) w_avg_10_4x4_c: 50.8 ( 1.00x) w_avg_10_4x4_avx2: 13.4 ( 3.79x) w_avg_10_8x8_c: 109.7 ( 1.00x) w_avg_10_8x8_avx2: 20.4 ( 5.38x) w_avg_10_16x16_c: 395.2 ( 1.00x) w_avg_10_16x16_avx2: 41.7 ( 9.48x) w_avg_10_32x32_c: 1535.6 ( 1.00x) w_avg_10_32x32_avx2: 137.9 (11.13x) w_avg_10_64x64_c: 6002.1 ( 1.00x) w_avg_10_64x64_avx2: 548.5 (10.94x) w_avg_10_128x128_c: 23742.7 ( 1.00x) w_avg_10_128x128_avx2: 2179.8 (10.89x) w_avg_12_2x2_c: 18.9 ( 1.00x) w_avg_12_2x2_avx2: 10.3 ( 1.84x) w_avg_12_4x4_c: 52.4 ( 1.00x) w_avg_12_4x4_avx2: 13.4 ( 3.91x) w_avg_12_8x8_c: 109.2 ( 1.00x) w_avg_12_8x8_avx2: 20.3 ( 5.39x) w_avg_12_16x16_c: 396.3 ( 1.00x) w_avg_12_16x16_avx2: 41.7 ( 9.51x) w_avg_12_32x32_c: 1532.6 ( 1.00x) w_avg_12_32x32_avx2: 138.6 (11.06x) w_avg_12_64x64_c: 5996.7 ( 1.00x) w_avg_12_64x64_avx2: 549.6 (10.91x) w_avg_12_128x128_c: 23738.0 ( 1.00x) w_avg_12_128x128_avx2: 2177.2 (10.90x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 01:01:27 +01:00
Andreas Rheinhardt	ea78402e9c	avcodec/x86/vvc/mc,dsp_init: Avoid pointless wrappers for avg Up until now, there were two averaging assembly functions, one for eight bit content and one for <=16 bit content; there are also three C-wrappers around these functions, for 8, 10 and 12 bpp. These wrappers simply forward the maximum permissible value (i.e. (1<<bpp)-1) and promote some integer values to ptrdiff_t. Yet these wrappers are absolutely useless: The assembly functions rederive the bpp from the maximum and only the integer part of the promoted ptrdiff_t values is ever used. Of course, these wrappers also entail an additional call (not a tail call, because the additional maximum parameter is passed on the stack). Remove the wrappers and add per-bpp assembly functions instead. Given that the only difference between 10 and 12 bits are some constants in registers, the main part of these functions can be shared (given that this code uses a jumptable, it can even be done without adding any additional jump). Old benchmarks: avg_8_2x2_c: 11.4 ( 1.00x) avg_8_2x2_avx2: 7.9 ( 1.44x) avg_8_4x4_c: 30.7 ( 1.00x) avg_8_4x4_avx2: 10.4 ( 2.95x) avg_8_8x8_c: 134.5 ( 1.00x) avg_8_8x8_avx2: 16.6 ( 8.12x) avg_8_16x16_c: 255.6 ( 1.00x) avg_8_16x16_avx2: 28.2 ( 9.07x) avg_8_32x32_c: 897.7 ( 1.00x) avg_8_32x32_avx2: 83.9 (10.70x) avg_8_64x64_c: 3320.0 ( 1.00x) avg_8_64x64_avx2: 321.1 (10.34x) avg_8_128x128_c: 12981.8 ( 1.00x) avg_8_128x128_avx2: 1480.1 ( 8.77x) avg_10_2x2_c: 12.0 ( 1.00x) avg_10_2x2_avx2: 8.4 ( 1.43x) avg_10_4x4_c: 34.9 ( 1.00x) avg_10_4x4_avx2: 9.8 ( 3.56x) avg_10_8x8_c: 76.8 ( 1.00x) avg_10_8x8_avx2: 15.1 ( 5.08x) avg_10_16x16_c: 256.6 ( 1.00x) avg_10_16x16_avx2: 25.1 (10.20x) avg_10_32x32_c: 932.9 ( 1.00x) avg_10_32x32_avx2: 73.4 (12.72x) avg_10_64x64_c: 3517.9 ( 1.00x) avg_10_64x64_avx2: 414.8 ( 8.48x) avg_10_128x128_c: 13695.3 ( 1.00x) avg_10_128x128_avx2: 1648.1 ( 8.31x) avg_12_2x2_c: 13.1 ( 1.00x) avg_12_2x2_avx2: 8.6 ( 1.53x) avg_12_4x4_c: 35.4 ( 1.00x) avg_12_4x4_avx2: 10.1 ( 3.49x) avg_12_8x8_c: 76.6 ( 1.00x) avg_12_8x8_avx2: 16.7 ( 4.60x) avg_12_16x16_c: 256.6 ( 1.00x) avg_12_16x16_avx2: 25.5 (10.07x) avg_12_32x32_c: 933.2 ( 1.00x) avg_12_32x32_avx2: 75.7 (12.34x) avg_12_64x64_c: 3519.1 ( 1.00x) avg_12_64x64_avx2: 416.8 ( 8.44x) avg_12_128x128_c: 13695.1 ( 1.00x) avg_12_128x128_avx2: 1651.6 ( 8.29x) New benchmarks: avg_8_2x2_c: 11.5 ( 1.00x) avg_8_2x2_avx2: 6.0 ( 1.91x) avg_8_4x4_c: 29.7 ( 1.00x) avg_8_4x4_avx2: 8.0 ( 3.72x) avg_8_8x8_c: 131.4 ( 1.00x) avg_8_8x8_avx2: 12.2 (10.74x) avg_8_16x16_c: 254.3 ( 1.00x) avg_8_16x16_avx2: 24.8 (10.25x) avg_8_32x32_c: 897.7 ( 1.00x) avg_8_32x32_avx2: 77.8 (11.54x) avg_8_64x64_c: 3321.3 ( 1.00x) avg_8_64x64_avx2: 318.7 (10.42x) avg_8_128x128_c: 12988.4 ( 1.00x) avg_8_128x128_avx2: 1430.1 ( 9.08x) avg_10_2x2_c: 12.1 ( 1.00x) avg_10_2x2_avx2: 5.7 ( 2.13x) avg_10_4x4_c: 35.0 ( 1.00x) avg_10_4x4_avx2: 9.0 ( 3.88x) avg_10_8x8_c: 77.2 ( 1.00x) avg_10_8x8_avx2: 12.4 ( 6.24x) avg_10_16x16_c: 256.2 ( 1.00x) avg_10_16x16_avx2: 24.3 (10.56x) avg_10_32x32_c: 932.9 ( 1.00x) avg_10_32x32_avx2: 71.9 (12.97x) avg_10_64x64_c: 3516.8 ( 1.00x) avg_10_64x64_avx2: 414.7 ( 8.48x) avg_10_128x128_c: 13693.7 ( 1.00x) avg_10_128x128_avx2: 1609.3 ( 8.51x) avg_12_2x2_c: 14.1 ( 1.00x) avg_12_2x2_avx2: 5.7 ( 2.48x) avg_12_4x4_c: 35.8 ( 1.00x) avg_12_4x4_avx2: 9.0 ( 3.96x) avg_12_8x8_c: 76.9 ( 1.00x) avg_12_8x8_avx2: 12.4 ( 6.22x) avg_12_16x16_c: 256.5 ( 1.00x) avg_12_16x16_avx2: 24.4 (10.50x) avg_12_32x32_c: 934.1 ( 1.00x) avg_12_32x32_avx2: 72.0 (12.97x) avg_12_64x64_c: 3518.2 ( 1.00x) avg_12_64x64_avx2: 414.8 ( 8.48x) avg_12_128x128_c: 13689.5 ( 1.00x) avg_12_128x128_avx2: 1611.1 ( 8.50x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 00:58:33 +01:00
Andreas Rheinhardt	5a60b3f1a6	avcodec/x86/vvc/mc: Remove always-false branches The C versions of the average and weighted average functions contains "FFMAX(3, 15 - BIT_DEPTH)" and the code here followed this; yet it is only instantiated for bit depths 8, 10 and 12, for which the above is just 15-BIT_DEPTH. So the comparisons are unnecessary. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 00:57:56 +01:00
Andreas Rheinhardt	59f8ff4c18	avcodec/x86/vvc/mc: Remove unused constants Also avoid overaligning .rodata. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 00:57:56 +01:00
Andreas Rheinhardt	eabf52e787	avcodec/x86/vvc/mc: Avoid unused work The high quadword of these registers is zero for width 2. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 00:57:56 +01:00
Andreas Rheinhardt	9317fb2b2e	avcodec/x86/vvc/mc: Avoid ymm registers where possible Widths 2 and 4 fit into xmm registers. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 00:57:56 +01:00
Andreas Rheinhardt	caa0ae0cfb	avcodec/x86/vvc/mc: Avoid pextr[dq], v{insert,extract}i128 Use mov[dq], movdqu instead if the least significant parts are set (i.e. if the immediate value is 0x0). Old benchmarks: avg_8_2x2_c: 11.3 ( 1.00x) avg_8_2x2_avx2: 7.5 ( 1.50x) avg_8_4x4_c: 31.2 ( 1.00x) avg_8_4x4_avx2: 10.7 ( 2.91x) avg_8_8x8_c: 133.5 ( 1.00x) avg_8_8x8_avx2: 21.2 ( 6.30x) avg_8_16x16_c: 254.7 ( 1.00x) avg_8_16x16_avx2: 30.1 ( 8.46x) avg_8_32x32_c: 896.9 ( 1.00x) avg_8_32x32_avx2: 103.9 ( 8.63x) avg_8_64x64_c: 3320.7 ( 1.00x) avg_8_64x64_avx2: 539.4 ( 6.16x) avg_8_128x128_c: 12991.5 ( 1.00x) avg_8_128x128_avx2: 1661.3 ( 7.82x) avg_10_2x2_c: 21.3 ( 1.00x) avg_10_2x2_avx2: 8.3 ( 2.55x) avg_10_4x4_c: 34.9 ( 1.00x) avg_10_4x4_avx2: 10.6 ( 3.28x) avg_10_8x8_c: 76.3 ( 1.00x) avg_10_8x8_avx2: 20.2 ( 3.77x) avg_10_16x16_c: 255.9 ( 1.00x) avg_10_16x16_avx2: 24.1 (10.60x) avg_10_32x32_c: 932.4 ( 1.00x) avg_10_32x32_avx2: 73.3 (12.72x) avg_10_64x64_c: 3516.4 ( 1.00x) avg_10_64x64_avx2: 601.7 ( 5.84x) avg_10_128x128_c: 13690.6 ( 1.00x) avg_10_128x128_avx2: 1613.2 ( 8.49x) avg_12_2x2_c: 14.0 ( 1.00x) avg_12_2x2_avx2: 8.3 ( 1.67x) avg_12_4x4_c: 35.3 ( 1.00x) avg_12_4x4_avx2: 10.9 ( 3.26x) avg_12_8x8_c: 76.5 ( 1.00x) avg_12_8x8_avx2: 20.3 ( 3.77x) avg_12_16x16_c: 256.7 ( 1.00x) avg_12_16x16_avx2: 24.1 (10.63x) avg_12_32x32_c: 932.5 ( 1.00x) avg_12_32x32_avx2: 73.3 (12.72x) avg_12_64x64_c: 3520.5 ( 1.00x) avg_12_64x64_avx2: 602.6 ( 5.84x) avg_12_128x128_c: 13689.6 ( 1.00x) avg_12_128x128_avx2: 1613.1 ( 8.49x) w_avg_8_2x2_c: 16.7 ( 1.00x) w_avg_8_2x2_avx2: 13.4 ( 1.25x) w_avg_8_4x4_c: 44.5 ( 1.00x) w_avg_8_4x4_avx2: 15.9 ( 2.81x) w_avg_8_8x8_c: 166.1 ( 1.00x) w_avg_8_8x8_avx2: 45.7 ( 3.63x) w_avg_8_16x16_c: 392.9 ( 1.00x) w_avg_8_16x16_avx2: 57.8 ( 6.80x) w_avg_8_32x32_c: 1455.5 ( 1.00x) w_avg_8_32x32_avx2: 215.0 ( 6.77x) w_avg_8_64x64_c: 5621.8 ( 1.00x) w_avg_8_64x64_avx2: 875.2 ( 6.42x) w_avg_8_128x128_c: 22131.3 ( 1.00x) w_avg_8_128x128_avx2: 3390.1 ( 6.53x) w_avg_10_2x2_c: 18.0 ( 1.00x) w_avg_10_2x2_avx2: 14.0 ( 1.28x) w_avg_10_4x4_c: 53.9 ( 1.00x) w_avg_10_4x4_avx2: 15.9 ( 3.40x) w_avg_10_8x8_c: 109.5 ( 1.00x) w_avg_10_8x8_avx2: 40.4 ( 2.71x) w_avg_10_16x16_c: 395.7 ( 1.00x) w_avg_10_16x16_avx2: 44.7 ( 8.86x) w_avg_10_32x32_c: 1532.7 ( 1.00x) w_avg_10_32x32_avx2: 142.4 (10.77x) w_avg_10_64x64_c: 6007.7 ( 1.00x) w_avg_10_64x64_avx2: 745.5 ( 8.06x) w_avg_10_128x128_c: 23719.7 ( 1.00x) w_avg_10_128x128_avx2: 2217.7 (10.70x) w_avg_12_2x2_c: 18.9 ( 1.00x) w_avg_12_2x2_avx2: 13.6 ( 1.38x) w_avg_12_4x4_c: 47.5 ( 1.00x) w_avg_12_4x4_avx2: 15.9 ( 2.99x) w_avg_12_8x8_c: 109.3 ( 1.00x) w_avg_12_8x8_avx2: 40.9 ( 2.67x) w_avg_12_16x16_c: 395.6 ( 1.00x) w_avg_12_16x16_avx2: 44.8 ( 8.84x) w_avg_12_32x32_c: 1531.0 ( 1.00x) w_avg_12_32x32_avx2: 141.8 (10.80x) w_avg_12_64x64_c: 6016.7 ( 1.00x) w_avg_12_64x64_avx2: 732.8 ( 8.21x) w_avg_12_128x128_c: 23762.2 ( 1.00x) w_avg_12_128x128_avx2: 2223.4 (10.69x) New benchmarks: avg_8_2x2_c: 11.3 ( 1.00x) avg_8_2x2_avx2: 7.6 ( 1.49x) avg_8_4x4_c: 31.2 ( 1.00x) avg_8_4x4_avx2: 10.8 ( 2.89x) avg_8_8x8_c: 131.6 ( 1.00x) avg_8_8x8_avx2: 15.6 ( 8.42x) avg_8_16x16_c: 255.3 ( 1.00x) avg_8_16x16_avx2: 27.9 ( 9.16x) avg_8_32x32_c: 897.9 ( 1.00x) avg_8_32x32_avx2: 81.2 (11.06x) avg_8_64x64_c: 3320.0 ( 1.00x) avg_8_64x64_avx2: 335.1 ( 9.91x) avg_8_128x128_c: 12999.1 ( 1.00x) avg_8_128x128_avx2: 1456.3 ( 8.93x) avg_10_2x2_c: 12.0 ( 1.00x) avg_10_2x2_avx2: 8.6 ( 1.40x) avg_10_4x4_c: 34.9 ( 1.00x) avg_10_4x4_avx2: 9.7 ( 3.61x) avg_10_8x8_c: 76.7 ( 1.00x) avg_10_8x8_avx2: 16.3 ( 4.69x) avg_10_16x16_c: 256.3 ( 1.00x) avg_10_16x16_avx2: 25.2 (10.18x) avg_10_32x32_c: 932.8 ( 1.00x) avg_10_32x32_avx2: 73.3 (12.72x) avg_10_64x64_c: 3518.8 ( 1.00x) avg_10_64x64_avx2: 416.8 ( 8.44x) avg_10_128x128_c: 13691.6 ( 1.00x) avg_10_128x128_avx2: 1612.9 ( 8.49x) avg_12_2x2_c: 14.1 ( 1.00x) avg_12_2x2_avx2: 8.7 ( 1.62x) avg_12_4x4_c: 35.7 ( 1.00x) avg_12_4x4_avx2: 9.7 ( 3.68x) avg_12_8x8_c: 77.0 ( 1.00x) avg_12_8x8_avx2: 16.9 ( 4.57x) avg_12_16x16_c: 256.2 ( 1.00x) avg_12_16x16_avx2: 25.7 ( 9.96x) avg_12_32x32_c: 933.5 ( 1.00x) avg_12_32x32_avx2: 74.0 (12.62x) avg_12_64x64_c: 3516.4 ( 1.00x) avg_12_64x64_avx2: 408.7 ( 8.60x) avg_12_128x128_c: 13691.6 ( 1.00x) avg_12_128x128_avx2: 1613.8 ( 8.48x) w_avg_8_2x2_c: 16.7 ( 1.00x) w_avg_8_2x2_avx2: 14.0 ( 1.19x) w_avg_8_4x4_c: 48.2 ( 1.00x) w_avg_8_4x4_avx2: 16.1 ( 3.00x) w_avg_8_8x8_c: 168.0 ( 1.00x) w_avg_8_8x8_avx2: 22.5 ( 7.47x) w_avg_8_16x16_c: 392.5 ( 1.00x) w_avg_8_16x16_avx2: 47.9 ( 8.19x) w_avg_8_32x32_c: 1453.7 ( 1.00x) w_avg_8_32x32_avx2: 176.1 ( 8.26x) w_avg_8_64x64_c: 5631.4 ( 1.00x) w_avg_8_64x64_avx2: 690.8 ( 8.15x) w_avg_8_128x128_c: 22139.5 ( 1.00x) w_avg_8_128x128_avx2: 2742.4 ( 8.07x) w_avg_10_2x2_c: 18.1 ( 1.00x) w_avg_10_2x2_avx2: 13.8 ( 1.31x) w_avg_10_4x4_c: 47.0 ( 1.00x) w_avg_10_4x4_avx2: 16.4 ( 2.87x) w_avg_10_8x8_c: 110.0 ( 1.00x) w_avg_10_8x8_avx2: 21.6 ( 5.09x) w_avg_10_16x16_c: 395.2 ( 1.00x) w_avg_10_16x16_avx2: 45.4 ( 8.71x) w_avg_10_32x32_c: 1533.8 ( 1.00x) w_avg_10_32x32_avx2: 142.6 (10.76x) w_avg_10_64x64_c: 6004.4 ( 1.00x) w_avg_10_64x64_avx2: 672.8 ( 8.92x) w_avg_10_128x128_c: 23748.5 ( 1.00x) w_avg_10_128x128_avx2: 2198.0 (10.80x) w_avg_12_2x2_c: 17.2 ( 1.00x) w_avg_12_2x2_avx2: 13.9 ( 1.24x) w_avg_12_4x4_c: 51.4 ( 1.00x) w_avg_12_4x4_avx2: 16.5 ( 3.11x) w_avg_12_8x8_c: 109.1 ( 1.00x) w_avg_12_8x8_avx2: 22.0 ( 4.96x) w_avg_12_16x16_c: 395.9 ( 1.00x) w_avg_12_16x16_avx2: 44.9 ( 8.81x) w_avg_12_32x32_c: 1533.5 ( 1.00x) w_avg_12_32x32_avx2: 142.3 (10.78x) w_avg_12_64x64_c: 6002.0 ( 1.00x) w_avg_12_64x64_avx2: 557.5 (10.77x) w_avg_12_128x128_c: 23749.5 ( 1.00x) w_avg_12_128x128_avx2: 2202.0 (10.79x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 00:57:56 +01:00
Andreas Rheinhardt	7bf9c1e3f6	avcodec/x86/vvc/mc: Avoid redundant clipping for 8bit It is already done by packuswb. Old benchmarks: avg_8_2x2_c: 11.1 ( 1.00x) avg_8_2x2_avx2: 8.6 ( 1.28x) avg_8_4x4_c: 30.0 ( 1.00x) avg_8_4x4_avx2: 10.8 ( 2.78x) avg_8_8x8_c: 132.0 ( 1.00x) avg_8_8x8_avx2: 25.7 ( 5.14x) avg_8_16x16_c: 254.6 ( 1.00x) avg_8_16x16_avx2: 33.2 ( 7.67x) avg_8_32x32_c: 897.5 ( 1.00x) avg_8_32x32_avx2: 115.6 ( 7.76x) avg_8_64x64_c: 3316.9 ( 1.00x) avg_8_64x64_avx2: 626.5 ( 5.29x) avg_8_128x128_c: 12973.6 ( 1.00x) avg_8_128x128_avx2: 1914.0 ( 6.78x) w_avg_8_2x2_c: 16.7 ( 1.00x) w_avg_8_2x2_avx2: 14.4 ( 1.16x) w_avg_8_4x4_c: 48.2 ( 1.00x) w_avg_8_4x4_avx2: 16.5 ( 2.92x) w_avg_8_8x8_c: 168.1 ( 1.00x) w_avg_8_8x8_avx2: 49.7 ( 3.38x) w_avg_8_16x16_c: 392.4 ( 1.00x) w_avg_8_16x16_avx2: 61.1 ( 6.43x) w_avg_8_32x32_c: 1455.3 ( 1.00x) w_avg_8_32x32_avx2: 224.6 ( 6.48x) w_avg_8_64x64_c: 5632.1 ( 1.00x) w_avg_8_64x64_avx2: 896.9 ( 6.28x) w_avg_8_128x128_c: 22136.3 ( 1.00x) w_avg_8_128x128_avx2: 3626.7 ( 6.10x) New benchmarks: avg_8_2x2_c: 12.3 ( 1.00x) avg_8_2x2_avx2: 8.1 ( 1.52x) avg_8_4x4_c: 30.3 ( 1.00x) avg_8_4x4_avx2: 11.3 ( 2.67x) avg_8_8x8_c: 131.8 ( 1.00x) avg_8_8x8_avx2: 21.3 ( 6.20x) avg_8_16x16_c: 255.0 ( 1.00x) avg_8_16x16_avx2: 30.6 ( 8.33x) avg_8_32x32_c: 898.5 ( 1.00x) avg_8_32x32_avx2: 104.9 ( 8.57x) avg_8_64x64_c: 3317.7 ( 1.00x) avg_8_64x64_avx2: 540.9 ( 6.13x) avg_8_128x128_c: 12986.5 ( 1.00x) avg_8_128x128_avx2: 1663.4 ( 7.81x) w_avg_8_2x2_c: 16.8 ( 1.00x) w_avg_8_2x2_avx2: 13.9 ( 1.21x) w_avg_8_4x4_c: 48.2 ( 1.00x) w_avg_8_4x4_avx2: 16.2 ( 2.98x) w_avg_8_8x8_c: 168.6 ( 1.00x) w_avg_8_8x8_avx2: 46.3 ( 3.64x) w_avg_8_16x16_c: 392.4 ( 1.00x) w_avg_8_16x16_avx2: 57.7 ( 6.80x) w_avg_8_32x32_c: 1454.6 ( 1.00x) w_avg_8_32x32_avx2: 214.6 ( 6.78x) w_avg_8_64x64_c: 5638.4 ( 1.00x) w_avg_8_64x64_avx2: 875.6 ( 6.44x) w_avg_8_128x128_c: 22133.5 ( 1.00x) w_avg_8_128x128_avx2: 3334.3 ( 6.64x) Also saves 550B of .text here. The improvements will likely be even better on Win64, because it avoids using two nonvolatile registers in the weighted average case. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 00:57:56 +01:00
Andreas Rheinhardt	b22b65f2f8	avformat/hlsenc: Return error upon error, fix shadowing Introduced in `65fc0db581`. Reviewed-by: Marvin Scholz <epirat07@gmail.com> Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-22 00:23:00 +01:00
Michael Niedermayer	c98346ffaa	avcodec/libtheoraenc: make keyframe mask unsigned and handle its larger range Fixes: left shift of 1 by 31 places cannot be represented in type 'int' Fixes: 473579864/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_LIBTHEORA_fuzzer-5835688160591872 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-02-21 22:43:41 +00:00
Marvin Scholz	ca011ee754	avformat: Bump version and add APIChanges entry Needed after the recent addition of the command APIs.	2026-02-21 20:03:52 +01:00
Andreas Rheinhardt	3be4545b67	avcodec/vvc/inter: Deduplicate applying averaging Reviewed-by: Frank Plowman <post@frankplowman.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-02-21 12:48:50 +01:00

1 2 3 4 5 ...

122974 Commits