FFmpeg

mirror of https://mirror.skon.top/https://github.com/FFmpeg/FFmpeg synced 2026-04-20 12:50:49 +08:00

Author	SHA1	Message	Date
Zhao Zhili	a85a8e6757	configure: fix VSX remaining enabled when -mvsx is unsupported When check_cflags -mvsx fails, the && short-circuit prevents check_cc from running. Since check_cc is responsible for disabling vsx on failure, skipping it leaves vsx incorrectly enabled. Fix by removing the && so check_cc always executes. Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2026-04-13 11:45:36 +00:00
Andreas Rheinhardt	32678dcc88	avcodec/x86/snowdsp_init: Remove disabled SSE2 functions Disabled in `3e0f7126b5` (almost 20 years ago) and no one fixed them, so remove them. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:56:35 +02:00
Andreas Rheinhardt	bd2964e611	avcodec/x86/snowdsp_init: Use standard init pattern Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:56:01 +02:00
Andreas Rheinhardt	338dc25642	avcodec/x86/snowdsp_init: Remove MMXEXT, SSE2 inner_add_yblock versions They have been superseded by SSSE3; the SSE2 version was even disabled (and segfaults if enabled). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:53:17 +02:00
Andreas Rheinhardt	5c830fccf4	avcodec/x86/snowdsp: Add SSSE3 inner_add_yblock Compared to the MMX version, this version benefits from wider registers and pmaddubsw. It also has fewer unnecessary loads and stores: On x64, the MMX version has 12 unnecessary GPR loads and 6 stores in each line when width is eight; for width 16, there are 17 unnecessary GPR loads and six stores per line. Even the 32bit SSSE3 version only has six loads and zero stores per line more than the x64 version. Furthermore, in contrast to the MMX version, the SSSE3 version also does not clobber the array of block pointers given to it. Benchmarks: inner_add_yblock_2_c: 29.2 ( 1.00x) inner_add_yblock_2_mmx: 32.5 ( 0.90x) inner_add_yblock_2_ssse3: 28.6 ( 1.02x) inner_add_yblock_4_c: 85.2 ( 1.00x) inner_add_yblock_4_mmx: 89.2 ( 0.96x) inner_add_yblock_4_ssse3: 84.5 ( 1.01x) inner_add_yblock_8_c: 302.0 ( 1.00x) inner_add_yblock_8_mmx: 77.0 ( 3.92x) inner_add_yblock_8_ssse3: 30.6 ( 9.85x) inner_add_yblock_16_c: 1164.7 ( 1.00x) inner_add_yblock_16_mmx: 260.4 ( 4.47x) inner_add_yblock_16_ssse3: 82.3 (14.15x) Both the MMX and SSSE3 versions leave the size 2 and 4 cases to ff_snow_inner_add_yblock_c() (but the MMX version has a prologue at the beginning that it needs to undo before the call, leading to the higher overhead for these sizes). I don't know why the SSSE3 version is marginally faster than the C version in these cases. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:51:35 +02:00
Andreas Rheinhardt	2fdccaf7d6	tests/checkasm/mpegvideo_unquantize: Fix precedence problem Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:51:35 +02:00
Andreas Rheinhardt	4f30bd6fba	tests/checkasm/llvidencdsp: Fix nonsense randomization The first loop was never entered due to a precedence problem; the second loop initialized everything, although it was not intended that way. This has been added in `56b8769a1c`. Sorry for this. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:51:34 +02:00
Andreas Rheinhardt	e0ed3fa834	tests/checkasm: Add snowdsp test Only inner_add_yblock for now. Hint: Said function uses a pointer to an array of pointers as parameter. The MMX version clobbers the array in such a way that calling the function repeatedly with the same arguments (as happens inside bench_new()) leads to buffer overflows and segfaults. Therefore CALL4 had to be overridden to restore the original pointers. This workaround will be removed soon when the MMX version is removed. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:46:24 +02:00
Andreas Rheinhardt	764e021946	avcodec/snowdata: Add explicit alignment for obmc tables This is in preparation for adding SSSE3 assembly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:46:24 +02:00
Andreas Rheinhardt	28d0a5091a	avcodec/snow_dwt: Remove pointless forward declaration Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:46:24 +02:00
Andreas Rheinhardt	5f373872c0	avcodec/x86/snow_dwt: Avoid slice_buffer in inner_add_yblock It is unnecessary and avoids the src_y parameter; it also makes this function more ASM-friendly. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:46:24 +02:00
Andreas Rheinhardt	fd77f00a8f	avcodec/snow: Avoid always-true branch The input lines used in ff_snow_inner_add_yblock() must always be set (because their values are used). The MMX assembly always relied on this. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:46:24 +02:00
Andreas Rheinhardt	13d621cc7c	avcodec/snow: Disable dead code in ff_snow_inner_add_yblock() It is only used with add != 0 (and the assembly functions only support this case). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:46:24 +02:00
Andreas Rheinhardt	eed0830a0c	avcodec/snowdata: Don't use 8 bits for six bits data This has been done in `561a18d3ba` in order to avoid shifts, yet this rationale no longer applies since `d593e32983`. So shift them back; this is in preparation for using these coefficients together with pmaddubsw. Hint: `561a18d3ba` also added a block guarded by "if(LOG2_OBMC_MAX == 8". I changed the condition to remove this check (i.e. kept the block) which should not change the output at all. Yet all FATE tests pass if the block is completely removed. I don't know if this block is necessary at all. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 12:46:24 +02:00
Andreas Rheinhardt	761b6f2359	swscale/x86/output: Remove obsolete MMXEXT function Possible now that the SSE2 function is available even when the stack is not aligned. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 08:46:44 +02:00
Andreas Rheinhardt	8a7c1f7fb8	swscale/x86/output: Make xmm functions usable even without aligned stack x86-32 lacks one GPR, so it needs to be read from the stack. If the stack needs to be realigned, we can no longer access the original location of one argument, so just request a bit more stack size and copy said argument at a fixed offset from the new stack. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 08:46:44 +02:00
Andreas Rheinhardt	0bb161fd09	swscale/x86/output: Simplify creating dither register Only the lower quadword needs to be rotated, because the register is zero-extended immediately afterwards anyway. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 08:46:44 +02:00
Andreas Rheinhardt	f5c5bca803	swscale/x86/scale: Remove always-false mmsize checks Forgotten in `a05f22eaf3`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 08:46:44 +02:00
Andreas Rheinhardt	999ccf6495	swresample/x86/{audio_convert,rematrix}: Remove remnants of MMX Forgotten in `2b94f23b06`, `4e51e48ebd` and `374b3ab03c`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 01:16:46 +02:00
Andreas Rheinhardt	e29c7089d2	avcodec/x86/vp8dsp_loopfilter: Remove always-true mmsize checks Forgotten in `6a551f1405`. Also fix the comment claiming that there are MMXEXT functions in this file. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 00:41:22 +02:00
Andreas Rheinhardt	9f560c8c1a	avcodec/x86/vp3dsp: Remove unused macros Forgotten in `a677b38298`. Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-13 00:41:22 +02:00
Jun Zhao	411484e8c9	lavc/videotoolbox_vp9: fix vpcC flags offset Write the 24-bit vpcC flags field at the current cursor position after the version byte. The previous code wrote to p+1 instead of p, leaving one byte uninitialized between version and flags and shifting all subsequent fields (profile, level, bitdepth, etc.) by one byte. Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	2026-04-12 22:15:51 +00:00
Jun Zhao	57397a683d	lavc/videotoolboxenc: return SEI parse errors Return the actual find_sei_end() error when SEI appending fails instead of reusing the previous status code. This preserves the real parse failure for callers instead of reporting malformed SEI handling as success. Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	2026-04-12 22:15:51 +00:00
Niklas Haas	b09d57c41d	avfilter/buffersrc: re-add missing overflow warning This was originally introduced by commit `05d6cc116e`. During the FFmpeg-libav split, this function was refactored by commit `7e350379f8` into av_buffersrc_add_frame(), replacing av_buffersrc_add_ref(). The new function did not include the overflow warning, despite the same being done for buffersink. Then, when commit `a05a44e205` merged the two functions back together, the libav implementation was favored over the FFmpeg implementation, silently removing the overflow warning in the process. This commit re-adds that missing warning. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-12 20:02:18 +00:00
nyanmisaka	ab7b6ef0a2	hwcontext_vulkan: fix double free when vulkan_map_to_drm fails The multiplanar image with storage_bit enabled fails to be exported to DMA-BUF on the QCOM turnip driver, thus triggering this double-free issue. ``` [Parsed_hwmap_2 @ 0xffff5c002a70] Configure hwmap vulkan -> drm_prime. [hwmap @ 0xffff5c001180] Filter input: vulkan, 1920x1080 (0). [AVHWFramesContext @ 0xffff5c004e00] Unable to export the image as a FD! free(): double free detected in tcache 2 Aborted ``` Additionally, add back an av_unused attribute. Otherwise, the compiler will complain about unused variables when CUDA is not enabled. Signed-off-by: nyanmisaka <nst799610810@gmail.com>	2026-04-12 20:50:38 +08:00
zuxy	56b97c03d4	avcodec/x86/h264_intrapred: Replace pred8x8_top_dc_8_mmxext with SSE2 More about deprecating MMX than any performance gain; nearly identical performance numbers on my Zen 4 (1.36x vs c), but llvm-mca predicts >60% perf gain on Intel CPUs newer than Skylake. Signed-off-by: Zuxy Meng <zuxy.meng@gmail.com>	2026-04-11 19:11:46 -07:00
Niklas Haas	c29465bcb6	swscale/x86/ops: use plain `ret` instruction The original intent here was probably to make the ops code agnostic to which operation is actually last in the list, but the existence of a divergence between CONTINUE and FINISH already implies that we hard-code the assumption that the final operation is a write op. So we can just massively simplify this with a call/ret pair instead of awkwardly exporting and then jumping back to the return label. This actually collapses FINISH down into just a plain RET, since the op kernels already don't set up any extra stack frame. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-11 16:30:15 +00:00
Tymur Boiko	f7ca6f7481	vulkan: fix -Wdiscarded-qualifiers warning and misleading DRM modifier log ff_vk_find_struct returns const void , so storing it in const void drm_create_pnext fixes the initialization warning but then dpb_hwfc->create_pnext = drm_create_pnext assigns const void * to void , triggering the same warning at that line. The right fix is a (void ) cast at the call site, same as done for buf_pnext. Also restrict the GetPhysicalDeviceImageFormatProperties2 verbose log in try_export_flags to the DRM modifier path only: when has_mods is false the log always printed mod[0]=0x0, which is misleading since no DRM modifier is involved. Signed-off-by: Tymur Boiko <tboiko@nvidia.com>	2026-04-11 12:50:07 +00:00
Kacper Michajłow	eaadd05232	.forgejo/CODEOWNERS: add myself for hls.* Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2026-04-11 01:58:35 +02:00
Kacper Michajłow	721545a3c2	MAINTAINERS: add myself as HLS demuxer maintainer Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2026-04-11 01:58:35 +02:00
Kacper Michajłow	cc41e6a462	tests/fate/hlsenc: add hls-event-no-endlist test Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2026-04-11 01:58:34 +02:00
Kacper Michajłow	6d98a9a2e8	avformat/hls: fix seeking in EVENT playlists that start mid-stream HLS EVENT playlists (e.g. Twitch VODs) are seekable but not finished, so live_start_index causes playback to begin near the end. The first packet's DTS then becomes first_timestamp, creating a wrong mapping between timestamps and segments. Fix this by subtracting the cumulative duration of skipped segments from first_timestamp so it reflects the true start of the playlist. Also set per-stream start_time from first_timestamp so correct time is reported, reset pts_wrap_reference on seek to prevent bogus wrap arounds. Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2026-04-11 01:58:34 +02:00
Niklas Haas	ef13a29d08	avfilter/framepool: fix frame pool uninit check Fixes a memory leak caused by AV_MEDIA_TYPE_VIDEO == 0 being excluded by the !pool->type check. We can just remove the entire check because av_buffer_pool_uninit() is already safe on NULL. Fixes: `fe2691b3bb` Reported-by: Kacper Michajłow <kasper93@gmail.com> Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 22:02:00 +02:00
Alexandru Ardelean	e43aab67ed	avdevice/v4l2: rename 'buff_data' -> 'buf_desc' Since we've added a 'buf_data' struct, rename this to avoid any confusion about this one. Signed-off-by: Alexandru Ardelean <aardelean@deviqon.com>	2026-04-10 16:02:28 +00:00
Alexandru Ardelean	1011e4d647	avdevice/v4l2: wrap buf_start and buf_len into a struct This reduces the number of malloc() & free() calls, and structures the data for the buffers a bit neatly. In case more per-buffer data needs to be added, having a separate struct is useful. Signed-off-by: Alexandru Ardelean <aardelean@deviqon.com>	2026-04-10 16:02:28 +00:00
Alexandru Ardelean	24adcf3a72	avdevice/v4l2: fix potential memleak when allocating device buffers In the loop which allocates the buffers for a V4L2 device, if failure occurs for a certain buffer (e.g. 3rd of 4 buffers), then the previously allocated buffers (and the buffer array) would not be free'd in the mmap_init(). This would cause a leak. This change handles the error cases of that loop to free all allocated resources, so that when mmap_init() fails nothing is leaked. Signed-off-by: Alexandru Ardelean <aardelean@deviqon.com>	2026-04-10 16:02:28 +00:00
Niklas Haas	0e983a0604	swscale: align allocated frame buffers to SwsPass hints This avoids hitting the slow memcpy fallback paths altogether, whenever swscale.c is handling plane allocation. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	b5573a8683	swscale/ops_dispatch: cosmetic Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	3a15990368	swscale/ops_dispatch: forward correct pass alignment As a consequence of the fact that the frame pool API doesn't let us directly access the linesize, we have to "un-translate" the over_read/write back to the nearest multiple of the pixel size. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	5441395a48	swscale/graph: add optimal alignment/padding hints Allows the pass buffer allocator to make smarter decisions based on the actual alignment requirements of the specific pass. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	2deca0ec19	swscale: clean up allocated frames on error Matches the semantics of sws_frame_begin(), which also cleans up any allocated buffers on error. This is an issue introduced by the commit that allowed ff_sws_graph_run() to fail in the first place. Fixes: `563cc8216b`	2026-04-10 15:12:18 +02:00
Niklas Haas	6c89a30ecd	swscale: add FFFramePool and use it for allocating planes The major consequence of this is that we start allocating buffers per plane, instead of allocating one contiguous buffer. This makes the no-op/refcopy case slightly slower, but doesn't meaningfully affect the rest: yuva444p -> yuva444p, time=157/1000 us (ref=78/1000 us), speedup=0.497x slower Overall speedup=1.016x faster, min=0.983x max=1.092x However, this is a necessary consequence of the desire to allow partial plane allocations / single plane refcopies. This slowdown also does not affect vf_scale, which already uses avfilter/framepool.c (via ff_get_video_buffer). Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	fe2691b3bb	avfilter/framepool: stack-allocate FFFramePool Saves a pointless free/alloc cycle on reinit. For the vast majority of filter links, this going to be allocated anyway; and on the occasions that it's not, the waste is marginal. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	a2ca55c563	avfilter/framepool: remove unnecessary braces (style) As per the FFmpeg coding style guidelines, braces should be avoided on isolated single-line statement bodies. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	5c4490a0a6	avfilter/framepool: fix whitespace (cosmetic) Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	38543781cc	avfilter/framepool: move variable declarations to site of definition This is not C89 anymore. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	6efbd99e48	avfilter/framepool: remove check for impossible condition FFALIGN(..., pool->align) = (...) & ~(pool->align - 1), so this condition equates to: ((...) & ~(align - 1) & (align - 1)), which is trivially 0. (Note that all expressions are of type `int`) Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	0b43b8ef31	avfilter/framepool: make FFFramePool public This struct is overally pretty trivial and there is little to no internal state or invariants that need to be protected. Making it public allows e.g. libswscale to allocate buffers for individual planes directly. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	3e99631873	avfilter/framepool: remove pointless ternary (cosmetic) Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00
Niklas Haas	53ce7265ab	avfilter/framepool: use strongly typed union of pixel/sample format Replacing the generic `int format` field. This aids in debugging, as e.g. gdb will tend to translate the strongly typed enums back into human readable names automatically. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-10 15:12:18 +02:00

1 2 3 4 5 ...

124046 Commits