FFmpeg

mirror of https://mirror.skon.top/https://github.com/FFmpeg/FFmpeg synced 2026-04-20 21:00:41 +08:00

Author	SHA1	Message	Date
Arien Shibani	c67a4554d1	CONTRIBUTING.md: add blank line after top heading Insert spacing after the first heading (MD022-style).	2026-04-20 12:32:50 +00:00
Arien Shibani	849e8307ce	doc/transforms.md: add document title and fix heading structure Add a top-level title and demote former section headings (MD041-style hierarchy). Add blank lines around headings and fenced code blocks where appropriate (MD022 and MD031-style). Some Markdown parsers, including kramdown, only recognize headings that are preceded by a blank line.	2026-04-20 12:32:50 +00:00
Arien Shibani	6e3366e9bc	INSTALL.md: add title heading and normalize section levels Use a top-level heading on the first line (MD041-style) and adjust section levels for clearer document structure. Improves navigation for assistive technologies that rely on heading outlines.	2026-04-20 12:32:50 +00:00
Arien Shibani	519c80b626	README.md: use consistent ATX heading style Align heading markers with markdownlint MD003 suggestions.	2026-04-20 12:32:50 +00:00
Andreas Rheinhardt	5e69e6d49c	avformat/pdvenc: Don't silently truncate value This muxer seems to intend to support output that does not begin at zero (instead of e.g. just hardcoding nb_frames_pos to 16). But then it is possible that avio_seek() returns values > INT_MAX even though the part of the file written by us can not exceed this value. So the return value of avio_seek() needs to be checked as 64bit integer and not silently truncated to int. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-20 12:54:31 +02:00
Andreas Rheinhardt	2b8438a495	avformat/pdvenc: Remove always-false checks The number of streams is always one (namely one video stream with codec id AV_CODEC_ID_PDV) due to the MAX_ONE_OF_EACH, ONLY_DEFAULT_CODECS flags. Also, the generic code (init_muxer() in mux.c) checks that video streams have proper dimensions set. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-20 12:54:31 +02:00
Andreas Rheinhardt	6135ccbf80	avcodec/pdvenc: Return directly upon error This encoder has the FF_CODEC_CAP_INIT_CLEANUP cap set. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-20 12:54:31 +02:00
Andreas Rheinhardt	87a6be19f8	avcodec/pdvenc: Remove always false check av_image_check_size() already checks that width*height fits into an int. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-20 12:54:31 +02:00
Andreas Rheinhardt	c94cb9c04f	avcodec/pdvenc: Remove always-false pixel format check Already checked via CODEC_PIXFMTS. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-20 12:54:31 +02:00
Andreas Rheinhardt	e908c92f5a	avcodec/cavs: Don't allocate block separately Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-20 12:21:41 +02:00
Jeongkeun Kim	de18feb0f0	avutil/aarch64: add pixelutils 32x32 SAD NEON implementation This adds a NEON-optimized function for computing 32x32 Sum of Absolute Differences (SAD) on AArch64, addressing a gap where x86 had SSE2/AVX2 implementations but AArch64 lacked equivalent coverage. The implementation mirrors the existing sad8 and sad16 NEON functions, employing a 4-row unrolled loop with UABAL and UABAL2 instructions for efficient load-compute interleaving, and four 8x16-bit accumulators to handle the wider 32-byte rows. Benchmarks on AWS Graviton3 (Neoverse V1, c7g.xlarge) using checkasm: sad_32x32_0: C 146.4 cycles -> NEON 98.1 cycles (1.49x speedup) sad_32x32_1: C 141.4 cycles -> NEON 98.9 cycles (1.43x speedup) sad_32x32_2: C 140.7 cycles -> NEON 95.0 cycles (1.48x speedup) Signed-off-by: Jeongkeun Kim <variety0724@gmail.com>	2026-04-19 19:27:55 +00:00
llyyr	4af27ba4ca	doc/APIchanges: fix date and version in latest entry This incorrectly lists the libavcodec major version as 60 instead of 62. Also fix the date and commit hash while at it Fixes: `7faa6ee2aa` ("libavformat/matroska: Support smpte 2094-50 metadata") Signed-off-by: llyyr <llyyr.public@gmail.com>	2026-04-19 15:37:33 +00:00
Romain Beauxis	82d7e375f1	libavdevice/alsa.c: fix NULL pointer dereference	2026-04-19 15:00:08 +00:00
Andreas Rheinhardt	415b466d41	avcodec/x86/vp3dsp: Port ff_vp3_idct_dc_add_mmxext to SSE2 This change should improve performance on Skylake and later Intel CPUs (which have only half the ports for saturated adds/subs for mmx register compared to xmm register): llvm-mca predicts a 25% performance improvement on Skylake. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 08:21:17 +02:00
Andreas Rheinhardt	e7a613b274	avcodec/x86/vp3dsp: Avoid loads and stores Instead reuse values in registers. Old benchmarks: idct_add_c: 74.2 ( 1.00x) idct_add_sse2: 60.4 ( 1.23x) idct_put_c: 100.8 ( 1.00x) idct_put_sse2: 58.7 ( 1.72x) New benchmarks: idct_add_c: 74.2 ( 1.00x) idct_add_sse2: 55.2 ( 1.34x) idct_put_c: 107.5 ( 1.00x) idct_put_sse2: 54.1 ( 1.99x) Hint: For x64, all the intermediate stores could be avoided. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 08:21:17 +02:00
Andreas Rheinhardt	ed59fc77e8	avcodec/x86/vp3dsp: Use named args in idct functions Also avoid REX prefixes while just at it. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 08:21:17 +02:00
Andreas Rheinhardt	c1af56357b	avcodec/x86/vp3dsp: Avoid unnecessary macro, repetition Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 08:21:17 +02:00
Andreas Rheinhardt	88879f2eff	tests/checkasm/vp3dsp: Add test for idct_add, idct_put, idct_dc_add Due to a discrepancy between SSE2 and the C version coefficients for idct_put and idct_add are restricted to a range not causing overflows. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 08:21:08 +02:00
Andreas Rheinhardt	84b9de0633	avcodec/x86/vp3dsp: Port ff_put_vp_no_rnd_pixels8_l2_mmx to SSE2 This allows to use pavgb to reduce the amount of instructions used to calculate the average; processing two rows via movhps allows to reduce the amount of pxor and pavgb even further and turned out to be beneficial. This patch also avoids a load as the constant used here can be easily generated at runtime. Old benchmarks: put_no_rnd_pixels_l2_c: 13.3 ( 1.00x) put_no_rnd_pixels_l2_mmx: 11.6 ( 1.15x) New benchmarks: put_no_rnd_pixels_l2_c: 13.4 ( 1.00x) put_no_rnd_pixels_l2_sse2: 7.5 ( 1.77x) Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 08:15:54 +02:00
Andreas Rheinhardt	37bc3a237b	tests/checkasm/vp3dsp: Add test for put_no_rnd_pixels_l2 Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 08:14:50 +02:00
Andreas Rheinhardt	79ce4432e0	avcodec/simple_idct10_template: Reduce amount of registers used This allows to avoid the stack for the 8 bit simple IDCT; for the other IDCTs, it avoids storing and restoring two xmm registers on Win64. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-19 07:36:37 +02:00
Andreas Rheinhardt	e1782fb016	avutil/x86/pixelutils: Don't use mmx in 8x8 SAD This function is exported, so has to abide by the ABI and therefore issues emms since commit `5b85ca5317`. Yet this is expensive and using SSE2 instead improves performance. Also avoid the initial zeroing and the last pointer increment while just at it. This removes the last usage of mmx from libavutil. Old benchmarks: sad_8x8_0_c: 13.2 ( 1.00x) sad_8x8_0_mmxext: 27.8 ( 0.48x) sad_8x8_1_c: 13.2 ( 1.00x) sad_8x8_1_mmxext: 27.6 ( 0.48x) sad_8x8_2_c: 13.3 ( 1.00x) sad_8x8_2_mmxext: 27.6 ( 0.48x) New benchmarks: sad_8x8_0_c: 13.3 ( 1.00x) sad_8x8_0_sse2: 11.7 ( 1.13x) sad_8x8_1_c: 13.8 ( 1.00x) sad_8x8_1_sse2: 11.6 ( 1.20x) sad_8x8_2_c: 13.2 ( 1.00x) sad_8x8_2_sse2: 11.8 ( 1.12x) Hint: Using two psadbw or one psadbw and movhps made no difference in the benchmarks, so I chose the latter due to smaller codesize. : except if lavu provides avpriv_emms for other libraries Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-18 21:21:11 +02:00
Michael Niedermayer	d538a71ad5	avcodec/svq1dec: Check input space for minimum We reject inputs that are significantly smaller than the smallest frame. This check raises the minimum input needed before time consuming computations are performed it thus improves the computation per input byte and reduces the potential DoS impact Fixes: Timeout Fixes: 472769364/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SVQ1_DEC_fuzzer-5519737145851904 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-04-18 18:32:50 +00:00
Andreas Rheinhardt	fcffc0e1c5	avformat/matroskaenc: Remove pointless side-data size checks Just presume that we any present side data is actually valid. Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-17 23:53:12 +02:00
Andreas Rheinhardt	38df985fba	avformat/matroskaenc: Use separate buffer for SMPTE 2094 blockadditional Otherwise the buffer for the hdr10+ blockadditional would be clobbered if both are present (the buffers can only be reused after the ebml_writer_write() call). Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-17 23:53:07 +02:00
Andreas Rheinhardt	25ce544d4b	avformat/matroskaenc: Increase size of EBML_WRITER array `7faa6ee2aa` added support for writing AV_PKT_DATA_DYNAMIC_HDR_SMPTE_2094_APP5, yet forgot to update the size of the EBML element buffer. Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-04-17 23:52:53 +02:00
Vignesh Venkat	7faa6ee2aa	libavformat/matroska: Support smpte 2094-50 metadata Add support for parsing and muxing smpte 2094-50 metadata. It will be stored as an ITUT-T35 message in the BlockAdditional element with an AddId type of 4 (which is reserved for ITUT-T35 in the matroska spec). https://www.matroska.org/technical/codec_specs.html#itu-t35-metadata Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>	2026-04-17 18:51:25 +00:00
Lynne	c1b19ee69f	aacdec: add support for 960-frame HE-AAC (DAB+) decoding Finally, after so many years. I'm sure there's good DAB+ content out there being broadcast. Go and listen to it.	2026-04-17 16:46:52 +02:00
Hassan Hany	9e4041d5ea	avcodec/opus: use precomputed NLSF weights for Silk decoder Precompute the SILK NLSF residual weights from the stage-1 codebooks and use the table during LPC decode. This removes the per-coefficient mandated fixed-point weight calculation in silk_decode_lpc() while preserving the same decoded values.	2026-04-17 14:39:20 +00:00
Niklas Haas	96f82f4fbb	swscale/x86/ops: simplify SWS_OP_CLEAR patterns Mark the components to be cleared, not the components to be preserved. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:25:17 +02:00
Niklas Haas	08707934cc	swscale/ops_backend: simplify SWS_OP_CLEAR declarations Mark the components to be cleared, not the components to be preserved. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:25:17 +02:00
Niklas Haas	7a71a01a1b	swscale/ops: nuke SwsComps.unused Finally, remove the last relic of this accursed design mistake. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:25:17 +02:00
Niklas Haas	a797e30f71	swscale/aarch64/ops: compute SWS_OP_PACK mask directly Instead of implicitly relying on SwsComps.unused, which contains the exact same information. (cf. ff_sws_op_list_update_comps) Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:25:17 +02:00
Niklas Haas	6d1e549195	swscale/aarch64/ops: use SWS_OP_NEEDED() instead of next->comps.unused These are basically identical, but the latter is being phased out. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:25:17 +02:00
Niklas Haas	18cc71fc8e	swscale/aarch64/ops: fix SWS_OP_LINEAR mask check The implementation of AARCH64_SWS_OP_LINEAR loops over elements of this mask to determine which output rows to compute. However, it is being set by this loop to `op->comps.unused`, which is a mask of unused input rows. As such, it should be looking at `next->comps.unused` instead. This did not result in problems in practice, because none of the linear matrices happened to trigger this case (more input columns than output rows). Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:25:17 +02:00
Niklas Haas	df4fe85ae3	swscale/ops_chain: replace SwsOpEntry.unused by SwsCompMask Needed to allow us to phase out SwsComps.unused altogether. It's worth pointing out the change in semantics; while unused tracks the unused input components, the mask is defined as representing the computed output components. This is 90% the same, expect for read/write, pack/unpack, and clear; which are the only operations that can be used to change the number of components. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:25:10 +02:00
Niklas Haas	215cd90201	swscale/x86/ops: simplify DECL_DITHER definition This extra indirection boilerplate just for the 0-size fast path really isn't doing us any favors. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:24:55 +02:00
Niklas Haas	9f0dded48d	swscale/ops_chain: check for exact linear mask match Makes this logic a lot simpler and less brittle. We can trivially adjust the list of linear masks that are required, whenever it changes as a result of any future modifications. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:24:55 +02:00
Niklas Haas	e20a32d730	swscale/x86/ops: align linear kernels with reference backend See previous commit. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:24:55 +02:00
Niklas Haas	9b1c1fe95f	swscale/ops_backend: align linear kernels with actually needed masks Using the power of libswscale/tests/sws_ops -summarize lets us see which kernels are actually needed by real op lists. Note: I'm working on a separate series which will obsolete this implementation whack-a-mole game altogether, by generating a list of all possible op kernels at compile time. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:24:55 +02:00
Niklas Haas	af2674645f	swscale/ops: drop offset from SWS_MASK_ALPHA This is far more commonly used without an offset than with; so having it there prevents these special cases from actually doing much good. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:24:55 +02:00
Niklas Haas	526195e0a3	swscale/x86/ops_float: fix typo in linear_row First vector is %2, not %3. This was never triggered before because all of the existing masks never hit this exact case. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:24:55 +02:00
Niklas Haas	6a83e15392	swscale/ops_chain: simplify SwsClearOp checking Since this now has an explicit mask, we can just check that directly, instead of relying on the unused comps hack/trick. Additionally, this also allows us to distinguish between fixed value and arbitrary value clears by just having the SwsOpEntry contain NAN values iff they support any clear value. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:24:22 +02:00
Niklas Haas	80bd6c0cd5	swscale/ops: don't strip range metadata for unused components As alluded to by the previous commit, this is now no longer necessary to prevent their print-out. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:23:36 +02:00
Niklas Haas	3680642e1b	swscale/ops: simplify min/max range print check This does come with a slight change in behavior, as we now don't print the range information in the case that the range is only known for unused components. However, in practice, that's already guaranteed by update_comps() stripping the range info explicitly in this case. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:23:36 +02:00
Niklas Haas	9bb2b11d5b	swscale/ops: add SwsCompMask parameter to print_q4() Instead of implicitly excluding NAN values if ignore_den0 is set. This gives callers more explicit control over which values to print, and in doing so, makes sure "unintended" NaN values are properly printed as such. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:23:36 +02:00
Niklas Haas	cf2d40f65d	swscale/ops: add explicit clear mask to SwsClearOp Instead of implicitly testing for NaN values. This is mostly a straightforward translation, but we need some slight extra boilerplate to ensure the mask is correctly updated when e.g. commuting past a swizzle. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:23:36 +02:00
Niklas Haas	4020607f0a	swscale/ops: add SwsCompMask and related helpers This new type will be used over the following commits to simplify the codebase. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:23:36 +02:00
Niklas Haas	ce2ca1a186	swscale/ops_optimizer: fix commutation of U32 clear + swap_bytes This accidentally unconditionally overwrote the entire clear mask, since Q(n) always set the denominator to 1, resulting in all channels being cleared instead of just the ones with nonzero denominators. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:23:36 +02:00
Niklas Haas	953d278a01	tests/swscale: fix input pattern generation for very small sizes This currently completely fails for images smaller than 12x12; and even in that case, the limited resolution makes these tests a bit useless. At the risk of triggering a lot of spurious SSIM regressions for very small sizes (due to insufficiently modelling the effects of low resolution on the expected noise), this patch allows us to at least run such tests. Incidentally, 8x8 is the smallest size that passes the SSIM check.	2026-04-16 20:59:39 +00:00

1 2 3 4 5 ...

124056 Commits