FFmpeg

mirror of https://mirror.skon.top/https://github.com/FFmpeg/FFmpeg synced 2026-04-23 18:35:18 +08:00

Author	SHA1	Message	Date
Andreas Rheinhardt	dbdf514c17	avcodec/x86/h264_deblock_10bit: Remove custom stack allocation code Allocate it via cglobal as usual. This makes the SSE2/AVX functions available when HAVE_ALIGNED_STACK is false; it also avoids modifying rsp unnecessarily in the deblock_h_luma_intra_10 functions on Win64. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-01-25 22:53:21 +01:00
Andreas Rheinhardt	b1140d3c98	avcodec/x86/h264_deblock: Remove obsolete macro parameters They are a remnant of the MMX functions (which processed only eight pixels at a time, so that it was called twice via a wrapper; the actual MMX function had "v8" in its name instead of simply v) which have been removed in commit `4618f36a24`. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-01-25 22:53:21 +01:00
Andreas Rheinhardt	899475326b	avcodec/x86/h264_deblock: Simplify splatting Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-01-25 22:53:21 +01:00
Andreas Rheinhardt	a22149ab3d	avcodec/x86/h264_deblock: Remove always-false branches These functions are always called with alpha and beta > 0. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-01-25 22:53:21 +01:00
Andreas Rheinhardt	982244818b	avcodec/x86/h264_deblock: Remove unused macros Forgotten in `4618f36a24`. Also remove a PASS8ROWS wrapper that seems to have been always unused. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-01-25 22:53:21 +01:00
Andreas Rheinhardt	6e65d1c945	avcodec/motion_est: Fix left shifts of negative numbers Fixes ticket #21486. Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-01-25 22:46:39 +01:00
Jun Zhao	8966101fa6	lavc/hevc: add aarch64 neon for 12-bit dequant Implement NEON optimization for HEVC dequant at 12-bit depth. For 12-bit: shift = 15 - 12 - log2_size = 3 - log2_size. When shift is negative, we use shl (shift left) instead of srshr. Performance benchmark on Apple M4: ./tests/checkasm/checkasm --test=hevc_dequant --bench hevc_dequant_4x4_12_c: 9.9 ( 1.00x) hevc_dequant_4x4_12_neon: 5.7 ( 1.74x) hevc_dequant_8x8_12_c: 1.7 ( 1.00x) hevc_dequant_8x8_12_neon: 1.3 ( 1.30x) hevc_dequant_16x16_12_c: 131.1 ( 1.00x) hevc_dequant_16x16_12_neon: 7.9 (16.52x) hevc_dequant_32x32_12_c: 69.7 ( 1.00x) hevc_dequant_32x32_12_neon: 28.4 ( 2.46x) Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	2026-01-25 06:55:26 +00:00
Jun Zhao	ce89d974c8	lavc/hevc: add aarch64 neon for 10-bit dequant Implement NEON optimization for HEVC dequant at 10-bit depth. For 10-bit: shift = 15 - 10 - log2_size = 5 - log2_size Performance benchmark on Apple M4: ./tests/checkasm/checkasm --test=hevc_dequant --bench hevc_dequant_4x4_10_c: 16.6 ( 1.00x) hevc_dequant_4x4_10_neon: 7.4 ( 2.23x) hevc_dequant_8x8_10_c: 39.7 ( 1.00x) hevc_dequant_8x8_10_neon: 7.5 ( 5.28x) hevc_dequant_16x16_10_c: 168.7 ( 1.00x) hevc_dequant_16x16_10_neon: 10.2 (16.56x) hevc_dequant_32x32_10_c: 1.9 ( 1.00x) hevc_dequant_32x32_10_neon: 1.9 ( 1.01x) Note: 32x32 shift=0 is identity transform (no-op), so NEON has no advantage over C which is also optimized away by the compiler. Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	2026-01-25 06:55:26 +00:00
Jun Zhao	24f296c7a1	lavc/hevc: optimize dequant for shift=0 case (identity transform) The HEVC dequantization uses: shift = 15 - bit_depth - log2_size When shift equals 0, the operation becomes an identity transform: - For shift > 0: output = (input + offset) >> shift - For shift < 0: output = input << (-shift) - For shift = 0: output = input << 0 = input (no change) This occurs in the following cases: - 10-bit, 32x32 block: shift = 15 - 10 - 5 = 0 - 12-bit, 8x8 block: shift = 15 - 12 - 3 = 0 Previously, the code would still iterate through all coefficients and perform redundant read-modify-write operations even when shift=0. This patch adds an early return for shift=0, avoiding unnecessary memory operations. checkasm benchmarks on Apple M4 show: - 10-bit 32x32: 69.1 -> 1.6 cycles (43x faster) - 12-bit 8x8: 30.9 -> 1.7 cycles (18x faster) Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	2026-01-25 06:55:26 +00:00
Jun Zhao	0886e50c6b	lavc/hevc: add aarch64 neon for 8-bit dequant Implement NEON optimization for HEVC dequant at 8-bit depth. The NEON implementation uses srshr (Signed Rounding Shift Right) which does both the add with offset and right shift in a single instruction. Optimization details: - 4x4 (16 coeffs): Single load-process-store sequence - 8x8 (64 coeffs): Fully unrolled, no loop overhead - 16x16 (256 coeffs): Pipelined load/compute/store to hide memory latency - 32x32 (1024 coeffs): Pipelined with all available NEON registers Performance benchmark on Apple M4: ./tests/checkasm/checkasm --test=hevc_dequant --bench hevc_dequant_4x4_8_c: 11.3 ( 1.00x) hevc_dequant_4x4_8_neon: 6.3 ( 1.78x) hevc_dequant_8x8_8_c: 33.9 ( 1.00x) hevc_dequant_8x8_8_neon: 6.6 ( 5.11x) hevc_dequant_16x16_8_c: 153.8 ( 1.00x) hevc_dequant_16x16_8_neon: 9.0 (17.02x) hevc_dequant_32x32_8_c: 78.1 ( 1.00x) hevc_dequant_32x32_8_neon: 31.9 ( 2.45x) Note on Performance Anomaly: The observation that hevc_dequant_32x32_8_c is faster than 16x16 (78.1 vs 153.8) is due to Clang auto-vectorizing only for sizes >= 32x32. Compiler: Apple clang version 17.0.0 (clang-1700.6.3.2) Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	2026-01-25 06:55:26 +00:00
Zhao Zhili	1e1dde8798	avcodec/libx265: map ffmpeg log level to x265 log level Previously x265 encoder used its default log level regardless of FFmpeg's log level setting. Note the log level can be overwritten by x265-params. Fix #21462 Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2026-01-25 13:09:30 +08:00
Carl Eugen Hoyos	aab0c23cb8	lavc/j2kdec: Do not ignore colour association for packed formats Fixes ticket #9468. Signed-off-by: Carl Eugen Hoyos <ceffmpeg@gmail.com	2026-01-24 20:25:05 +00:00
Christopher Degawa	a5d4c398b4	avcodec/libsvtav1: rename aq_mode for v4.0.0 Signed-off-by: Christopher Degawa <ccom@randomderp.com> Signed-off-by: James Almer <jamrial@gmail.com>	2026-01-23 23:07:18 -03:00
Lynne	8349565d52	aacsbr_template: fix SBR USAC coupling This issue hid under the radar since the codebooks between coupling modes very often result in identical bit counts regardless of the encoded data, leading to no frame-level bitstream desyncs except in rare cases. AAC Mps212 data is parsed immediately after the SBR data, where a loss of sync in SBR will result in Mps212 being wildly different.	2026-01-23 14:40:52 +01:00
Ling, Edison	a93cb79da2	avcodec/d3d12va_encode: Bug fix and refactor for motion estimation precision initialization Move motion estimation precision check from standalone `d3d12va_encode_init_motion_estimation_precision()` function into each codec's init_sequence_params() to reuse existing feature support queries. - fixes AV1 using wrong support structure (SUPPORT instead of SUPPORT1) - eliminates duplicate setup code - removes redundant CheckFeatureSupport API call - no intended functional changes other than bug fix	2026-01-23 13:25:55 +00:00
James Almer	dd2976b9e1	avcodec/mlp: don't duplicate the AV_CRC_8_EBU table Signed-off-by: James Almer <jamrial@gmail.com>	2026-01-22 17:44:46 -03:00
Hyunjun Ko	b637624046	avcodec/vulkan_av1: fix mi_col_starts and mi_row_starts units The spec says: pMiColStarts is a pointer to an array of TileCols number of unsigned integers that corresponds to MiColStarts defined in section 6.8.14 of the [AV1 Specification] And the unit of MiColStarts is MI(mode info). So is pMiRowStarts.	2026-01-21 10:42:02 +00:00
Ramiro Polla	96d8e19720	avcodec/mjpegdec: fix segfault on extern_huff and no extradata Regression since `1debadd58e`.	2026-01-21 03:26:02 +00:00
Werner Robitza	d25d133df3	avcodec/libx265: add pass and x265-stats option Add support for standard -pass and -passlogfile options, matching the behavior of libx264. Add the -x265-stats option to specify the stats filename. Update documentation. Signed-off-by: Werner Robitza <werner.robitza@gmail.com>	2026-01-20 10:10:26 +00:00
Manuel Lauss	d244d438c3	avcodec/sanm: fix BL16 c1/7 source overread Fix the required size calculation. Reported-by: Ruikai Peng <ruikai@pwno.io> Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>	2026-01-20 09:47:47 +00:00
Andreas Rheinhardt	94b7385592	avcodec/mlpenc: Mark unreachable cases as such Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-01-20 00:38:35 +00:00
Ramiro Polla	e960f0aa01	avcodec/mjpegdec: remove qscale_table field from MJpegDecodeContext This field has been unused since `759001c534`.	2026-01-19 22:42:09 +00:00
Marton Balint	387a522106	avcodec/libvpxdec: use codec capabilities to determine if external frame buffer can be used Previously we used the codec or at the time of decoding fb_priv for this, but fb_priv can be nonzero even if an external frame buffer is not set, so it's cleaner to use the capability flag directly. Also check the result of vpx_codec_set_frame_buffer_functions. Signed-off-by: Marton Balint <cus@passwd.hu>	2026-01-19 21:32:00 +00:00
Marton Balint	a2688827f4	avcodec/libvpxdec: cache the decoder interface This saves us some #ifdefry. Signed-off-by: Marton Balint <cus@passwd.hu>	2026-01-19 21:32:00 +00:00
Marton Balint	a6069092af	avcodec/libvpxenc: log the error message from the correct encoder It is possible that the error happens with the alpha encoder, not the normal one, so let's always pass the affected encoder to the logging function. Signed-off-by: Marton Balint <cus@passwd.hu>	2026-01-19 21:32:00 +00:00
Michael Niedermayer	fc8a614f3d	avcodec/omx: Check extradata size and nFilledLen No testcase, its unknown if this is a real issue Reported-by: Peter Teoh <htmldeveloper@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-01-19 20:47:22 +00:00
Michael Niedermayer	09ec2b397a	avcodec/exr: use av_realloc_array() Related to: #YWH-PGM40646-33 See: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/21347 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2026-01-19 20:41:04 +00:00
Lynne	3ccafa5906	ffv1_vulkan: generate a CRC table during runtime Since the recent CRC changes, get_table returns arch-dependent tables.	2026-01-19 16:37:17 +01:00
Lynne	e3a96a69cb	vulkan_dpx: remove host image upload path The main reason this was written was due to Nvidia. Nvidia always has a fickle upload path, and seemed to have a shortcut for the host image upload path. This seems to have been patched out of recent driver versions. This upload path relies on the driver keeping the same layout, down to the stride for the images. Which is an assumption that's not portable. Rather than relying on this fickle upload path, what we'd like when we want pure bandwidth is to decouple uploads to a separate queue, and let the GPU pull the data from RAM via uploads. It'll be slower with a single-threaded decoder, but currently all of our compute-based decoders and the decoders that sit underneath them support frame threading.	2026-01-19 16:37:17 +01:00
Lynne	713e3c4f91	vulkan_decode: do not align single-plane images to subsampling Unlike multiplane images, single-plane images do not need to be aligned to chroma width. Saves a bit of memory.	2026-01-19 16:37:16 +01:00
Lynne	8dcf02ac63	vulkan: remove IS_WITHIN macro This is the more correct GLSL solution.	2026-01-19 16:37:15 +01:00
Araz Iusubov	850436a517	avcodec/amfenc: fix async_depth deadlock with lookahead AMF encoders may deadlock when lookahead > async_depth. Automatically adjust async_depth to lookahead + 1 to prevent hangs.	2026-01-19 15:36:37 +00:00
Gyan Doshi	43dbc011fa	avcodec/bsf/setts: add option prescale When prescale is enabled, time fields are converted to the output timebase before expression evaluation. This allows option specification even if the input timebase is unknown.	2026-01-19 16:51:47 +05:30
Gyan Doshi	1ccd2f6243	avcodec/bsf/setts: rescale TS when changing TB The setts bsf has an option to change TB. However the filter only changed the TB and did not rescale the ts and duration, so it effectively and silently stretched or squeezed the stream. The pts, dts and duration are now rescaled to maintain temporal fidelity.	2026-01-19 16:51:31 +05:30
Zhao Zhili	8f9700bff0	avcodec/d3d12va_encode_h264: simplify deblock option to bool type Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>	2026-01-19 09:14:06 +00:00
Andreas Rheinhardt	063684efec	avcodec/mlp: Don't use internals of CRC API ff_mlp_restart_checksum() used the (undocumented) layout of the CRC tables and therefore broke on x86 when the clmul implementation added in `dc03cffe9c` is used. This commit fixes this (and issue #21506). Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>	2026-01-19 00:30:20 +01:00
Timo Rothenpieler	7379539685	avcodec/prores_raw: use av_popcount instead of limited lookup table The calculation can yield bigger values than the lookup table provides. So just use actual popcount instead. Fixes #21503	2026-01-18 18:52:55 +01:00
James Almer	685ceebd42	avcodec/vc1dec: fix memory leak on error Regression since `8a1c2779a0`. Fixes CID 732271. Signed-off-by: James Almer <jamrial@gmail.com>	2026-01-17 17:56:06 -03:00
averne	3829f4ba6a	vulkan/prores: reduce push constants size The VK specs only mandates 128B, and some platforms don't actually implement more. This moves the quantization matrices to the per-frame buffer.	2026-01-17 17:33:31 +00:00
James Almer	f8e39f6c73	avcodec/hevc/ps: add missing check for profile tier level count Fixes issue #21488. Signed-off-by: James Almer <jamrial@gmail.com>	2026-01-17 12:37:47 -03:00
James Almer	f311969c03	avcodec/qdm2: propagate error values in the entire decoder And add missing error value checks. Fixes the rest of of issue #21476. Signed-off-by: James Almer <jamrial@gmail.com>	2026-01-17 12:03:51 -03:00
James Almer	1ffcd07400	avcodec/mimic: check return value of init_get_bits() Fixes part of issue #21476. Signed-off-by: James Almer <jamrial@gmail.com>	2026-01-17 12:02:31 -03:00
James Almer	bb29b51876	avcodec/vc1dec: don't overwrite error codes returned by init_get_bits8() Done by mistake in `8a1c2779a0`. Signed-off-by: James Almer <jamrial@gmail.com>	2026-01-17 11:01:21 -03:00
Ling, Edison	c3d3377fe1	avcodec/d3d12va_encode: Add H264 deblock filter parameter support add parameter `deblock` for users to explicitly enable/disable deblocking filter in d3d12 H264 encoding usage: -deblock enable or -deblock 1 -deblock disable or -deblock 0 -deblock auto or -deblock -1 sample command line: ``` .\ffmpeg.exe -hwaccel d3d12va -hwaccel_output_format d3d12 -i input.mp4 -c:v h264_d3d12va -deblock enable -y output.mp4 ```	2026-01-16 07:03:37 +00:00
Ruikai Peng	be82aef7cc	lavc/aacdec_usac: fix CPE channel index in ff_aac_usac_reset_state() fix a simple index bug in ff_aac_usac_reset_state() that writes past the end of ChannelElement.ch[2] for CPE ff_aac_usac_reset_state() loops over channels with j < ch, but incorrectly takes &che->ch[ch]. For CPE (ch == 2) this becomes che->ch[2], which is one past the end of ChannelElement.ch[2], and the subsequent memset() causes an intra-object out-of-bounds write. index the channel element with the loop variable (j).	2026-01-15 19:32:52 +00:00
James Almer	8a1c2779a0	avcodec/vc1dec: check return values of all init_get_bits() calls And replace them with init_get_bits8, to prevent integer overflows on huge values. Fixes issue #21463. Signed-off-by: James Almer <jamrial@gmail.com>	2026-01-15 16:07:46 -03:00
Jun Zhao	b326b3a08d	lavc/av1_parser: Extract SAR from render_size Extract the Sample Aspect Ratio (SAR) from render_width_minus_1 and render_height_minus_1 in the sequence header. The AV1 specification defines the render dimensions, which can be used in conjunction with the coded dimensions to determine the pixel aspect ratio. This ensures consistent aspect ratio handling for AV1 streams encapsulated in containers like MP4 or MKV, as observed in the updated FATE tests where SAR changes from 0/1 to 1/1. Signed-off-by: Jun Zhao <barryjzhao@tencent.com>	2026-01-14 23:56:39 +00:00
Lynne	e51c549f6e	vulkan/dpx: drop using the nontemporal extension Its rarely respected by implementations, its fairly new (1 year old), and it has a scuffed define (neither glslc nor glslang enable the "GL_EXT_nontemporal_keyword" define if its enabled, unlike all other extensions).	2026-01-14 16:13:22 +01:00
Lynne	f2a55af9a4	vulkan_dpx: switch to compile-time SPIR-V generation	2026-01-12 17:28:43 +01:00
Lynne	0f4667fc11	vulkan_prores_raw: clean up and optimize	2026-01-12 17:28:42 +01:00

1 2 3 4 5 ...

53414 Commits