FFmpeg

mirror of https://mirror.skon.top/https://github.com/FFmpeg/FFmpeg synced 2026-04-20 21:00:41 +08:00

Files

Jun Zhao 89c21b5ab7 lavc/hevc: add aarch64 NEON for Planar prediction

Add NEON-optimized implementation for HEVC intra Planar prediction at
8-bit depth, supporting all block sizes (4x4 to 32x32).

Planar prediction implements bilinear interpolation using an incremental
base update: base_{y+1}[x] = base_y[x] - (top[x] - left[N]), reducing
per-row computation from 4 multiply-adds to 1 subtract + 1 multiply.
Uses rshrn for rounded narrowing shifts, eliminating manual rounding
bias. All left[y] values are broadcast in the NEON domain, avoiding
GP-to-NEON transfers.

4x4 interleaves row computations across 4 rows to break dependencies.
16x16 uses v19-v22 for persistent base/decrement vectors, avoiding
callee-saved register spills. 32x32 processes 8 rows per loop iteration
(4 iterations total) to reduce code size while maintaining full NEON
utilization.

Speedup over C on Apple M4 (checkasm --bench):

    4x4: 2.25x    8x8: 6.40x    16x16: 9.72x    32x32: 3.21x

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>

2026-03-30 14:32:10 +00:00

h26x

aarch64: Add PAC sign/validation of the link register

2026-03-20 13:16:06 +02:00

vvc

aarch64/vvc: Optimisations of put_chroma_v() functions for 10/12-bit

2026-03-27 13:42:50 +00:00

aacencdsp_init.c

…

aacencdsp_neon.S

…

aacpsdsp_init_aarch64.c

…

aacpsdsp_neon.S

…

ac3dsp_init_aarch64.c

…

ac3dsp_neon.S

avcodec/aarch64/ac3dsp_neon.S: Optimize ac3_sum_square_butterfly_int32_neon

2025-03-02 01:17:53 +02:00

cabac.h

…

fdct.h

…

fdctdsp_init_aarch64.c

…

fdctdsp_neon.S

…

fmtconvert_init.c

…

fmtconvert_neon.S

…

h264chroma_init_aarch64.c

…

h264cmc_neon.S

…

h264dsp_init_aarch64.c

avcodec/h264dsp: Remove redundant h264 from H264DSPCtx member names

2026-01-25 22:53:25 +01:00

h264dsp_neon.S

…

h264idct_neon.S

…

h264pred_init.c

aarch64/h264pred: disable inefficient functions

2026-02-04 09:06:37 +00:00

h264pred_neon.S

lavc/aarch64: Fix addp overflow in ff_pred16x16_plane_neon_10

2025-10-24 15:32:35 +00:00

h264qpel_init_aarch64.c

…

h264qpel_neon.S

…

hevcdsp_deblock_neon.S

aarch64: hevcdsp: Make returns match the call site

2026-03-17 20:37:53 +00:00

hevcdsp_dequant_neon.S

lavc/hevc: add aarch64 neon for 12-bit dequant

2026-01-25 06:55:26 +00:00

hevcdsp_idct_neon.S

aarch64/hevcdsp_idct_neon: Add implementation for idct dc 12

2025-03-04 17:01:58 +08:00

hevcdsp_init_aarch64.c

lavc/hevc: reorder aarch64 NEON pel function assignments

2026-03-13 21:43:37 +00:00

hevcpred_init_aarch64.c

lavc/hevc: add aarch64 NEON for Planar prediction

2026-03-30 14:32:10 +00:00

hevcpred_neon.S

lavc/hevc: add aarch64 NEON for Planar prediction

2026-03-30 14:32:10 +00:00

hpeldsp_init_aarch64.c

…

hpeldsp_neon.S

aarch64/hpeldsp_neon: fix out-of-bounds read

2026-01-04 03:22:55 +00:00

huffyuvdsp_init_aarch64.c

libavcodec/huffyuvdsp: Add NEON optimization for the add_int16 function

2026-03-04 22:31:19 +00:00

huffyuvdsp_neon.S

libavcodec/huffyuvdsp: Add NEON optimization for the add_int16 function

2026-03-04 22:31:19 +00:00

idct.h

…

idctdsp_init_aarch64.c

…

idctdsp_neon.S

…

Makefile

lavc/hevc: add aarch64 NEON for DC prediction

2026-03-30 14:32:10 +00:00

me_cmp_init_aarch64.c

avcodec/mpegvideoenc: Add MPVEncContext

2025-03-26 04:08:33 +01:00

me_cmp_neon.S

aarch64: Add PAC sign/validation of the link register

2026-03-20 13:16:06 +02:00

mpegaudiodsp_init.c

…

mpegaudiodsp_neon.S

…

mpegvideoencdsp_init.c

…

mpegvideoencdsp_neon.S

…

neon.S

…

neontest.c

…

opusdsp_init.c

…

opusdsp_neon.S

…

pixblockdsp_init_aarch64.c

avcodec/pixblockdsp: be consistent about restrict use in ff_{get,diff}_pixels

2025-10-25 01:01:15 +02:00

pixblockdsp_neon.S

…

pngdsp_init.c

avcodec/aarch64: add pngdsp

2026-02-04 12:05:35 +08:00

pngdsp_neon.S

avcodec/aarch64: add pngdsp

2026-02-04 12:05:35 +08:00

rv40dsp_init_aarch64.c

…

sbrdsp_init_aarch64.c

…

sbrdsp_neon.S

…

simple_idct_neon.S

…

synth_filter_init.c

…

synth_filter_neon.S

…

vc1dsp_init_aarch64.c

…

vc1dsp_neon.S

…

videodsp_init.c

…

videodsp.S

…

vorbisdsp_init.c

…

vorbisdsp_neon.S

…

vp8dsp_init_aarch64.c

…

vp8dsp_neon.S

…

vp8dsp.h

…

vp9dsp_init_10bpp_aarch64.c

…

vp9dsp_init_12bpp_aarch64.c

…

vp9dsp_init_16bpp_aarch64_template.c

…

vp9dsp_init_aarch64.c

…

vp9dsp_init.h

…

vp9itxfm_16bpp_neon.S

…

vp9itxfm_neon.S

…

vp9lpf_16bpp_neon.S

…

vp9lpf_neon.S

…

vp9mc_16bpp_neon.S

…

vp9mc_aarch64.S

…

vp9mc_neon.S

…