Files
FFmpeg/libavcodec/aarch64
Jun Zhao 8966101fa6 lavc/hevc: add aarch64 neon for 12-bit dequant
Implement NEON optimization for HEVC dequant at 12-bit depth.

For 12-bit: shift = 15 - 12 - log2_size = 3 - log2_size. When shift
is negative, we use shl (shift left) instead of srshr.

Performance benchmark on Apple M4:
./tests/checkasm/checkasm --test=hevc_dequant --bench
hevc_dequant_4x4_12_c:                                   9.9 ( 1.00x)
hevc_dequant_4x4_12_neon:                                5.7 ( 1.74x)

hevc_dequant_8x8_12_c:                                   1.7 ( 1.00x)
hevc_dequant_8x8_12_neon:                                1.3 ( 1.30x)

hevc_dequant_16x16_12_c:                               131.1 ( 1.00x)
hevc_dequant_16x16_12_neon:                              7.9 (16.52x)

hevc_dequant_32x32_12_c:                                69.7 ( 1.00x)
hevc_dequant_32x32_12_neon:                             28.4 ( 2.46x)

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-01-25 06:55:26 +00:00
..