Files
FFmpeg/libavcodec
Andreas Rheinhardt af3f8f5bd2 avcodec/x86/vvc/of: Break dependency chain
Don't extract and update one word of one and the same register
at a time; use separate src and dst registers, so that pextrw
and bsr can be done in parallel. Also use movd instead of pinsrw
for the first word.

Old benchmarks:
apply_bdof_8_8x16_c:                                  3275.2 ( 1.00x)
apply_bdof_8_8x16_avx2:                                487.6 ( 6.72x)
apply_bdof_8_16x8_c:                                  3243.1 ( 1.00x)
apply_bdof_8_16x8_avx2:                                284.4 (11.40x)
apply_bdof_8_16x16_c:                                 6501.8 ( 1.00x)
apply_bdof_8_16x16_avx2:                               570.0 (11.41x)
apply_bdof_10_8x16_c:                                 3286.5 ( 1.00x)
apply_bdof_10_8x16_avx2:                               461.7 ( 7.12x)
apply_bdof_10_16x8_c:                                 3274.5 ( 1.00x)
apply_bdof_10_16x8_avx2:                               271.4 (12.06x)
apply_bdof_10_16x16_c:                                6590.0 ( 1.00x)
apply_bdof_10_16x16_avx2:                              543.9 (12.12x)
apply_bdof_12_8x16_c:                                 3307.6 ( 1.00x)
apply_bdof_12_8x16_avx2:                               462.2 ( 7.16x)
apply_bdof_12_16x8_c:                                 3287.4 ( 1.00x)
apply_bdof_12_16x8_avx2:                               271.8 (12.10x)
apply_bdof_12_16x16_c:                                6465.7 ( 1.00x)
apply_bdof_12_16x16_avx2:                              543.8 (11.89x)

New benchmarks:
apply_bdof_8_8x16_c:                                  3255.7 ( 1.00x)
apply_bdof_8_8x16_avx2:                                349.3 ( 9.32x)
apply_bdof_8_16x8_c:                                  3262.5 ( 1.00x)
apply_bdof_8_16x8_avx2:                                214.8 (15.19x)
apply_bdof_8_16x16_c:                                 6471.6 ( 1.00x)
apply_bdof_8_16x16_avx2:                               429.8 (15.06x)
apply_bdof_10_8x16_c:                                 3227.7 ( 1.00x)
apply_bdof_10_8x16_avx2:                               321.6 (10.04x)
apply_bdof_10_16x8_c:                                 3250.2 ( 1.00x)
apply_bdof_10_16x8_avx2:                               201.2 (16.16x)
apply_bdof_10_16x16_c:                                6476.5 ( 1.00x)
apply_bdof_10_16x16_avx2:                              400.9 (16.16x)
apply_bdof_12_8x16_c:                                 3230.7 ( 1.00x)
apply_bdof_12_8x16_avx2:                               321.8 (10.04x)
apply_bdof_12_16x8_c:                                 3210.5 ( 1.00x)
apply_bdof_12_16x8_avx2:                               200.9 (15.98x)
apply_bdof_12_16x16_c:                                6474.5 ( 1.00x)
apply_bdof_12_16x16_avx2:                              400.2 (16.18x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-02-22 01:05:12 +01:00
..
2026-02-12 00:56:21 +00:00
2025-11-05 16:31:59 +00:00
2025-07-29 23:38:16 +02:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2025-09-22 23:46:29 +00:00
2025-10-08 20:40:08 +02:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2025-12-13 18:45:17 -03:00
2025-12-13 18:45:17 -03:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2025-11-26 15:16:42 +01:00
2025-11-08 18:48:54 +01:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2025-08-07 19:44:59 +00:00
2025-08-03 13:19:25 +00:00
2025-08-03 13:48:47 +02:00
2025-07-03 20:35:23 +02:00
2025-10-30 03:41:24 +01:00
2025-08-11 20:31:09 +02:00
2025-06-23 17:16:42 +10:00
2026-01-02 18:39:48 +01:00
2026-01-02 18:39:48 +01:00
2026-01-02 18:39:48 +01:00
2026-01-02 18:39:48 +01:00
2025-08-03 13:48:47 +02:00
2025-11-26 15:16:43 +01:00
2025-08-03 13:19:25 +00:00
2025-07-20 01:05:23 +02:00
2025-12-13 18:45:17 -03:00
2025-12-30 17:30:45 +00:00
2025-08-03 13:48:47 +02:00
2026-01-02 18:39:48 +01:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2025-12-13 18:45:17 -03:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2026-01-10 22:47:22 +01:00
2026-02-04 12:05:35 +08:00
2026-02-04 12:05:35 +08:00
2025-08-08 18:29:40 +09:00
2026-02-11 20:35:20 +00:00
2025-07-03 20:34:51 +02:00
2025-08-03 13:48:47 +02:00
2025-07-03 20:35:23 +02:00
2025-07-03 20:35:32 +02:00
2026-02-19 22:39:35 +00:00
2025-08-11 11:54:31 +02:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2025-09-22 23:46:29 +00:00
2025-08-03 13:48:47 +02:00
2025-08-03 13:48:47 +02:00
2025-08-20 11:20:14 +02:00
2025-11-27 11:34:25 +01:00
2025-12-30 14:39:08 -05:00
2025-12-30 14:39:08 -05:00
2025-08-04 09:12:17 +00:00
2026-01-02 18:39:48 +01:00
2026-01-02 18:39:48 +01:00
2026-01-02 18:39:48 +01:00