Andreas Rheinhardt af3f8f5bd2 avcodec/x86/vvc/of: Break dependency chain
Don't extract and update one word of one and the same register
at a time; use separate src and dst registers, so that pextrw
and bsr can be done in parallel. Also use movd instead of pinsrw
for the first word.

Old benchmarks:
apply_bdof_8_8x16_c:                                  3275.2 ( 1.00x)
apply_bdof_8_8x16_avx2:                                487.6 ( 6.72x)
apply_bdof_8_16x8_c:                                  3243.1 ( 1.00x)
apply_bdof_8_16x8_avx2:                                284.4 (11.40x)
apply_bdof_8_16x16_c:                                 6501.8 ( 1.00x)
apply_bdof_8_16x16_avx2:                               570.0 (11.41x)
apply_bdof_10_8x16_c:                                 3286.5 ( 1.00x)
apply_bdof_10_8x16_avx2:                               461.7 ( 7.12x)
apply_bdof_10_16x8_c:                                 3274.5 ( 1.00x)
apply_bdof_10_16x8_avx2:                               271.4 (12.06x)
apply_bdof_10_16x16_c:                                6590.0 ( 1.00x)
apply_bdof_10_16x16_avx2:                              543.9 (12.12x)
apply_bdof_12_8x16_c:                                 3307.6 ( 1.00x)
apply_bdof_12_8x16_avx2:                               462.2 ( 7.16x)
apply_bdof_12_16x8_c:                                 3287.4 ( 1.00x)
apply_bdof_12_16x8_avx2:                               271.8 (12.10x)
apply_bdof_12_16x16_c:                                6465.7 ( 1.00x)
apply_bdof_12_16x16_avx2:                              543.8 (11.89x)

New benchmarks:
apply_bdof_8_8x16_c:                                  3255.7 ( 1.00x)
apply_bdof_8_8x16_avx2:                                349.3 ( 9.32x)
apply_bdof_8_16x8_c:                                  3262.5 ( 1.00x)
apply_bdof_8_16x8_avx2:                                214.8 (15.19x)
apply_bdof_8_16x16_c:                                 6471.6 ( 1.00x)
apply_bdof_8_16x16_avx2:                               429.8 (15.06x)
apply_bdof_10_8x16_c:                                 3227.7 ( 1.00x)
apply_bdof_10_8x16_avx2:                               321.6 (10.04x)
apply_bdof_10_16x8_c:                                 3250.2 ( 1.00x)
apply_bdof_10_16x8_avx2:                               201.2 (16.16x)
apply_bdof_10_16x16_c:                                6476.5 ( 1.00x)
apply_bdof_10_16x16_avx2:                              400.9 (16.16x)
apply_bdof_12_8x16_c:                                 3230.7 ( 1.00x)
apply_bdof_12_8x16_avx2:                               321.8 (10.04x)
apply_bdof_12_16x8_c:                                 3210.5 ( 1.00x)
apply_bdof_12_16x8_avx2:                               200.9 (15.98x)
apply_bdof_12_16x16_c:                                6474.5 ( 1.00x)
apply_bdof_12_16x16_avx2:                              400.2 (16.18x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-02-22 01:05:12 +01:00
2026-01-12 17:28:41 +01:00
2025-08-08 21:51:15 +00:00
2023-03-01 21:59:10 +01:00
2025-06-23 14:48:40 +02:00
2025-08-03 13:48:47 +02:00
2025-05-07 15:35:47 +02:00
2025-08-14 08:42:29 -04:00

FFmpeg README

FFmpeg is a collection of libraries and tools to process multimedia content such as audio, video, subtitles and related metadata.

Libraries

  • libavcodec provides implementation of a wider range of codecs.
  • libavformat implements streaming protocols, container formats and basic I/O access.
  • libavutil includes hashers, decompressors and miscellaneous utility functions.
  • libavfilter provides means to alter decoded audio and video through a directed graph of connected filters.
  • libavdevice provides an abstraction to access capture and playback devices.
  • libswresample implements audio mixing and resampling routines.
  • libswscale implements color conversion and scaling routines.

Tools

  • ffmpeg is a command line toolbox to manipulate, convert and stream multimedia content.
  • ffplay is a minimalistic multimedia player.
  • ffprobe is a simple analysis tool to inspect multimedia content.
  • Additional small tools such as aviocat, ismindex and qt-faststart.

Documentation

The offline documentation is available in the doc/ directory.

The online documentation is available in the main website and in the wiki.

Examples

Coding examples are available in the doc/examples directory.

License

FFmpeg codebase is mainly LGPL-licensed with optional components licensed under GPL. Please refer to the LICENSE file for detailed information.

Contributing

Patches should be submitted to the ffmpeg-devel mailing list using git format-patch or git send-email. Github pull requests should be avoided because they are not part of our review process and will be ignored.

Description
No description provided
Readme 581 MiB
Languages
C 89.4%
Assembly 8.3%
Makefile 1.3%
C++ 0.3%
GLSL 0.3%
Other 0.2%