mirror of
https://mirror.skon.top/https://github.com/FFmpeg/FFmpeg
synced 2026-04-23 02:11:14 +08:00
110d0cdc9d1ec414a658f841a3fbefbf6f796d61
Code mostly inspired by vp8's MC, however:
- its MMX2 horizontal filter is worse because it can't take advantage of
the coefficient redundancy
- that same coefficient redundancy allows better code for non-SSSE3 versions
Benchmark (rounded to tens of unit):
V8x8 H8x8 2D8x8 V16x16 H16x16 2D16x16
C 445 358 985 1785 1559 3280
MMX* 219 271 478 714 929 1443
SSE2 131 158 294 425 515 892
SSSE3 120 122 248 387 390 763
End result is overall around a 15% speedup for SSSE3 version (on 6 sequences);
all loop filter functions now take around 55% of decoding time, while luma MC
dsp functions are around 6%, chroma ones are 1.3% and biweight around 2.3%.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Libav README ------------ 1) Documentation ---------------- * Read the documentation in the doc/ directory. 2) Licensing ------------ * See the LICENSE file.
Description
Languages
C
89.4%
Assembly
8.3%
Makefile
1.3%
C++
0.3%
GLSL
0.3%
Other
0.2%