Commit Graph

2261 Commits

Author SHA1 Message Date
Michael Niedermayer
b3f9eac35a swscale/output: Fix integer overflow in yuv2gbrp_full_X_c()
Fixes: signed integer overflow: 1966895953 + 210305024 cannot be represented in type 'int'
Fixes: 391921975/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5916798905548800

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit ce538ef97a)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-05-16 18:59:58 +02:00
Michael Niedermayer
d67d0175db swscale/output: Fix undefined overflow in yuv2rgba64_full_X_c_template()
Fixes: signed integer overflow: -1082982400 + -1195645138 cannot be represented in type 'int'
Fixes: 376136843/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-4791844321427456

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 56faee21c1)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-05-16 03:03:48 +02:00
Michael Niedermayer
201f2c5912 swscale/slice: clear allocated memory in alloc_lines()
Fixes: use of uninitialized memory in hScale16To15_c()
Fixes: 373924007/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5841199968092160

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit aeec39f3c1)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-05-16 03:03:43 +02:00
Michael Niedermayer
de74e8ee6f swscale/output: Fix undefined integer overflow in yuv2rgba64_2_c_template()
Fixes: signed integer overflow: -1082982400 + -1083218484 cannot be represented in type 'int'
Fixes: 70657/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-6707819712675840

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit bd80c97391)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-05-16 03:03:36 +02:00
Michael Niedermayer
b24fff0e60 swscale/swscale: Use unsigned operation to avoid undefined behavior
I have not checked that the constant is correct, this just fixes the undefined behavior

Fixes: signed integer overflow: -646656 * 3517 cannot be represented in type 'int
Fixes: 70559/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5209368631508992

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 44c5641ae8)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-05-16 03:03:36 +02:00
Michael Niedermayer
82953b7570 swscale/output: Fix integer overflows in yuv2rgba64_X_c_template
Fixes: signed integer overflow: -1082982400 + -1068681048 cannot be represented in type 'int'
Fixes: 69995/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-6285740271534080

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit bcab9789ef)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-22 10:42:24 +02:00
Michael Niedermayer
77c7c10755 swscale/output: Avoid undefined overflow in yuv2rgb_write_full()
Fixes: signed integer overflow: -140140 * 16525 cannot be represented in type 'int'
Fixes: 68859/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-4516387130245120

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit c221c7422f)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-21 17:36:40 +02:00
Michael Niedermayer
709fae3a49 swscale/output: alpha can become negative after scaling, use multiply
Fixes: left shift of negative value -3245
Fixes: 69047/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-6571511551950848

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 9e6c5b6e86)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-21 17:36:07 +02:00
Michael Niedermayer
97411b1790 swscale/yuv2rgb: Use 64bit for brightness computation
This will not overflow for normal values
Fixes: CID1500280 Unintentional integer overflow

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit bfc22f364d)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-06-13 18:30:08 +02:00
Michael Niedermayer
c3078364af swscale/output: Fix integer overflow in yuv2rgba64_full_1_c_template()
Fixes: signed integer overflow: -1082982400 + -1079364728 cannot be represented in type 'int'
Fixes: 67910/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5329011971522560
The input is 9bit in 16bit, the fuzzer fills all 16bit thus generating "invalid" input
No overflow should happen with valid input.

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 1330a73cca)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-06-13 11:23:11 +02:00
Michael Niedermayer
658d282659 swscale/output: Fix integer overflow in yuv2rgba64_1_c_template
Fixes: signed integer overflow: -831176 * 9539 cannot be represented in type 'int'
Fixes: 67869/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5117342091640832

The input is 9bit in 16bit, the fuzzer fills all 16bit thus generating "invalid" input
No overflow should happen with valid input.

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit a56559e688)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-06-13 11:23:10 +02:00
Michael Niedermayer
c0bb8e0e62 swscale/utils: Fix xInc overflow
Fixes: signed integer overflow: 2 * 1073741824 cannot be represented in type 'int'
Fixes: 67802/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-6249515855183872

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 1a9eda65d0)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-04-14 22:12:50 +02:00
Michael Niedermayer
2c2dbde45e libswscale/utils: Fix bayer to yuvj
Fixes: out of array access.

Earlier code assumes that a unscaled bayer to yuvj420 converter exists
but the later code then skips yuvj420

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit e9cc9e492f)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-04-14 22:12:43 +02:00
Michael Niedermayer
22eee37d23 swscale/swscale: Check srcSliceH for bayer
Fixes: Assertion srcSliceH > 1 failed at libswscale/swscale_unscaled.c:1359
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 64098d0cd8)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-04-14 22:12:43 +02:00
Michael Niedermayer
84ed2a2b5a swscale/utils: Allocate more dithererror
Fixes: out of array read
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 18f26f8a2f)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-04-14 22:12:43 +02:00
Michael Niedermayer
a604063ede swscale/input: Use more unsigned intermediates
Same principle as previous commit, with sufficiently huge rgb2yuv table
values this produces wrong results and undefined behavior.
The unsigned produces the same incorrect results. That is probably
ok as these cases with huge values seem not to occur in any real
use case.

Fixes: signed integer overflow
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit ba209e3d51)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-04-21 01:55:09 +02:00
Michael Niedermayer
87df8385b8 swscale/output: Bias 16bps output calculations to improve non overflowing range
Fixes: integer overflow
Fixes: ./ffmpeg   -f rawvideo -video_size 66x64 -pixel_format yuva420p10le   -i ~/videos/overflow_input_w66h64.yuva420p10le   -filter_complex "scale=flags=bicubic+full_chroma_int+full_chroma_inp+bitexact+accurate_rnd:in_color_matrix=bt2020:out_color_matrix=bt2020:in_range=full:out_range=full,format=rgba64[out]"   -pixel_format rgba64 -map '[out]'   -y overflow_w66h64.png

Found-by: Drew Dunne <asdunne@google.com>
Tested-by: Drew Dunne <asdunne@google.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 0f0afc7fb5)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2023-04-21 01:55:08 +02:00
Martin Storsjö
9d5450b514 swscale: aarch64: Fix yuv2rgb with negative strides
Treat the 32 bit stride registers as signed.

Alternatively, we could make the stride arguments ptrdiff_t instead
of int, and changing all of the assembly to operate on these
registers with their full 64 bit width, but that's a much larger
and more intrusive change (and risks missing some operation, which
would clamp the intermediates to 32 bit still).

Fixes: https://trac.ffmpeg.org/ticket/9985

Signed-off-by: Martin Storsjö <martin@martin.st>
(cherry picked from commit cb803a0072)
Signed-off-by: Martin Storsjö <martin@martin.st>
2022-11-04 14:32:34 +02:00
Michael Niedermayer
ff87b7bd2f swscale/alphablend: Fix slice handling
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 06d6726588)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-10-06 13:59:34 +02:00
Michael Niedermayer
d3f9206997 swscale/slice: Fix wrong return on error
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 7874d40f10)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-10-06 13:54:16 +02:00
Michael Niedermayer
b72df5e492 swscale/slice: Check slice for allocation failure
Fixes: null pointer dereference
Fixes: alloc_slice.mp4

Found-by: Rafael Dutra <rafael.dutra@cispa.de>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 997f9cfc12)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2021-10-06 13:54:16 +02:00
Andreas Rheinhardt
4b75d960b6 swscale/utils: Fix invalid left shifts of negative numbers
Affected the FATE-tests vsynth_lena-dv-411, vsynth1-dv-411,
vsynth2-dv-411 and hevc-paramchange-yuv420p.yuv420p10.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit e2646e23be)
2020-05-20 03:44:07 +02:00
Andreas Rheinhardt
b694403ef9 swscale/x86/swscale: Fix undefined left shifts of negative numbers
This affected many FATE-tests: The number of failing tests went down
from 663 to 344. (Both numbers exclude tests that failed because of
unaligned accesses in code that is inside #if HAVE_FAST_UNALIGNED.)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 736c7c20e7)
2020-05-20 03:42:42 +02:00
Michael Niedermayer
01628af26d swscale/yuv2rgb: Fix vertical dither offset with slices
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit be3c29e379)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-05-19 17:17:36 +02:00
Michael Niedermayer
c3b5c1423e swscale/output: Fix integer overflow in yuv2rgb_write_full() with out of range input
Fixes: signed integer overflow: 1169365504 + 981452800 cannot be represented in type 'int'
Fixes: ticket8293

Found-by: Suhwan
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit e057e83a4f)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-05-19 17:17:36 +02:00
Michael Niedermayer
824c773263 swscale/output: Fix integer overflow in alpha computation in yuv2gbrp16_full_X_c()
Fixes: signed integer overflow: 524280 * 4432 cannot be represented in type 'int'
Fixes: ticket8322

Found-by: Suhwan
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 49ba1879ad)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-05-19 17:17:36 +02:00
Michael Niedermayer
9430ad3e21 swscale/input: Fix several invalid shifts related to rgb2yuv constants
Fixes: Invalid shifts
Fixes: #8140
Fixes: #8146

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit d48e510124)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-05-19 17:17:35 +02:00
Michael Niedermayer
ea7a818c95 swscale/output: Fix several invalid shifts in yuv2rgb_full_1_c_template()
Fixes: Invalid shifts
Fixes: #8320

Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit 7b7f97532b)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-05-19 17:17:35 +02:00
Michael Niedermayer
8a9c9711cf swscale/swscale: Fix several invalid shifts related to vChrDrop
Fixes: Invalid shifts
Fixes: #8166
Fixes: filter-crop_scale_vflip FATE-test

Reviewed-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit a6ca22c118)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-05-19 17:17:35 +02:00
Michael Niedermayer
22db337a40 Bump minor versions to separate 4.2 from master
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-07-21 18:36:18 +02:00
Michael Niedermayer
9d269301f0 swscale/tests/swscale: Lengthen pixfmt name buffer to 21 bytes
Some formats use longer names than 12.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-05-13 13:39:49 +02:00
Adam Richter
b8ed493061 libswcale: Fix possible string overflow in test.
In libswcale/tests/swcale.c, the function fileTest() calls sscanf in
an argument of "%12s" on character srcStr[] and dstStr[], which are
only 12 bytes.  So, if the input string is 12 characters, a
terminating null byte can be written past the end of these arrays.

This bug was found by cppcheck.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-05-13 13:39:40 +02:00
Philip Langdale
4fa4f1d7a9 swscale: Add test for isSemiPlanarYUV to pixdesc_query
Lauri had asked me what the semi planar formats were and that reminded
me that we could add it to pixdesc_query so we know exactly what the
list is.
2019-05-12 07:51:02 -07:00
Philip Langdale
cd48318035 swscale: Add support for NV24 and NV42
The implementation is pretty straight-forward. Most of the existing
NV12 codepaths work regardless of subsampling and are re-used as is.
Where necessary I wrote the slightly different NV24 versions.

Finally, the one thing that confused me for a long time was the
asm specific x86 path that did an explicit exclusion check for NV12.
I replaced that with a semi-planar check and also updated the
equivalent PPC code, which Lauri kindly checked.
2019-05-12 07:51:02 -07:00
Lauri Kasanen
e25bddf5fc swscale/ppc: Shorten power8 tests via a var 2019-05-07 10:08:16 +03:00
Lauri Kasanen
a2a16206aa swscale/ppc: VSX-optimize hScale16To*
./ffmpeg -loop 1 -s 1200x1440 -i tux16.png \
    -s 2400x720 -f rawvideo -y -vframes 5 -pix_fmt yuv420p16le -nostats test.raw

./ffmpeg -loop 1 -s 1200x1440 -i tux16.png \
    -s 2400x720 -f rawvideo -y -vframes 5 -pix_fmt yuv420p -nostats test.raw

32-bit mul, power8 only

2x speedup for hScale8To19_vsx (x86 SSE2 is 2.37):
  30896 UNITS in hscale,    8192 runs,      0 skips
  63956 UNITS in hscale,    8192 runs,      0 skips

2.06 for hScale16To15_vsx:
  30531 UNITS in hscale,    8192 runs,      0 skips
  63161 UNITS in hscale,    8192 runs,      0 skips
2019-05-07 10:08:16 +03:00
Lauri Kasanen
3437111f17 swscale/ppc: Indent 2019-05-07 10:08:16 +03:00
Lauri Kasanen
9456adc223 swscale/ppc: VSX-optimize hScale8To19
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \
    -s 2400x720 -f rawvideo -y -vframes 5 -pix_fmt yuv420p16le -nostats test.raw

2.26 speedup (x86 SSE2 is 2.32):
  23772 UNITS in hscale,    4096 runs,      0 skips
  53862 UNITS in hscale,    4096 runs,      0 skips
2019-05-07 10:08:16 +03:00
Lauri Kasanen
d0e4d0429e swscale/ppc: VSX-optimize hscale_fast
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \
        -s 2400x720 -f rawvideo -vframes 5 -pix_fmt abgr -nostats test.raw

4.27 speedup for hyscale_fast:
  24796 UNITS in hyscale_fast,    4096 runs,      0 skips
   5797 UNITS in hyscale_fast,    4096 runs,      0 skips

4.48 speedup for hcscale_fast:
  19911 UNITS in hcscale_fast,    4095 runs,      1 skips
   4437 UNITS in hcscale_fast,    4096 runs,      0 skips
2019-04-30 14:41:28 +03:00
Lauri Kasanen
ce92ee4b4f swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_2
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \
        -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \
        -cpuflags 0 -v error -

32-bit mul, power8 only.

~2x speedup:

rgb24
  24431 UNITS in yuv2packed2,   16384 runs,      0 skips
  13783 UNITS in yuv2packed2,   16383 runs,      1 skips
bgr24
  24396 UNITS in yuv2packed2,   16384 runs,      0 skips
  14059 UNITS in yuv2packed2,   16384 runs,      0 skips
rgba
  26815 UNITS in yuv2packed2,   16383 runs,      1 skips
  12797 UNITS in yuv2packed2,   16383 runs,      1 skips
bgra
  27060 UNITS in yuv2packed2,   16384 runs,      0 skips
  13138 UNITS in yuv2packed2,   16384 runs,      0 skips
argb
  26998 UNITS in yuv2packed2,   16384 runs,      0 skips
  12728 UNITS in yuv2packed2,   16381 runs,      3 skips
bgra
  26651 UNITS in yuv2packed2,   16384 runs,      0 skips
  13124 UNITS in yuv2packed2,   16384 runs,      0 skips

This is a low speedup, but the x86 mmx version also gets only ~2x. The mmx version
is also heavily inaccurate, while the vsx version has high accuracy.
2019-04-11 09:08:51 +03:00
Lauri Kasanen
8607e29fa3 swscale/ppc: VSX-optimize yuv2rgb_full_X
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \
                -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \
                -cpuflags 0 -v error -

32-bit mul, power8 only.

~6.4x speedup:

rgb24
 214278 UNITS in yuv2packedX,   16384 runs,      0 skips
  33249 UNITS in yuv2packedX,   16384 runs,      0 skips
bgr24
 214616 UNITS in yuv2packedX,   16384 runs,      0 skips
  33233 UNITS in yuv2packedX,   16384 runs,      0 skips
rgba
 214517 UNITS in yuv2packedX,   16384 runs,      0 skips
  33271 UNITS in yuv2packedX,   16384 runs,      0 skips
bgra
 214973 UNITS in yuv2packedX,   16384 runs,      0 skips
  33397 UNITS in yuv2packedX,   16384 runs,      0 skips
argb
 214613 UNITS in yuv2packedX,   16384 runs,      0 skips
  33310 UNITS in yuv2packedX,   16384 runs,      0 skips
bgra
 214637 UNITS in yuv2packedX,   16384 runs,      0 skips
  33330 UNITS in yuv2packedX,   16384 runs,      0 skips
2019-04-07 09:20:34 +03:00
Lauri Kasanen
3256e949be swscale/ppc: VSX-optimize yuv2rgb_full_2
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags area \
            -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \
            -cpuflags 0 -v error -

32-bit mul, power8 only.

~4x speedup:

rgb24
  52763 UNITS in yuv2packed2,   16384 runs,      0 skips
  13453 UNITS in yuv2packed2,   16384 runs,      0 skips
bgr24
  53144 UNITS in yuv2packed2,   16384 runs,      0 skips
  13616 UNITS in yuv2packed2,   16384 runs,      0 skips
rgba
  52796 UNITS in yuv2packed2,   16384 runs,      0 skips
  12904 UNITS in yuv2packed2,   16384 runs,      0 skips
bgra
  52732 UNITS in yuv2packed2,   16384 runs,      0 skips
  13262 UNITS in yuv2packed2,   16384 runs,      0 skips
argb
  52661 UNITS in yuv2packed2,   16384 runs,      0 skips
  12879 UNITS in yuv2packed2,   16384 runs,      0 skips
bgra
  52662 UNITS in yuv2packed2,   16384 runs,      0 skips
  12932 UNITS in yuv2packed2,   16384 runs,      0 skips
2019-04-07 09:20:33 +03:00
Lauri Kasanen
50e672bc54 swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_1
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \
        -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \
        -cpuflags 0 -v error -

32-bit mul, power8 only.

1.8-2.3x speedup:

rgb24
  18192 UNITS in yuv2packed1,   32767 runs,      1 skips
   9983 UNITS in yuv2packed1,   32760 runs,      8 skips
bgr24
  18665 UNITS in yuv2packed1,   32766 runs,      2 skips
   9925 UNITS in yuv2packed1,   32763 runs,      5 skips
rgba
  20239 UNITS in yuv2packed1,   32767 runs,      1 skips
   8794 UNITS in yuv2packed1,   32759 runs,      9 skips
bgra
  20354 UNITS in yuv2packed1,   32768 runs,      0 skips
   8770 UNITS in yuv2packed1,   32761 runs,      7 skips
argb
  20185 UNITS in yuv2packed1,   32768 runs,      0 skips
   8761 UNITS in yuv2packed1,   32761 runs,      7 skips
bgra
  20360 UNITS in yuv2packed1,   32766 runs,      2 skips
   8759 UNITS in yuv2packed1,   32764 runs,      4 skips

This is a low speedup, but the x86 mmx version also gets only ~2x. The mmx version
is also heavily inaccurate, while the vsx version has high accuracy.
2019-04-07 09:20:31 +03:00
Lauri Kasanen
7adce3e64c swscale/ppc: VSX-optimize yuv2422_X
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \
          -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \
          -cpuflags 0 -v error -

7.2x speedup:

yuyv422
 126354 UNITS in yuv2packedX,   16384 runs,      0 skips
  16383 UNITS in yuv2packedX,   16382 runs,      2 skips
yvyu422
 117669 UNITS in yuv2packedX,   16384 runs,      0 skips
  16271 UNITS in yuv2packedX,   16379 runs,      5 skips
uyvy422
 117310 UNITS in yuv2packedX,   16384 runs,      0 skips
  16226 UNITS in yuv2packedX,   16382 runs,      2 skips
2019-03-31 12:41:34 +03:00
Lauri Kasanen
9a2db4dc61 swscale/ppc: VSX-optimize yuv2422_2
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags area \
                -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \
                -cpuflags 0 -v error -

5.1x speedup:

yuyv422
  19339 UNITS in yuv2packed2,   16384 runs,      0 skips
   3718 UNITS in yuv2packed2,   16383 runs,      1 skips
yvyu422
  19438 UNITS in yuv2packed2,   16384 runs,      0 skips
   3800 UNITS in yuv2packed2,   16380 runs,      4 skips
uyvy422
  19128 UNITS in yuv2packed2,   16384 runs,      0 skips
   3721 UNITS in yuv2packed2,   16380 runs,      4 skips
2019-03-31 12:41:33 +03:00
Lauri Kasanen
a6a31ca3d9 swscale/ppc: VSX-optimize yuv2422_1
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \
            -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \
            -cpuflags 0 -v error -

15.3x speedup:

yuyv422
  14513 UNITS in yuv2packed1,   32768 runs,      0 skips
    949 UNITS in yuv2packed1,   32767 runs,      1 skips
yvyu422
  14516 UNITS in yuv2packed1,   32767 runs,      1 skips
    943 UNITS in yuv2packed1,   32767 runs,      1 skips
uyvy422
  14530 UNITS in yuv2packed1,   32767 runs,      1 skips
    941 UNITS in yuv2packed1,   32766 runs,      2 skips
2019-03-31 12:41:32 +03:00
Michael Niedermayer
8865ae959b swscale/swscale_unscaled: Fix chroma slice height
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-03-28 22:47:32 +01:00
Dong, Jerry
c47fada298 swscale/swscale_unscaled: fixed the issue that when width/height is not 2-multiple, transition of nv12 to u/v planes is not completed.
Signed-off-by: Dong, Jerry <jerry.dong@intel.com>
Signed-off-by: Decai Lin <decai.lin@intel.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-03-28 20:28:43 +01:00
Lauri Kasanen
681957b88d swscale/ppc: VSX-optimize yuv2rgb_full
./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 \
        -s 1200x1440 -f null -vframes 100 -pix_fmt $i -nostats \
        -cpuflags 0 -v error -

This uses 32-bit mul, so POWER8 only.

The following output formats get about 4.5x speedup:

rgb24
  39980 UNITS in yuv2packed1,   32768 runs,      0 skips
   8774 UNITS in yuv2packed1,   32768 runs,      0 skips
bgr24
  40069 UNITS in yuv2packed1,   32768 runs,      0 skips
   8772 UNITS in yuv2packed1,   32766 runs,      2 skips
rgba
  39759 UNITS in yuv2packed1,   32768 runs,      0 skips
   8681 UNITS in yuv2packed1,   32767 runs,      1 skips
bgra
  39729 UNITS in yuv2packed1,   32768 runs,      0 skips
   8696 UNITS in yuv2packed1,   32766 runs,      2 skips
argb
  39766 UNITS in yuv2packed1,   32768 runs,      0 skips
   8672 UNITS in yuv2packed1,   32766 runs,      2 skips
bgra
  39784 UNITS in yuv2packed1,   32768 runs,      0 skips
   8659 UNITS in yuv2packed1,   32767 runs,      1 skips
2019-03-27 09:05:08 +02:00
Lauri Kasanen
81a4719d8e swscale: Remove duplicated code
In this function, the exact same clamping happens both in the if and unconditionally.
2019-03-27 09:00:06 +02:00