730 Commits

Author SHA1 Message Date
Andreas Rheinhardt
fc8c6d4665 swscale/swscale: Remove ineffective check
If any of the dstStrides is not aligned mod 16, the warning
above this one will be triggered, setting stride_unaligned_warned,
so that the following check for stride_unaligned_warned will
be always false.

Reviewed-by: Niklas Haas <ffmpeg@haasn.dev>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-14 15:22:49 +02:00
Niklas Haas
0e983a0604 swscale: align allocated frame buffers to SwsPass hints
This avoids hitting the slow memcpy fallback paths altogether, whenever
swscale.c is handling plane allocation.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas
2deca0ec19 swscale: clean up allocated frames on error
Matches the semantics of sws_frame_begin(), which also cleans up any
allocated buffers on error.

This is an issue introduced by the commit that allowed ff_sws_graph_run()
to fail in the first place.

Fixes: 563cc8216b
2026-04-10 15:12:18 +02:00
Niklas Haas
6c89a30ecd swscale: add FFFramePool and use it for allocating planes
The major consequence of this is that we start allocating buffers per plane,
instead of allocating one contiguous buffer. This makes the no-op/refcopy
case slightly slower, but doesn't meaningfully affect the rest:

yuva444p -> yuva444p, time=157/1000 us (ref=78/1000 us), speedup=0.497x slower
Overall speedup=1.016x faster, min=0.983x max=1.092x

However, this is a necessary consequence of the desire to allow partial plane
allocations / single plane refcopies. This slowdown also does not affect
vf_scale, which already uses avfilter/framepool.c (via ff_get_video_buffer).

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas
563cc8216b swscale/graph: allow setup() to return an error code
Useful for a handful of reasons, including Vulkan (which depends on external
device resources), but also a change I want to make to the tail handling.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas
9b7439c31b swscale: don't pointlessly loop over NULL buffers
This array is defined as contiguous.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:58 +01:00
Niklas Haas
dd75b6b57c swscale: add sanity clear on AVFrame *dst
Before allocating/referencing buffers, make sure these fields are in a
defined state.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:58 +01:00
Niklas Haas
76c60b192d swscale: restructure sws_scale_frame() slightly
Results in IMHO slightly more readable code flow, and will be useful in an
upcoming commit (that adds logic to ref individual planes).

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:58 +01:00
Niklas Haas
47f89ea88b swscale: explicitly track if a context is "legacy" or not
The legacy API is defined by sws_init_context(), sws_scale() etc., whereas
the "modern" API is defined by just using sws_scale_frame() without prior
init call.

This int allows us to cleanly distinguish the type of context, paving the
way for some minor refactoring.

As an immediate benefit, we now gain a bunch of explict error checks to
ensure the API is used correctly (i.e. sws_scale() not called before
sws_init_context()).

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-06 19:06:33 +01:00
Niklas Haas
5b39be1f0a swscale: fix build on --disable-unstable
By excluding the Vulkan makefile entirely when --disable-unstable is passed.
This also correctly avoids compiling e.g. unused GLSL compilers.

Fixes: #22295
See-Also: #22366

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-04 11:53:10 +01:00
Niklas Haas
4d7b1c3685 swscale/graph: move frame->field init logic to SwsGraph
And have ff_sws_graph_run() just take a bare AVFrame. This will help with
an upcoming change, aside from being a bit friendlier towards API users
in general.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-02-27 16:18:34 +00:00
Lynne
362414afba swscale: add support for processing hardware frames
Sponsored-by: Sovereign Tech Fund
2026-02-26 14:10:22 +01:00
Lynne
c911295f09 swscale: forward original frame pointers to ops.c backend
Sponsored-by: Sovereign Tech Fund
2026-02-26 14:10:21 +01:00
Niklas Haas
afdb683a3f swscale: avoid UB on interlaced frames
NULL+0 is UB.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-02-23 19:39:17 +00:00
Niklas Haas
18060a8820 swscale/graph: simplify ff_sws_graph_run() API
There's little reason not to directly take an SwsImg here; it's already an
internally visible struct.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-02-23 19:39:17 +00:00
rcombs
d1962881ae swscale: use configured YUV matrix with paletted-RGB inputs
This replaces hardcoded BT601.
2025-12-16 01:24:40 +00:00
Arpad Panyik
1f30ff30fb swscale: Add AArch64 Neon path for xyz12Torgb48 LE
Add optimized Neon code path for the little endian case of the
xyz12Torgb48 function. The innermost loop processes the data in 4x2
pixel blocks using software gathers with the matrix multiplication
and clipping done by Neon.

Relative runtime of micro benchmarks after this patch on some
Cortex and Neoverse CPU cores:

 xyz12le_rgb48le    X1      X3      X4    X925      V2
 16x4_neon:       2.55x   4.34x   3.84x   3.31x   3.22x
 32x4_neon:       2.39x   3.63x   3.22x   3.35x   3.29x
 64x4_neon:       2.37x   3.31x   2.91x   3.33x   3.27x
 128x4_neon:      2.34x   3.28x   2.91x   3.35x   3.24x
 256x4_neon:      2.30x   3.17x   2.91x   3.32x   3.10x
 512x4_neon:      2.26x   3.10x   2.91x   3.30x   3.07x
 1024x4_neon:     2.26x   3.07x   2.96x   3.30x   3.05x
 1920x4_neon:     2.26x   3.06x   2.93x   3.28x   3.04x

 xyz12le_rgb48le   A76     A78    A715    A720    A725
 16x4_neon:       2.33x   2.28x   2.53x   3.33x   3.19x
 32x4_neon:       2.35x   2.18x   2.45x   3.23x   3.24x
 64x4_neon:       2.35x   2.16x   2.42x   3.15x   3.21x
 128x4_neon:      2.35x   2.13x   2.39x   3.00x   3.09x
 256x4_neon:      2.36x   2.12x   2.35x   2.85x   2.99x
 512x4_neon:      2.35x   2.14x   2.35x   2.78x   2.95x
 1024x4_neon:     2.31x   2.09x   2.33x   2.80x   2.91x
 1920x4_neon:     2.30x   2.07x   2.32x   2.81x   2.94x

 xyz12le_rgb48le   A55    A510    A520
 16x4_neon:       2.09x   1.92x   2.36x
 32x4_neon:       2.05x   1.89x   2.38x
 64x4_neon:       2.02x   1.77x   2.35x
 128x4_neon:      1.96x   1.74x   2.25x
 256x4_neon:      1.90x   1.72x   2.19x
 512x4_neon:      1.83x   1.75x   2.16x
 1024x4_neon:     1.83x   1.62x   2.15x
 1920x4_neon:     1.82x   1.60x   2.15x

Signed-off-by: Arpad Panyik <Arpad.Panyik@arm.com>
2025-12-05 10:28:18 +00:00
Arpad Panyik
ef651b84ce swscale: Refactor XYZ+RGB state and add function hooks
Prepare for xyz12Torgb48 architecture-specific optimizations in
subsequent patches by:
 - Grouping XYZ+RGB gamma LUTs and 3x3 matrices into SwsColorXform
   (ctx->xyz2rgb and ctx->rgb2xyz), replacing scattered fields.
 - Dropping the unused last matrix column giving the same or smaller
   SwsInternal size.
 - Renaming ff_xyz12Torgb48 and ff_rgb48Toxyz12 and routing calls via
   the new per-context function pointer (ctx->xyz12Torgb48 and
   ctx->rgb48Toxyz12) in graph.c and swscale.c.
 - Adding ff_sws_init_xyzdsp and invoking it in swscale init paths
   (normal and unscaled).
 - Making fill_xyztables public to ease its setup later in checkasm.

These modifications do not introduce any functional changes.

Signed-off-by: Arpad Panyik <Arpad.Panyik@arm.com>
2025-12-05 10:28:18 +00:00
Ramiro Polla
4bee010844 swscale/range_convert: fix truncation bias in range conversion
384fe39623 introduced a regression in the
range conversion offset calculation, resulting in a slight green tint
in full-range RGB to YUV conversions of grayscale values.

The offset being calculated was not taking into consideration a bias
needed for correctly rounding the result from the multiplication stage,
leading to a truncated value.

Fixes issue #11646.
2025-11-06 20:36:08 +00:00
Michael Niedermayer
d16a058dbc swscale/swscale: Do not crash on floats
Fixes: shift exponent 32 is too large for 32-bit type 'unsigned int'
Fixes: division by zero
Fixes: 391981061/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-6691017763389440
Fixes: 392929028/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5142088307507200

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2025-04-10 03:01:32 +02:00
Niklas Haas
8ab40ca984 swscale: fix gray -> grayf32 SIGFPE
swscale internals don't distinguish between 16-bit and higher bit depth
output formats internally when it comes to the choice of intermediate
representation.

Clamping this value both prevents a SIGFPE and also aligns the check
with reality.
2025-03-17 11:40:05 +01:00
James Almer
e20ee9f9ae swscale/swscale: don't reject scaling when color parameters are not supported but conversion is not required
Values in csp, prim, trc, etc, are irrelevant if there's no conversion needed.

Reviewed-by: Niklas Haas <ffmpeg@haasn.xyz>
Signed-off-by: James Almer <jamrial@gmail.com>
2025-01-22 12:15:18 -03:00
James Almer
abdc20727c swscale/swscale: combine the input/output checks in sws_frame_setup()
Cosmetic change in preparation for the next commit.

Signed-off-by: James Almer <jamrial@gmail.com>
2025-01-22 12:14:57 -03:00
Andreas Rheinhardt
4973bb661e swscale/cms,graph,lut3d: Use ff_-prefix, don't export internal functions
Symbols with the sws_* prefix are exported.

Reviewed-by: Alexander Strasser <eclipse7@gmx.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-01-12 15:41:39 +01:00
Niklas Haas
af6d52eec6 swscale: use 16-bit intermediate precision for RGB/XYZ conversion
The current logic uses 12-bit linear light math, which is woefully insufficient
and leads to nasty postarization artifacts. This patch simply switches the
internal logic to 16-bit precision.

This raises the memory requirement of these tables from 32 kB to 272 kB.

All relevant FATE tests updated for improved accuracy.

Fixes: #4829
Signed-off-by: Niklas Haas <git@haasn.dev>
Sponsored-by: Sovereign Tech Fund
2024-12-26 20:31:36 +01:00
Niklas Haas
8cf2d97280 swscale/unscaled: add pal8 -> gbr(a)p special converter
Fixes: #9520
Signed-off-by: Niklas Haas <git@haasn.dev>
Sponsored-by: Sovereign Tech Fund
2024-12-26 19:29:18 +01:00
Niklas Haas
253b8977c0 swscale: remove primaries/trc change warning
This is now supported when using the new API.
2024-12-23 12:33:43 +01:00
Niklas Haas
a8d01dff9a swscale/utils: add HDR metadata to SwsFormat
Only add the condensed values that we actually care about. Group them into
a new struct to make it easier to discard or replace this metadata.

Define a special comparison function that does not choke on undefined/unknown
metadata.
2024-12-23 12:33:43 +01:00
Ramiro Polla
384fe39623 swscale/range_convert: fix mpeg ranges in yuv range conversion for non-8-bit pixel formats
There is an issue with the constants used in YUV to YUV range conversion,
where the upper bound is not respected when converting to mpeg range.

With this commit, the constants are calculated at runtime, depending on
the bit depth. This approach also allows us to more easily understand how
the constants are derived.

For bit depths <= 14, the number of fixed point bits has been set to 14
for all conversions, to simplify the code.
For bit depths > 14, the number of fixed points bits has been raised and
set to 18, to allow for the conversion to be accurate enough for the mpeg
range to be respected.

The convert functions now take the conversion constants (coeff and offset)
as function arguments.
For bit depths <= 14, coeff is unsigned 16-bit and offset is 32-bit.
For bit depths > 14, coeff is unsigned 32-bit and offset is 64-bit.

x86_64:
chrRangeFromJpeg8_1920_c:    2127.4   2125.0  (1.00x)
chrRangeFromJpeg16_1920_c:   2325.2   2127.2  (1.09x)
chrRangeToJpeg8_1920_c:      3166.9   3168.7  (1.00x)
chrRangeToJpeg16_1920_c:     2152.4   3164.8  (0.68x)
lumRangeFromJpeg8_1920_c:    1263.0   1302.5  (0.97x)
lumRangeFromJpeg16_1920_c:   1080.5   1299.2  (0.83x)
lumRangeToJpeg8_1920_c:      1886.8   2112.2  (0.89x)
lumRangeToJpeg16_1920_c:     1077.0   1906.5  (0.56x)

aarch64 A55:
chrRangeFromJpeg8_1920_c:   28835.2  28835.6  (1.00x)
chrRangeFromJpeg16_1920_c:  28839.8  32680.8  (0.88x)
chrRangeToJpeg8_1920_c:     23074.7  23075.4  (1.00x)
chrRangeToJpeg16_1920_c:    17318.9  24996.0  (0.69x)
lumRangeFromJpeg8_1920_c:   15389.7  15384.5  (1.00x)
lumRangeFromJpeg16_1920_c:  15388.2  17306.7  (0.89x)
lumRangeToJpeg8_1920_c:     19227.8  19226.6  (1.00x)
lumRangeToJpeg16_1920_c:    15387.0  21146.3  (0.73x)

aarch64 A76:
chrRangeFromJpeg8_1920_c:    6324.4   6268.1  (1.01x)
chrRangeFromJpeg16_1920_c:   6339.9  11521.5  (0.55x)
chrRangeToJpeg8_1920_c:      9656.0   9612.8  (1.00x)
chrRangeToJpeg16_1920_c:     6340.4  11651.8  (0.54x)
lumRangeFromJpeg8_1920_c:    4422.0   4420.8  (1.00x)
lumRangeFromJpeg16_1920_c:   4420.9   5762.0  (0.77x)
lumRangeToJpeg8_1920_c:      5949.1   5977.5  (1.00x)
lumRangeToJpeg16_1920_c:     4446.8   5946.2  (0.75x)

NOTE: all simd optimizations for range_convert have been disabled.
      they will be re-enabled when they are fixed for each architecture.

NOTE2: the same issue still exists in rgb2yuv conversions, which is not
       addressed in this commit.
2024-12-05 21:10:29 +01:00
Ramiro Polla
2d1358a84d swscale/range_convert: saturate output instead of limiting input
For bit depths <= 14, the result is saturated to 15 bits.
For bit depths > 14, the result is saturated to 19 bits.

x86_64:
chrRangeFromJpeg8_1920_c:    2126.5   2127.4  (1.00x)
chrRangeFromJpeg16_1920_c:   2331.4   2325.2  (1.00x)
chrRangeToJpeg8_1920_c:      3163.0   3166.9  (1.00x)
chrRangeToJpeg16_1920_c:     3163.7   2152.4  (1.47x)
lumRangeFromJpeg8_1920_c:    1262.2   1263.0  (1.00x)
lumRangeFromJpeg16_1920_c:   1079.5   1080.5  (1.00x)
lumRangeToJpeg8_1920_c:      1860.5   1886.8  (0.99x)
lumRangeToJpeg16_1920_c:     1910.2   1077.0  (1.77x)

aarch64 A55:
chrRangeFromJpeg8_1920_c:   28836.2  28835.2  (1.00x)
chrRangeFromJpeg16_1920_c:  28840.1  28839.8  (1.00x)
chrRangeToJpeg8_1920_c:     44196.2  23074.7  (1.92x)
chrRangeToJpeg16_1920_c:    36527.3  17318.9  (2.11x)
lumRangeFromJpeg8_1920_c:   15388.5  15389.7  (1.00x)
lumRangeFromJpeg16_1920_c:  15389.3  15388.2  (1.00x)
lumRangeToJpeg8_1920_c:     23069.7  19227.8  (1.20x)
lumRangeToJpeg16_1920_c:    19227.8  15387.0  (1.25x)

aarch64 A76:
chrRangeFromJpeg8_1920_c:    6334.7   6324.4  (1.00x)
chrRangeFromJpeg16_1920_c:   6336.0   6339.9  (1.00x)
chrRangeToJpeg8_1920_c:     11474.5   9656.0  (1.19x)
chrRangeToJpeg16_1920_c:     9640.5   6340.4  (1.52x)
lumRangeFromJpeg8_1920_c:    4453.2   4422.0  (1.01x)
lumRangeFromJpeg16_1920_c:   4414.2   4420.9  (1.00x)
lumRangeToJpeg8_1920_c:      6645.0   5949.1  (1.12x)
lumRangeToJpeg16_1920_c:     6005.2   4446.8  (1.35x)

NOTE: all simd optimizations for range_convert have been disabled
      except for x86, which already had the same behaviour.
      they will be re-enabled when they are fixed for each architecture.
2024-12-05 21:10:29 +01:00
Niklas Haas
2a091d4f2e swscale: introduce new, dynamic scaling API
As part of a larger, ongoing effort to modernize and partially rewrite
libswscale, it was decided and generally agreed upon to introduce a new
public API for libswscale. This API is designed to be less stateful, more
explicitly defined, and considerably easier to use than the existing one.

Most of the API work has been already accomplished in the previous commits,
this commit merely introduces the ability to use sws_scale_frame()
dynamically, without prior sws_init_context() calls. Instead, the new API
takes frame properties from the frames themselves, and the implementation is
based on the new SwsGraph API, which we simply reinitialize as needed.

This high-level wrapper also recreates the logic that used to live inside
vf_scale for scaling interlaced frames, enabling it to be reused more easily
by end users.

Finally, this function is designed to simply copy refs directly when nothing
needs to be done, substantially improving throughput of the noop fast path.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-25 11:03:50 +01:00
Niklas Haas
6a91a165fd swscale: eliminate redundant SwsInternal accesses
This is a purely cosmetic commit aimed at replacing accesses to
SwsInternal.opts by direct access to SwsContext wherever convenient.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-25 10:59:52 +01:00
Niklas Haas
2d077f9acd swscale/internal: group user-facing options together
This is a preliminary step to separating these into a new struct. This
commit contains no functional changes, it is a pure search-and-replace.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-11-21 12:49:56 +01:00
James Almer
5f5421ec66 swscale/swscale: prevent integer overflow in chrRangeToJpeg16_c
Same as it's done in lumRangeToJpeg16_c(). Plenty of allowed input values can
overflow here.

Fixes: src/libswscale/swscale.c:198:47: runtime error: signed integer overflow: 475328 * 4663 cannot be represented in type 'int'
Signed-off-by: James Almer <jamrial@gmail.com>
2024-11-02 15:01:31 -03:00
Ramiro Polla
8b30daedf7 swscale/range_convert: indent after previous commit 2024-10-27 13:20:56 +01:00
Ramiro Polla
f7ee0195df swscale/range_convert: drop redundant conditionals from arch-specific init functions
These conditions are already checked for in the main init function.
2024-10-27 13:20:56 +01:00
Ramiro Polla
7728b3357d swscale/range_convert: call arch-specific init functions from main init function
This commit also fixes the issue that the call to ff_sws_init_range_convert()
from sws_init_swscale() was not setting up the arch-specific optimizations.
2024-10-27 13:20:56 +01:00
Niklas Haas
67adb30322 swscale: rename SwsContext to SwsInternal
And preserve the public SwsContext as separate name. The motivation here
is that I want to turn SwsContext into a public struct, while keeping the
internal implementation hidden. Additionally, I also want to be able to
use multiple internal implementations, e.g. for GPU devices.

This commit does not include any functional changes. For the most part, it is
a simple rename. The only complications arise from the public facing API
functions, which preserve their current type (and hence require an additional
unwrapping step internally), and the checkasm test framework, which directly
accesses SwsInternal.

For consistency, the affected functions that need to maintain a distionction
have generally been changed to refer to the SwsContext as *sws, and the
SwsInternal as *c.

In an upcoming commit, I will provide a backing definition for the public
SwsContext, and update `sws_internal()` to dereference the internal struct
instead of merely casting it.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-24 22:50:00 +02:00
Niklas Haas
f1f54d2f82 swscale/x86: use dedicated int for self-modifying MMX dstW
I want to pull options out of SwsInternal, so we need to make this field
a dedicated int that gets updated as appropriate in ff_swscale().

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-23 23:12:23 +02:00
Niklas Haas
ec9985b54f swscale/internal: constify and expose ff_swscale()
Used as an intermediate entry point for the new swscale context. The extra
constification is a consistency measure, as I want to move the memcpy of
stride and plane pointers to the functions that actually need to mutate them.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-09 13:18:08 +02:00
Niklas Haas
403a20b2e6 swscale/rgb2xyz: expose these functions internally
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-09 13:17:17 +02:00
Niklas Haas
775de8c19d swscale/rgb2xyz: follow convention on image pointers and strides
Instead of taking an int16_t pointer and a stride in halfwords, follow the
usual convention of treating all planes and strides as byte-addressed.

This does not have any immediate effect but makes these functions more
reusable without unintended "gotchas".

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-09 13:14:57 +02:00
Niklas Haas
9d8f5141cf swscale/rgb2xyz: add explicit width parameter
This fixes an 11-year-old bug in the rgb2xyz functions, when used with a
negative stride. The current loop bounds turned it into a no-op.

Additionally, this increases performance on highly cropped images, whose
stride may be substantially higher than the effective width.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-09 13:14:57 +02:00
Niklas Haas
ea228fc415 swscale/rgb2xyz: minor style fixes
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-09 13:14:57 +02:00
Niklas Haas
73b3344edd swscale/input: parametrize ff_sws_init_input_funcs() pointers
Following the precedent set by ff_sws_init_output_funcs().

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-07 19:51:34 +02:00
Niklas Haas
286bdc9cdc swscale/internal: turn cascaded_tmp into an array
Slightly more convenient to access from the new wrapping code.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-07 19:51:34 +02:00
Niklas Haas
61369484f6 swscale/internal: expose ff_update_palette() internally
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-10-07 19:51:34 +02:00
Michael Niedermayer
44c5641ae8 swscale/swscale: Use unsigned operation to avoid undefined behavior
I have not checked that the constant is correct, this just fixes the undefined behavior

Fixes: signed integer overflow: -646656 * 3517 cannot be represented in type 'int
Fixes: 70559/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-5209368631508992

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-09-19 00:10:38 +02:00
Michael Niedermayer
66b60bae68 swscale/swscale: Use ptrdiff_t for linesize computations
This is unlikely to make a difference

Fixes: CID1591896 Unintentional integer overflow
Fixes: CID1591901 Unintentional integer overflow

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-07 23:36:30 +02:00
Rémi Denis-Courmont
79dfdac4db sws/input: R-V V rgb24ToY & bgr24ToY
T-Head C908:
rgb24_to_y_8_c:            2.0
rgb24_to_y_8_rvv_i32:      2.7
rgb24_to_y_128_c:         26.2
rgb24_to_y_128_rvv_i32:    9.2
rgb24_to_y_1080_c:       219.5
rgb24_to_y_1080_rvv_i32:  76.2
rgb24_to_y_1280_c:       276.2
rgb24_to_y_1280_rvv_i32:  89.7
rgb24_to_y_1920_c:       389.7
rgb24_to_y_1920_rvv_i32: 134.2

SpacemiT X60:
rgb24_to_y_8_c:            1.7
rgb24_to_y_8_rvv_i32:      2.2
rgb24_to_y_128_c:         23.2
rgb24_to_y_128_rvv_i32:    4.2
rgb24_to_y_1080_c:       195.0
rgb24_to_y_1080_rvv_i32:  33.7
rgb24_to_y_1280_c:       231.0
rgb24_to_y_1280_rvv_i32:  40.0
rgb24_to_y_1920_c:       346.2
rgb24_to_y_1920_rvv_i32:  59.7
2024-06-08 18:30:43 +03:00