37 Commits

Author SHA1 Message Date
Niklas Haas
7a71a01a1b swscale/ops: nuke SwsComps.unused
Finally, remove the last relic of this accursed design mistake.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:25:17 +02:00
Niklas Haas
af2674645f swscale/ops: drop offset from SWS_MASK_ALPHA
This is far more commonly used without an offset than with; so having it there
prevents these special cases from actually doing much good.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:24:55 +02:00
Niklas Haas
cf2d40f65d swscale/ops: add explicit clear mask to SwsClearOp
Instead of implicitly testing for NaN values. This is mostly a straightforward
translation, but we need some slight extra boilerplate to ensure the mask
is correctly updated when e.g. commuting past a swizzle.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:23:36 +02:00
Niklas Haas
4020607f0a swscale/ops: add SwsCompMask and related helpers
This new type will be used over the following commits to simplify the
codebase.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:23:36 +02:00
Niklas Haas
85bef2c2bc swscale/ops: split SwsConst up into op-specific structs
It was a bit clunky, lacked semantic contextual information, and made it
harder to reason about the effects of extending this struct. There should be
zero runtime overhead as a result of the fact that this is already a big
union.

I made the changes in this commit by hand, but due to the length and noise
level of the commit, I used Opus 4.6 to verify that I did not accidentally
introduce any bugs or typos.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-02 11:48:15 +00:00
Niklas Haas
7989fd973a swscale/ops: add min/max to SwsDitherOp
This gives more accurate information to the range tracker.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 12:10:38 +02:00
Niklas Haas
048ca3b367 swscale/ops_optimizer: check COMP_GARBAGE instead of next->comps.unused
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas
13388c0cac swscale/ops: test for SWS_COMP_GARBAGE instead of next->comps.unused
When printing/describing operations.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas
c0cc7f341a swscale/ops: simplify SwsOpList.order_src/dst
Just define these directly as integer arrays; there's really no point in
having them re-use SwsSwizzleOp; the only place this was ever even remotely
relevant was in the no-op check, which any decent compiler should already
be capable of optimizing into a single 32-bit comparison.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas
bf09910292 swscale/ops: add filter kernel to SwsReadWriteOp
This allows reads to directly embed filter kernels. This is because, in
practice, a filter needs to be combined with a read anyways. To accomplish
this, we define filter ops as their semantic high-level operation types, and
then have the optimizer fuse them with the corresponding read/write ops
(where possible).

Ultimately, something like this will be needed anyways for subsampled formats,
and doing it here is just incredibly clean and beneficial compared to each
of the several alternative designs I explored.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:13 +01:00
Niklas Haas
63140bff5e swscale/ops: define SWS_OP_FILTER_H/V
This commit merely adds the definitions. The implementations will follow.

It may seem a bit impractical to have these filter ops given that they
break the usual 1:1 association between operation inputs and outputs, but
the design path I chose will have these filter "pseudo-ops" end up migrating
towards the read/write for CPU implementations. (Which don't benefit from
any ability to hide the intermediate memory internally the way e.g. a fused
Vulkan compute shader might).

What we gain from this design, on the other hand, is considerably cleaner
high-level code, which doesn't need to concern itself with low-level
execution details at all, and can just freely insert these ops wherever
it needs to. The dispatch layer will take care of actually executing these
by implicitly splitting apart subpasses.

To handle out-of-range values and so on, the filters by necessity have to
also convert the pixel range. I have settled on using floating point types
as the canonical intermediate format - not only does this save us from having
to define e.g. I32 as a new intermediate format, but it also allows these
operations to chain naturally into SWS_OP_DITHER, which will basically
always be needed after a filter pass anyways.

The one exception here is for point sampling, which would rather preserve
the input type. I'll worry about this optimization at a later point in time.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:13 +01:00
Niklas Haas
4395e8f3a2 swscale/ops: add helper function to enumerate over all op lists
This moves the logic from tests/sws_ops into the library itself, where it
can be reused by e.g. the aarch64 asmgen backend to iterate over all possible
operation types it can expect to see.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas
f62c837eb6 swscale/ops: move op-formatting code to helper function
Annoyingly, access to order_src/dst requires access to the SwsOpList, so
we have to append that data after the fact.

Maybe this is another incremental tick in favor of `SwsReadWriteOp` in the
ever-present question in my head of whether the plane order should go there
or into SwsOpList.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas
e7c84a8e6a swscale/ops_dispatch: infer destination format from SwsOpList
This is now redundant.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas
b5db7c7354 swscale/ops_dispatch: have ff_sws_compile_pass() take ownership of ops
More useful than just allowing it to "modify" the ops; in practice this means
the contents will be undefined anyways - might as well have this function
take care of freeing it afterwards as well.

Will make things simpler with regards to subpass splitting.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas
1addde59f9 swscale/ops: add ff_sws_op_type_name
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:57 +01:00
Niklas Haas
1d16161a8b swscale/ops: use SwsCompFlags typedef instead of plain int
This improves the debugging experience. These are all internal structs so
there is no need to worry about ABI stability as a result of adding flags.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:57 +01:00
Niklas Haas
b4bcb00cd3 swscale/ops: add and use ff_sws_op_list_input/output()
Makes various pieces of code that expect to get a SWS_OP_READ more robust,
and also allows us to generalize to introduce more input op types in the
future (in particular, I am looking ahead towards filter ops).

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-05 23:34:56 +00:00
Niklas Haas
096e20b4b8 swscale/ops: reorder fields in SwsOpList
For some reason, this avoids triggering a compiler bug in gcc-15 on PowerPC,
that was introduced with commit da47951bd7.
2026-02-26 18:08:49 +00:00
Niklas Haas
841ca7a2cb swscale/format: pass SwsFormat by ref instead of by value where possible
The one exception is in adapt_colors(), which mutates these structs on
its own stack anyways.
2026-02-26 18:08:49 +00:00
Niklas Haas
ef4a597ad8 swscale/ops: allow excluding components from SWS_OP_DITHER
We often need to dither only a subset of the components. Previously this
was not possible, but we can just use the special value -1 for this.

The main motivating factor is actually the fact that "unnecessary" dither ops
would otherwise frequently prevent plane splitting, since e.g. a copied
alpha plane has to come along for the ride through the whole F32/dither
pipeline.

Additionally, it somewhat simplifies implementations.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-02-26 13:09:14 +00:00
Ramiro Polla
c7c8c31302 swscale/tests/sws_ops: print range values in the output
This gives more information about each operation and helps catch issues
earlier on.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-02-24 19:27:51 +01:00
Niklas Haas
da47951bd7 swscale/ops: lift read op metadata to SwsOpList
Instead of awkwardly preserving these from the `SwsOp` itself. This
interpretation lessens the risk of bugs as a result of changing the plane
swizzle mask without updating the corresponding components.

After this commit, the plane swizzle mask is automatically taken into
account; i.e. the src_comps mask is always interpreted as if the read op
was in-order (unswizzled).

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-02-19 19:44:46 +00:00
Niklas Haas
998cffb432 swscale/ops: add input/output plane swizzle mask to SwsOpList
This can be used to have the execution code directly swizzle the plane
pointers, instead of swizzling the data via SWS_OP_SWIZZLE. This can be used
to, for example, extract a subset of the input/output planes for partial
processing of split graphs (e.g. subsampled chroma, or independent alpha),
or just to skip an SWS_OP_SWIZZLE operation.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-02-19 19:44:46 +00:00
Niklas Haas
e96332cb65 swscale/ops: add ff_sws_op_list_is_noop()
And use it in ff_sws_compile_pass() instead of hard-coding the check there.
This check will become more sophisticated in the following commits.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-02-19 19:44:46 +00:00
Niklas Haas
abb1524138 Revert "swscale/ops: clarify SwsOpList.src/dst semantics"
This reverts commit c94e8afe5d.

These are now actually purely informational.
2025-12-24 16:37:22 +00:00
Niklas Haas
5f1be98f62 swscale/ops: add SWS_COMP_SWAPPED
This flag keeps track of whether a pixel is currently byte-swapped or
not. Not needed by current backends, but informative and useful for
catching potential endianness errors.

Updates a lot of FATE tests with a cosmetic diff like this:

 rgb24 -> gray16be:
   [ u8 XXXX -> +++X] SWS_OP_READ         : 3 elem(s) packed >> 0
   [ u8 ...X -> +++X] SWS_OP_CONVERT      : u8 -> f32
   [f32 ...X -> .++X] SWS_OP_LINEAR       : dot3 [...]
   [f32 .XXX -> +++X] SWS_OP_CONVERT      : f32 -> u16
-  [u16 .XXX -> +++X] SWS_OP_SWAP_BYTES
-  [u16 .XXX -> +++X] SWS_OP_WRITE        : 1 elem(s) planar >> 0
-    (X = unused, + = exact, 0 = zero)
+  [u16 .XXX -> zzzX] SWS_OP_SWAP_BYTES
+  [u16 .XXX -> zzzX] SWS_OP_WRITE        : 1 elem(s) planar >> 0
+    (X = unused, z = byteswapped, + = exact, 0 = zero)

(The choice of `z` to represent swapped integers is arbitrary, but I think
it's visually evocative and distinct from the other symbols)
2025-12-24 16:37:22 +00:00
Niklas Haas
9586b81373 swscale/ops: don't strip SwsComps from SWS_OP_READ
The current behavior of assuming the value range implicitly on SWS_OP_READ
has a number of serious drawbacks and shortcomings:

- It ignored the effects of SWS_OP_RSHIFT, such as for p010 and related
  MSB-aligned formats. (This is actually a bug)

- It adds a needless dependency on the "purely informative" src/dst fields
  inside SwsOpList.

- It is difficult to reason about when acted upon by SWS_OP_SWAP_BYTES, and
  the existing hack of simply ignoring SWAP_BYTES on the value range is not
  a very good solution here.

Instead, we need a more principled way for the op list generating code
to communicate extra metadata about the operations read to the optimizer.

I think the simplest way of doing this is to allow the SwsComps field attached
to SWS_OP_READ to carry additional, user-provided information about the values
read.

This requires changing ff_sws_op_list_update_comps() slightly to not completely
overwrite SwsComps on SWS_OP_READ, but instead merge the implicit information
with the explictly provided one.
2025-12-24 16:37:22 +00:00
Niklas Haas
fafd72ef04 swscale/ops_internal: fix ff_sws_pack_op_decode()
This function was assuming that the bits are MSB-aligned, but they are
LSB-aligned in both practice (and in the actual backend).

Also update the documentation of SwsPackOp to make this clearer.

Fixes an incorrect omission of a clamp after decoding e.g. rgb4, since
the max value range was incorrectly determined as 0 as a result of unpacking
the MSB bits instead of the LSB bits:

 bgr4 -> gray:
   [ u8 XXXX -> +XXX] SWS_OP_READ         : 1 elem(s) packed >> 1
   [ u8 .XXX -> +++X] SWS_OP_UNPACK       : {1 2 1 0}
   [ u8 ...X -> +++X] SWS_OP_SWIZZLE      : 2103
   [ u8 ...X -> +++X] SWS_OP_CONVERT      : u8 -> f32
   [f32 ...X -> .++X] SWS_OP_LINEAR       : dot3 [...]
   [f32 .XXX -> .++X] SWS_OP_DITHER       : 16x16 matrix + {0 3 2 5}
+  [f32 .XXX -> .++X] SWS_OP_MIN          : x <= {255 _ _ _}
   [f32 .XXX -> +++X] SWS_OP_CONVERT      : f32 -> u8
   [ u8 .XXX -> +++X] SWS_OP_WRITE        : 1 elem(s) planar >> 0
     (X = unused, + = exact, 0 = zero)
2025-12-22 20:14:31 +00:00
Niklas Haas
7505264b6a swscale/ops: update comment on SWS_COMP_EXACT
That the integer is "in-range" is implied by the min/max range tracking,
not the flag itself.
2025-12-20 13:52:45 +00:00
Niklas Haas
1d0fd7fabf swscale/ops: categorize ops by type compatibility
This is a more useful grouping than the previous, somewhat arbitrary one.
2025-12-20 13:52:45 +00:00
Niklas Haas
960cf3015e swscale/ops: add explicit row offset to SwsDitherOp
To improve decorrelation between components, we offset the dither matrix
slightly for each component. This is currently done by adding a hard-coded
offset of {0, 3, 2, 5} to each of the four components, respectively.

However, this represents a serious challenge when re-ordering SwsDitherOp
past a swizzle, or when splitting an SwsOpList into multiple sub-operations
(e.g. for decoupling luma from subsampled chroma when they are independent).

To fix this on a fundamental level, we have to keep track of the offset per
channel as part of the SwsDitherOp metadata, and respect those values at
runtime.

This commit merely adds the metadata; the update to the underlying backends
will come in a follow-up commit. The FATE change is merely due to the
added offsets in the op list print-out.
2025-12-15 14:31:58 +00:00
Niklas Haas
c94e8afe5d swscale/ops: clarify SwsOpList.src/dst semantics
Turns out these are not, in fact, purely informative - but the optimizer
can take them into account. This should be documented properly.

I tried to think of a way to avoid needing this in the optimizer, but any
way I could think of would require shoving this to SwsReadWriteOp, which I
am particularly unwilling to do.
2025-12-08 20:09:37 +00:00
Martin Storsjö
3cc1dc3358 swscale: Remove the unused ff_sws_pixel_type_to_uint
This function uses ff_sws_pixel_type_size to switch on the
size of the provided type. However, ff_sws_pixel_type_size returns
a size in bytes (from sizeof()), not a size in bits. Therefore,
this would previously never return the right thing but always
hit the av_unreachable() below.

As the function is entirely unused, just remove it.

This fixes compilation with MSVC 2026 18.0 when targeting ARM64,
which previously hit an internal compiler error [1].

[1] https://developercommunity.visualstudio.com/t/Internal-Compiler-Error-targeting-ARM64-/10962922
2025-11-21 21:07:34 +00:00
Niklas Haas
db2bc11a97 swscale/ops: add dispatch layer
This handles the low-level execution of an op list, and integration into
the SwsGraph infrastructure. To handle frames with insufficient padding in
the stride (or a width smaller than one block size), we use a fallback loop
that pads the last column of pixels using `memcpy` into an appropriately
sized buffer.
2025-09-01 19:28:36 +02:00
Niklas Haas
ea9ca3ff35 swscale/optimizer: add high-level ops optimizer
This is responsible for taking a "naive" ops list and optimizing it
as much as possible. Also includes a small analyzer that generates component
metadata for use by the optimizer.
2025-09-01 19:28:36 +02:00
Niklas Haas
16e191c8ef swscale/ops: introduce new low level framework
See docs/swscale-v2.txt for an in-depth introduction to the new approach.

This commit merely introduces the ops definitions and boilerplate functions.
The subsequent commits will flesh out the underlying implementation.
2025-09-01 19:28:36 +02:00