FFmpeg

mirror of https://mirror.skon.top/https://github.com/FFmpeg/FFmpeg synced 2026-04-20 21:00:41 +08:00

Author	SHA1	Message	Date
Niklas Haas	7a71a01a1b	swscale/ops: nuke SwsComps.unused Finally, remove the last relic of this accursed design mistake. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:25:17 +02:00
Niklas Haas	af2674645f	swscale/ops: drop offset from SWS_MASK_ALPHA This is far more commonly used without an offset than with; so having it there prevents these special cases from actually doing much good. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:24:55 +02:00
Niklas Haas	cf2d40f65d	swscale/ops: add explicit clear mask to SwsClearOp Instead of implicitly testing for NaN values. This is mostly a straightforward translation, but we need some slight extra boilerplate to ensure the mask is correctly updated when e.g. commuting past a swizzle. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:23:36 +02:00
Niklas Haas	4020607f0a	swscale/ops: add SwsCompMask and related helpers This new type will be used over the following commits to simplify the codebase. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:23:36 +02:00
Niklas Haas	85bef2c2bc	swscale/ops: split SwsConst up into op-specific structs It was a bit clunky, lacked semantic contextual information, and made it harder to reason about the effects of extending this struct. There should be zero runtime overhead as a result of the fact that this is already a big union. I made the changes in this commit by hand, but due to the length and noise level of the commit, I used Opus 4.6 to verify that I did not accidentally introduce any bugs or typos. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-02 11:48:15 +00:00
Niklas Haas	7989fd973a	swscale/ops: add min/max to SwsDitherOp This gives more accurate information to the range tracker. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-29 12:10:38 +02:00
Niklas Haas	048ca3b367	swscale/ops_optimizer: check COMP_GARBAGE instead of next->comps.unused Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-29 09:39:09 +00:00
Niklas Haas	13388c0cac	swscale/ops: test for SWS_COMP_GARBAGE instead of next->comps.unused When printing/describing operations. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-29 09:39:09 +00:00
Niklas Haas	c0cc7f341a	swscale/ops: simplify SwsOpList.order_src/dst Just define these directly as integer arrays; there's really no point in having them re-use SwsSwizzleOp; the only place this was ever even remotely relevant was in the no-op check, which any decent compiler should already be capable of optimizing into a single 32-bit comparison. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-29 09:39:09 +00:00
Niklas Haas	bf09910292	swscale/ops: add filter kernel to SwsReadWriteOp This allows reads to directly embed filter kernels. This is because, in practice, a filter needs to be combined with a read anyways. To accomplish this, we define filter ops as their semantic high-level operation types, and then have the optimizer fuse them with the corresponding read/write ops (where possible). Ultimately, something like this will be needed anyways for subsampled formats, and doing it here is just incredibly clean and beneficial compared to each of the several alternative designs I explored. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-28 18:50:13 +01:00
Niklas Haas	63140bff5e	swscale/ops: define SWS_OP_FILTER_H/V This commit merely adds the definitions. The implementations will follow. It may seem a bit impractical to have these filter ops given that they break the usual 1:1 association between operation inputs and outputs, but the design path I chose will have these filter "pseudo-ops" end up migrating towards the read/write for CPU implementations. (Which don't benefit from any ability to hide the intermediate memory internally the way e.g. a fused Vulkan compute shader might). What we gain from this design, on the other hand, is considerably cleaner high-level code, which doesn't need to concern itself with low-level execution details at all, and can just freely insert these ops wherever it needs to. The dispatch layer will take care of actually executing these by implicitly splitting apart subpasses. To handle out-of-range values and so on, the filters by necessity have to also convert the pixel range. I have settled on using floating point types as the canonical intermediate format - not only does this save us from having to define e.g. I32 as a new intermediate format, but it also allows these operations to chain naturally into SWS_OP_DITHER, which will basically always be needed after a filter pass anyways. The one exception here is for point sampling, which would rather preserve the input type. I'll worry about this optimization at a later point in time. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-28 18:50:13 +01:00
Niklas Haas	4395e8f3a2	swscale/ops: add helper function to enumerate over all op lists This moves the logic from tests/sws_ops into the library itself, where it can be reused by e.g. the aarch64 asmgen backend to iterate over all possible operation types it can expect to see. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-28 16:48:13 +00:00
Niklas Haas	f62c837eb6	swscale/ops: move op-formatting code to helper function Annoyingly, access to order_src/dst requires access to the SwsOpList, so we have to append that data after the fact. Maybe this is another incremental tick in favor of `SwsReadWriteOp` in the ever-present question in my head of whether the plane order should go there or into SwsOpList. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-28 16:48:13 +00:00
Niklas Haas	e7c84a8e6a	swscale/ops_dispatch: infer destination format from SwsOpList This is now redundant. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-12 21:02:48 +00:00
Niklas Haas	b5db7c7354	swscale/ops_dispatch: have ff_sws_compile_pass() take ownership of `ops` More useful than just allowing it to "modify" the ops; in practice this means the contents will be undefined anyways - might as well have this function take care of freeing it afterwards as well. Will make things simpler with regards to subpass splitting. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-12 21:02:48 +00:00
Niklas Haas	1addde59f9	swscale/ops: add ff_sws_op_type_name Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-09 11:25:57 +01:00
Niklas Haas	1d16161a8b	swscale/ops: use SwsCompFlags typedef instead of plain int This improves the debugging experience. These are all internal structs so there is no need to worry about ABI stability as a result of adding flags. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-09 11:25:57 +01:00
Niklas Haas	b4bcb00cd3	swscale/ops: add and use ff_sws_op_list_input/output() Makes various pieces of code that expect to get a SWS_OP_READ more robust, and also allows us to generalize to introduce more input op types in the future (in particular, I am looking ahead towards filter ops). Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-05 23:34:56 +00:00
Niklas Haas	096e20b4b8	swscale/ops: reorder fields in SwsOpList For some reason, this avoids triggering a compiler bug in gcc-15 on PowerPC, that was introduced with commit `da47951bd7`.	2026-02-26 18:08:49 +00:00
Niklas Haas	841ca7a2cb	swscale/format: pass SwsFormat by ref instead of by value where possible The one exception is in adapt_colors(), which mutates these structs on its own stack anyways.	2026-02-26 18:08:49 +00:00
Niklas Haas	ef4a597ad8	swscale/ops: allow excluding components from SWS_OP_DITHER We often need to dither only a subset of the components. Previously this was not possible, but we can just use the special value -1 for this. The main motivating factor is actually the fact that "unnecessary" dither ops would otherwise frequently prevent plane splitting, since e.g. a copied alpha plane has to come along for the ride through the whole F32/dither pipeline. Additionally, it somewhat simplifies implementations. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-26 13:09:14 +00:00
Ramiro Polla	c7c8c31302	swscale/tests/sws_ops: print range values in the output This gives more information about each operation and helps catch issues earlier on. Sponsored-by: Sovereign Tech Fund Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>	2026-02-24 19:27:51 +01:00
Niklas Haas	da47951bd7	swscale/ops: lift read op metadata to SwsOpList Instead of awkwardly preserving these from the `SwsOp` itself. This interpretation lessens the risk of bugs as a result of changing the plane swizzle mask without updating the corresponding components. After this commit, the plane swizzle mask is automatically taken into account; i.e. the src_comps mask is always interpreted as if the read op was in-order (unswizzled). Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-19 19:44:46 +00:00
Niklas Haas	998cffb432	swscale/ops: add input/output plane swizzle mask to SwsOpList This can be used to have the execution code directly swizzle the plane pointers, instead of swizzling the data via SWS_OP_SWIZZLE. This can be used to, for example, extract a subset of the input/output planes for partial processing of split graphs (e.g. subsampled chroma, or independent alpha), or just to skip an SWS_OP_SWIZZLE operation. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-19 19:44:46 +00:00
Niklas Haas	e96332cb65	swscale/ops: add ff_sws_op_list_is_noop() And use it in ff_sws_compile_pass() instead of hard-coding the check there. This check will become more sophisticated in the following commits. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-19 19:44:46 +00:00
Niklas Haas	abb1524138	Revert "swscale/ops: clarify SwsOpList.src/dst semantics" This reverts commit `c94e8afe5d`. These are now actually purely informational.	2025-12-24 16:37:22 +00:00
Niklas Haas	5f1be98f62	swscale/ops: add SWS_COMP_SWAPPED This flag keeps track of whether a pixel is currently byte-swapped or not. Not needed by current backends, but informative and useful for catching potential endianness errors. Updates a lot of FATE tests with a cosmetic diff like this: rgb24 -> gray16be: [ u8 XXXX -> +++X] SWS_OP_READ : 3 elem(s) packed >> 0 [ u8 ...X -> +++X] SWS_OP_CONVERT : u8 -> f32 [f32 ...X -> .++X] SWS_OP_LINEAR : dot3 [...] [f32 .XXX -> +++X] SWS_OP_CONVERT : f32 -> u16 - [u16 .XXX -> +++X] SWS_OP_SWAP_BYTES - [u16 .XXX -> +++X] SWS_OP_WRITE : 1 elem(s) planar >> 0 - (X = unused, + = exact, 0 = zero) + [u16 .XXX -> zzzX] SWS_OP_SWAP_BYTES + [u16 .XXX -> zzzX] SWS_OP_WRITE : 1 elem(s) planar >> 0 + (X = unused, z = byteswapped, + = exact, 0 = zero) (The choice of `z` to represent swapped integers is arbitrary, but I think it's visually evocative and distinct from the other symbols)	2025-12-24 16:37:22 +00:00
Niklas Haas	9586b81373	swscale/ops: don't strip SwsComps from SWS_OP_READ The current behavior of assuming the value range implicitly on SWS_OP_READ has a number of serious drawbacks and shortcomings: - It ignored the effects of SWS_OP_RSHIFT, such as for p010 and related MSB-aligned formats. (This is actually a bug) - It adds a needless dependency on the "purely informative" src/dst fields inside SwsOpList. - It is difficult to reason about when acted upon by SWS_OP_SWAP_BYTES, and the existing hack of simply ignoring SWAP_BYTES on the value range is not a very good solution here. Instead, we need a more principled way for the op list generating code to communicate extra metadata about the operations read to the optimizer. I think the simplest way of doing this is to allow the SwsComps field attached to SWS_OP_READ to carry additional, user-provided information about the values read. This requires changing ff_sws_op_list_update_comps() slightly to not completely overwrite SwsComps on SWS_OP_READ, but instead merge the implicit information with the explictly provided one.	2025-12-24 16:37:22 +00:00
Niklas Haas	fafd72ef04	swscale/ops_internal: fix ff_sws_pack_op_decode() This function was assuming that the bits are MSB-aligned, but they are LSB-aligned in both practice (and in the actual backend). Also update the documentation of SwsPackOp to make this clearer. Fixes an incorrect omission of a clamp after decoding e.g. rgb4, since the max value range was incorrectly determined as 0 as a result of unpacking the MSB bits instead of the LSB bits: bgr4 -> gray: [ u8 XXXX -> +XXX] SWS_OP_READ : 1 elem(s) packed >> 1 [ u8 .XXX -> +++X] SWS_OP_UNPACK : {1 2 1 0} [ u8 ...X -> +++X] SWS_OP_SWIZZLE : 2103 [ u8 ...X -> +++X] SWS_OP_CONVERT : u8 -> f32 [f32 ...X -> .++X] SWS_OP_LINEAR : dot3 [...] [f32 .XXX -> .++X] SWS_OP_DITHER : 16x16 matrix + {0 3 2 5} + [f32 .XXX -> .++X] SWS_OP_MIN : x <= {255 _ _ _} [f32 .XXX -> +++X] SWS_OP_CONVERT : f32 -> u8 [ u8 .XXX -> +++X] SWS_OP_WRITE : 1 elem(s) planar >> 0 (X = unused, + = exact, 0 = zero)	2025-12-22 20:14:31 +00:00
Niklas Haas	7505264b6a	swscale/ops: update comment on SWS_COMP_EXACT That the integer is "in-range" is implied by the min/max range tracking, not the flag itself.	2025-12-20 13:52:45 +00:00
Niklas Haas	1d0fd7fabf	swscale/ops: categorize ops by type compatibility This is a more useful grouping than the previous, somewhat arbitrary one.	2025-12-20 13:52:45 +00:00
Niklas Haas	960cf3015e	swscale/ops: add explicit row offset to SwsDitherOp To improve decorrelation between components, we offset the dither matrix slightly for each component. This is currently done by adding a hard-coded offset of {0, 3, 2, 5} to each of the four components, respectively. However, this represents a serious challenge when re-ordering SwsDitherOp past a swizzle, or when splitting an SwsOpList into multiple sub-operations (e.g. for decoupling luma from subsampled chroma when they are independent). To fix this on a fundamental level, we have to keep track of the offset per channel as part of the SwsDitherOp metadata, and respect those values at runtime. This commit merely adds the metadata; the update to the underlying backends will come in a follow-up commit. The FATE change is merely due to the added offsets in the op list print-out.	2025-12-15 14:31:58 +00:00
Niklas Haas	c94e8afe5d	swscale/ops: clarify SwsOpList.src/dst semantics Turns out these are not, in fact, purely informative - but the optimizer can take them into account. This should be documented properly. I tried to think of a way to avoid needing this in the optimizer, but any way I could think of would require shoving this to SwsReadWriteOp, which I am particularly unwilling to do.	2025-12-08 20:09:37 +00:00
Martin Storsjö	3cc1dc3358	swscale: Remove the unused ff_sws_pixel_type_to_uint This function uses ff_sws_pixel_type_size to switch on the size of the provided type. However, ff_sws_pixel_type_size returns a size in bytes (from sizeof()), not a size in bits. Therefore, this would previously never return the right thing but always hit the av_unreachable() below. As the function is entirely unused, just remove it. This fixes compilation with MSVC 2026 18.0 when targeting ARM64, which previously hit an internal compiler error [1]. [1] https://developercommunity.visualstudio.com/t/Internal-Compiler-Error-targeting-ARM64-/10962922	2025-11-21 21:07:34 +00:00
Niklas Haas	db2bc11a97	swscale/ops: add dispatch layer This handles the low-level execution of an op list, and integration into the SwsGraph infrastructure. To handle frames with insufficient padding in the stride (or a width smaller than one block size), we use a fallback loop that pads the last column of pixels using `memcpy` into an appropriately sized buffer.	2025-09-01 19:28:36 +02:00
Niklas Haas	ea9ca3ff35	swscale/optimizer: add high-level ops optimizer This is responsible for taking a "naive" ops list and optimizing it as much as possible. Also includes a small analyzer that generates component metadata for use by the optimizer.	2025-09-01 19:28:36 +02:00
Niklas Haas	16e191c8ef	swscale/ops: introduce new low level framework See docs/swscale-v2.txt for an in-depth introduction to the new approach. This commit merely introduces the ops definitions and boilerplate functions. The subsequent commits will flesh out the underlying implementation.	2025-09-01 19:28:36 +02:00

37 Commits