FFmpeg

mirror of https://mirror.skon.top/https://github.com/FFmpeg/FFmpeg synced 2026-04-30 13:50:50 +08:00

Author	SHA1	Message	Date
Niklas Haas	df4fe85ae3	swscale/ops_chain: replace SwsOpEntry.unused by SwsCompMask Needed to allow us to phase out SwsComps.unused altogether. It's worth pointing out the change in semantics; while unused tracks the unused input components, the mask is defined as representing the computed output components. This is 90% the same, expect for read/write, pack/unpack, and clear; which are the only operations that can be used to change the number of components. Signed-off-by: Niklas Haas <git@haasn.dev>	2026-04-16 23:25:10 +02:00
Kacper Michajłow	1092852406	swscale/ops: remove type from continuation functions The glue code doesn't care about types, so long the functions are chained correctly. Let's not pretend there is any type safety there, as the function pointers were casted anyway from unrelated types. Particularly some f32 and u32 are shared. This fixes errors like so: src/libswscale/ops_tmpl_int.c:471:1: runtime error: call to function linear_diagoff3_f32 through pointer to incorrect function type 'void ()(struct SwsOpIter , const struct SwsOpImpl , unsigned int , unsigned int , unsigned int , unsigned int *)' libswscale/ops_tmpl_float.c:208: note: linear_diagoff3_f32 defined here Fixes: #22332	2026-04-13 23:28:30 +00:00
Kacper Michajłow	9a2a0557ad	swscale/ops: remove optimize attribute from op functions It was added to force auto vectorization on GCC builds. Since then auto vectorization has been enabled for whole code base, `1464930696`. According to GCC documentaiton, the optimize attribute should be used for debugging purposes only. It is not suitable in production code. In particular it's unclear whether the attribute is applied, as it's is actually lost when function is inlined, so usage of it is quite fragile. Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2026-04-13 23:28:30 +00:00
Niklas Haas	e787f75ec8	swscale/ops_backend: add support for SWS_OP_FILTER_V These could be implemented as a special case of DECL_READ(), but the amount of extra noise that entails is not worth it; especially due to the extra setup/free code that needs to be used here. I've decided that, for now, the canonical implementation shall convert the weights to floating point before doing the actual scaling. This is not a huge efficiency loss (since the result will be 32-bit anyways, and mulps/addps are 1-cycle ops); so the main downside comes from the single extra float conversion on the input pixels. In theory, we may revisit this later if it turns out that using e.g. pmaddwd is a win even for vertical scaling, but for now, this works and is a simple starting point. Vertical scaling also tends to happen after horizontal scaling, at which point the input will be F32 already to begin with. For smaller types/kernels (e.g. U8 input with a reasonably sized kernel), the result here is exact either way, since the resulting 8+14 bit sum fits exactly into float. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-28 18:50:14 +01:00
Niklas Haas	fce3deaa3b	swscale/ops_backend: add SwsOpExec to SwsOpIter Needed for the scaling kernel, which accesses line strides. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-28 18:50:14 +01:00
Niklas Haas	00d1f41b2e	swscale/ops_backend: avoid UB (null pointer arithmetic) Just use uintptr_t, it accomplishes the exact same thing while being defined behavior. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-24 13:20:59 +00:00
Niklas Haas	ef114cedef	swscale/ops_chain: refactor setup() signature This is basically a cosmetic commit that groups all of the parameters to setup() into a single struct, as well as the return type. This gives the immediate benefit of freeing up 8 bytes per op table entry, though the main motivation will come in the following commits. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-03-18 09:09:44 +00:00
Niklas Haas	e729f49645	swscale/ops_backend: allocate block storage up-front Instead of in each read() function. Not only is this slightly faster, due to promoting more tail calls, but it also allows us to have operation chains that don't start with a read. Also simplifies the implementations. Sponsored-by: Sovereign Tech Fund Signed-off-by: Niklas Haas <git@haasn.dev>	2026-02-19 19:44:46 +00:00
Kacper Michajłow	1294ab5db1	swscale/ops_tmpl_int: remove unused arguments from wrap read decl Signed-off-by: Kacper Michajłow <kasper93@gmail.com>	2025-09-13 19:12:44 +02:00
Niklas Haas	5aef513fb4	swscale/ops_backend: add reference backend basend on C templates This will serve as a reference for the SIMD backends to come. That said, with auto-vectorization enabled, the performance of this is not atrocious. It easily beats the old C code and sometimes even the old SIMD. In theory, we can dramatically speed it up by using GCC vectors instead of arrays, but the performance gains from this are too dependent on exact GCC versions and flags, so it practice it's not a substitute for a SIMD implementation.	2025-09-01 19:28:36 +02:00

10 Commits