This allows distinguishing between different types of failure, e.g.
AVERROR(EINVAL) on invalid pass dimensions.
Signed-off-by: Niklas Haas <git@haasn.dev>
The code was evidently designed at one point in time to support "direct"
execution (not via a thread pool) for num_threads == 1, but this was never
implemented.
As a side benefit, reduces context creation overhead in single threaded
mode (relevant e.g. inside the libswscale self test), due to not needing to
spawn and destroy several thousand worker threads.
Co-authored-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Niklas Haas <git@haasn.dev>
AVFrame just really doesn't have the semantics we want. However, there a
tangible benefit to having SwsFrame act as a carbon copy of a (subset of)
AVFrame.
This partially reverts commit 67f3627267.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
This has now become fully redundant with AVFrame, especially since the
existence of SwsPassBuffer. Delete it, simplifying a lot of things and
avoiding reinventing the wheel everywhere.
Also generally reduces overhead, since there is less redundant copying
going on.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
And have ff_sws_graph_run() just take a bare AVFrame. This will help with
an upcoming change, aside from being a bit friendlier towards API users
in general.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
This commit replaces the AVBufferRef inside SwsPassBuffer by an AVFrame, in
anticipation of the SwsImg removal.
Incidentally, we could also now just use av_frame_get_buffer() here, but
at the cost of breaking the 1:1 relationship between planes and buffers,
which is required for per-plane refcopies.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
This function was originally written to support the use case of e.g.
partially allocated planes that implicitly reference the original input
image, but I've decided that this is stupid and doesn't currently work
anyways.
Plus, I have plans to kill SwsImg, so we need to simplify this mess.
Signed-off-by: Niklas Haas <git@haasn.dev>
This gives more information about each operation and helps catch issues
earlier on.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
Allows multiple passes to share a single output buffer reference. We always
allocate an output buffer so that subpasses can share the same output buffer
reference while still allowing that reference to implicitly point to the
final output image.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
The global args to ff_sws_graph_run() really shouldn't matter inside thread
workers. If they ever do, it indicates a leaky abstraction. The only reason
it was needed in the first place was because of the way the input/output
buffers implicitly defaulted to the global args.
However, we can solve this much more elegantly by just calculating it in
ff_sws_graph_run() directly and storing the computed SwsImg inside the
execution state.
Signed-off-by: Niklas Haas <git@haasn.dev>
This allows already referenced planes to be skipped, in the case of e.g.
only some of the output planes being sucessfully referenced. Also avoids
what is technically UB, if the user happens to call ff_sws_graph_run() after
already having ref'd an image.
Signed-off-by: Niklas Haas <git@haasn.dev>
Using the original input image here is completely wrong - the format/palette
could have been set to anything else in the meantime. At best, we would want to
use the original input to add_legacy_sws_pass(), but it's impossible for this
to differ from the per-pass input. The only time legacy subpasses are added
is when using cascaded contexts, but in this case, the only context actually
reading from the palette format would be the first one.
I'm not entirely sure why this code was originally written this way, but
I'm reasonably confident that it's not at all necessary. Tested extensively
on both FATE, the self-test, and real-world files.
Signed-off-by: Niklas Haas <git@haasn.dev>
This annoyingly requires recreating some of the logic inside av_img_alloc(),
because there's no good existing current helper accessible from libswscale
that gives per-plane allocations like this.
The new code is based off the calculations inside libavframe/bufferpool.c.
Signed-off-by: Niklas Haas <git@haasn.dev>
When multiple passes share a buffer reference, the true buffer dimensions
may be different for each pass, depending on slice alignment. So we can't
rely on the pass dimensions being representative.
Instead, store this information in the SwsPassBuffer itself.
Signed-off-by: Niklas Haas <git@haasn.dev>
I want to add more metadata to this and also turn it into a refstruct,
but get the cosmetic diff out of the way first.
Signed-off-by: Niklas Haas <git@haasn.dev>
And use it in ff_sws_compile_pass() instead of hard-coding the check there.
This check will become more sophisticated in the following commits.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
Prepare for xyz12Torgb48 architecture-specific optimizations in
subsequent patches by:
- Grouping XYZ+RGB gamma LUTs and 3x3 matrices into SwsColorXform
(ctx->xyz2rgb and ctx->rgb2xyz), replacing scattered fields.
- Dropping the unused last matrix column giving the same or smaller
SwsInternal size.
- Renaming ff_xyz12Torgb48 and ff_rgb48Toxyz12 and routing calls via
the new per-context function pointer (ctx->xyz12Torgb48 and
ctx->rgb48Toxyz12) in graph.c and swscale.c.
- Adding ff_sws_init_xyzdsp and invoking it in swscale init paths
(normal and unscaled).
- Making fill_xyztables public to ease its setup later in checkasm.
These modifications do not introduce any functional changes.
Signed-off-by: Arpad Panyik <Arpad.Panyik@arm.com>
This behavior had no real justification and was just incredibly confusing,
since the in/out pointers passet to setup() did not match those passed to
run(), all for what is arguably an exception anyways (the palette setup).
If this function returns an error after ff_sws_graph_add_pass() has been
called, and the pass->free callback is therefore already set up to free the
context, the graph will end up freed twice: once by the pass->free callback
(during ff_sws_graph_free()), and once before that by failure path of the
caller (e.g. add_legacy_sws_pass(), or init_legacy_subpass() itself for
cascaded contexts.)
The solution is to redefine the ownership of SwsGraph to pass clearly from
the caller of add_legacy_sws_pass() to init_legacy_subpass(), which can then
deal with appropriately freeing the context conditional on whether or not the
pass was already registered in the pass list.
Reported-by: 김영민 <kunshim@naver.com>
Signed-off-by: Niklas Haas <git@haasn.dev>
The current loop only works if the input and output have the same number
of planes. However, with the new scaling logic, we can also optimize into a
noop the case where the input has extra unneeded planes.
For the memcpy fallback to work in these cases we have to instead check if
the *output* pointer is set, rather than the input pointer.
utils.c is getting quite crowded, and I need a new place to dump a lot of
format handling code for the ongoing rewrite. Rather than bloating this file
even more, start splitting format handling helpers off into a new file.
This also renames the existing utils.h header, which didn't really contain
anything except the SwsFormat definition anyway (the prototypes for what should
have been in utils.h are all still in the legacy swscale_internal.h).
This leverages the previously introduced color management subsystem in order
to adapt between transfer functions and color spaces, as well as for HDR tone
mapping.
Take special care to handle grayscale formats without a colorspace
gracefully.
This setting can be used to infuence the type of tone and gamut mapping used
internally when color space conversions are required. As discussed at VDD'24,
the default was set to relative colorimetric clipping, which is approximately
associative, surjective and idempotent. As such, it roundtrips well, although
it is strictly speaking not associative on out-of-gamut colors.
This interface has been designed from the ground up to serve as a new
framework for dispatching various scaling operations at a high level. This
will eventually replace the old ad-hoc system of using cascaded contexts,
as well as allowing us to plug in more dynamic scaling passes requiring
intermediate steps, such as colorspace conversions, etc.
The starter implementation merely piggybacks off the existing sws_init() and
sws_scale(), functions, though it does bring the immediate improvement of
splitting up cascaded functions and pre/post conversion functions into
separate filter passes, which allows them to e.g. be executed in parallel
even when the main scaler is required to be single threaded. Additionally,
a dedicated (multi-threaded) noop memcpy pass substantially improves
throughput of that fast path.
Follow-up commits will eventually expand this to move all of the scaling
decision logic into the graph init function, and also eliminate some of the
current special cases.
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>