16 Commits

Author SHA1 Message Date
Andreas Rheinhardt
2611874a50 avcodec/cbrt_tablegen: Deduplicate common code
Namely the part that creates a temporary LUT.

Reviewed-by: Zhao Zhili <quinkblack@foxmail.com>
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-09-04 10:15:09 +02:00
Andreas Rheinhardt
9bacefc41e avcodec/cbrt_tablegen: Avoid LUT only used once
This can be done by reusing the destination array to store
a temporary LUT (with only half the amount of elements, but
double the element size). This relies on certain assumptions
about sizes, but they are always fulfilled for systems supported
by us (sizeof(double) == 8 is needed/guaranteed since commit
3383a53e7d). Furthermore,
sizeof(uint32_t) is always >= four because CHAR_BIT is eight
(in fact, sizeof(uint32_t) is four, because the exact width
types don't have any padding).

Reviewed-by: Zhao Zhili <quinkblack@foxmail.com>
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-09-04 10:15:03 +02:00
Andreas Rheinhardt
cd337c46d5 avcodec/cbrt_tablegen: Reduce size of LUT only used once
ff_cbrt_tableinit{,_fixed}() uses a LUT of doubles of the same size
as the actual LUT to be initialized (8192 elems). Said LUT is only
used to initialize another LUT, but because it is static, the dirty
memory (64KiB) is not released. It is also too large to be put on
the stack.

This commit mitigates this: We only use a LUT for the powers of
odd values, thereby halving its size. The generated LUT stays unchanged.

Reviewed-by: Zhao Zhili <quinkblack@foxmail.com>
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-09-04 10:14:53 +02:00
Andreas Rheinhardt
77e5e61f1a avcodec/cbrt_tablegen: Remove always-false branch
Each ff_cbrt_tableinit*() is called at most once, making this
check always-false.

Reviewed-by: Zhao Zhili <quinkblack@foxmail.com>
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2025-09-04 10:12:25 +02:00
Reimar Döffinger
7c93f2c0b9 Move cbrt tables to separate cbrt_data(_fixed).c files.
Allows sharing and reusing the data between different files.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2016-03-13 18:15:57 +01:00
Ganesh Ajjanagadde
07a11ebcab lavc/cbrt_tablegen: speed up tablegen
This exploits an approach based on the sieve of Eratosthenes, a popular
method for generating prime numbers.

Tables are identical to previous ones.

Tested with FATE with/without --enable-hardcoded-tables.

Sample benchmark (Haswell, GNU/Linux+gcc):
prev:
7860100 decicycles in cbrt_tableinit,       1 runs,      0 skips
7777490 decicycles in cbrt_tableinit,       2 runs,      0 skips
[...]
7582339 decicycles in cbrt_tableinit,     256 runs,      0 skips
7563556 decicycles in cbrt_tableinit,     512 runs,      0 skips

new:
2099480 decicycles in cbrt_tableinit,       1 runs,      0 skips
2044470 decicycles in cbrt_tableinit,       2 runs,      0 skips
[...]
1796544 decicycles in cbrt_tableinit,     256 runs,      0 skips
1791631 decicycles in cbrt_tableinit,     512 runs,      0 skips

Both small and large run count given as this is called once so small run
count may give a better picture, small numbers are fairly consistent,
and there is a consistent downward trend from small to large runs,
at which point it stabilizes to a new value.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2016-01-11 17:20:38 -05:00
Ganesh Ajjanagadde
2f5075f551 avcodec/cbrt_tablegen: speed up dynamic table creation
On systems having cbrt, there is no reason to use the slow pow function.

Sample benchmark (x86-64, Haswell, GNU/Linux):
new:
5124920 decicycles in cbrt_tableinit,       1 runs,      0 skips

old:
12321680 decicycles in cbrt_tableinit,       1 runs,      0 skips

Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
2015-12-01 19:05:19 -05:00
Nedeljko Babic
a9d986c2ce avcodec: Minor macro polishing
Use macros from aac_defines.h for adding suffixes
 instead of local macros.

Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-07-22 23:23:29 +02:00
Jovan Zelincevic
08be74ac81 libavcodec: Implementation of AAC_fixed_decoder (LC-module) [2/4]
Add fixed point implementation of functions for generating tables

Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-07-09 14:41:19 +02:00
Reimar Döffinger
03bf457241 Add av_cold to table generation functions.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2014-08-31 10:33:02 +02:00
Michael Niedermayer
59352a07d8 avcodec: improve precission for cbrtf() emulation
cbrtf() took floats but it represented 1/3 exactly
and even if not more precission should be better in theory
for the table generation

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-15 21:03:03 +02:00
Derek Buitenhuis
008014b5e7 tablegen: Don't use cbrtf in host tools
You cannot count on them being present on all systems, and you
cannot include libm.h in a host tool, so just hard code baseline
implementations.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-15 16:52:07 +01:00
Michael Niedermayer
bf8bb94322 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  ffmpeg: get rid of the -vglobal option.
  dct32: Add AVX implementation of 32-point DCT
  dct32: Change pass 6 permutation to allow for AVX implementation
  dct32: port SSE 32-point DCT to YASM
  multiple inclusion guard cleanup
  avio: document buffer must created with av_malloc() and friends
  avio: check AVIOContext malloc failure
  swscale: point out an alternative to sws_getContext
  svq3: Do initialization after parsing the extradata
  add changelog entries for 0.7_beta2
  mp3lame: add #include required for AV_RB32 macro.

Conflicts:
	Changelog
	libavcodec/svq3.c
	libavcodec/x86/dct32_sse.c
	libavfilter/vsrc_buffer.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-22 04:53:19 +02:00
Diego Biurrun
153382e1b6 multiple inclusion guard cleanup
Add missing multiple inclusion guards; clean up #endif comments;
add missing library prefixes; keep guard names consistent.
2011-05-21 13:48:10 +02:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Reimar Döffinger
c26bce1070 Allow hard-coding of the 32kB cubic-root table for AAC.
Originally committed as revision 22527 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-14 19:59:47 +00:00