Previously, the padding was silence, which was a problem when streaming since
you would sample a little bit of this silence between each buffer.
We still need a means to get padding data for the right hand side, but this
patch makes the resampler output more correct.
This time it's using real math from a real whitepaper instead of my previous
amateur, fast-but-low-quality attempt. The new resampler does "bandlimited
interpolation," as described here: https://ccrma.stanford.edu/~jos/resample/
The output appears to sound cleaner, especially at high frequencies, and of
course works with non-power-of-two rate conversions.
There are some obvious optimizations to be done to this still, and there is
other fallout: this doesn't resample a buffer in-place, the 2-channels-Sint16
fast path is gone because this resampler does a _lot_ of floating point math.
There is a nasty hack to make it work with SDL_AudioCVT.
It's possible these issues are solvable, but they aren't solved as of yet.
Still, I hope this effort is slouching in the right direction.
Simon Hug
This issue actually raises the question if this API change (requirement of initialized audio subsystem) is breaking backwards compatibility. I don't see the documentation saying it is needed in 2.0.5.
"Major changes, roughly in order of appearance:
- Use float math everywhere, instead of promoting to double and casting back
all the time.
- Conserve sound energy when downmixing any channel into two other channels.
- Add a QuadToStereo filter. (The previous technique of reusing StereoToMono
never worked, since it assumed an incorrect channel layout for 4.0.)
- Add a 71to51 filter. This removes just under half of the cases the previous
code would silently break in.
- Add a QuadTo51 filter. More silent breakage fixed.
- Add a 51to71 filter, removing another almost-half of the silently broken
cases.
- Add 8 to the list of values SDL_SupportedChannelCount will accept.
- Change SDL_BuildAudioCVT's channel-related logic to handle every case, and
to actually fail if it fails instead of silently corrupting sound data and/or
crashing down the road."
(Note that SDL doesn't otherwise support 7.1 audio yet, but hopefully it will
soon and the 7.1 converters are an important piece of that. --ryan.)
Fixes Bugzilla #3727.
Simon Hug
There's a chance that an audio conversion from many channels to a few can use more than 9 audio filters. SDL_AudioCVT has 10 SDL_AudioFilter pointers of which one has to be the terminating NULL pointer. The SDL code has no checks for this limit. If it overflows there can be stack or heap corruption or a call to 0xa.
Attached patch adds a function that checks for this limit and throws an error if it is reached. Also adds some documentation.
Test parameters that trigger this issue:
AUDIO_U16MSB with 224 channels at 46359 Hz
V
AUDIO_S16MSB with 6 channels at 27463 Hz
The fuzzer program I uploaded in bug 3667 has more of them.
This defaults to the internal SDL resampler, since that's the likely default
without a system-wide install of libsamplerate, but those that need more can
tweak this.
This currently favors libsamplerate over the fast path (quality over speed),
but I'm not sure that's the correct approach, as there may be surprising
changes in performance metrics depending on what packages are available on
a user's system. That being said, currently, the only thing with access to
SDL_AudioStream is an SDL audio device's thread, and it might be mostly idle
otherwise, so maybe this is generally good.
Turns out that iterating from 0 to channels-1 was a serious performance hit!
These cases now tend to match or beat the original audio resampler's speed!
This allows us to avoid an extra copy, allocate less memory and reduce cache
pressure. On the downside: we have to do a lot of tapdancing to resample the
buffer in reverse when the output is growing.
It's expensive and (hopefully) unnecessary. If this becomes an overflow
problem, we could multiply both values by 0.5f before adding them, but let's
see if we can get by without the extra multiplication first.
We never seem to overflow the source buffer now; this might have been a
leftover from a bug that was covered by Vitaly's fixes?
Removing this conditional makes the resampler 10-20% faster. Left an
assert in there for debug builds, in case this still happens.
Removed some needless things ("len / sizeof (Uint8)"), and made sure the
int32 -> float code uses doubles to avoid working with large integer values
in a 32-bit float.
Lukasz Biel
Tried to compile SDL2 using newest version of VS.
Got:
SDL_audiocvt.obj : error LNK2019: unresolved external symbol memcpy referenced in function SDL_ResampleCVT
1>E:\Users\dotPo\Lib\SDL\VisualC\x64\Release\SDL2.dll : fatal error LNK1120: 1 unresolved externals
whole compilation process: http://pastebin.com/eWDAvBce
Steps to reproduce:
clone http://hg.libsdl.org/SDL using tortoise hg,
open SDL\VisualC\SDL.sln,
when promted if should retarget solution click ok,
select release x64 build type,
Build/Build Solution
attempt 2, using Visual Studio cmake support:
open folder SDL\
select release x64 build type,
run CMake\Build CMakeLists.txt
build fails
When switched to debug build type, buils succeeds in both cases.
VS 2017 is still beta.
It causes audio pops if you're converting in chunks (and needs to
allocate/initialize/free on each convert). We'll either adjust this interface
when we break ABI for 2.1 to make this usable, or publish the SDL_AudioStream
API for those that want a streaming solution.
In the meantime, the "simple" resampler produces "good enough" audio without
pops and doesn't have to be initialized, so that'll do for now on the
SDL_AudioCVT interface.
There was a draft of this where it did audio conversion into the final buffer,
if there was enough room available past what you asked for, but that interface
got removed, so the parameters didn't make sense (and we were using the
wrong one in any case, too!).
This no longer uses a script to generate code for every possible type
conversion or resampler. This caused a bloat in binary size and and compile
times. Now we use a handful of more generic functions and assume staying in
the CPU cache is the most important thing anyhow.
This shrinks the size of the final build (in this case: macOS X amd64, -Os to
optimize for size) by 15%. When compiling on a single core, build times drop
by about 15% too (although the previous cost was largely hidden by multicore
builds).