Saturday, December 08, 2007

Apple formats

Aften finally supports AIFF and CAFF input. Next up is AIFF-C. Also, for full CAFF support, I need to include channel layout detection and/or conversion.

Saturday, November 10, 2007

multi-mono input

I've been working on multi-mono input for Aften over the last 2 days. I'm finally happy with it, but I need to do some more testing before committing it.

I decided to implement it within the pcm reader rather than in libaften. It's a bit slower since it has to interleave the samples, then libaften turns around and deinterleaves them. It makes things simpler though. I have thought of implementing planar pcm decoding and planar aften encoding. This would probably make things faster when using multi-mono input because it would avoid both the interleaving and deinterleaving.

Sunday, September 09, 2007

Aften 0.0.8

I finally released version 0.0.8. I've been meaning to for a while now, but things kept getting in the way. I have lots of plans for new stuff...now I just need the time to make them happen. The top of my list is to get coupling working 100%. Then I want to implement simple E-AC3 support.

Sunday, July 08, 2007

raw pcm input

I have restructured the audio input framework for the Aften commandline program. There is now a separate directory (and static lib) for processing audio input. The major reason for this change was to support raw pcm input. The new framework will also make it easier to add support for other audio formats.

With this change are 3 new commandline options.

[-raw_fmt X] Raw audio input sample format (default: s16_le)
One of the pre-defined sample formats:
u8, s16_le, s16_be, s20_le, s20_be, s24_le, s24_be,
s32_le, s32_be, float_le, float_be, double_le,
double_be
[-raw_sr #] Raw audio input sample rate (default: 48000)
[-raw_ch #] Raw audio input channels (default: 2)

Sunday, March 04, 2007

thoughts on bandwidth and coupling

While it's fresh on my mind, I want to get some ideas down in print regarding adaptive coupling and variable bandwidth.

First of all, I just added a completely reworked variable bandwidth algorithm. Basically, it works by doing exponent encoding at full bandwidth, then doing 1 bit allocation pass at q=240 (snroffset=0). The data from these 2 are used to estimate the bits required at each bandwidth setting. So a good guess can be made as to which bandwidth code can be chosen to give quality near 240.

I want to extend this idea to adaptive coupling. Instead of continuous adjustment like in variable bandwidth mode, I want to have a few bandwidth threshold points. The first would be a high-frequency threshold, probably around 16kHz. Above those frequencies, it would work like the variable bandwidth mode. If encoding at the threshold value at q=240 will fit in the frame, the bandwidth would be increased incrementally until it does not fit, and no coupling would be used. Below that threshold, a light coupling strategy would be tried. If that still doesn't give adequate quality, then heavy coupling strategy would be tried. If still too low, then no coupling would be used and normal variable bandwidth mode would be used instead.

Of course, I may change all this later, but I wanted to do some rambling while I still have the idea pretty well-formed in my head.

Saturday, February 24, 2007

recent changes

Here is an overview of some of the changes since the release of version 0.06. It's a more detailed list than what's in the Changelog, but it's simpler than having to go through the 90 SVN commit messages.
  • fixed piped input for Windows
  • several updates to the public API, including splitting the public header
  • Prakash added C++ bindings
  • utility functions for acmod and lfe were fixed, cleaned-up, and simplified
  • fixed dual-mono encoding
  • added a completely new bit allocation search method, which was a collaboration between Prakash and myself. The advantage of the new method is that it gives the same results no matter what the search start value is.
  • version info is now accessible in API and displayed on commandline
  • parallel encoding by Prakash. this is the most significant change since the last release.
  • fixed bit exactness/uninitialized memory problem in exponent encoding
  • added option to remove start-of-frame padding
  • split out the commandline help in aften to a separate files to make it simpler and easier to maintain
  • a number of excellent speed-ups by Prakash in the exponent functions
  • added optional fast exponent strategy option which uses a fixed strategy, which is not quite as good quality, but is significantly faster
  • fixed alt bitstream syntax encoding

Friday, January 26, 2007

behind the scenes

I thought I would take a minute to discuss some of the improvements I'm currently working on.

First, I have completed a basic implementation of channel coupling. The AC-3 format has the capability to merge all or selected channels in certain frequency ranges. Only a small amount of side information is transmitted for individual channel differences in those ranges. The format allows for very detailed control of this coupling on a per-block and per-channel level. The implementation I've done at this point is only a very simple starting point using a couple of fixed strategies based on samples from commercial DVD's. The plan is to eventually have Aften dynamically adjust the strategy based on the required compression needed for each frame.

Second, most of the work has been completed on a psychoacoustic model. AC-3 actually has its own simple psychoacoustic model built-in to the specification, but also provides the capability to adjust the results of the built-in model. This allows a more sophisticated model to be run in parallel in order to generate more accurate bit allocation results. At this point, I have simply adapted the MPEG Psychoacoustic Model I, as implemented in the twolame mp2 encoder, to the integer log-power measurement and critical band structure used in the AC-3 bit allocation process. The information generated by this model is used to select the fast gain codes which are used in the AC-3 bit allocation. I am now working on trying to utilize the results for delta bit allocation (adjustments to the masking threshold) as well.

The inclusion of these features will probably not be done for at least a month or two, but I will post patches of works-in-progress against the upcoming version 0.06 to Aften-devel for testing purposes.

Monday, January 08, 2007

New exponent strategy decision

I've implemented and committed a new algorithm for the selection of exponent strategies. The old method was from FFmpeg. It compared the difference between exponents from one block to the next and selected exponent strategy for each block using a threshold. The new method selects between 5 predefined sets of strategies for the whole frame. All 5 use the same number of exponent bits. The selection is done by comparing the original exponents to each set of encoded exponents and choosing the set with the smallest deviation from the original.