I've been thinking about how to improve the quality of Aften. The first thing, which I vow to get finished soon, is channel coupling, but there are other things I want to try out too.
The exponent strategy decision in Aften is somewhat simple. In AC-3, exponents can take up anywhere from about 100 to 2500 bits per channel, per frame. In Aften, the set of 6 exponent strategies for each channel are chosen from a list that usually gives somewhere around 400-500 bits/channel. In some cases bits are clearly wasted on the mantissas and could be put to better use increasing accuracy of the exponents. My first thought is to create separate exponent strategy set tables to group together sets that use about the same number of bits. The table corresponding to approximately the current choices would be the default, but Aften could choose other tables in response to the results of bit allocation.
For high bitrate encoding, average quality of 280 to 300 is fairly common, with values in some sections averaging 340 to 350. I would experiment with various thresholds for switching from adding bits to mantissas to adding bits to exponents. My first inclination would be to make it around 280. It could even be done incrementally, with certain quality levels triggering a switch to the next highest exponent table.
For low bitrate encoding, I would have to be very careful about taking away bits from exponents. I'll need to do some listening tests and some other types of objective measurements to determine how reducing exponent accuracy affects percieved quality. The numerical "quality" setting in Aften is only directly related to mantissa accuracy.
I have also briefly experimented with weighting the error measurement used to select exponent strategies. I tried weighting the error values by frequency bin based on the hearing threshold table. This gave more weight to error in the more important bandwidths. I have not been able to hear a difference though. Some PEAQ tests have shown about a 0.02 increase in the objective difference grade (OGD) across several samples, so it at least is likely not to hurt quality... But it's slower, so I'm holding off on committing it until I can do more tests.
Wednesday, April 29, 2009
Saturday, October 18, 2008
Exponent Strategy
I've been playing around with exponent strategies in Alsophila and came up with what I think is a decent way of selecting them. It is also the first Alsophila feature I have ported back to Aften. The expstr_fast (-fba) option is now expstr_search (-exps) and ranges from 1 to 32.
The previous way of doing this was to select from 5 preset exponent strategy sets for each channel in a frame. The pre-defined sets were originally chosen pretty much by guessing. The new way selects from the 32 sets in the E-AC-3 specification (table E2.14). I sorted them in order of the most commonly used sets by testing various source material and analyzing which sets were used. The new search size option determines how far down this sorted list the encoder searches in order to find the best match.
The previous way of doing this was to select from 5 preset exponent strategy sets for each channel in a frame. The pre-defined sets were originally chosen pretty much by guessing. The new way selects from the 32 sets in the E-AC-3 specification (table E2.14). I sorted them in order of the most commonly used sets by testing various source material and analyzing which sets were used. The new search size option determines how far down this sorted list the encoder searches in order to find the best match.
Friday, October 10, 2008
Wow, it has been quiet a while since my last post. Aften has been somewhat on the back burner in favor of other projects this year. But I have at least been working on AC-3 and E-AC-3, so much of the work I'm doing will help improve Aften in the future.
I have started to do some more work with Alsophila, which is sort of my Aften sandbox project. Channel coupling has been working for a while now, but it's very simplistic. I'm revamping the structure of the encoder to allow a much better CBR mode with adaptive coupling and bandwith. Once I'm happy with that, I'll probably start work on E-AC-3 encoding.
I really didn't like splitting out a new project, but the code structure of Aften was starting to get overwhelming and was making experimentation with new features somewhat difficult. When I get CBR+coupling working well, I'll try to port the changes back to Aften.
I have started to do some more work with Alsophila, which is sort of my Aften sandbox project. Channel coupling has been working for a while now, but it's very simplistic. I'm revamping the structure of the encoder to allow a much better CBR mode with adaptive coupling and bandwith. Once I'm happy with that, I'll probably start work on E-AC-3 encoding.
I really didn't like splitting out a new project, but the code structure of Aften was starting to get overwhelming and was making experimentation with new features somewhat difficult. When I get CBR+coupling working well, I'll try to port the changes back to Aften.
Saturday, December 08, 2007
Apple formats
Saturday, November 10, 2007
multi-mono input
I've been working on multi-mono input for Aften over the last 2 days. I'm finally happy with it, but I need to do some more testing before committing it.
I decided to implement it within the pcm reader rather than in libaften. It's a bit slower since it has to interleave the samples, then libaften turns around and deinterleaves them. It makes things simpler though. I have thought of implementing planar pcm decoding and planar aften encoding. This would probably make things faster when using multi-mono input because it would avoid both the interleaving and deinterleaving.
I decided to implement it within the pcm reader rather than in libaften. It's a bit slower since it has to interleave the samples, then libaften turns around and deinterleaves them. It makes things simpler though. I have thought of implementing planar pcm decoding and planar aften encoding. This would probably make things faster when using multi-mono input because it would avoid both the interleaving and deinterleaving.
Sunday, September 09, 2007
Aften 0.0.8
I finally released version 0.0.8. I've been meaning to for a while now, but things kept getting in the way. I have lots of plans for new stuff...now I just need the time to make them happen. The top of my list is to get coupling working 100%. Then I want to implement simple E-AC3 support.
Sunday, July 08, 2007
raw pcm input
I have restructured the audio input framework for the Aften commandline program. There is now a separate directory (and static lib) for processing audio input. The major reason for this change was to support raw pcm input. The new framework will also make it easier to add support for other audio formats.
With this change are 3 new commandline options.
With this change are 3 new commandline options.
[-raw_fmt X] Raw audio input sample format (default: s16_le)
One of the pre-defined sample formats:
u8, s16_le, s16_be, s20_le, s20_be, s24_le, s24_be,
s32_le, s32_be, float_le, float_be, double_le,
double_be
[-raw_sr #] Raw audio input sample rate (default: 48000)
[-raw_ch #] Raw audio input channels (default: 2)
Sunday, March 04, 2007
thoughts on bandwidth and coupling
While it's fresh on my mind, I want to get some ideas down in print regarding adaptive coupling and variable bandwidth.
First of all, I just added a completely reworked variable bandwidth algorithm. Basically, it works by doing exponent encoding at full bandwidth, then doing 1 bit allocation pass at q=240 (snroffset=0). The data from these 2 are used to estimate the bits required at each bandwidth setting. So a good guess can be made as to which bandwidth code can be chosen to give quality near 240.
I want to extend this idea to adaptive coupling. Instead of continuous adjustment like in variable bandwidth mode, I want to have a few bandwidth threshold points. The first would be a high-frequency threshold, probably around 16kHz. Above those frequencies, it would work like the variable bandwidth mode. If encoding at the threshold value at q=240 will fit in the frame, the bandwidth would be increased incrementally until it does not fit, and no coupling would be used. Below that threshold, a light coupling strategy would be tried. If that still doesn't give adequate quality, then heavy coupling strategy would be tried. If still too low, then no coupling would be used and normal variable bandwidth mode would be used instead.
Of course, I may change all this later, but I wanted to do some rambling while I still have the idea pretty well-formed in my head.
First of all, I just added a completely reworked variable bandwidth algorithm. Basically, it works by doing exponent encoding at full bandwidth, then doing 1 bit allocation pass at q=240 (snroffset=0). The data from these 2 are used to estimate the bits required at each bandwidth setting. So a good guess can be made as to which bandwidth code can be chosen to give quality near 240.
I want to extend this idea to adaptive coupling. Instead of continuous adjustment like in variable bandwidth mode, I want to have a few bandwidth threshold points. The first would be a high-frequency threshold, probably around 16kHz. Above those frequencies, it would work like the variable bandwidth mode. If encoding at the threshold value at q=240 will fit in the frame, the bandwidth would be increased incrementally until it does not fit, and no coupling would be used. Below that threshold, a light coupling strategy would be tried. If that still doesn't give adequate quality, then heavy coupling strategy would be tried. If still too low, then no coupling would be used and normal variable bandwidth mode would be used instead.
Of course, I may change all this later, but I wanted to do some rambling while I still have the idea pretty well-formed in my head.
Saturday, February 24, 2007
recent changes
Here is an overview of some of the changes since the release of version 0.06. It's a more detailed list than what's in the Changelog, but it's simpler than having to go through the 90 SVN commit messages.
- fixed piped input for Windows
- several updates to the public API, including splitting the public header
- Prakash added C++ bindings
- utility functions for acmod and lfe were fixed, cleaned-up, and simplified
- fixed dual-mono encoding
- added a completely new bit allocation search method, which was a collaboration between Prakash and myself. The advantage of the new method is that it gives the same results no matter what the search start value is.
- version info is now accessible in API and displayed on commandline
- parallel encoding by Prakash. this is the most significant change since the last release.
- fixed bit exactness/uninitialized memory problem in exponent encoding
- added option to remove start-of-frame padding
- split out the commandline help in aften to a separate files to make it simpler and easier to maintain
- a number of excellent speed-ups by Prakash in the exponent functions
- added optional fast exponent strategy option which uses a fixed strategy, which is not quite as good quality, but is significantly faster
- fixed alt bitstream syntax encoding
Friday, January 26, 2007
behind the scenes
I thought I would take a minute to discuss some of the improvements I'm currently working on.
First, I have completed a basic implementation of channel coupling. The AC-3 format has the capability to merge all or selected channels in certain frequency ranges. Only a small amount of side information is transmitted for individual channel differences in those ranges. The format allows for very detailed control of this coupling on a per-block and per-channel level. The implementation I've done at this point is only a very simple starting point using a couple of fixed strategies based on samples from commercial DVD's. The plan is to eventually have Aften dynamically adjust the strategy based on the required compression needed for each frame.
Second, most of the work has been completed on a psychoacoustic model. AC-3 actually has its own simple psychoacoustic model built-in to the specification, but also provides the capability to adjust the results of the built-in model. This allows a more sophisticated model to be run in parallel in order to generate more accurate bit allocation results. At this point, I have simply adapted the MPEG Psychoacoustic Model I, as implemented in the twolame mp2 encoder, to the integer log-power measurement and critical band structure used in the AC-3 bit allocation process. The information generated by this model is used to select the fast gain codes which are used in the AC-3 bit allocation. I am now working on trying to utilize the results for delta bit allocation (adjustments to the masking threshold) as well.
The inclusion of these features will probably not be done for at least a month or two, but I will post patches of works-in-progress against the upcoming version 0.06 to Aften-devel for testing purposes.
First, I have completed a basic implementation of channel coupling. The AC-3 format has the capability to merge all or selected channels in certain frequency ranges. Only a small amount of side information is transmitted for individual channel differences in those ranges. The format allows for very detailed control of this coupling on a per-block and per-channel level. The implementation I've done at this point is only a very simple starting point using a couple of fixed strategies based on samples from commercial DVD's. The plan is to eventually have Aften dynamically adjust the strategy based on the required compression needed for each frame.
Second, most of the work has been completed on a psychoacoustic model. AC-3 actually has its own simple psychoacoustic model built-in to the specification, but also provides the capability to adjust the results of the built-in model. This allows a more sophisticated model to be run in parallel in order to generate more accurate bit allocation results. At this point, I have simply adapted the MPEG Psychoacoustic Model I, as implemented in the twolame mp2 encoder, to the integer log-power measurement and critical band structure used in the AC-3 bit allocation process. The information generated by this model is used to select the fast gain codes which are used in the AC-3 bit allocation. I am now working on trying to utilize the results for delta bit allocation (adjustments to the masking threshold) as well.
The inclusion of these features will probably not be done for at least a month or two, but I will post patches of works-in-progress against the upcoming version 0.06 to Aften-devel for testing purposes.
Monday, January 08, 2007
New exponent strategy decision
I've implemented and committed a new algorithm for the selection of exponent strategies. The old method was from FFmpeg. It compared the difference between exponents from one block to the next and selected exponent strategy for each block using a threshold. The new method selects between 5 predefined sets of strategies for the whole frame. All 5 use the same number of exponent bits. The selection is done by comparing the original exponents to each set of encoded exponents and choosing the set with the smallest deviation from the original.
Monday, October 23, 2006
dynamic range compression
I finally bit the bullet today and finished implementing DRC. There are 6 profiles, as defined by Dolby: Film Standard, Film Light, Music Standard, Music Light, Speech, and None.
example using Film Standard:
example using Music Light:
The results have not been extensively tested yet, so until then, consider this option experimental. It seems to work ok for the most part, but I have had a couple strange results.
example using Film Standard:
aften -dnorm 27 -dynrng 0 test.wav test.ac3
example using Music Light:
aften -dnorm 17 -dynrng 3 test.wav test.ac3
The results have not been extensively tested yet, so until then, consider this option experimental. It seems to work ok for the most part, but I have had a couple strange results.
Sunday, October 22, 2006
build system and other stuff
It's been a while since I've posted. Most of the recent changes have not been made by me, but by Prakash Punnoor, who has nearly completely converted the build system to CMake. The configure shell script is still usable, but once all the features are converted it will likely be removed all together or stripped down to a simpler form and just kept for compatibility's sake.
Also, Prakash has imported an SSE-optimized version of the libvorbis MDCT. The new build system, along with asm run-time detection, have made this possible. The effect is a 5 to 10% increase in encoding speed.
Also, Prakash has imported an SSE-optimized version of the libvorbis MDCT. The new build system, along with asm run-time detection, have made this possible. The effect is a 5 to 10% increase in encoding speed.
Thursday, September 14, 2006
wavrms changes
I have done some modification to the wavrms utility program which can be used to give a dialnorm setting recommendation.
- enabled piped input for Windows
- added an option to specify a time range to analyze
- simplified the calculation
Monday, September 11, 2006
Thank you Xiphorous
I have substituted the FFmpeg MDCT implementation with the one from libvorbis, which is faster. The only side-effect is that there are 2 files now (mdct.c, mdct.h) which fall under the Xiph BSD-style license rather than the LGPL like the rest of Aften.
The AC-3 short-block MDCT is not a standard MDCT, so I had to modify it a bit to get it to work properly. The result is a small slow-down (but still faster than the previous MDCT) which I hope to fix before too long.
The AC-3 short-block MDCT is not a standard MDCT, so I had to modify it a bit to get it to work properly. The result is a small slow-down (but still faster than the previous MDCT) which I hope to fix before too long.
Friday, September 08, 2006
new optional speed-up
I've added a new encoding option which will sacrifice some accuracy in the bit allocation by skipping the binary search. This gives about a 4% to 5% speed-up. The commandline option for Aften is "-fba 1" (stands for fast bit allocation).
update & small speed increase
It's been several days since my last update. I've mostly been trying to iron-out the dynamic range compression. In the meantime, I've applied a couple minor changes to dsp.c which together give about a 3%-4% speed-up on my system.
Saturday, September 02, 2006
new cbr search method
I implemented a new search method for CBR bit allocation which gives about 6% 10% faster encoding than with the old method.
old method:
old method:
- estimate snr based on last frame
- decrement snr in large steps until data fits in frame
- increment snr in medium steps while data still fits in frame
- increment snr in small steps while data still fits in frame
- estimate snr based on last frame
- narrow the search range by doing a weighted increment or decrement of snr based on how close it is to optimal sliding a 16-wide window up or down until it overlaps the optimal value.
- do a binary search for the optimal snr value
Monday, August 21, 2006
version 0.05
Version 0.05 has been released. The speed increase since version 0.04 really warranted another release. There have also been a few bug fixes.
http://sourceforge.net/projects/aften/
http://sourceforge.net/projects/aften/
Saturday, August 19, 2006
doubles or floats
I decided to switch the default configuration for the floating-point type. Now using floats is default, and "--enable-double" enables use of doubles.
Friday, August 18, 2006
floats or doubles
There is now a configure switch to enable use of floats internally instead of doubles. This is less accurate, but faster. To use floats, just add "--enable-float" when running configure. Otherwise, Aften will use doubles.
Saturday, August 12, 2006
Sample Format choices
I have modified by libaften and the Aften wav reader to be able to use various sample formats. Aften still uses doubles internally, but the input can be any one of the most common pcm sample format types.
It wasn't absolutely necessary for me to add this functionality to the wav reader, but I did it so that wav.c can be easily reused for other purposes.
It wasn't absolutely necessary for me to add this functionality to the wav reader, but I did it so that wav.c can be easily reused for other purposes.
Friday, August 11, 2006
more functional separation
I moved all of the exponent processing functions from a52enc.c to exponent.c. This was in preparation for a new exponent strategy decision algorithm, which I am working on now.
Monday, August 07, 2006
bit allocation speedup
Thanks to the help of Prakash Punnoor (prakash@punnoor.de), the bit allocation in Aften is much faster. On my system, there was a 22% speedup for CBR and 36% speedup for VBR!!
Sunday, August 06, 2006
Daily builds
I have setup daily builds from SVN. Every day at 4:50am EST my computer will get the latest SVN version of Aften and build a source package and a Windows binary package. They will be located at:
http://jbr.homelinux.org/aften/daily/
http://jbr.homelinux.org/aften/daily/
Saturday, August 05, 2006
Aften 0.04 released
I have setup a Sourceforge project for Aften, where I have released version 0.04.
Benefits of putting Aften on Sourceforge:
Benefits of putting Aften on Sourceforge:
- SVN access to the latest development changes.
- Organized file releases
- Mailing lists (currently, there is only an svnlog mailing list)
- More publicity
- Opportunity for other developers to join the project
- Web hosting (the new Aften webpage is http://aften.sourceforge.net)
Friday, August 04, 2006
Default Bitrate
I made a few minor changes regarding bitrate. First, CBR mode is now the default instead of VBR. The default CBR bitrate is chosen based on the number of full-bandwidth channels.
1: 96 kbps
2: 192 kbps
3: 256 kbps
4: 384 kbps
5: 448 kbps
I modified the Aften commandline to accept bitrate in kbps instead of bps (i.e. '-b 192' instead of '-b 192000').
Now you can use the '-b' option along with the '-q' option. This will use VBR mode and use the given bitrate as a maximum bitrate.
1: 96 kbps
2: 192 kbps
3: 256 kbps
4: 384 kbps
5: 448 kbps
I modified the Aften commandline to accept bitrate in kbps instead of bps (i.e. '-b 192' instead of '-b 192000').
Now you can use the '-b' option along with the '-q' option. This will use VBR mode and use the given bitrate as a maximum bitrate.
Wednesday, August 02, 2006
libaften
I finally created a libaften! If there are no problems with the new configure/build system, I will release version 0.04 soon.
I decided not to make the filter library separate. It just made things too complicated. So now there is the base directory, aften, libaften, and util. Currently only a static library is built, but I will eventually add support for compiling a dynamic library/dll.
I decided not to make the filter library separate. It just made things too complicated. So now there is the base directory, aften, libaften, and util. Currently only a static library is built, but I will eventually add support for compiling a dynamic library/dll.
Tuesday, August 01, 2006
Bandwidth stuff
I have done some reworking of the bandwidth calculation. Now there are 2 adaptive modes. The default is now a fixed adaptive bandwidth, which is chosen at the start of encoding based on either the quality or bitrate setting. The other is the same variable bandwidth as before, but with a different formula. I did this mainly to make the bandwidth filter work better. There is a distinct quality difference in the high frequencies when using the fixed adaptive bandwidth coupled with the bandwidth pre-encoding filter.
commandline option changes:
'-bwfilter 1' = enable bandwidth filter
'-w -1' = fixed adaptive bandwidth (default)
'-w -2' = variable adaptive bandwidth (cannot use bandwidth filter w/ this mode)
commandline option changes:
'-bwfilter 1' = enable bandwidth filter
'-w -1' = fixed adaptive bandwidth (default)
'-w -2' = variable adaptive bandwidth (cannot use bandwidth filter w/ this mode)
Monday, July 31, 2006
filters
Added commandline options for the DC high-pass filter and the LFE low-pass filter.
The commandline switches are '-dcfilter <0|1>' and '-lfefilter <0|1>'.
The commandline switches are '-dcfilter <0|1>' and '-lfefilter <0|1>'.
short-term plans
Here is a small list of short-term development plans. I hope to get all this done in the next week or so. Once these are done, I will release a new version 0.04.
- separate the CRC code to its own file
- add optional DC high-pass and LFE low-pass filters
- completely redo the variable bandwidth system
- add an optional bandwidth low-pass filter
- overhaul the configuration system
- separate aften frontend, libaften, libfilter, and utils
Filter Library
I created a separate filter library, which will be used for all input audio filters. Currently it is only used for the transient-detection high-pass filter. Also, I changed Aften to use doubles instead of floats for pretty much everything.
The configuration system might be a little unstable on non-Linux systems right now. I am planning on doing some updates soon.
The configuration system might be a little unstable on non-Linux systems right now. I am planning on doing some updates soon.
Friday, July 28, 2006
Small xbsi change
Added 2 commandline options, '-xbsi1' and '-xbsi2'. These must be used if the user wants the extended bit stream info to be written to the AC-3 file. Even if options in the extended bit stream info are specified on the commandline, they will not be encoded without explicitly adding the corresponding xbsi option. This was done to make simplifying the option parsing easier.
Header Reorganization
As one step forward in creating a libaften, I have reorganized the stucts to provide more complete separation of aften.h as a public header from the other private headers.
Alternate Bit Stream Syntax
I have added 8 new commandline options to Aften which correspond to the parameters in the Alternate Bit Stream Syntax, which is defined by Annex C of the A/52A spec. These are metadata parameters which affect interpretation by the decoder and/or receiver. Run "aften -h" to view the new options.
Sunday, July 23, 2006
channel map changes
In order to support broken/incorrect/non-standard wav files, I have made a couple changes. The first was to detect use of either back-left/back-right or side-left/side-right as the stereo surround channels. The second was to allow the user to specify the audio coding mode and presence of the LFE channel on the commandline (e.g. "-acmod 7 -lfe 1"). Only the number of channels must match the specified settings.
Saturday, July 22, 2006
First post
Aftenblog is open. This is a development blog for Aften, an AC3 encoder. I created it so I can have a space to post program updates and general thoughts regarding Aften and the AC3 format.
Subscribe to:
Posts (Atom)