Jim Leonard
2004-04-09 15:14:36 UTC
Hi, I'm working on a project that requires compression of 8-bit audio PCM data
by at least 2:1, and I'm trying to gather information on various methods so
that I can implement the best one for this project. I'm having a hard time
coming up with information and was hoping I could get some advice.
The project I'm working on is getting an original IBM PC/XT (a 4.77MHz 8088
with 640K of RAM) displaying multimedia. The video portion I already have
worked out, but in order to deal with the incredibly slow hard drive I need to
compress the audio by at least 2:1 as well. The audio data to be compressed
is unsigned 8-bit PCM mono at either 22050 or 44100Hz.
For reference, I have already come up with a low-cycle routine; it's an
adaptive DPCM scheme where the differentials are represented by signed 4-bit
indexes pointing to a value in a table. I use varying frame sizes and lookup
tables per frame to minimize error; since the indexes are 4 bits and the
tables are only 16 bytes per frame, it achieves close to 2:1 compression on
8-bit data. I based the routine around using the 8088 opcode XLAT for speed;
XLAT replaces a source value with one from a table in memory in only six
cycles.
But I can't help thinking there is a better algorithm for either smaller
sizes, or better quality at the same size. Because of the low cpu decode
requirements, there aren't very many methods to choose from. Here's a list of
what I've researched; I would very much appreciate additional suggestions or
comments on the below:
- Apple ACE/MACE: I've tried to get information on Apple's ACE/MACE
algorithms (ACE achives 3:1 and MACE achieves 6:1 or 8:1) but all I found is
decompression code that does not hint at the compression method.
- IMA ADPCM: This only outputs 16-bit signed data, so I'd have to shift the
output down to 8-bit ranges and invert the sign. I don't think that would
hurt the quality *too* much, but I haven't implemented it yet to find out.
Also, decompression speed on an 8088 is unknown. (And on a similar note,
are there any 8-bit implementations of IMA ADPCM?)
- 1-bit "halftoning" system as referenced in Patent #5,095,509: I can't grok
the legalese in the patent sufficiently to understand the process; it
appears to work by oversampling to 8x the sampling rate and dithering to
1-bit, like halftoning in printing, but I can't quite wrap my head around
this concept... any explanations out there?
- Creative's own 4/3/2-bit ADPCM: Sound Blaster cards can play back 4/3/2-bit
compressed audio in hardware, but other than their own VOC Editor, I can't
find any software that can create these files, nor anything that explains
the algorithm involved.
- John Ratcliff's ACOMP system: In an old Dr. Dobb's article, John Ratcliff
outlined a system similar to my own, except that he uses multiple bit depth
indexes and variable multipliers applied to the deltas referred to by the
indexes. He claims it achieves between 1.5:1 and 3:1 for music data, but I
haven't implemented it yet to verify the quality at those ratios. Also, I
think a mistake he made in his implementation is using linear tables, when
logrithmic tables would have been a better choice. Has anyone implemented
the ACOMP scheme for themselves?
Any advice as to the best one? Are there additional schemes with simple
decompression that I may be overlooking?
by at least 2:1, and I'm trying to gather information on various methods so
that I can implement the best one for this project. I'm having a hard time
coming up with information and was hoping I could get some advice.
The project I'm working on is getting an original IBM PC/XT (a 4.77MHz 8088
with 640K of RAM) displaying multimedia. The video portion I already have
worked out, but in order to deal with the incredibly slow hard drive I need to
compress the audio by at least 2:1 as well. The audio data to be compressed
is unsigned 8-bit PCM mono at either 22050 or 44100Hz.
For reference, I have already come up with a low-cycle routine; it's an
adaptive DPCM scheme where the differentials are represented by signed 4-bit
indexes pointing to a value in a table. I use varying frame sizes and lookup
tables per frame to minimize error; since the indexes are 4 bits and the
tables are only 16 bytes per frame, it achieves close to 2:1 compression on
8-bit data. I based the routine around using the 8088 opcode XLAT for speed;
XLAT replaces a source value with one from a table in memory in only six
cycles.
But I can't help thinking there is a better algorithm for either smaller
sizes, or better quality at the same size. Because of the low cpu decode
requirements, there aren't very many methods to choose from. Here's a list of
what I've researched; I would very much appreciate additional suggestions or
comments on the below:
- Apple ACE/MACE: I've tried to get information on Apple's ACE/MACE
algorithms (ACE achives 3:1 and MACE achieves 6:1 or 8:1) but all I found is
decompression code that does not hint at the compression method.
- IMA ADPCM: This only outputs 16-bit signed data, so I'd have to shift the
output down to 8-bit ranges and invert the sign. I don't think that would
hurt the quality *too* much, but I haven't implemented it yet to find out.
Also, decompression speed on an 8088 is unknown. (And on a similar note,
are there any 8-bit implementations of IMA ADPCM?)
- 1-bit "halftoning" system as referenced in Patent #5,095,509: I can't grok
the legalese in the patent sufficiently to understand the process; it
appears to work by oversampling to 8x the sampling rate and dithering to
1-bit, like halftoning in printing, but I can't quite wrap my head around
this concept... any explanations out there?
- Creative's own 4/3/2-bit ADPCM: Sound Blaster cards can play back 4/3/2-bit
compressed audio in hardware, but other than their own VOC Editor, I can't
find any software that can create these files, nor anything that explains
the algorithm involved.
- John Ratcliff's ACOMP system: In an old Dr. Dobb's article, John Ratcliff
outlined a system similar to my own, except that he uses multiple bit depth
indexes and variable multipliers applied to the deltas referred to by the
indexes. He claims it achieves between 1.5:1 and 3:1 for music data, but I
haven't implemented it yet to verify the quality at those ratios. Also, I
think a mistake he made in his implementation is using linear tables, when
logrithmic tables would have been a better choice. Has anyone implemented
the ACOMP scheme for themselves?
Any advice as to the best one? Are there additional schemes with simple
decompression that I may be overlooking?