Compression benchmarks
Page 2 of 2

Author:  tepples [ Sat Jan 02, 2016 10:51 am ]
Post subject:  Re: Compression benchmarks

Sik wrote:
tepples wrote:
I like to use a unary code, in which each bit of a control word can be 0 for a single literal symbol or 1 for part of previous run, to be fairly efficient on real data and easy to decode to a multiple of 8 bytes. LZSS in the Allegro 4 library and Game Boy Advance and Nintendo DS BIOS uses a variant of this unary code, where each bit of the control word is one way for a literal and the other way for a reference to previous data.

I don't think that qualifies under the definition of unary code you linked, though... (the tokens are always 0 or 1, not a string of bits)

That's because unary coding and run-length encoding are inverses of each other. From an RLE point of view, PB8's repeat coding is a unary code.

0: Run of 1 byte, then change byte
10: Run of 2 bytes, then change byte
110: Run of 3 bytes, then change byte
1110: Run of 4 bytes, then change byte

Also what you've described applies to pretty much nearly every LZ77-based scheme out there

It models the probability of a back-reference as 50%, with the other 50% for a single literal byte, with no dependence on the previous runs. This is equivalent to runs of back-references and runs of literals each having a geometric distribution to their lengths. Unary codes and other Golomb codes are ideal for geometric distributions, with the Golomb code's M parameter depending on the distribution's p parameter. Optimally, M should be close to -log(2)/log(1 - p), with the unary code corresponding to M = 1 or p = .5. Most schemes branded as "RLE", on the other hand, treat all run lengths up to 32 or 128 or whatever as equally likely.

Another option for coding run lengths is the exp-Golomb code. It replaces the unary code for the quotient with the Elias gamma code, resulting in longer codes for values 1 and 3 (runs of length 2 or 4) for shorter codes for longer runs. Exp-Golomb is ideal for power law (Zipf) distributions.

(LZSS's particularity is the use of a ring buffer instead of the already decoded data, allowing for more room to reuse strings as repeated strings don't go into the buffer).

Repeated strings do go into the ring buffer, at least in Allegro 4 LZSS (which is equivalent to GBA LZSS with different bit ordering). The use of a ring buffer is to allow encoding or decoding without keeping the entire uncompressed stream in RAM at once. The difference between LZSS and the format described in Ziv and Lempel's 1977 paper is that original LZ77 assumed that all back references would be followed by exactly one literal and potentially have zero length, while LZSS allows back-references to follow one another directly.

Author:  calima [ Sun Jan 03, 2016 3:17 am ]
Post subject:  Re: Compression benchmarks

Guys, please keep compression theory in a separate topic. This topic's intention was to have data on existing open-source compression algorithms' performance on the NES, not to argue what is the theoretical best.

Author:  TOUKO [ Sun Jan 03, 2016 9:25 am ]
Post subject:  Re: Compression benchmarks

if you are interested by LZ4 decompressor, you can find the 6502 implementation here :

Author:  calima [ Thu Jan 21, 2016 5:18 am ]
Post subject:  Re: Compression benchmarks

Added lz5hc results. It's a new variant, performs worse than LZO in both metrics, but it's liberally licensed.

Author:  calima [ Tue Jun 06, 2017 11:41 am ]
Post subject:  Re: Compression benchmarks

I've made an asm version of lz4 that will probably soon be a part of cc65. Its speed has been posted on the first page, ~7x faster than the C version.

I also made a version that unpacks to VRAM. Rough speeds for a full 1024 nametable:
- rle takes 0.5 frames
- uncompressed takes 1.3 frames
- lz4 takes 2.8 frames

It's slower, but far from being an issue, as unpacking a nametable happens during a black screen anyway. The compression tends to be better than the Shiru version of RLE.

Author:  Oziphantom [ Tue Jun 06, 2017 10:53 pm ]
Post subject:  Re: Compression benchmarks

Can you post your "samples" the things you compress to make the tests I have some other methods I would like to compare.

Author:  calima [ Wed Jun 07, 2017 1:36 am ]
Post subject:  Re: Compression benchmarks

As mentioned, the base tests were done with the first 768 bytes of the GPL ( ... /srcdata.h). I can't post the nametable I tested, as it's a part of an upcoming MegaCat game.

Author:  Oziphantom [ Wed Jun 07, 2017 3:57 am ]
Post subject:  Re: Compression benchmarks

Algo         | Ratio  | Speed | License, comments
exomizer mem | 0.5807 | 9.4   | Free to use/modify; needs about 162 bytes of temp RAM, 4 of which need to be ZP

Speed is based upon 143492 cycles for 768 bytes extrapolated to estimated NES speeds. Assumes Not ZP based selfmodifying Next Byte routine, with all variables stored in ZP. Using indirect vector would save RAM but cost cycles, ZP based get next byte would get more speed. Data was packed in reverse with literals.
exomizer 'mem' mode is 446 bytes, the 'sfx' Self extracting program is 696, the RAW decoder used for mem is 189 bytes, with literal support.

Author:  calima [ Wed Jun 07, 2017 10:10 am ]
Post subject:  Re: Compression benchmarks

Added, thanks.

Page 2 of 2 All times are UTC - 7 hours
Powered by phpBB® Forum Software © phpBB Group