nesdev.com
http://forums.nesdev.com/

Compression benchmarks
http://forums.nesdev.com/viewtopic.php?f=2&t=13671
Page 1 of 2

Author:  calima [ Sun Dec 27, 2015 8:01 am ]
Post subject:  Compression benchmarks

Since nobody apparently had benchmarked the usual algos on the NES, I ran some. Feel free to copy to the wiki if you want.

Code:
Algo         | Ratio | Speed | License, comments
-------------------------------------------
zlib         | 0.527 | 6.4   | zlib, requires 0.7kb RAM -> not usable without extra RAM
exomizer mem | 0.581 | 9.4   | Free to use/modify; needs about 162 bytes of temp RAM, 4 of which need to be ZP
lzo          | 0.666 | 2.8   | GPL, no RAM required
lz5hc        | 0.688 | 2.5   | BSD, no RAM required
fastlz       | 0.725 | 3.9   | MIT, no RAM required
lz4hc        | 0.731 | 5.8   | BSD, no RAM required - optimized asm is 39.5kb/s


Zlib is the asm-optimized version included in cc65, the others were compiled C, and it shows in the speed. Speed is decompression in kb/s. The data used was the first 768 bytes of the GPL, ie English text, comparable to any level data you'd usually use in compressibility.

I started with the assumption LZO would be best of the no-RAM algos, and indeed it was. Its speed was a surprise however, on desktops it is extremely fast, here it's over twice as slow as zlib. Being GPL it is not usable for many closed projects, though commercial licenses are available.

Might try others if I find some, only LZO advertises being low RAM.

edit: Added lz5hc.

Author:  dougeff [ Sun Dec 27, 2015 10:22 am ]
Post subject:  Re: Compression benchmarks

Forgive my ignorance, but are these different methods of compressing in-game graphics?

I'm only familiar with RLE, not included here.

Do you have any links to NES code which actually use the compression methods discussed here?

Author:  tokumaru [ Sun Dec 27, 2015 11:58 am ]
Post subject:  Re: Compression benchmarks

dougeff wrote:
Forgive my ignorance, but are these different methods of compressing in-game graphics?

Due to the small amount of RAM on the NES, graphics are usually the only thing compressed with traditional techniques. Some do use an extra 8KB of WRAM to store level maps, and these could also benefit from these traditional compression algorithms.

Quote:
I'm only familiar with RLE, not included here.

RLE is, like, the most basic compression scheme ever, but even it can be implemented in a lot of different ways that'll slightly affect the final compression ratios. Still, all of the other compression schemes mentioned here should outperform RLE by a reasonable margin, specially when compressing text.

Quote:
Do you have any links to NES code which actually use the compression methods discussed here?

I believe there are 6502 implementations of decompression routines for some of these schemes, but if you want to use them for graphics you must modify all the code that deals with the output stream to use $2006/$2007 for reads and writes.

Author:  JRoatch [ Sun Dec 27, 2015 1:06 pm ]
Post subject:  Re: Compression benchmarks

lz4hc seems to be the best of the four tested (considering ram first then time), but still would be painfully slow. 5.8 kb/s is only about 12 bytes per ntsc field, assuming 100% cpu usage.

Author:  Bregalad [ Sun Dec 27, 2015 1:11 pm ]
Post subject:  Re: Compression benchmarks

I'm not sure what is the data that you are showing, but it almost certainly does not make any sense in a "general purpose" way. A good compression algorithm exploit property of the data it compresses, so it really depends on the data. This is why I wrote my own tool that does statistical analysis using many different algorithm that uses low ressources.

The NES has 2kb RAM so an algorithm that requires 0.7kb of RAM is certainly usable without extra RAM.

The ROM usage is certainly more important than RAM usage. If the decompression algorithm is large, it kills of the savings you have by compressing data.

Author:  calima [ Sun Dec 27, 2015 1:24 pm ]
Post subject:  Re: Compression benchmarks

dougeff wrote:
Forgive my ignorance, but are these different methods of compressing in-game graphics?

Do you have any links to NES code which actually use the compression methods discussed here?

They can be used for graphics with CHR RAM, but I intend to use them on level data and such. No public examples that I know of, and for the decompression I used the latest release of each.

43110 wrote:
lz4hc seems to be the best of the four tested (considering ram first then time), but still would be painfully slow. 5.8 kb/s is only about 12 bytes per ntsc field, assuming 100% cpu usage.


When using it to compress the level data, which must be under 2kb, that means sub-0.5s level loading time, which is quite good. Loading can be on a black screen, so all cpu can be used.

Bregalad wrote:
The NES has 2kb RAM so an algorithm that requires 0.7kb of RAM is certainly usable without extra RAM.


There usually isn't that much free. In a game I did previously, there was less than 200 bytes free BSS, so zlib couldn't be used. To keep that much free, you pretty much have to design for it from the start.

Quote:
The ROM usage is certainly more important than RAM usage. If the decompression algorithm is large, it kills of the savings you have by compressing data.


Yes. However, level data may be 10kb or more total, compression that cuts half can be a kb or two and still a big win.

Author:  Dwedit [ Sun Dec 27, 2015 4:15 pm ]
Post subject:  Re: Compression benchmarks

Did anyone ever make a 6502 version of APLIB? I've made ARM and Z80 versions.

Author:  tokumaru [ Sun Dec 27, 2015 6:08 pm ]
Post subject:  Re: Compression benchmarks

Bregalad wrote:
The NES has 2kb RAM so an algorithm that requires 0.7kb of RAM is certainly usable without extra RAM.

Maybe between levels, or other times when you're not keeping track of an entire virtual world. I seriously doubt that most NES games will have 0.7KB of free RAM during gameplay.

Author:  JRoatch [ Sun Dec 27, 2015 6:51 pm ]
Post subject:  Re: Compression benchmarks

Dwedit wrote:
Did anyone ever make a 6502 version of APLIB? I've made ARM and Z80 versions.
the version mic_ wrote in 2010 is still available, and was successfully used for the NES version of Streemerz.

Author:  calima [ Mon Dec 28, 2015 2:06 am ]
Post subject:  Re: Compression benchmarks

Too bad aplib's compressor is closed source.

Author:  TOUKO [ Mon Dec 28, 2015 3:58 am ]
Post subject:  Re: Compression benchmarks

You have the lz4 one :
http://www.brutaldeluxe.fr/products/crossdevtools/lz4/

It's for 65816 cpu but a 6502 version exist somewhere .

It's a fast decompressor (real time oriented), but it have not the best ratio of course .

Author:  Dwedit [ Mon Dec 28, 2015 8:46 am ]
Post subject:  Re: Compression benchmarks

I don't know if there's a good random-access compression scheme (LZ78-style) out there. Zelda Oracles used something like that for the dungeon maps.

Author:  tepples [ Mon Dec 28, 2015 8:54 am ]
Post subject:  Re: Compression benchmarks

The Legend of Zelda: Oracle of * also had the Game Boy Color's 32K of RAM (4K fixed, 4K switched) and 16K of VRAM, which is easily randomly accessible while rendering is off, to work with. For random access with little RAM, such as on an NES without extra RAM on the cartridge, you need to use a static dictionary in ROM.

For text, try Huffword or byte pair, with pointers to each document (e.g. each page of text or each line of dialogue). I used byte pair for my robotfindskitten port, and I was going to use Huffword for an e-book reader until the project ran into serious emulator bugs.

For map data, there's either multi-layer metatiles (Mega Man; Blaster Master; Sonic the Hedgehog) or object coding (Super Mario Bros. series). The Legend of Zelda for NES is known for using metatiles that are a column of the screen.

Another trick I like to use that generalizes RLE in a static-dictionary-ish way is based on a Markov chain. For each symbol, store the most common symbol that follows it, and then treat a "run" as a set of symbols each followed by its most common next symbol. (RLE is the special case of this where each symbol's most common next symbol is itself.) Background maps in Haunted: Halloween '85 for NES are stored this way with vertical runs. Level maps in Wrecking Ball Boy for Pygame are horizontal rows of tiles that are then Markov-expanded vertically.

It's also useful to consider how you are encoding run lengths. The best method rate-wise depends on the distribution of run lengths and literal lengths; the distribution implied by popular RLE schemes such as PackBits may not be optimal. And on the NES, you often have to break VRAM uploads into fixed-size packets. I like to use a unary code, in which each bit of a control word can be 0 for a single literal symbol or 1 for part of previous run, to be fairly efficient on real data and easy to decode to a multiple of 8 bytes. LZSS in the Allegro 4 library and Game Boy Advance and Nintendo DS BIOS uses a variant of this unary code, where each bit of the control word is one way for a literal and the other way for a reference to previous data.

Author:  zzo38 [ Mon Dec 28, 2015 9:49 pm ]
Post subject:  Re: Compression benchmarks

It is true that RLE is another format, and then there is also huffed RLE (I have used for map data in one game). As tepples mentions, there are many other good ones too. Another thing you can do is to decompress data from CHR ROM into RAM; if everything is read sequentially and may even have parts of variable length, then this may speed it up even more, but you would have to disable rendering in order to use it properly, so it might be used during loading one level or something like that (if you have small enough level sizes).

The kind of data you may compress might include:
  • Graphics
  • Map
  • Text
  • Music

Author:  Sik [ Sat Jan 02, 2016 9:34 am ]
Post subject:  Re: Compression benchmarks

tepples wrote:
I like to use a unary code, in which each bit of a control word can be 0 for a single literal symbol or 1 for part of previous run, to be fairly efficient on real data and easy to decode to a multiple of 8 bytes. LZSS in the Allegro 4 library and Game Boy Advance and Nintendo DS BIOS uses a variant of this unary code, where each bit of the control word is one way for a literal and the other way for a reference to previous data.

I don't think that qualifies under the definition of unary code you linked, though... (the tokens are always 0 or 1, not a string of bits) Also what you've described applies to pretty much nearly every LZ77-based scheme out there (LZSS's particularity is the use of a ring buffer instead of the already decoded data, allowing for more room to reuse strings as repeated strings don't go into the buffer).

Page 1 of 2 All times are UTC - 7 hours
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/