Compression benchmarks

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

calima
Posts: 1745
Joined: Tue Oct 06, 2015 10:16 am

Compression benchmarks

Post by calima »

Since nobody apparently had benchmarked the usual algos on the NES, I ran some. Feel free to copy to the wiki if you want.

Code: Select all

Algo         | Ratio | Speed | License, comments
-------------------------------------------
zlib         | 0.527 | 6.4   | zlib, requires 0.7kb RAM -> not usable without extra RAM
exomizer mem | 0.581 | 9.4   | Free to use/modify; needs about 162 bytes of temp RAM, 4 of which need to be ZP
lzo          | 0.666 | 2.8   | GPL, no RAM required
lz5hc        | 0.688 | 2.5   | BSD, no RAM required
fastlz       | 0.725 | 3.9   | MIT, no RAM required
lz4hc        | 0.731 | 5.8   | BSD, no RAM required - optimized asm is 39.5kb/s
Zlib is the asm-optimized version included in cc65, the others were compiled C, and it shows in the speed. Speed is decompression in kb/s. The data used was the first 768 bytes of the GPL, ie English text, comparable to any level data you'd usually use in compressibility.

I started with the assumption LZO would be best of the no-RAM algos, and indeed it was. Its speed was a surprise however, on desktops it is extremely fast, here it's over twice as slow as zlib. Being GPL it is not usable for many closed projects, though commercial licenses are available.

Might try others if I find some, only LZO advertises being low RAM.

edit: Added lz5hc.
Last edited by calima on Wed Jun 07, 2017 10:09 am, edited 4 times in total.
User avatar
dougeff
Posts: 3079
Joined: Fri May 08, 2015 7:17 pm

Re: Compression benchmarks

Post by dougeff »

Forgive my ignorance, but are these different methods of compressing in-game graphics?

I'm only familiar with RLE, not included here.

Do you have any links to NES code which actually use the compression methods discussed here?
nesdoug.com -- blog/tutorial on programming for the NES
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Compression benchmarks

Post by tokumaru »

dougeff wrote:Forgive my ignorance, but are these different methods of compressing in-game graphics?
Due to the small amount of RAM on the NES, graphics are usually the only thing compressed with traditional techniques. Some do use an extra 8KB of WRAM to store level maps, and these could also benefit from these traditional compression algorithms.
I'm only familiar with RLE, not included here.
RLE is, like, the most basic compression scheme ever, but even it can be implemented in a lot of different ways that'll slightly affect the final compression ratios. Still, all of the other compression schemes mentioned here should outperform RLE by a reasonable margin, specially when compressing text.
Do you have any links to NES code which actually use the compression methods discussed here?
I believe there are 6502 implementations of decompression routines for some of these schemes, but if you want to use them for graphics you must modify all the code that deals with the output stream to use $2006/$2007 for reads and writes.
JRoatch
Formerly 43110
Posts: 422
Joined: Wed Feb 05, 2014 7:01 am
Contact:

Re: Compression benchmarks

Post by JRoatch »

lz4hc seems to be the best of the four tested (considering ram first then time), but still would be painfully slow. 5.8 kb/s is only about 12 bytes per ntsc field, assuming 100% cpu usage.
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Compression benchmarks

Post by Bregalad »

I'm not sure what is the data that you are showing, but it almost certainly does not make any sense in a "general purpose" way. A good compression algorithm exploit property of the data it compresses, so it really depends on the data. This is why I wrote my own tool that does statistical analysis using many different algorithm that uses low ressources.

The NES has 2kb RAM so an algorithm that requires 0.7kb of RAM is certainly usable without extra RAM.

The ROM usage is certainly more important than RAM usage. If the decompression algorithm is large, it kills of the savings you have by compressing data.
calima
Posts: 1745
Joined: Tue Oct 06, 2015 10:16 am

Re: Compression benchmarks

Post by calima »

dougeff wrote:Forgive my ignorance, but are these different methods of compressing in-game graphics?

Do you have any links to NES code which actually use the compression methods discussed here?
They can be used for graphics with CHR RAM, but I intend to use them on level data and such. No public examples that I know of, and for the decompression I used the latest release of each.
43110 wrote:lz4hc seems to be the best of the four tested (considering ram first then time), but still would be painfully slow. 5.8 kb/s is only about 12 bytes per ntsc field, assuming 100% cpu usage.
When using it to compress the level data, which must be under 2kb, that means sub-0.5s level loading time, which is quite good. Loading can be on a black screen, so all cpu can be used.
Bregalad wrote:The NES has 2kb RAM so an algorithm that requires 0.7kb of RAM is certainly usable without extra RAM.
There usually isn't that much free. In a game I did previously, there was less than 200 bytes free BSS, so zlib couldn't be used. To keep that much free, you pretty much have to design for it from the start.
The ROM usage is certainly more important than RAM usage. If the decompression algorithm is large, it kills of the savings you have by compressing data.
Yes. However, level data may be 10kb or more total, compression that cuts half can be a kb or two and still a big win.
User avatar
Dwedit
Posts: 4924
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: Compression benchmarks

Post by Dwedit »

Did anyone ever make a 6502 version of APLIB? I've made ARM and Z80 versions.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Compression benchmarks

Post by tokumaru »

Bregalad wrote:The NES has 2kb RAM so an algorithm that requires 0.7kb of RAM is certainly usable without extra RAM.
Maybe between levels, or other times when you're not keeping track of an entire virtual world. I seriously doubt that most NES games will have 0.7KB of free RAM during gameplay.
JRoatch
Formerly 43110
Posts: 422
Joined: Wed Feb 05, 2014 7:01 am
Contact:

Re: Compression benchmarks

Post by JRoatch »

Dwedit wrote:Did anyone ever make a 6502 version of APLIB? I've made ARM and Z80 versions.
the version mic_ wrote in 2010 is still available, and was successfully used for the NES version of Streemerz.
calima
Posts: 1745
Joined: Tue Oct 06, 2015 10:16 am

Re: Compression benchmarks

Post by calima »

Too bad aplib's compressor is closed source.
User avatar
TOUKO
Posts: 306
Joined: Mon Mar 30, 2015 10:14 am
Location: FRANCE

Re: Compression benchmarks

Post by TOUKO »

You have the lz4 one :
http://www.brutaldeluxe.fr/products/crossdevtools/lz4/

It's for 65816 cpu but a 6502 version exist somewhere .

It's a fast decompressor (real time oriented), but it have not the best ratio of course .
User avatar
Dwedit
Posts: 4924
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: Compression benchmarks

Post by Dwedit »

I don't know if there's a good random-access compression scheme (LZ78-style) out there. Zelda Oracles used something like that for the dungeon maps.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Compression benchmarks

Post by tepples »

The Legend of Zelda: Oracle of * also had the Game Boy Color's 32K of RAM (4K fixed, 4K switched) and 16K of VRAM, which is easily randomly accessible while rendering is off, to work with. For random access with little RAM, such as on an NES without extra RAM on the cartridge, you need to use a static dictionary in ROM.

For text, try Huffword or byte pair, with pointers to each document (e.g. each page of text or each line of dialogue). I used byte pair for my robotfindskitten port, and I was going to use Huffword for an e-book reader until the project ran into serious emulator bugs.

For map data, there's either multi-layer metatiles (Mega Man; Blaster Master; Sonic the Hedgehog) or object coding (Super Mario Bros. series). The Legend of Zelda for NES is known for using metatiles that are a column of the screen.

Another trick I like to use that generalizes RLE in a static-dictionary-ish way is based on a Markov chain. For each symbol, store the most common symbol that follows it, and then treat a "run" as a set of symbols each followed by its most common next symbol. (RLE is the special case of this where each symbol's most common next symbol is itself.) Background maps in Haunted: Halloween '85 for NES are stored this way with vertical runs. Level maps in Wrecking Ball Boy for Pygame are horizontal rows of tiles that are then Markov-expanded vertically.

It's also useful to consider how you are encoding run lengths. The best method rate-wise depends on the distribution of run lengths and literal lengths; the distribution implied by popular RLE schemes such as PackBits may not be optimal. And on the NES, you often have to break VRAM uploads into fixed-size packets. I like to use a unary code, in which each bit of a control word can be 0 for a single literal symbol or 1 for part of previous run, to be fairly efficient on real data and easy to decode to a multiple of 8 bytes. LZSS in the Allegro 4 library and Game Boy Advance and Nintendo DS BIOS uses a variant of this unary code, where each bit of the control word is one way for a literal and the other way for a reference to previous data.
zzo38
Posts: 1096
Joined: Mon Feb 07, 2011 12:46 pm

Re: Compression benchmarks

Post by zzo38 »

It is true that RLE is another format, and then there is also huffed RLE (I have used for map data in one game). As tepples mentions, there are many other good ones too. Another thing you can do is to decompress data from CHR ROM into RAM; if everything is read sequentially and may even have parts of variable length, then this may speed it up even more, but you would have to disable rendering in order to use it properly, so it might be used during loading one level or something like that (if you have small enough level sizes).

The kind of data you may compress might include:
  • Graphics
  • Map
  • Text
  • Music
(Free Hero Mesh - FOSS puzzle game engine)
Sik
Posts: 1589
Joined: Thu Aug 12, 2010 3:43 am

Re: Compression benchmarks

Post by Sik »

tepples wrote:I like to use a unary code, in which each bit of a control word can be 0 for a single literal symbol or 1 for part of previous run, to be fairly efficient on real data and easy to decode to a multiple of 8 bytes. LZSS in the Allegro 4 library and Game Boy Advance and Nintendo DS BIOS uses a variant of this unary code, where each bit of the control word is one way for a literal and the other way for a reference to previous data.
I don't think that qualifies under the definition of unary code you linked, though... (the tokens are always 0 or 1, not a string of bits) Also what you've described applies to pretty much nearly every LZ77-based scheme out there (LZSS's particularity is the use of a ring buffer instead of the already decoded data, allowing for more room to reuse strings as repeated strings don't go into the buffer).
Post Reply