Tile encoder/decoder

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

tokumaru wrote:The bad thing about making all of this configurable would making things even slower.
Not if it's decided at compile time.
Have you looked into the frequency distribution of the block lengths? I guess I will when I get a chance.
Just a little bit out of curiosity, but I don't know if we can do anything useful with it.
Currently, you're encoding the length of each block as a unary number: the number of 0s plus the following 1 gives the length. But unary is optimal for only one distribution: the geometric distribution with p=1/2. For different values of p, different Golomb codes become optimal. But if your distribution follows a power law, the universal codes become optimal.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

tepples wrote:Not if it's decided at compile time.
Could be, but it's easy to think of programs where both methods would be desirable. A programmer may want to load tiles while something is displayed on screen some times, or even during gameplay, but he will probably want tiles to load as quickly as possible when he's not displaying anything.
Currently, you're encoding the length of each block as a unary number: the number of 0s plus the following 1 gives the length.
I see what you mean. Depending on the lengths it might be better to use a constant number of bits (or a variable number of them that favors shorter lengths) to represent them.

From what I've seen, blocks tend to be quite short (less than 5 or 6 tiles maybe), because the number of colors that follow each color grows pretty quickly with typical tiles, and there would be no advantage in using this format if all colors could be followed by 3 colors. There are a few exceptions, like when a long series of tiles use only 2 colors for example, in which case it's more compact to have a long block.
User avatar
Disch
Posts: 1848
Joined: Wed Nov 10, 2004 6:47 pm

Post by Disch »

I'm having more troubles with that 4th tile. I'm guessing I'm still building the tables wrong:

Code: Select all

1110  Mode[3]=3:1,0,2
1110  Mode[2]=3:1,0,3
1111  Mode[1]=3:3,0,2
1111  Mode[0]=3:3,1,2
0 - New Row
11  -- clr[0]=3
1  -- clr[1]=3
1  -- clr[2]=3
1  -- clr[3]=3
1  -- clr[4]=3
1  -- clr[5]=3
01  -- clr[6]=1
01  -- clr[7]=3
__ ROW 0:  33333313
The row should be 33333300 -- but I'm not seeing how that is supposed to happen.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

Disch wrote:

Code: Select all

1110  Mode[3]=3:1,0,2
1110  Mode[2]=3:1,0,3
1111  Mode[1]=3:3,0,2
1111  Mode[0]=3:3,1,2
0 - New Row
11  -- clr[0]=3
1  -- clr[1]=3
1  -- clr[2]=3
1  -- clr[3]=3
1  -- clr[4]=3
1  -- clr[5]=3
01  -- clr[6]=1
01  -- clr[7]=3
__ ROW 0:  33333313
The row should be 33333300 -- but I'm not seeing how that is supposed to happen.
You interpreted the last few bits incorrectly. When row 6 starts, a 0 indicates it's not a repeat, so you need to read a new color. Since the previous color (3) can be followed by 3 others, the possible codes for the new pixel are 0, 10 or 11. After you read the 1, for some reason you stopped and picked the wrong color. You should have read another bit in order to know whether to use the second possibility (code 10) or the third (code 11).

Since the next bit is 0, you have to use the second possibility, which is color 0 (Mode[3]=3:1,0,2). Then the next bit is one, indicating that you should repeat color 0, which results in 33333300.

EDIT: Are you by any chance using the codes tepples mentioned in that other thread? I'm using different codes. Whenever there are 2 possibilities I use 0 and 1, and whenever there are 3 possibilities I use 0, 10 and 11.
User avatar
Disch
Posts: 1848
Joined: Wed Nov 10, 2004 6:47 pm

Post by Disch »

Yeah I was using tepples' thing as a reference.

So lemme get this straight:

1 = repeat clr
00 = table B
010 = table C
011 = table D

is that right??


EDIT:

success! After that change it works.

yay

Now I can work on a compressor (even though I'm super late. If only I had more than 2 hours a day to work on this).

I am confident I'll be able to achieve optimal compression, though ;D
Last edited by Disch on Thu Dec 24, 2009 12:13 am, edited 1 time in total.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

Correct.
I am confident I'll be able to achieve optimal compression, though ;D
Good luck!
User avatar
Memblers
Site Admin
Posts: 4044
Joined: Mon Sep 20, 2004 6:04 am
Location: Indianapolis
Contact:

Post by Memblers »

tokumaru wrote:OK, I'm considering modifying the decoder to work with rendering enabled or disabled, and the choice is left to the programmer. When decompressing with rendering enabled, the maximum number of tiles you want to decode per frame must be informed. Does that sound good?
That would be really cool, because there is nothing else available that does that. Paired with double-buffering, that could allow for some interesting animation possibilities by freeing up room for many more frames. Perhaps not in action games, but almost anywhere else.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

There are 9 instances of ReadBit. You could make it half a macro and half a subroutine to save a 12-cycle jsr/rts pair for 7 of the 8 bits in a compressed byte. For a 4 KiB segment compressed to 2500 bytes, this saves 7 frames at the cost of 36 bytes. Given how much you're already saving over the original Codemasters code, this might be worth it.

FROM the following:

Code: Select all

+ReadBit:

	;read a bit from the stream and return it in the carry flag
	asl BitBuffer
	bne +Return
+	lda (InputStream), y
	iny
	bne +
	inc InputStream+1
+	rol
	sta BitBuffer

+Return:

	;return
	rts
TO the following: (I'm not too familiar with asm6, so this is a sort of ca65/asm6 amalgamation)

Code: Select all

.macro ReadBit
	.local Return
	asl BitBuffer
	bne +Return
	jsr +ReadByte
+Return
.endmacro

+ReadByte:
	lda (InputStream), y
	iny
	bne +
	inc InputStream+1
+	rol
	sta BitBuffer
	rts
On several other forums I used to be active on (pocketheaven, gbadev), there was seldom a new library release thread without constructive criticism about its license. From readme.txt:
Feel free to use this in your projects if you'd like. You can also modify the compressor or the decompressor if you think you can improve them in any way. I just ask that you let me know if you make any improvements to these programs.
I take it that was supposed to be a request, much like the "giftware" license of the Allegro library. Or is it a requirement? It's OK if it's a requirement as long as one can fulfill it by just publishing the source code of the modified compressor or decompressor; in that case, it acts like a weak copyleft like the Classpath license, MPL, or LGPL.

(License discussion continues in another topic.)
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

I wonder if Disch was able to make a better encoder than mine.

Disch, are you planning on changing anything about the format or are you just gonna improve the block-finding logic?

If anyone has any ideas for improvements to the format itself, it would be better if we decided on a final format right away, so that we can improve encoders and decoders without breaking compatibility with the format.

I made few changes to the original format, because I knew they would for sure save some extra bits, but if anyone thinks more radical changes (such as specifying block lengths with something else than unary codes) would be improve the compression ratios, we should discuss them.
User avatar
Disch
Posts: 1848
Joined: Wed Nov 10, 2004 6:47 pm

Post by Disch »

I've been unmotivated this weekend, so I didn't work much.

I don't plan on changing the format, no.
tomaitheous
Posts: 592
Joined: Thu Aug 28, 2008 1:17 am
Contact:

Post by tomaitheous »

tepples wrote:But then LZ77 works a lot better on platforms that have random access to VRAM (like Game Boy with rendering off), a work RAM big enough to hold a fairly large history (like Game Boy with rendering on or Super NES), or both (like GBA or PC). Methods with a static dictionary (e.g. Huffman, word encoding) and methods with a small state (e.g. RLE, CMM) work better when you have to decompress directly to a parallel port.
Actually, LZSS works just fine with port based destination when you setup the sliding window as a ring buffer. I do just this for PCE because the of the ram requirements of hucards (and even CD projects were it's all ram environment). I even have a decoder that will decode packed pixel to planar every so many bytes on the fly along with the ring buffer and port destination. 4bit packed pixel tiles compress much better than planar (composite planar like the snes), but not sure how much this applies to the NES planar format.

Pucrunch is what I use, with a smaller LZ window. The ring buffer is all handled on the decompressor side.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

tomaitheous wrote:
tepples wrote:But then LZ77 works a lot better on platforms that have [...] a work RAM big enough to hold a fairly large history[/b] (like Game Boy with rendering on or Super NES)
Actually, LZSS works just fine with port based destination when you setup the sliding window as a ring buffer.
The ring buffer would have to sit in "a work RAM big enough to hold a fairly large history". The NES has 2 KiB, and one-fourth of that is already spoken for (VRAM transfer buffer and stack at $0100-$01FF, and OAM transfer buffer at $0200-$02FF). Or do LZ77-based codecs still provide good compression even with a 256-byte window?
4bit packed pixel tiles compress much better than planar (composite planar like the snes), but not sure how much this applies to the NES planar format.
Have you tried compressing 2-bit packed pixel tiles? That's what CMM does. Even on PCE and SNES, 2-bit tiles often show up in things like fonts.
tomaitheous
Posts: 592
Joined: Thu Aug 28, 2008 1:17 am
Contact:

Post by tomaitheous »

tepples wrote:
tomaitheous wrote:
tepples wrote:But then LZ77 works a lot better on platforms that have [...] a work RAM big enough to hold a fairly large history[/b] (like Game Boy with rendering on or Super NES)
Actually, LZSS works just fine with port based destination when you setup the sliding window as a ring buffer.
The ring buffer would have to sit in "a work RAM big enough to hold a fairly large history". The NES has 2 KiB, and one-fourth of that is already spoken for (VRAM transfer buffer and stack at $0100-$01FF, and OAM transfer buffer at $0200-$02FF). Or do LZ77-based codecs still provide good compression even with a 256-byte window?
I found doing a 512 byte window with Pucrunch to be not much different in size that say 1024 or even 8k LZ window. But then again, Pucrunch is more than just LZSS. LZ77 is too inefficient. Or do you mean LZSS (which is a slight modification to LZ77)? But also, I doubt I would do a NES project that didn't use the extra ram @ $6000.


4bit packed pixel tiles compress much better than planar (composite planar like the snes), but not sure how much this applies to the NES planar format.
Have you tried compressing 2-bit packed pixel tiles? That's what CMM does. Even on PCE and SNES, 2-bit tiles often show up in things like fonts.
Haven't tried it yet. For font stuff, I tend to do VWF setups. So the font stays in rom until I need to decompress a letter (usually RLZ compression in native planar format) to send off to vram.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

tomaitheous wrote:Or do you mean LZSS (which is a slight modification to LZ77)?
By "LZ77" I refer to the entire LZ77 family of codecs based on (start, length) references to history, including LZSS, LZO, Pucrunch, Deflate, and the like, not just the first proposal that strictly alternated matches and literal symbols. For example, the GBA and DS BIOS calls that decompress a format equivalent to LZSS have "LZ77" in the name. But if an LZSS format with a 512-byte history works, great.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

BUMP: I finally got a chance to try this out.

To get the program to work on Ubuntu Karmic, I needed to add this at the top:

Code: Select all

#include <sys/stat.h>
long filelength(int fileno) {
  struct stat st;
  if (fstat(fileno, &st) < 0) return -1;
  return st.st_size;
}
And there were still warnings:

Code: Select all

compress.c: In function ‘readTiles’:
compress.c:85: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result
compress.c: In function ‘adjustLengths’:
compress.c:414: warning: ‘bestLength0’ may be used uninitialized in this function
compress.c:414: warning: ‘bestLength1’ may be used uninitialized in this function
"Please inform the source and destination files." isn't so clear. A help message more along the conventions of well-known developer tools would be "usage: compress INFILE.chr OUTFILE.cmm"

But I did manage to compress the 8192 bytes of my current project's CHR to 5337 bytes to give me an idea of what I could expect.
Post Reply