It is currently Sat Dec 16, 2017 11:25 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 84 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
Author Message
 Post subject:
PostPosted: Tue Dec 22, 2009 12:13 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10169
Location: Rio de Janeiro - Brazil
OK, it appears to be working great now! The compressor has been updated (download from the same location as before), and now it compresses better than Codemasters' compressor did (at least for the few tilesets I tried).

I did something tepples suggested me a few days ago: messing with the lengths of the blocks gradually, making them longer or shorter, merging blocks, things like that, and it worked great. After adjusting all the lengths a few times (the program only stops when the compression ratio stops improving), a pretty decent compression ratio is achieved. Compare it against other compressors/formats and draw your own conclusions.

I'm not sure if the compression ratio can be improved any further, but my guess is it can. I believe this is beyond my level of intelligence though, so I probably won't try it. I'm just glad I was able to make it slightly better than the scheme I was basing myself on.

Anyway, I hope that there are no bugs, but if you find anything just let me know. I hope this is useful for you guys that need some extra CHR compression. I'll probably use it in my projects from now own too.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Dec 22, 2009 2:33 am 
Offline
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 3969
I just tested it on the last real 8192 bank of CHR in Zelda 2. The output size was 5163 bytes, and it beat LZMA (5371 bytes).

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
 Post subject:
PostPosted: Tue Dec 22, 2009 6:57 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10169
Location: Rio de Janeiro - Brazil
Dwedit wrote:
The output size was 5163 bytes, and it beat LZMA (5371 bytes).

Yeah, that's expected, since LZMA is designed to work on arbitrary data and doesn't know the input consists of several bitmaps with dimensions of 8x8 pixels and color depth of 2 bpp.

The Codemasters codec is the best one I've seen used in NES games (I didn't check every single game, obviously). It seems that most other developers favored simplicity and settled for more modest compression ratios. I admire Codemasters for going this far in order to make better use of the ROM space they had. You know, this was not their first compression scheme, in older games they used a simpler one, and I think it's admirable that they continued to research better methods instead of sitting on their asses and just using what they already had.

Anyway, I'm not sure where to go from here. I'm more than satisfied with the compression ratio now, and I doubt I can improve it any further. I guess I'll try to make the decoder faster, even though I don't think speed is a big issue here... 2 seconds (I didn't time it, I'm just guessing) isn't such a long time to wait for the tiles to load. I don't want to sacrifice space though, so I'll only consider making it faster if I can do it without making it larger.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Dec 22, 2009 9:34 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19353
Location: NE Indiana, USA (NTSC)
tokumaru wrote:
I guess I'll try to make the decoder faster, even though I don't think speed is a big issue here... 2 seconds (I didn't time it, I'm just guessing) isn't such a long time to wait for the tiles to load.

Two seconds to load 8 KiB? Four tiles a frame? Sounds like a Micronics game. Has anyone else watched the pattern tables during Ikari Warriors' opening sequence? Perhaps part of favoring simplicity comes from favoring speed.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Dec 22, 2009 10:38 pm 
Offline
User avatar

Joined: Wed Nov 10, 2004 6:47 pm
Posts: 1845
My decompressor decoded the first 3 tiles of smb.bin correctly, but then it started screwing up.

After doing a bit of logging I found this to be my problem:

tokumaru wrote:
It is possible to repeat the first row of the first tile of a block, and a row filled with color 0 is used in this case.


I'm going to go out on limb here and say either:

- That's wrong
or
- That's not how smb.bin was encoded

The 4th tile in smb.bin starts a new block, and the first 4 rows of the 4th tilee are repeat rows. But if you look at SMB.chr the top 4 rows of the 4th tile are not zero filled.

Of course it could be that I'm misinterpretting.

Here's my ugly log that shows how I'm decoding. My apologies for the crudeness. Apparently I can't copy/paste from the terminal and photobucket seems to have blurred my png for some retarded reason:

http://i7.photobucket.com/albums/y282/D ... beelog.png

The row 6 and 7 at the top are the last 2 rows of the 3rd tile (they are decoded correctly)

Following that is the 4th tile. As you can see, after defining the tables, it opens with '1111' which is 4 zero'd rows.


Can you shed some light on this, tokumaru? I must be doing something wrong, right?

I'll upload my source if you want to peek, although it's pretty ugly.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Dec 22, 2009 11:48 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10169
Location: Rio de Janeiro - Brazil
Disch wrote:
tokumaru wrote:
It is possible to repeat the first row of the first tile of a block, and a row filled with color 0 is used in this case.


I'm going to go out on limb here and say either:

- That's wrong

I checked my decoder, and right after setting up all the tables it clears the row "buffer" (faking a row with all zeroes) and reads a bit to decide whether or not the next row is a repeat.

Quote:
or
- That's not how smb.bin was encoded

I just debugged the ROM that uses my decoder and I got the following statistics for the block that starts on the 4th tile:

color 3 is followed by 3 colors: 1, 0 and 2;
color 2 is followed by 2 colors: 0 and 3;
color 1 is followed by 3 colors: 3, 0 and 2;
color 0 is followed by 1 color: 2;

This is from the file encoded with the latest version of the encoder, but I just looked at your log and it seems you are using the file encoded with the old version. Well, if you switch to the new version the above is what you should get (there's a block starting at the 4th tile too, it just appears to have a different length).

Quote:
Following that is the 4th tile. As you can see, after defining the tables, it opens with '1111' which is 4 zero'd rows.

From your log it seems that you are decoding the tables wrong, which would completely screw the meaning of future bits. It seems to me that you are not reading a color from the file in case mode 3 is selected. You might have assumed a color is not necessary because it's obvious what the 3 colors are, but you do need to be given a color so that you know which of the 3 possibilities is the most frequent one, because it gets a shorter code than the others.

In mode 3, when pixels are drawn, the first possibility is given code 0, and the remaining two get codes 10 and 11. We want to make sure the shortest code goes to the most common color. The result wouldn't be as compact if we simply used the 3 possibilities in the same order every time.

Is that the problem? I'm almost certain it is, because I just checked and the first block never uses mode 3, which must be why you are getting it right.

EDIT: In case it isn't clear, the building of the tables works like this:

mode 0:
.no color is specified;

mode 1:
.the specified color goes to table B;

mode 2:
.the specified color isn't stored in any tables;
.the first color that isn't the specified color or the current one goes to table B;
.the second color that isn't the specified color or the current one goes to table C;

mode 3:
.the specified color goes to table B;
.the first color that isn't the specified color or the current one goes to table C;
.the second color that isn't the specified color or the current one goes to table D;

The only case where a color isn't specified is in mode 0.


Last edited by tokumaru on Wed Dec 23, 2009 12:30 am, edited 1 time in total.

Top
 Profile  
 
 Post subject:
PostPosted: Wed Dec 23, 2009 12:11 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10169
Location: Rio de Janeiro - Brazil
tepples wrote:
Two seconds to load 8 KiB? Four tiles a frame?

I didn't time it. Just run the ROM yourself with FCEUXD's tile viewer open and you can have an idea of how long it takes to decode the first half of SMB's tileset (it's about as long as it takes for them to be displayed, although some of the time is spent setting up the name and attribute tables and other kinds of initialization). FXEUXD seems to lag a bit when it starts though, so I can't properly time it unless I use the NMI to count how many frames are needed until all the tiles are done.

Quote:
Has anyone else watched the pattern tables during Ikari Warriors' opening sequence?

Just did. It's kinda slow, I think my decoder is faster.

Quote:
Perhaps part of favoring simplicity comes from favoring speed.

I don't think speed is that important in this case. 1 or 2 seconds isn't enough time for a player to wonder if the game is still running, it's the standard time of a black screen between a fade out and a fade in, not a big deal.

I could probably make the decoder much faster if I read bits using inlined code rather than a subroutine (think about it: 12 cycles of jsr/rts for every bit adds up to almost 17 frames in a 5KB file). I could also code different sections for each mode rather than check the mode a few times to make decisions. However, that would at least double the size of the decoder, and I don't want it's size to be a big overhead.

Also, in many cases simplicity doesn't represent speed. There are many cases where the straightforward/simple solution is slow, and you have to think of tricks to make it all faster.


Top
 Profile  
 
 Post subject:
PostPosted: Wed Dec 23, 2009 6:11 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19353
Location: NE Indiana, USA (NTSC)
tokumaru wrote:
I don't think speed is that important in this case. 1 or 2 seconds isn't enough time for a player to wonder if the game is still running, it's the standard time of a black screen between a fade out and a fade in, not a big deal.

Which means the map decoder has to run that much faster if it caches much of the decompressed map in RAM.


Top
 Profile  
 
 Post subject:
PostPosted: Wed Dec 23, 2009 7:39 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10169
Location: Rio de Janeiro - Brazil
OK tepples, I already told you I am thinking of ways to speed it up. You are welcome to contribute with any ideas you have.

Also, Map decoders, when realistically complex, should be able to do their job much faster than a tile decoder, seeing as they have to generate around 1KB of data, not 8KB.

In my game at least, I plan on decoding the tiles for a level while a black screen is displayed, but once that's done I'll display a title card for the level and draw the first screen column by column like when inside the game, so the time it takes to do it will not even count as "black screen time", there will be something to look at and music playing so it will not look frozen.

I don't remember anyone bitching at load times in NES games, not even about Ikari Warriors.


Top
 Profile  
 
 Post subject:
PostPosted: Wed Dec 23, 2009 8:14 am 
Offline
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 3969
I hated the load time in Excitebike :), whenever you use the Load or Save command.

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
 Post subject:
PostPosted: Wed Dec 23, 2009 9:08 am 
Offline
User avatar

Joined: Wed Nov 10, 2004 6:47 pm
Posts: 1845
tokumaru wrote:
Is that the problem? I'm almost certain it is, because I just checked and the first block never uses mode 3, which must be why you are getting it right.


Yeah that's gotta be it. For some reason I skipped over that part. ^^

Thanks a million!


Top
 Profile  
 
 Post subject:
PostPosted: Wed Dec 23, 2009 2:16 pm 
Offline
Site Admin
User avatar

Joined: Mon Sep 20, 2004 6:04 am
Posts: 3488
Location: Indianapolis
It's really great, I probably will use this for almost all of my CHR data. Only time I would want something faster is when the system first comes on, I might use RLE there because I hate how some games on cart don't start up right away.

I found one CHR file that was 1,200 bytes smaller with LZ77, but not surprisingly it was because most of the tiles were duplicated on the 2nd 4kB page. But yeah on that same file there was a 319 byte improvement between the last 2 versions of this compression, and that's more memory than the decompression code takes - that's really cool.


Dwedit wrote:
I hated the load time in Excitebike :), whenever you use the Load or Save command.

Yeah me too, it takes forever to access the NES tapedrive. :P


Top
 Profile  
 
 Post subject:
PostPosted: Wed Dec 23, 2009 2:30 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19353
Location: NE Indiana, USA (NTSC)
tokumaru wrote:
I don't remember anyone bitching at load times in NES games, not even about Ikari Warriors.

This is probably as close to bitching as an assume-good-faith environment allows:
In GDRI's article about Micronics, contributors wrote:
The Famicom/NES games listed below are characterized by a distinct lack of coding and design skill and are often plagued by problems such as jerky scrolling, sprite tearing, and load times.

In Wikipedia's article about Micronics, citing the GDRI article, contributors wrote:
The games also have long delays between changing screens, since the games transferred graphics to video memory at a much slower rate than usual.

I can accept four or five tiles per frame, as long as it can decompress to a transfer buffer in RAM (which I've been putting at $0100-$019F lately) so that I can 1. show a title card while stuff loads (even if only "WORLD 1-1" or "MARIO START!") and 2. swap tilesets within a level. Otherwise, I might have to rethink the design.

So you took out the first block's "load tables" bit. Have you looked into the frequency distribution of the block lengths? I guess I will when I get a chance.


Top
 Profile  
 
 Post subject:
PostPosted: Wed Dec 23, 2009 3:04 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10169
Location: Rio de Janeiro - Brazil
tepples wrote:
I can accept four or five tiles per frame, as long as it can decompress to a transfer buffer in RAM (which I've been putting at $0100-$019F lately) so that I can 1. show a title card while stuff loads (even if only "WORLD 1-1" or "MARIO START!")

Well, when a tile is completed the X register is free, so I guess it could be used to write the tile to a buffer instead of directly to VRAM. I could also provide an additional function to be called during VBlank that transfers the tiles that are ready to VRAM.

The bad thing about making all of this configurable would making things even slower. I could, whenever a row is completed, decide between buffering it or writing it to VRAM, based on a parameter supplied by the programmer (could even be the carry flag: 0 = buffer, 1 = VRAM).

Then the programmer would be responsible for calling the routine that uploads the tiles to VRAM and frees space in the buffer.

Quote:
and 2. swap tilesets within a level. Otherwise, I might have to rethink the design.

I don't plan on using this during gameplay because it would be very slow. You'd have to limit the decoder to 1 or 2 tiles per frame in order to not slow everything else down. So in my game I plan on using uncompressed graphics for anything that is loaded during gameplay.

Do you think I should have the decoder accept another parameter specifying the maximum number of tiles it can decompress each time it's called? I would need 2 entry points, much like NSFs have a INIT and PLAY addresses. The equivalent of the PLAY address would have to be called every frame and would be responsible for decoding at most the specified number of tiles until all tiles were done.

Quote:
So you took out the first block's "load tables" bit.

Yeah. Not a huge saving by any means, but that bit was meaningless, so I took it out. I considered not allowing the repetition of the first row of the first tile of a block as well, but I think I should check how often the first tiles of blocks start with zeroed out rows.

Quote:
Have you looked into the frequency distribution of the block lengths? I guess I will when I get a chance.

Just a little bit out of curiosity, but I don't know if we can do anything useful with it.


Top
 Profile  
 
 Post subject:
PostPosted: Wed Dec 23, 2009 4:50 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10169
Location: Rio de Janeiro - Brazil
OK, I'm considering modifying the decoder to work with rendering enabled or disabled, and the choice is left to the programmer. When decompressing with rendering enabled, the maximum number of tiles you want to decode per frame must be informed. Does that sound good?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 84 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group