Tile encoder/decoder

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

User avatar
thefox
Posts: 3141
Joined: Mon Jan 03, 2005 10:36 am
Location: Tampere, Finland
Contact:

Re:

Post by thefox » Tue Aug 28, 2012 7:31 am

tokumaru wrote:I didn't time it. Just run the ROM yourself with FCEUXD's tile viewer open and you can have an idea of how long it takes to decode the first half of SMB's tileset (it's about as long as it takes for them to be displayed, although some of the time is spent setting up the name and attribute tables and other kinds of initialization). FXEUXD seems to lag a bit when it starts though, so I can't properly time it unless I use the NMI to count how many frames are needed until all the tiles are done.
I timed it using NintendulatorDX (checking my options for tile compression in Streemerz). The "jsr Decompress" takes 1,194,448 cycles to decompress that 4KB worth of SMB tileset, that's approximately 40 frames (668 ms) on NTSC NES.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi

User avatar
sonder
Posts: 116
Joined: Wed Jun 26, 2013 12:35 pm
Location: Baltimore
Contact:

Re: Tile encoder/decoder

Post by sonder » Sat Aug 17, 2013 12:45 pm

Hey tokumaru thanks for providing this. That's some impressive ratios.
sonder

User avatar
sonder
Posts: 116
Joined: Wed Jun 26, 2013 12:35 pm
Location: Baltimore
Contact:

Re:

Post by sonder » Sat Aug 17, 2013 12:51 pm

tokumaru wrote:Well, I don't know how other people usually handle their graphics, but I usually don't have a single CHR file. I have separate files for each character, item, font, and so on. Depending on which tiles will be used at the same time I merge the files to create blocks of tiles that I can compress.

I don't know if implementing a feature like this is worth the trouble. If you have a large CHR file but want to encode it to separate blocks, doesn't that mean that there is a reason you think they shouldn't be packed together, which means it makes sense to have them in separate CHR files to begin with?

About the pointers, I wouldn't make that automatic because I like to have options. You can incbin a compressed block anywhere in your source under a label and you have it's address. However, you might want to handle them differently, you may want to have a bank index along with each pointer (in case you have graphics scattered across different program banks). You might not need a table at all, if you have just a couple of graphic blocks and the code to read from the small table would be larger than using immediate values.

I don't like to work with tools that are too specific and limit what you can do with them. As it is now I don't think it's hard at all to include the files by hand and build your own table of pointers. Like I said before (and you agreed! :wink:) I don't care too much for user friendliness, and this qualifies as it IMO. When I'm programming I just want the tools to be usable, the main project is the game, and that's where I'll put my efforts in order to make it a polished product.
I think that the codec should support arbitrary numbers of tiles (if it doesn't already, correct me if I'm wrong). In a CHR RAM project stuff is going to be loaded in pieces not entire banks.
sonder

tepples
Posts: 22224
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Tile encoder/decoder

Post by tepples » Sat Aug 17, 2013 1:12 pm

Then make "separate files for each character, item, font, and so on" like tokumaru recommended, and load multiple files into CHR RAM before the level starts.

User avatar
sonder
Posts: 116
Joined: Wed Jun 26, 2013 12:35 pm
Location: Baltimore
Contact:

Re: Tile encoder/decoder

Post by sonder » Sat Aug 17, 2013 8:41 pm

tepples wrote:Then make "separate files for each character, item, font, and so on" like tokumaru recommended, and load multiple files into CHR RAM before the level starts.
Yeah that's how I would do it - I guess I was just confused, for some reason I got the idea that the compressor expected 256 tiles exactly.
sonder

Ti_
Posts: 36
Joined: Sat Aug 03, 2013 3:08 pm
Location: Russia
Contact:

Re: Tile encoder/decoder

Post by Ti_ » Fri Sep 05, 2014 6:28 am

My test of tokumaru compression on tiles archives from battletoads:
table for unpacked, original packed, and tokumaru packed.
Attachments
tokumaru_packer_test.xls
(9 KiB) Downloaded 407 times
My profile on RHDN.

tepples
Posts: 22224
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Tile encoder/decoder

Post by tepples » Fri Sep 05, 2014 7:06 am

And the same thing in tab-separated values for people who happen not to have Microsoft Office installed:
Attachments
tokumaru_packer_test.txt
(1.85 KiB) Downloaded 379 times

Ti_
Posts: 36
Joined: Sat Aug 03, 2013 3:08 pm
Location: Russia
Contact:

Re: Tile encoder/decoder

Post by Ti_ » Fri Sep 05, 2014 7:12 am

tepples wrote:And the same thing in tab-separated values for people who happen not to have Microsoft Office installed:
I've used spread32 , it's 1Mb size. :beer:
My profile on RHDN.

Bisqwit
Posts: 248
Joined: Fri Oct 14, 2011 1:09 am

Re: Tile encoder/decoder

Post by Bisqwit » Sun May 01, 2016 8:50 am

I rewrote a compressor / decompressor for this scheme from scratch.
Incidentally, it also compresses marginally better than Tokumaru's tool (due to better algorithm in finding block splitting points). The compression format is the same though.

It can be downloaded (C++ source) at: http://iki.fi/bisqwit/src/tokumaru-source.zip
Precompiled version for x86_64 Windows: http://iki.fi/bisqwit/src/tokumaru-w64.exe
Precompiled version for i386 Windows: http://iki.fi/bisqwit/src/tokumaru-w32.exe


It can be downloaded at: http://bisqwit.iki.fi/source/tokumaru.html
Last edited by Bisqwit on Sun May 01, 2016 4:24 pm, edited 2 times in total.

tepples
Posts: 22224
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Tile encoder/decoder

Post by tepples » Sun May 01, 2016 9:33 am

The C++ source link is 404 Not Found, and it is not listed on the page one level up. But through "access the directory list directly" I found it: tokumaru. with no extension.
The origin of this software must not be misrepresented
zlib license

What sort of speedup does use of threads typically give? Because in a large build, it might be wiser to default to one thread and use job-level parallelism, where make -j3 calls multiple tokumaru instances at once (along with multiple converters for other things).

And things like "%u tiles (%u bytes) decompressed into %u bytes\n" typically go to std::stderr, not std:stdout, so that they don't mix with the output even if quiet is disabled.

Bisqwit
Posts: 248
Joined: Fri Oct 14, 2011 1:09 am

Re: Tile encoder/decoder

Post by Bisqwit » Sun May 01, 2016 9:42 am

The source link has been fixed. I forgot to actually click submit to my edit to the post. Sorry about that.
Threads give a significant speedup when running at the highest optimization levels. The process is highly parallelizable. Threads can be explicitly disabled with a commandline option.

As for messages, I have typically a guideline where errors go into stderr and everything else goes into stdout. The exception is if the actual compressed output goes into stdout; then everything else goes into stderr. This version does not support compressing to stdout currently.

Yes, zlib license. Unless there is some other point in your quote? I so dislike contextless quotations. Implications. If you are trying to accuse that I am misrepresenting the author of the original software, my program is written entirely from scratch based on the algorithm documentation I collected yesterday, and even then I am providing a credited link to the code that I read. It is neither a modification nor a port of other software. This is the original software, even though what it seeks to accomplish is the same as Tokumaru's program.

Drag
Posts: 1332
Joined: Mon Sep 27, 2004 2:57 pm
Contact:

Re: Tile encoder/decoder

Post by Drag » Sun May 01, 2016 11:48 am

How do you arrive at an algorithm that achieves the optimal split point or just optimal ways to apply compression in general? Is it a greedy algorithm? Are multiple arbitrary permutations performed and then compared?

User avatar
tokumaru
Posts: 11933
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Tile encoder/decoder

Post by tokumaru » Sun May 01, 2016 12:13 pm

IIRC, testing all possible length combinations of 2 consecutive blocks worked really well for me.

tepples
Posts: 22224
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Tile encoder/decoder

Post by tepples » Sun May 01, 2016 12:38 pm

Bisqwit wrote:Yes, zlib license. Unless there is some other point in your quote? I so dislike contextless quotations.
Just summarizing for the benefit of readers before they click through. I've seen a lot of projects on GitHub, for example, that make it hard to find the license info, with no top-level LICENSE or COPYING file and no license notice at the bottom of README.

Bisqwit
Posts: 248
Joined: Fri Oct 14, 2011 1:09 am

Re: Tile encoder/decoder

Post by Bisqwit » Sun May 01, 2016 2:30 pm

Drag wrote:How do you arrive at an algorithm that achieves the optimal split point or just optimal ways to apply compression in general? Is it a greedy algorithm? Are multiple arbitrary permutations performed and then compared?
The way I do it here is that I make a directed acyclic graph of all possible encode-lengths at all possible offsets, and then use Dijkstra's algorithm with a priority queue to find the shortest path through that graph. The path length is the number of bits generated for that particular arc.
This works very well since the splits are independent: When a block is split, no data from the previous block is carried over to the next block. It is also efficient because of the high granularity of block splits (every 16 bytes), and high likelihood of short splits being most efficient.
The compression level is just a cap on the encode-lengths to test. Without a cap, the performance would be O(n^2). With a cap, it becomes O(n*cap). In practice even short caps work really well.

By contrast, this approach would not work well with e.g. deflate, because the number of plausible block lengths is several orders of magnitudes higher.

Post Reply