It is currently Sat Mar 23, 2019 6:41 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 11 posts ] 
Author Message
PostPosted: Mon Apr 30, 2018 6:08 pm 
Offline
Formerly 43110
User avatar

Joined: Wed Feb 05, 2014 7:01 am
Posts: 379
Location: us-east
Donut is the sequel to my previous codec Bagel.

(2018-02-15 Edit)
New versions with slightly tweaked format from last year's version:
6520 decompressor for ca65 via github.
Compressors written for C and/or python also available there.

Written narrative, dedicated webpage, and windows exe builds may appear next year or so.


Last edited by JRoatch on Fri Feb 15, 2019 3:42 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Wed Nov 28, 2018 10:08 am 
Offline
User avatar

Joined: Thu Mar 31, 2016 11:15 am
Posts: 505
up-to-date version is on github, for anyone looking

https://github.com/jroatch/donut-nes


Top
 Profile  
 
PostPosted: Wed Feb 06, 2019 6:39 pm 
Offline
User avatar

Joined: Fri Nov 24, 2017 2:40 pm
Posts: 144
I'm really fascinated by this. So far I've been using lz4 for all of my compression needs. Simple and ubiquitous. On the other hand, it doesn't do a great job compressing character tables compared to name/attribute data. My thoughts about how to improve the situation haven't gone much further than "uh... preprocess with XOR maybe?"

Donut gives slightly better compression than straight lz4, using only ~20 bytes more code. If I had 3 full character tables, it would even pay off to have both compressors for different types of data. Additionally, lz4 really requires turning off the PPU while decompressing. Donut on the other hand seems to work with a small CPU side buffer and looks relatively easy to put in a coroutine and pause for the NMI when the buffer gets full. Maybe it's even fast enough to stream a couple tiles per frame?


Top
 Profile  
 
PostPosted: Wed Feb 06, 2019 7:33 pm 
Offline
Formerly 43110
User avatar

Joined: Wed Feb 05, 2014 7:01 am
Posts: 379
Location: us-east
Thanks,
Working with some form of coroutines was part of my thinking when designing this (and the codec it descended from).

but in practice there is is one complication I've found with it's current design, the requirement to retain the buffer for the "Reuse/Duplicate previous block" feature. In my usage I found that It gets largely used to code a sequence solid color blocks, since solid color blocks takes 2 bytes, but duplicate blocks takes only 1 byte.

I'm contemplating making a breaking change where I get rid of "Duplicate blocks", and make solid color blocks only 1 byte. That will remove the requirement to save the 64 byte buffer, making it easier for further routines to transform the uncompressed data.
(C encoder easier to program as well)

I'm not sure about using it directly in vblank as each (non duplicate) 64 byte block takes 1293~7247 cycles.

------
I'm also glad to hear about the viability of lz4 on the NES.


Top
 Profile  
 
PostPosted: Wed Feb 06, 2019 7:39 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21212
Location: NE Indiana, USA (NTSC)
Duplicates were important for the Action 53 data set. Games with CHR ROM often have duplicate "X" tiles to mark unused tiles, and there was an intent to interleave different 4K or 8K CHR banks in case a game uses $2000 bit 4 (background pattern table base address) or CNROM to animate a subset of the tiles.

Would it be better to make a format option to allow or disallow dupes?


Top
 Profile  
 
PostPosted: Wed Feb 06, 2019 7:52 pm 
Offline
Formerly 43110
User avatar

Joined: Wed Feb 05, 2014 7:01 am
Posts: 379
Location: us-east
And that was precisely the reason I held on to it for as long as I did.

but even the way I'm doing it now (comparing a match of 64 bytes) is subpar when compared to pb53 (matches of 16 bytes).
I can do better, and I have plenty of codec header space to do it with (all header bytes >= 0xc0).

and even if I don't make up better compression schemes, I believe I have the option to just turn those X tiles to solid black ones.

Edit: didn't see the second half of your post

I have some ideas about now to handle duplicates between pages, and I think taking out the duplicate block feature in the core decoder will help with how that will work.


Top
 Profile  
 
PostPosted: Wed Feb 06, 2019 8:20 pm 
Offline
User avatar

Joined: Fri Nov 24, 2017 2:40 pm
Posts: 144
JRoatch wrote:
the requirement to retain the buffer for the "Reuse/Duplicate previous block" feature


Not an issue if you reserve the buffer bytes exclusively for Donut though right?

JRoatch wrote:
I'm not sure about using it directly in vblank as each (non duplicate) 64 byte block takes 1293~7247 cycles


Nah. I mean fill the buffer, yield, copy it to the PPU during NMI, then resume during the next frame. It would require modifications, but the source isn't that complicated. This is basically impossible with lz4 without extra work RAM.

JRoatch wrote:
I'm contemplating making a breaking change


I'm sort of curious about that. Donut isn't really an archival format. Why worry too much about breaking changes? Speaking personally, since my tools are a build dependency, I keep them in the same repository as my project (or in a git submodule, etc).


Top
 Profile  
 
PostPosted: Wed Feb 06, 2019 8:35 pm 
Offline
Formerly 43110
User avatar

Joined: Wed Feb 05, 2014 7:01 am
Posts: 379
Location: us-east
slembcke wrote:
Donut isn't really an archival format. Why worry too much about breaking changes?

True.
It's from a mental habit I picked up from reading too many data format documents.


Top
 Profile  
 
PostPosted: Wed Feb 06, 2019 8:40 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21212
Location: NE Indiana, USA (NTSC)
slembcke wrote:
JRoatch wrote:
the requirement to retain the buffer for the "Reuse/Duplicate previous block" feature

Not an issue if you reserve the buffer bytes exclusively for Donut though right?

It's 64 out of 2048 bytes that the rest of your program can't use while a decompression is running in the background.


Top
 Profile  
 
PostPosted: Wed Feb 06, 2019 9:07 pm 
Offline
User avatar

Joined: Fri Nov 24, 2017 2:40 pm
Posts: 144
Sure, but you could say that about anything that uses RAM. If streaming to CHR RAM is considered important, 64 bytes of RAM might be considered a reasonable cost. It's not like it's being forced onto anybody. At this point I simply think it would be interesting if it could be done. :p


Top
 Profile  
 
PostPosted: Fri Feb 15, 2019 4:58 pm 
Offline
Formerly 43110
User avatar

Joined: Wed Feb 05, 2014 7:01 am
Posts: 379
Location: us-east
Anyway I went ahead made some breaking changes in the basic block format, and edited the first post pointing to the current new locations.

I also managed to complete the C version of the encoder, and I really like the 1200x speedup compared to the python version.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group