It is currently Sat Sep 22, 2018 6:22 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 28 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Tue Jun 07, 2016 8:53 pm 
Offline
User avatar

Joined: Sat Jul 12, 2014 3:04 pm
Posts: 958
Thought: apply economics! Calculate the opportunity cost of each merge, by subtracting the savings of the next-best merge for each of the two "ends" involved from its savings, and go for the best value after counting opportunity cost to determine which one gets picked.

Compare with the simple-greedy.


Top
 Profile  
 
PostPosted: Tue Jun 07, 2016 9:06 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 7540
Location: Seattle
... I think that sounds like A* search?


Top
 Profile  
 
PostPosted: Wed Jun 08, 2016 1:27 am 
Offline

Joined: Tue Oct 06, 2015 10:16 am
Posts: 796
tokumaru wrote:
Quote:
if you did it in sets of say 12 strings

Which 12 strings, though? Those that are involved in the longest overlaps? What kind of selection would make me lose less opportunities?


Going with the Monte Carlo approach, any random 12 strings. Pick say 10 random sets, brute-force the best for each, and pick the best. The program can also do the original greedy algo on the side, and at the end pick the better one.

If randomness is bad, this is the type that a neural network can guess reasonably, if you know how to set those up.


Top
 Profile  
 
PostPosted: Wed Jun 08, 2016 8:13 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10814
Location: Rio de Janeiro - Brazil
I did in fact think of considering the next merge before committing the current merge.


Top
 Profile  
 
PostPosted: Wed Jun 08, 2016 8:32 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10814
Location: Rio de Janeiro - Brazil
Here's an idea: Instead of remembering the longest overlap for each block, remember the longest overlap on each end of each block, so that before merging 2 blocks, I could calculate how much I'd be losing from merges I wouldn't be doing (i.e. add up all merges that'd use the ends that would be connected). So instead of merging the longest overlap each step, I merge the longest overlap that will ruin less alternate merges.

EDIT: Sounds like what Myask suggested.


Top
 Profile  
 
PostPosted: Wed Jun 08, 2016 11:26 pm 
Offline
Formerly 65024U

Joined: Sat Mar 27, 2010 12:57 pm
Posts: 2263
For each byte of data, why not have a mirroring byte that shows the count of how many things use said byte? Match all the bytes you could merge it with, and choose the one with the lowest amount of relying entries.


Top
 Profile  
 
PostPosted: Thu Jun 09, 2016 5:47 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10814
Location: Rio de Janeiro - Brazil
Kasumi, what logic are you using to merge two blocks? I'm actually having trouble coming up with something decent! Allowing small blocks to overlap not only the edges of large blocks, but also be entirely contained by them, is something I didn't anticipate, and everything I'm coming up with is proving to be excruciatingly slow! :oops:


Top
 Profile  
 
PostPosted: Thu Jun 09, 2016 6:32 pm 
Offline
User avatar

Joined: Wed Apr 02, 2008 2:09 pm
Posts: 1244
Err... I wrote this a while ago, so I dunno. *checks*

I've got a "blockmaster" class, a "block" class, and a "subblock" class.
A blockmaster contains a pointer to a pointer to a list of blocks.
A block contains a name, a size, a pointer to the data, and a pointer to a pointer to a list of subblocks.
A subblock contains a name and its offset from the block's data.

To start, I add every block to the blockmaster. The blockmaster has three functions called merge, testmerge, and mergeall.

This is what testmerge looks like (Forgive my glorious lack of code style RE: everythinglowercase):
Code:
int metblockmaster::testmerge(metblock block1, metblock block2){
   int bytessaved = 0;

   if(block1.size <= 0 || block2.size <= 0){
      return bytessaved;
   }

   int startingoffset = -block1.size+1;
   int range = block1.size+block2.size-1;

   int bestoffset = -1;

   for(int x = startingoffset; x < startingoffset+range; x++){
      
      int tempbytessaved = 0;
      for(int y = 0; y < block1.size; y++){
         if(y+x >= 0 && y+x < block2.size){
            if(block1.data[y] == block2.data[y+x]){
               tempbytessaved++;
            }else{
               tempbytessaved = 0;
               break;
            }
         }
      }

      if(tempbytessaved > bytessaved){
         bytessaved = tempbytessaved;
         bestoffset = x;
      }

   }

   return bytessaved;
}

I feel like I did it the most obvious way, so this may be what you're thinking of that's excruciatingly slow, I dunno. I actually see a few ways to improve the speed here.

I use test merge to look for the best savings (since it's non destructive). Merge basically does the same thing, except it merges the data using the best offset, adds one block as a subblock to the other (storing its offset from the main data, and retaining its name, as well as adding all subblocks from one block to the other), and removes that block from the block master.

I use whether best offset was positive or negative to decide which block should be the one to become a subblock. (If the positive offset one becomes the subblock, the negatively offset one doesn't need its own subblock's offsets updated) As well, this means to create the merged string, I can start by copying all of the negative blocks data, and then append the new block's data.

I can probably post merge too if you'd like, but it's... long, and this explanation is probably better than the code.

_________________
https://kasumi.itch.io/indivisible


Top
 Profile  
 
PostPosted: Thu Jun 09, 2016 6:53 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10814
Location: Rio de Janeiro - Brazil
This is exactly what I asked for. So, this is basically sliding one block over the other, from one end to the other, and comparing the overlapping parts element by element, right? I did think of something like this, but haven't tried this approach yet. Thanks for sharing the code!


Top
 Profile  
 
PostPosted: Thu Jun 09, 2016 7:12 pm 
Offline
User avatar

Joined: Wed Apr 02, 2008 2:09 pm
Posts: 1244
Correct. Whether a block is contained within another or at either end effectively doesn't matter because of the if statement with the range checks. (The count is not updated, but also doesn't reset.)

You could break on y+x >= block2.size or do a few other things to speed it up, but this does my 746 blocks fast enough that I'm not really waiting up for it. If you do find a novel way to do it, still share, of course.

_________________
https://kasumi.itch.io/indivisible


Top
 Profile  
 
PostPosted: Fri Jun 10, 2016 3:44 am 
Offline

Joined: Thu Aug 20, 2015 3:09 am
Posts: 396
I'm not entirely sure it's applicable to this particular problem, but the Knuth-Morris-Pratt algorithm should reduce the O(mn) comparisons to O(m+n), if you can work the range check into it.


Top
 Profile  
 
PostPosted: Fri Jun 10, 2016 2:26 pm 
Offline
User avatar

Joined: Thu Mar 31, 2016 11:15 am
Posts: 359
Here's a compressor in C++ using ant colony optimization to solve TSP: http://pastebin.com/raw/geE75wfn

And some test results:
Code:
  # |   Un |   NN |  Ant
500 | 8000 | 5452 | 5336
200 | 3200 | 2378 | 2304
 50 |  800 |  659 |  630

Where
  # = number of strips
 Un = uncompressed size
 NN = nearest neighbor optimized size
Ant = ant colony optimized size

I would probably use nearest neighbor (or some other greedy algorithm) once the number of strips is >500.


Top
 Profile  
 
PostPosted: Sun Jun 19, 2016 6:50 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10814
Location: Rio de Janeiro - Brazil
pubby wrote:
Here's a compressor in C++ using ant colony optimization to solve TSP

Sorry I didn't say this before, but thanks for giving this a try. I'm still trying to fully understand the whole ant colony thing!

Anyway, I'll probably not be using this technique to compress level maps in my game anymore. The results were OK, but not as good as I expected. My idea was to break up each screen (area of 16x16 blocks) into 16 strips of 16 blocks, and then compress all the strips into their shortest common supersequence. In my tests, only about 70% of the strips were unique. On top of that, the supersequence thing worked reasonably well, even though it was just the greedy algorithm, reducing the space required by the unique strips by practically 50%, so only about 35% of the strip data remained. It sounds good, but when you factor in the space taken by the screens themselves (16 pointers, 32 bytes, per screen), the results aren't so hot. Another thing that didn't work as expected was supporting both vertical and horizontal strips. Screens would be encoded one way or another depending on the amount of unique strips they introduced... this did reduce the number of unique strips a bit, but they didn't overlap as well, for obvious reasons, so the final supersequence ended up being longer than if only horizontal strips were used.

I'll probably revert to my old level format, which uses multiple levels of metatiles and is similar to quad-tree compression. For one, the compression algorithm is much more straightforward, and the ratios are better too, as long as design the levels with enough grid-aligned redundancy. The main problem with that was being limited to 256 blocks of each size (256x256, 128x128, 64x64, 32x32, 16x16). Using pointers instead of indices in order to support more than 256 entities would not only double the space required for the entities, but also significantly slow down the decoding routine.

I do expect to stay under 256 blocks most of the time, but I'd like to have the possibility of breaking this mark should the need arise. Oh well, maybe I'll come up with another way to expand this technique.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 28 posts ]  Go to page Previous  1, 2

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group