It is currently Sun Mar 26, 2017 9:36 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 6 posts ] 
Author Message
PostPosted: Fri Mar 03, 2017 8:34 am 
Offline
User avatar

Joined: Sun Nov 09, 2008 9:18 pm
Posts: 797
Location: Pennsylvania, USA
The wiki says: "This, every frame is exactly 312*341 = 106396 pixels = 33247.5 cycles long."

How many actual instructions does that amount to, approximately? (I suppose it could vary depending on the code running, but on average?) In GGVm's cpu core, I arbitrarily chose 10,000 based on roughly 3 cycles per instruction to approximate how much time the cpu gets per frame. (and it works really well) But wondering if it can be lower, perhaps.


Top
 Profile  
 
PostPosted: Fri Mar 03, 2017 9:12 am 
Offline
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 3791
Every frame is 262*341 pixels large on NTSC, divide by 3, and you get 29780 CPU cycles. Instructions are either 2, 3, 4, or 5 cycles long depending on which instruction it is. Math instructions that operate on registers tend to be 3 cycles long, and instructions that access non-zeropage memory tend to be 4 cycles long.

So 4 cycles is a better guess than 3 cycles, or 3.5 if you want to be silly about rough averages. 3.5 cycles gives you 8508 cycles instructions per frame.


Edit: Just did the number crunching using real data. For NES Contra, the average cycles per instruction is 3.2.

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
PostPosted: Fri Mar 03, 2017 12:27 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 5417
Location: Seattle
I'd check profiling data before you assume that a ≈10% reduction in effective processor speed will significantly change performance...

Idle loop specification/detection will obviate this guesswork anyway.


Top
 Profile  
 
PostPosted: Fri Mar 03, 2017 12:54 pm 
Offline
User avatar

Joined: Sun Nov 09, 2008 9:18 pm
Posts: 797
Location: Pennsylvania, USA
lidnariq wrote:
I'd check profiling data before you assume that a ≈10% reduction in effective processor speed will significantly change performance...

Idle loop specification/detection will obviate this guesswork anyway.


Good point. I actually suddenly realized this afternoon that a feature I had added to ggvm called an "nmi safety functor" which has game-specific knowledge about where the nmi wait loops are could be re-used for this purpose, especially now that I'm executing the cpu synchronously with the render thread. Once the pc hits the range specified by the nmi saftey functor I know it's waiting for the next nmi and can stop executing. There may be a way to detect the nmi wait situation but as ggvm requires game-specific knowledge anyway due to not being a real emulator...no biggie.


Top
 Profile  
 
PostPosted: Fri Mar 03, 2017 4:24 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 17984
Location: NE Indiana, USA (NTSC)
Repeated reads of the same RAM location in a tight loop usually mean waiting for the NMI handler to change that location. My games, for example, usually spin on a variable in zero page like this:
Code:
  lda nmis
:
  cmp nmis
  beq :-


But then most of my pre-Curse of Possum Hollow games break your rule that video memory can be updated only in the NMI handler, as their NMI handler is the trivial one:
Code:
nmi_handler:
  inc nmis
  rti


One of my motivations behind Popslide was a desire to get away from this structure. Then the task remains to port existing projects to Popslide, which I admit might be quite a task for projects that use the proportional font routine.


Top
 Profile  
 
PostPosted: Fri Mar 03, 2017 4:32 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 4906
Location: Canada
I don't really like the idea of counting instructions, personally. There's a lot of variation possible there. (Does your emulation count cycles at all, or is that a foreign concept to this emulator?)

I think you could make a practical simple idle test with reasonably low overhead. Something like:

1. Whenever you hit a branch or jump instruction, store the resulting location branched to.
2. If equal to the last branch location, increment a counter.
3. When the counter reaches some threshold, start doing progressive tests.

The baseline complexity is just little bit of extra work in every branch, and you don't have to do any more complicated tests until you've reached some threshold conditions, and if you detect the idle loop once you can just skip the test when you hit it again on subsequent frames.

Here's a quick code sketch of the idea: (not intended as a fully working solution, just an illustration of the concept)

Code:
uint16 last_idle;
uint16 lash_branch;
int loop_count;
uint8 idle_a, idle_x, idle_y;
uint32 idle_hash;
bool idle_test;

void branch() // call after any branch or jump instruction
{
   if (register.PC == last_branch) // if in a loop
   {
      if (register.PC == last_idle) // we already detected this loop last frame, skip the retest
      {
         idle(); // stop emulating PC until next NMI or IRQ or sprite-0 etc.
         return;
      }
      
      if (!idle_test) return; // already ruled this out as an idle loop
      
      ++loop_count;
      switch (loop_count)
      {
         default:
            break;
         case 32: // some arbitrary threshold to begin idle testing
            idle_a = register.A;
            idle_x = register.X;
            idle_y = register.Y;
            break;
         case 33: // test #1
            if (register.A != idle_a || register.X != idle.x || register.Y != idle_y) // probably not an idle loop e.g. if index registers are changing
               idle_test = false;
            else
               idle_hash = hash_ram_state(); // most expensive test: try to detect any changes in RAM for the next loop
            break;
         case 34: // test #2
            if (idle_hash == hash_ram_state())
            {
               last_idle = register.PC; // remember this idle point to skip detection next time
               idle();
               return;
            }
            else
               idle_test = false;
            break;
      }
   }
   else // branched to a new location, not currently in a loop
   {
      loop_count = 0;
      idle_test = true;
      last_branch = register.PC;
   }
}

There's always ways to make cases that this can't detect, but you don't need to solve the general purpose case. You can tweak your detector to work with the kind of cases you expect, and tell your users how they should form their idle loops. (An idle loop with several branches obviously wouldn't work, for example, but you could extend the idea to "last 8 branches", etc. if you wanted that feature.)


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 6 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: Jarhmander and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group