It is currently Thu Aug 22, 2019 11:29 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 21 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Thu May 16, 2019 5:54 pm 
Offline

Joined: Thu Apr 18, 2019 9:13 am
Posts: 160
lidnariq wrote:
supercat wrote:
the only arrangements that can't be resolved easily are those that would have less than 1552 [cycles] at the end after everything else is accounted for
Er, that's my point. The threshold is 1428 cycles, not 1554. Because 1552 is achievable. Admittedly it's comparatively expensive, because that bit period of 214 cycles means busy-waiting for almost two scanlines in the subsequent IRQ, but it's still achievable.

I suppose, given that you have this level of precision, sometimes one might prefer to use IRQs where p1=p2 to skip busy-waiting to save on CPU time, and only use the more precise ones near the end of the frame to achieve PPU synchronization.


The minimum duration for a typical ISR is going to be 54 or 72 cycles because of the need to perform the second rate write after the first reload. A typical raster interrupt would need some time padding to meet that, and could thus allow quite a bit of room for adjustment as to when time-critical stores are taken without having to increase its total execution time. The only thing that needs to be a specific number is the total time for a frame; everything else can be compensated at little or no extra cost.


Top
 Profile  
 
PostPosted: Thu May 16, 2019 6:54 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8533
Location: Seattle
supercat wrote:
The minimum duration for a typical ISR is going to be 54 or 72 cycles because of the need to perform the second rate write after the first reload.
The point I'm trying to get at is that usually one would want to avoid using a large value for "p2" (the second value written in the ISR) because it would impose a problematically huge overhead cost to the following ISR. But, if that next ISR has "p1" (the first value written in the ISR) the same as "p2", the wait for the second write can be skipped, and the second ISR has minimal overhead.

I'm only talking about the "middle" ISRs, the ones that work together to get the desired IRQ at the right time. The ones that actually are used for raster effects have an entirely different cost calculation.

I'm just not certain how often this would come up. So far all I've done is write a naïve depth-first search, which is distinctly the wrong approach for finding an optimal set of values to write to $4010.


Top
 Profile  
 
PostPosted: Thu May 16, 2019 7:34 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21560
Location: NE Indiana, USA (NTSC)
The loss of one CPU cycle for each 6 frames (one dot per two frames) while rendering is turned off when decompressing a new scene into the pattern tables and nametables might prove problematic. How do you solve that while keeping NMI off?

_________________
Pin Eight | Twitter | GitHub | Patreon


Top
 Profile  
 
PostPosted: Thu May 16, 2019 9:11 pm 
Offline

Joined: Thu Apr 18, 2019 9:13 am
Posts: 160
tepples wrote:
The loss of one CPU cycle for each 6 frames (one dot per two frames) while rendering is turned off when decompressing a new scene into the pattern tables and nametables might prove problematic. How do you solve that while keeping NMI off?


The code has two types of frames--long and short. Presently it simply uses a counter to generate one long frame for each short frames, but it could just as well add 3 to a counter for every short frame which had video enabled, subtract 9 for every long frame which has video enabled, show long frames when the counter is positive and short frames when it is negative. If it did that, then code could compensate for blanked frames by ensuring that rendering was always disabled for a multiple of two frames, and add an extra count for each pair of frames.

Alternatively, one could use raster splits to blank most of the screen without blanking the part that controls that weird half-dot drop. This may require subdividing the task of putting data into CHRAM into chunks that can fit within the various blanking intervals, but on the flip side would allow the game to show the player something while it was preparing the next level.

I think the simplest approach is probably to have the IRQ use the sequence of ten IRQs of length 426*8, eight of length 380*8, one of length 54*8, and one of length 54*7+190 (59560 cycles) and add 7 to the counter for every time it repeats the sequence. Provided the screen isn't blanked for more than about a quarter second, the code would generate enough long frames to regain sync. If blanking times could be long, one could make the IRQ use a sequence of length 59562 and subtract 5 from the counter any time the counter is positive, but I that wouldn't be necessary if the screen is only blanked for a few frames.


Top
 Profile  
 
PostPosted: Thu May 16, 2019 9:25 pm 
Offline

Joined: Thu Apr 18, 2019 9:13 am
Posts: 160
lidnariq wrote:
I'm just not certain how often this would come up. So far all I've done is write a naïve depth-first search, which is distinctly the wrong approach for finding an optimal set of values to write to $4010.


My approach was simply to keep a list of how best to compute each value, and then go through the array and for each value, fill in all of the higher values that can be computed from it in a way that's shorter than any way yet found for them.

To support the cases where p1 and p2 are both large, but the next step has p1 and p2 equal, one should keep a list of the best way to compute each value with the last p2 being small, and the best way if the last p2 doesn't need to be small. The one with p2 small could add values where p1==p2 and where p1!=p2, while those where p2 is big could only add in those where p1==p2.

In any case, unless one cares about IRQ durations during blanking, I don't think any of the "tricky" values are needed.


Top
 Profile  
 
PostPosted: Fri May 17, 2019 7:45 am 
Offline

Joined: Thu Apr 18, 2019 9:13 am
Posts: 160
tepples wrote:
The loss of one CPU cycle for each 6 frames (one dot per two frames) while rendering is turned off when decompressing a new scene into the pattern tables and nametables might prove problematic. How do you solve that while keeping NMI off?


If you want to try to actively maintain sync, that can be done by using the sequence "LDA $2002 / BIT $2002" timed so that vsync should hit just after the load and nudge the frame timing if it doesn't. The problem I see with that is that unless one can control what instructions the main line is running, the 7 cycle uncertainty of when the IRQ or NMI will get processed would lead to considerable timing uncertainty. Although I haven't figured out the best way to establish initial sync, it should be possible to synchronize things within a few frames so that the timing of DMC events would be known precisely even though the timing of individual interrupts would have 7 cycles of jitter. If one tries to take measurements of whether one is ahead or behind within an interrupt whose timing isn't certain, that timing uncertainty would be added to one's measurements.

Perhaps the right approach would be to actively synchronize the DMC and display whenever the game is paused, but otherwise just run with the DMC. It's possible to align the CPU with the DMC, but it's expensive. The cost wouldn't matter during pause, however, and if the raster gets out of sync a player could fix the problem by pausing and unpausing the game.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 21 posts ]  Go to page Previous  1, 2

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group