Has demo code, the routines to call, and documentation on how to calculate exactly when a write will occur. With this, the random latency of NMI is eliminated, so that you are always a known number of CPU cycles into the current frame. Code outside NMI doesn't need to be cycle-timed. This will allow more accurate raster effects in demos and games.
I also wrote a detailed wiki page about how it achieves this synchronization.
Now, I'm still trying to understand exactly how the technique works, but how does this behave on the real console (I'm having some problems that prevent me from using my PowerPak right now, so I can't test it myself)? In Nintendulator it jitters quite a bit, and in Nestopia it jitters a little less (looks like a single pixel to me). Can you really make something always happen at the exact same time every frame (i.e. the demos in the archive don't jitter at all)?
What we need now is for people to try this out with their favorite raster technique, including ones they could never get to work due to timing, and see what is possible.
In a nutshell, we want NMI to occur at the beginning of a frame, rather than delayed until the end of the current instruction. This technique is able to determine how long the interrupted instruction was, and make a compensating delay so that it is as if NMI occurred at the beginning of the frame, without any delay. Note that the posted version can only handle interrupted instructions of two and three cycles, but that I have been able to make it work even when interrupting longer instructions.
You're a hero Blarg, congratulations ! I was just going to ask you where was the devlopment of the NTSC version because i haven't had news for a while, and I was afraid something would have turned up not doable.
Tokumaru, you're probably missing how much of an improvement what did blargg over the common synchronization method is. Under normal circunstances, the "best" possible margin for a PPU write was about a dozen of pixels (by having an all-cycle timed code from a NMI which interrupts an endless jmp here loop). Now the best possible margin is 1 pixel NTSC and 3 pixel PAL (so I'll say this is 3 pixels - because it's the worst case that counts if you want chances to do things that are doable on both NTSC and PAL systems).
So 3 pixels fits perfectly in a tile if you're lucky. This allows you to do :
- Mid-scanline CHR-bankswitching on the BG *WITHOUT* any tiles that are shared between all banks
- Scroll writes mid-scanline *WITHOUT* any blank area. Vertical or horizontal screen split becomes perfectly possible (as long as the bottom isn't used - to have free CPU time before the next VBLank)
- Nametable switching mid-scanline *WITHOUT* any tiles that are shared between both tables
- Do funky clipping effects with $2001 WITHOUT having any shaking horizontally.
- Palette rewrites during HBlank could become much less of a nightmare to get glitch free
- The accuracy of emulators can be tested to a higer point : Because it's possible to write to registers to a very predictable pixel, if effects takes too early or too late the result can be visible and inacuracies that would have otherwise never noticed will be corrected.
True, I'll dig my old "Window" demo that did palette writes during HBlank but with glitches and see if I can improve that.What we need now is for people to try this out with their favorite raster technique, including ones they could never get to work due to timing, and see what is possible.
Also, some research about $2007 reads during the frame should be done. Nobody has any clue of what effects those have, and to be honest it's about time some serious info about that would be published.
I'd also say some research about $2003 and $2004 should be done, but Blarg was about to become mad last time so we could just forget about those two and keep to the good old "just sta $4014 and don't ask any other questions".
I plan on trying this in Nestopia tomorrow to be sure it emulates it correctly. Until then, don't assume it's handling it correctly.
And as Bregalad mentioned, this might be possible using $2004 reads. I've done some tests, and made some progress, but still have some issues. If I could get it working, it'd be really awesome. It'd allow synchronizing without having to cycle-time the NMI handler, and it'd be able to handle NMI during any instruction. It's just that $2004 has lots of obscure things, and I'm sure it'll be somewhat different on PAL, unlike this technique, which is exactly the same, other than timing.
I've had techniques for synchronizing this well in the past for emulator test ROMs, but it required cycle-timed code. The breakthrough here is getting the same accuracy from a NMI, without the code outside NMI needing to be cycle-timed, and with only ten instructions of synchronization code in NMI (no loops either).
Because it really is a headache to me. If the number of cycles is low enough, I go on with a M = (N-2)/5 and K = (N-2)%5 then do something like that (by hand, not using macros, although it might work with a macro):
Code: Select all
.if (K=1) M = M-1 .endif ldx #M - dex bne - .if (K=1) nop nop nop ;6 more cycles .elseif (K=2) nop ;2 more cycles .elseif (K=3) lda $ff ;3 more cycles .elseif (K=4) nop nop ;4 more cycles .endif
That's not what I meant. I meant we could, by synchronize with your method, read $2004 and $2007 at very predictable times (instead of being approximate), and reverse engineer more precisely how read from those registers works.And as Bregalad mentioned, this might be possible using $2004 reads. I've done some tests, and made some progress, but still have some issues. If I could get it working, it'd be really awesome. It'd allow synchronizing without having to cycle-time the NMI handler, and it'd be able to handle NMI during any instruction. It's just that $2004 has lots of obscure things, and I'm sure it'll be somewhat different on PAL, unlike this technique, which is exactly the same, other than timing.
And speaking of that, Loopy's firefly demo does come to mind as one of the use cases for this: http://home.comcast.net/~olimar/NES/firefly.zip
The demo is MMC5 and uses scanline interrupts for the fly effect, so I don't know how much jitter there is on the a real NES... anyone who could tell me? I imagine it could be quite improved by this sync code, and it would be cool to see a more compatible NROM version rather than MMC5.
Blargg solution seems to help regarding raster effects. In my project, while scrolling up or down, I have to skip a few scanlines to "connect" the content together. This is because of the map format used in the original game is not nes compatible.
I had to use the the MMC3 irq counter to find the right line and skip a few of them. The only problem is that there was jitters. Because of that, I had to add some nop to fix the scanline at the right position. But, I never was able to get the right scanline after because of this operation somehow.
Would this new finding help in that case?
I don't think so, because resetting the scroll is a fairly quick operation and the time window provided by an HBlank should be more than enough to do it properly. An error of 7 or 8 cycles shouldn't matter in your case.Banshaku wrote:Would this new finding help in that case?
If you can't properly reset the scroll to the desired location there must be something else wrong. Maybe the IRQs are not behaving as expected and firing at different times in the scanline or something.
I have recently made my game reset the scroll during HBlank and it works flawlessly. You just have to write to $2006 and $2005 (if you want to change the fine Y scroll, if you don't only $2006 will do the trick) as described in loopy's document and make sure that the second $2006 write falls withing HBlank (and also the $2005 write that changes the X scroll, because changes to the fine X scroll appear immediately). HBlank lasts nearly 28 CPU cycles, so making sure that the 4 or 8 cycles of 1 or 2 PPU writes fall within that time shouldn't be such a precise operation.
Now, I don't remember when exactly in the scanline MMC3 IRQs fire, but if it's too close to the start of Hblank there might not be enough time to reset the scroll for the next scanline. If that's the case, it would be better to have the IRQ fire 1 scanline early and waste some time before changing the scroll on the following scanline, a change that will only take effect on the one after that.
I'm hoping it'll allow new mid-frame scroll effects that weren't possible before. At least with NTSC, the jitter is very low.