Just released version 0.2.0
This version is mostly accuracy/timing fixes and a lot of debugger tool improvements (+ DSP support).
Congratulations! Looks great :D
Many thanks to everybody in this thread for helping out! I think I've had more help fixing emulation bugs on Mesen-S in 3 months than I have on Mesen in 3 years :p
It's seriously long overdue that we had a fresh face in SNES emulation. I hope you'll stick around long-term and that we'll find lots of new hardware discoveries out together ^-^
The more we figure out about the hardware, and the more quality open-source implementations we have, the easier it'll get for more new emudevs to enter SNES emulation and quickly get up to speed, as is the case with the NES scene currently.
I dream of the day we have a drop-in PPU core similar to blargg's DSP that has the accuracy we have in NES emulators. I'll seriously consider the SNES well-preserved and be able to retire feeling content once we reach that point.
and maybe try to implement the same kind of CPU overclocking that NES emulators use and see how that works out.
I'm interested in this as well. The obvious trick of adding more Vblank scanlines would bias the timing (frame rate) and thus require us to clock the CPU faster to compensate, which may interfere with timed raster effects as with Air Strike Patrol and lots of buggy games that overshoot blanking periods. Only making the CPU faster during Vblank lines is likely the best strategy, right?
This makes it sound like the most painful part of the SA-1 is integrating it with the rest of the system in terms of bus accesses, etc.?
The most painful for performance, yeah. Snes9X came up with an approximation that's not too bad if you want to focus on performance. But if you're going for all-out accuracy (^_^) then a different approach will be needed.
This is the gold standard test ROM for it: https://github.com/VitorVilela7/SnesSpeedTest
The design of the SA1 is ingenius and evil: the CPU cannot be stalled because the SNES CPU has no concept of external wait states (/DTACK on the Genesis, for instance.) So instead, the SA1 detects when the SA1 CPU tries to access ROM, BWRAM, or IRAM while the SNES CPU is accessing it, and will insert wait states into the SA1 CPU. The obvious million dollar question is, what happens if the SA1 is in the middle of reading from one of those when the CPU comes in to try? As far as I can tell, the answer is it just lets it finish doing the read and somehow there's enough headroom to let everything still work.
The way I do it is for every CPU read/write to set a "MAR" (memory address register) variable to hold the current state of the address bus pins. This also has to be done for DMA/HDMA accesses. It took a whole lot of trial and error to get the best results in Vitor's test ROM, but I still don't have it perfect.
https://github.com/byuu/higan/blob/mast ... ry.cpp#L11
It doesn't seem? to invalidate the address bus pins on idle cycles (but much more likely, the reason is that idle cycles tend to set the address bus to the current program counter, which is still going to stall out the SA1 because it doesn't seem to recognize it's an idle cycle.)
https://github.com/byuu/higan/blob/mast ... ng.cpp#L30
This one's going to suck to emulate. Past logic analyzer traces done on SNES refresh show it actually looks like 5-cycle read + 3-cycle idle, repeated five times for 40 cycles total. Just having that pulse run for 40 clocks gave me less precise timing matches than breaking it into five sections. I rounded to 6-2 x5 because everything else steps by 2, and even forcing 5-3 at a good speed hit (have tp step the IRQ counter at 21.4MHz instead of 10.7MHz) didn't improve the results any. It turns out that the SA1 carts connect the DRAM refresh pin to the SA1 CPU, and it really does affect the timing.
https://github.com/byuu/bsnes/blob/mast ... rom.cpp#L2
To avoid destroying performance completely, I added a setting to bypass the memory address stall and synchronizations on ROM, BWRAM, IRAM accesses. This of course is not accurate and fails Vitor's test, but it also kind of acts like an SA1 overclock so ... call it a feature! ^-^;;
Once you've implemented this let me know, and then I'll dive into the nitty gritty stuff I haven't fully been able to figure out yet :3
I might try to do the SA-1 first (rather than the Super FX)
I went the same route. It's a good strategy even if the SA1 ends up being more complex as a whole.
But for now, I'm taking a step back from this for a few days - I need a break!
It's well earned. No rush, see you when you're back.
Side note, I just fixed my own bug in Donkey Kong Country 2, it appears the game is messing with sprite registers mid-scanline? Didn't dig too deeply, but I had to revert my tile/item caching so that one line renders from the previous line's cache. Might be worth taking a look with Mesen-S just to make sure the sprite timing changes didn't affect it.