Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

Bananmos
Posts: 508
Joined: Wed Mar 09, 2005 9:08 am
Contact:

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Bananmos » Sun May 03, 2020 7:19 am

Was already planning on eventually doing something like this for the CHR viewer (to allow it to display PRG ROM as tiles), makes sense to have something similar for the sprite viewer, too.

So, here you go! (next appveyor build will have it)
Thanks a lot! Just used this new feature to easily track down an annoying bug triggering when trying to use a metasprites with all 64 sprites. :)

I did run into a few minor issues however:
* The feature that makes the sprite viewer automatically show the pattern table for sprites as set by $2000 is generally convenient. But as my sprite table is at $1000 and I often find myself setting breakpoints in code where $2000 has been temporarily set to zero to disable NMIs, and with no way to override this setting I am left with the wrong tiles showing until I make the debugger stop somewhere else in my code.

* The 2x setting in the PPU viewer looks pretty useful as a means to zoom. But for all the views that stack two pixel images, it ends up being too tall for a 1080p display. OTOH, the widget that end up on the sides are unaffected by the scaling. Have you considered re-arranging the UI to stack the pixel images horizontally and have the widgets on the bottom? It would solve this problem with 1080p screens for everything but the 4x nametable viewer (but with vertical mirroring as I'm using, the lower one is redundant to me anyway and would best be removed completely in 2x mode)

* Sometimes, I accidentally open a new PPU Viewer window despite already having one open, and this appears to cause Mesen to hang and requiring a force kill with the Windows task manager.

None of these features are deal-breaker by any means - just figured I would mention them :)

And this is unrelated to the PPU Viewer, but despite having set the .dbg files to auto-import, Mesen doesn't seem to actually do this. I have to re-load the .dbg file manually every time I do a new build and fire up Mesen.
Do you know what's going wrong here?

The ability to show CHR from PRG-ROM sounds neat feature! Ideally, the format would be semi-configurable, as CHR data in PRG-ROM data often tends to be stored differently to the native format to make unrolled loops more efficient. For example, a common pattern is to have each row / plane in a separate array, to avoid having to increment / decrement the index registers for every byte copied:
WriteScrambledTile:
lda tileDataP0R0,x
sta $2007
[...]
lda tileDataP0R7,x
sta $2007
lda tileDataP1R0,x
sta $2007
[...]
lda tileDataP0R7,x
sta $2007
rts
But I realise this configurability might be difficult to achieve in practice... and could probably only work if the separated arrays are 256-byte-aligned.

Finally, I noticed you just added support for OAM corruption in Mesen. I know it's still a beta feature, but that's really useful! I spent quite a bit of time tweaking my NMI code to stay within the safe hblank region, and have been careful not to touch it since. The event viewer already helps a lot with this, but having the OAM corruption emulated is great to see the effect of it. As expected, the latest forced blanking code I'm prototyping right now causes a flickerfest when turning the feature on... which is a good reminder to myself why it's still prototype code... :P

Sour
Posts: 807
Joined: Sun Feb 07, 2016 6:16 pm

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Sour » Sun May 03, 2020 12:25 pm

I'll add a dropdown/toggle to pick between auto/8px/16px sprites when I have a chance.
Bananmos wrote:
Sun May 03, 2020 7:19 am
* The 2x setting in the PPU viewer looks pretty useful as a means to zoom. But for all the views that stack two pixel images, it ends up being too tall for a 1080p display.
The PPU viewer in general is a bit annoying to deal with, layout-wise. And the 2x zoom option is essentially a patch to allow at least some level of zooming, but as you can tell, it's not ideal.
Ideally I think I would probably split the ppu viewer into separate windows (to not be forced to keep all of them the same size, which limits what I can do quite a bit in terms of layout..), remove the 2x zoom options and instead implement zoom like I did in Mesen-S' viewers (which I ported back to mesen's event viewer a while ago) - basically you can zoom freely with ctrl-+/- or the ctrl+mouse wheel, and then scroll around the picture with click+drag, which in my opinion works a lot better, is less restrictive and is much more flexible in terms of layout. Changing all this isn't what I would call trivial, though..
* Sometimes, I accidentally open a new PPU Viewer window despite already having one open, and this appears to cause Mesen to hang and requiring a force kill with the Windows task manager.
Having too many windows opened (and having them set to refresh at a high FPS) can sometimes cause the windows message queue to get filled with draw requests faster than the application can process them.
I've added some logic over on Mesen-S to try to limit the odds of this ever causing a lockup, though (by reducing the refresh speed based on roughly how long the refreshes are taking.) I'll try to copy that fix over to Mesen soon and let you know.
And this is unrelated to the PPU Viewer, but despite having set the .dbg files to auto-import, Mesen doesn't seem to actually do this. I have to re-load the .dbg file manually every time I do a new build and fire up Mesen.
Is the .dbg file called "myrom.dbg" (not the lack of .nes in the name here) for a rom called "myrom.nes"? Is it in the same folder as the "myrom.nes" file?
Finally, I noticed you just added support for OAM corruption in Mesen. I know it's still a beta feature, but that's really useful!
Glad it's already useful to someone! Like you said, this is very much a work in progress and still doesn't quite react exactly like the hardware does for all scenarios. Are you testing on NTSC? Or PAL? I don't think we've tested/confirmed that this behavior occurs for PAL consoles, yet.

Bananmos
Posts: 508
Joined: Wed Mar 09, 2005 9:08 am
Contact:

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Bananmos » Mon May 04, 2020 5:52 am

Sour wrote:
Sun May 03, 2020 12:25 pm

Is the .dbg file called "myrom.dbg" (not the lack of .nes in the name here) for a rom called "myrom.nes"? Is it in the same folder as the "myrom.nes" file?
Doh! It was named myrom.nes.dbg... I've updated my built batch script and it seems to work fine now :)
Glad it's already useful to someone! Like you said, this is very much a work in progress and still doesn't quite react exactly like the hardware does for all scenarios. Are you testing on NTSC? Or PAL? I don't think we've tested/confirmed that this behavior occurs for PAL consoles, yet.
It was quite a few years since I did this tweaking to avoid OAM corruption, but it was on an NTSC console. Even longer ago, I primarily used a PAL console, and I'm pretty sure the issue doesn't exist on PAL, as I have no recollection of it ever being a problem.

I think the main gap in PAL emulation is still that the DMC cycle steal emulation is not correctly emulated. If ever there's a suitable test ROM for it I'd be happy to run it on my PAL console.

The test ROM would likely need to be a bit more elaborate than my old demo I posted before, where I was just relying on tweaking delay loops with visual feedback, and all this was way before Blargg's work on sync:ing the NMI to rendering.

Bananmos
Posts: 508
Joined: Wed Mar 09, 2005 9:08 am
Contact:

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Bananmos » Sat May 09, 2020 10:04 am

So I'm trying to use the new super-useful assert feature with modulo to check my sprite page index is always divisible by 4, but can't get it to work.

* Doing "assert(Y % 4 == 0)" just plain doesn't do anything. And when I try to type "%" in the debugger, it wont even accept the condition - so guess the modulo operator is not supported yet?

* Doing the common work-around of a bitwise-and with "assert(Y & 3 == 0)" does seem to trigger the assert in the debugger... but even for values that are clearly multiples of 4. So not sure what's going on here?

Would be great to have some way of doing modulo on my asserted values... :)

Sour
Posts: 807
Joined: Sun Feb 07, 2016 6:16 pm

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Sour » Sat May 09, 2020 10:25 am

Ah, I probably broke the modulo operator back when I added support binary values (e.g "x == %00010001"). Might have to change the operator for one of them, but I'll try and see if it's simple to keep them both in with the '%' operator.

I think the assert(Y & 3 == 0) part is an order of operation problem? It's doing 3 == 0 -> false/0, and then Y & 0, which is always 0, so the assert always triggers. "(Y & 3) == 0" should work, though.

Bananmos
Posts: 508
Joined: Wed Mar 09, 2005 9:08 am
Contact:

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Bananmos » Sun May 10, 2020 9:05 am

Sour wrote:
Sat May 09, 2020 10:25 am
Ah, I probably broke the modulo operator back when I added support binary values (e.g "x == %00010001"). Might have to change the operator for one of them, but I'll try and see if it's simple to keep them both in with the '%' operator.
Ah, I see. No worries about using a different operator if it makes things simpler. I'm happy to use something like "Y MOD 4", and keeping the widepsread % prefix for 6502 binary numbers is probably more important than the aesthetics of module.
Sour wrote:
Sat May 09, 2020 10:25 am
I think the assert(Y & 3 == 0) part is an order of operation problem? It's doing 3 == 0 -> false/0, and then Y & 0, which is always 0, so the assert always triggers. "(Y & 3) == 0" should work, though.
Ah, the old ordering gotcha! Shall use parantheses more carefully in the future :)

User avatar
za909
Posts: 207
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by za909 » Sat May 16, 2020 10:39 am

Hi, I'd like to make a suggestion. During my DPCM-PCM endeavors I really could've used some of the still hidden stats of the APU, namely the 8-bit DPCM buffer state, which if I am not mistaken is implemented as a bit shifter, so there is no separate bits remaining counter. Seeing the state of this would've made determining timings a bit easier, since I had no idea when the buffer was going to run out of bits and start an IRQ/ DMC DMA read.

Sour
Posts: 807
Joined: Sun Feb 07, 2016 6:16 pm

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Sour » Tue May 19, 2020 7:43 pm

Bananmos wrote:
Sun May 10, 2020 9:05 am
Ah, I see. No worries about using a different operator if it makes things simpler. I'm happy to use something like "Y MOD 4", and keeping the widepsread % prefix for 6502 binary numbers is probably more important than the aesthetics of module
The modulo operator should be fixed as of the latest commit/appveyor builds (both this and binary notation still use %) - let me know if you still have issues with it.
za909 wrote:
Sat May 16, 2020 10:39 am
During my DPCM-PCM endeavors I really could've used some of the still hidden stats of the APU, namely the 8-bit DPCM buffer state, which if I am not mistaken is implemented as a bit shifter, so there is no separate bits remaining counter.
Thanks for the suggestion. At this point, I'm likely to eventually scrap the APU viewer and replace it by a much more versatile tool like the register viewer that exists in Mesen-S, which would easily allow me to show a lot more state (for the APU or other components, e.g maybe some specific common mappers, etc.), without it also being a nightmare in terms of UI/layout (doesn't help that Mono on Linux sizes everything differently a bit which makes this more painful still.) I'll keep this in mind for when I do replace the APU viewer, though.

Bananmos
Posts: 508
Joined: Wed Mar 09, 2005 9:08 am
Contact:

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Bananmos » Sat Jun 27, 2020 5:38 pm

The modulo operator should be fixed as of the latest commit/appveyor builds (both this and binary notation still use %) - let me know if you still have issues with it.
Sorry for late reply... modulo operator works great for me now!

But there's a few other things that have been annoying me lately in the otherwise-awesome debugger:

1) It appears I cannot get local labels to work in Mesen's watch window. ca65 local labels use a @ prefix, and it looks like Mesen doesn't recognise these as they always come out as <invalid expression>. This means missing out on a lot of the potential of the variables watch window, as I need to keep entering and remembering the numerical address of variables instead.

2) Another problem with ca65 local vars is that the Mesen debugger's disassembly will frequently show the wrong version of a local variable. i.e., if you have one subroutineA declaring "@foo = TEMP" and subroutineB declaring "@bar = TEMP", then subroutine may end up showing an "sta @foo" in the disassembly where the source had "sta @bar". If it's not fixable, then I would actually prefer for the disassembly to just show the raw numerical values, as seeing the wrong variable name is quite distracting / confusing.

3) The reverse-debugging feature is often a life-saver, but I can't seem to get it to work reliably. Sometimes I'll click it and it'll backstep an instruction just as I expect. But other times, the click will take me as far back as the reset time. Do you know what might be causing this intermittent bug?

4) Speaking of the reversible debugging, it would also be awesome if there was a simple way to backtrack more than a single instruction, just like the forward debugging can run one scanline or one frame. The assert feature has been of great use to me, but I often need to backtrack for a long time to get to the root cause of the assert, because I can't find a way to execute backwards on a coarser granularity.

Lastly, just want to say a big thanks for all the improvements done so far to the best NES debugger there ever was! A lot of the more complicated coding I've done recently would have taken several days more of debugging effort without them :)

Bananmos
Posts: 508
Joined: Wed Mar 09, 2005 9:08 am
Contact:

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Bananmos » Sun Jun 28, 2020 11:53 am

Oh, a while back you also asked me about how well the OAM corruption feature worked. I've just tried using it today, and had positive but still-kind-of-mixed results with it.

My sprite scaling code had introduced a regression in OAM corruption, and before fixing it the results on my NTSC NES and Mesen matched up pretty much perfectly. Have a look at this video which has my HiDef-NES and HDTV in the top part, with Mesen running on my laptop at the bottom:

https://photos.app.goo.gl/kVaU2bv8f469hNEJ8

I then used Mesen's event viewer to re-organise the writes to $2001, and quickly ended up with a version that worked in Mesen - but was still buggy on the real HW:

https://photos.app.goo.gl/wbYao2n5jYfbajfr9

Finally, after some more tweaking to have all the $2001 happen around dot 330 (which appears to only corrupt sprites 2-5), I got it looking flawless on both the HW and Mesen:

https://photos.app.goo.gl/C6pyoxe2DquSLpcW8

So looks like the OAM corruption emulation in Mesen might still need some tweaks. Nevertheless, it's already a good sanity check :)

Fiskbit
Posts: 117
Joined: Sat Nov 18, 2017 9:15 pm

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Fiskbit » Sun Jun 28, 2020 2:56 pm

OAM corruption emulation in Mesen is known to be incomplete. We believe it's roughly accurate for the case where you turn rendering off mid-frame and turn it back on in vblank, but if you turn it off mid-frame somewhere around dots 65 to 256 and then you turn it back on mid-frame, there is additional corruption on some CPU/PPU alignments that is not understood well enough to emulate. I can't tell in your video exactly what's going on in the event viewer, so I'm not sure what the specifics of your case are. If you're not turning rendering on mid-frame, I'd be very interested in understanding why Mesen differs from hardware.

Edit: Looking closer at the videos, I see some tiles flashing lower on the screen which leads me to believe that you were indeed hitting the 65-256 issue; the emulated issue simply copies the contents of OAM row 0 into the victim row, while this unemulated issue causes some kind of data corruption of the victim row that results in tiles moving to unique positions. We were discussing this on the nesdev Discord not too long ago and I worked out that performing the write to disable rendering anywhere in the dot range 318-340 (as reported in the Mesen event viewer) should be safe regardless of when you reenable rendering. While this range hasn't been thoroughly tested, I didn't see any evidence of corruption there in my oam_flicker_test_reenable test ROM. I'm a little surprised that you're saying you see sprite corruption when disabling around dot 330, but since it sounds like it's only corrupting rows 1 and 2, perhaps you have enough variance in your timing that sometimes your write is landing as late as dot 3.

Bananmos
Posts: 508
Joined: Wed Mar 09, 2005 9:08 am
Contact:

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Bananmos » Sun Jun 28, 2020 3:45 pm

Fiskbit wrote:
Sun Jun 28, 2020 2:56 pm
OAM corruption emulation in Mesen is known to be incomplete. We believe it's roughly accurate for the case where you turn rendering off mid-frame and turn it back on in vblank, but if you turn it off mid-frame somewhere around dots 65 to 256 and then you turn it back on mid-frame, there is additional corruption on some CPU/PPU alignments that is not understood well enough to emulate. I can't tell in your video exactly what's going on in the event viewer, so I'm not sure what the specifics of your case are. If you're not turning rendering on mid-frame, I'd be very interested in understanding why Mesen differs from hardware.

Edit: Looking closer at the videos, I see some tiles flashing lower on the screen which leads me to believe that you were indeed hitting the 65-256 issue; the emulated issue simply copies the contents of OAM row 0 into the victim row, while this unemulated issue causes some kind of data corruption of the victim row that results in tiles moving to unique positions. We were discussing this on the nesdev Discord not too long ago and I worked out that performing the write to disable rendering anywhere in the dot range 318-340 (as reported in the Mesen event viewer) should be safe regardless of when you reenable rendering. While this range hasn't been thoroughly tested, I didn't see any evidence of corruption there in my oam_flicker_test_reenable test ROM. I'm a little surprised that you're saying you see sprite corruption when disabling around dot 330, but since it sounds like it's only corrupting rows 1 and 2, perhaps you have enough variance in your timing that sometimes your write is landing as late as dot 3.
Exactly where can I find the oam_flicker_test_reenable ROM?

I'm equally surprised to hear the claim that there's a 100% safe area to enable / disable rendering, because when I tried this a few years back and determined that 330 was the sweet spot, there was still corruption on sprites 2-3 that I couldn't get rid of by any means, so I just determined that was an ok sacrifice for using forced blanking to get more VRAM bandwidth :)

This time around, it seems my code was a little bit more off and sprites 2-5 were affected - but I wasn't trying as hard as I simply wanted to get rid of the flicker in this test ROM.

If there is indeed a 100% safe enable/disable location to show all 64 sprites at arbitrary scanlines, I'd be really keen to use the technique! While sacrificing a few sprites isn't the end of the world it's always felt a bit inconvenient, especially as the issue doesn't even exist on PAL machines.

Fiskbit
Posts: 117
Joined: Sat Nov 18, 2017 9:15 pm

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Fiskbit » Sun Jun 28, 2020 4:21 pm

You can find oam_flicker_test.nes and oam_flicker_test_reenable.nes in this and the following post. Assuming the safe region is indeed safe, a problem with it is that it's only just barely large enough to fit the variance in timing if you use IRQs, so you have to be very precise. Micro Machines is an example of a game that turns rendering off and on mid-frame in the safe region without encountering any corruption that I'm aware of.

I haven't done any testing on PAL, so this is the first I've heard that the problem doesn't occur on PAL PPUs, but I'm not surprised. Also, this issue was actually introduced in later NTSC PPUs, though I haven't found yet exactly where it was introduced. I know E-0 and later do have it, and A and B do not. My guess is that it was introduced in either E-0 or D-0, and that D and earlier don't have it.

Sour
Posts: 807
Joined: Sun Feb 07, 2016 6:16 pm

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by Sour » Tue Jun 30, 2020 1:33 pm

Bananmos wrote:
Sat Jun 27, 2020 5:38 pm
But there's a few other things that have been annoying me lately in the otherwise-awesome debugger:
I think 1) and 2) are probably the same issue here? As far as I can tell, labels that start with a @ work properly in the watch window. However, Mesen cannot properly deal with multiple labels pointing to the same address, so in your @foo = TEMP and @bar = TEMP example, it will only keep either one of the labels (e.g @foo) and use that one everywhere, making it impossible to use @bar in the watch, and making all occurrences of @bar in the disassembly show up as @foo.

There is no easy way around these issues - fixing them would require Mesen labels to have a concept of label scope (e.g like CA65 has), which would require a lot of effort/changes (I tried implementing this once in the past and quickly realized it required far too much effort) The watch window/expression/etc also have no direct knowledge of CA65's symbols, so they rely on the (more limited) labels that are created based on the CA65 symbols. Potential "solutions" might be:
-Disabling the import of labels completely in the integration settings - this would remove all labels from the disassembly though, which isn't great either.
-Concatenating all label names that point to the same address together - this could very quickly become unusable for frequently re-used memory addresses, and isn't very intuitive.
-Using source view and just ignoring the disassembly view altogether - but this still leaves issues with not being able to use the watch window 100% reliably. Fixing this would require to make the watch aware of the ca65 symbols, which would require the c++ core to know about the ca65 symbols since the watch relies on expressions, which are evaluated in the C++ core (CA65 integration is entirely a UI-side feature at the moment, and I would prefer to keep it this way if possible)

As for 3/4, step back is definitely not perfect - I've tried and fixed issues with it several times, but more always get reported. Basically it loads the most recent save state stored in the rewind cache, and then replays the input for however many frames needed, until it reaches the instruction that ran before the instruction that was about to run. In the past there have been issues when used in the first ~30 frames after a reset/power cycle, or after loading a state, for example, but I think those should mostly be fixed, although I think fiskbit reported something with regards to that again, not too long ago. I'll try to take a look, but unless I can find the conditions to reproduce it, it might be somewhat tricky to fix.

Step back is a feature that originally took me an hour to get to work for the most part, and then forced me to waste an additional 10-20 hours on debugging issues with it :P (and there are more issues, beyond this - e.g stepping back will screw up the call stack window, for example)

So with all that said, it's 3) is basically why I'm not too keen on doing 4), in the sense that adding more features to an already somewhat buggy feature isn't great, heh. Other people have also asked to be able to rewind to the previous breakpoint, for example, etc, and that's also not too trivial for other reasons, etc. The current way the step back works makes it limited to stepping back anywhere between 0 instructions to 30 full frames, but never more, due to its reliance on the rewinder's code, which is another thing that might need to be fixed before adding any more functionality to it.

tepples
Posts: 21971
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Mesen Debugger - Feedback/Feature Requests? (2018 edition)

Post by tepples » Wed Jul 01, 2020 2:20 pm

Sour wrote:
Tue Jun 30, 2020 1:33 pm
fixing them would require Mesen labels to have a concept of label scope (e.g like CA65 has), which would require a lot of effort/changes (I tried implementing this once in the past and quickly realized it required far too much effort)
Compare Game Boy, where BGB can recognize which scope a label is under. For example, 144p Test Suite, it gives the name memcpy.loop for a cheap local label .loop belonging to the label memcpy. Under ca65, you could display memcpy@chrloop if it's defined as a cheap local label. Or, if defined in a .proc (as seen in the tech demo of the ca83 macro pack), you'd get memcpy::loop.
Sour wrote:
Tue Jun 30, 2020 1:33 pm
The watch window/expression/etc also have no direct knowledge of CA65's symbols, so they rely on the (more limited) labels that are created based on the CA65 symbols.
Per the ca65 Users Guide, a symbol defined with := is marked as a label.

Post Reply