Native 6502 code inducing debugger/emulator functionality

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

Post Reply
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Native 6502 code inducing debugger/emulator functionality

Post by koitsu »

Original thread/post that sparked me to create this one: viewtopic.php?p=184854#p184854 (please see my quoted text there for some context)
rainwarrior wrote:"STA $2001" is using an "actual opcode". ;P I don't really understand the advantage of BRK here (it saves 3 bytes but requires an IRQ handler?).
No, it's not implemented like that. I should've been more precise in what I was trying to convey, so I will try my best to here.

BRK as I described, only with said feature in emulator turned on (default=off), would not act like like BRK normally would. Instead, it would act as a communication point (think: trigger) for the emulator to do something -- again, PURELY for debugging or development purposes. The second byte would act as a control byte, and the opcode/operand would not cost any cycles (unless there was some reason it absolutely had to in the emulation core, but there's legit reason it shouldn't). The IRQ/BRK vector would not be called/utilised when this was enabled.

You could implement the same thing through something like a write to some MMIO/register address that the emulator would honour. NO$SNS implements exactly this to some level: 21FCh-21FFh Nocash Debug Extension (char_out and 21mhz_timer in no$sns emu). Here's why I dislike that approach, though I see the advantages it offers (a LOT more flexibility, agreed!):

You end up spending actual time/cycles doing load/store operations that would affect the actual code/program behaviour past that point. This kind of feature isn't just about "brand new games/code being written", it's also got reverse-engineering/romhacking in mind -- where in some games you might have maybe 20 bytes of "unused space", just enough for injection of a JMP yourcode + BRK $xx + JMP originalcode (let's not get into semantics about using JSR/RTS instead etc. -- besides the point).

Using BRK to "shell out to a debugger" is something that's tried/true in the past: GSBug for the IIGS used it exactly that way. My idea is similar, but also to add functionality (through the operand byte) to make the emulator do something and not just drop to the debugger. Quoting the GSBug manual (note last line, re: monitor breaks vs. debugger breaks):
Whenever a break occurs, GSBug takes over the machine. It saves the state of the machine and allocates a 1K block. The debugger beeps to let you know that a break has occurred and displays its version number on entry. All the registers, including the stack pointer and program counter will reflect the state of the machine immediately before the break instruction was executed. The program counter will point to the break instruction, and the stack pointer will not reflect the fact that the return address and processor status were pushed onto the stack when the instruction was executed. To instruct the debugger not to trap breaks, you set monitor breaks instead of debugger breaks. See "Using Breakpoints" later in this chapter for details on how to do this.
There's no reason I picked BRK other than the fact that it's a commonly-unused opcode that has a signature byte, giving it some degree of control. You could use another opcode if you wanted, sure, but the problem on the 6502 (incl. NES -- because now I know people are doing this, shame on them ;-) ) is that people actually use the undocumented/unused opcodes for things. I suppose the NOP/KIL ones would be useful (on the 65816 two common ones were wdm ($42) which was a 2-byte NOP basically, and cop ($02) which had its own vector and was intended for a coprocessor (least on the Apple IIGS we never had such)). Point is: I don't care what opcode, just that it needs to have some particular operand or control byte after it.

Heck, now that I'm thinking about it, maybe it could just be something like $42 but with several bytes and not just a single-byte operand. Dunno. You get the point though by now I'm sure.
rainwarrior wrote:If you're trying to time within VBlank then you're not doing something you can see in the NES visual output anyway (though you could use an oscilloscope with $4011, $4016, etc. to get a signal across in an alternate way). I was demonstrating the use of $2001 there specifically because it makes visual output on the target hardware.
Yes, and it's something I've talked about in the past being useful too. The downside is, as I described, that it's basically the "only" visual interface a programmer/developer/romhacker person might have to find out what's going on -- and $2001 affects a lot more than just R/G/B intensity, so now you might have to manage/tweak bits 7-5 vs. what bits 4-0 have (in other words: a simple lda/sta is no longer involved).

As for the oscilloscope... if you think this is practical for a homebrewer then let me know when I can book a flight to Planet Rainwarrior. ;-) We've seen repeated examples of people running out of VBlank time (and in many cases, nobody knowing how many CPU cycles are available in and out of VBlank per NTSC vs. PAL vs. Dendy) and showing up here not sure what's going on. Oscilloscope + dedicated hardware of this sort (plus the understanding of EE) really isn't feasible. Let's stay practical, yeah? There are some emulators now that have a run-until-next-scanline feature, which I think is pretty good for what most people need (I'm excluding extreme cases like blargg's colour palette demo).
rainwarrior wrote:If you want to time code within an emulator there's a lot of ways to do it. You can use breakpoints. You can trigger LUA from execution points, write instructions (including $2001), or various other triggers, and then use it to gather/process/output your statistics. There's also thefox's custom build of Nintendulator that adds profiling registers at $4020-$403F.
Breakpoints stop the emulator altogether. Clicking Run or Continue over and over gets tedious, especially when you have several breakpoints. I'm fine with debuggers, but I really prefer "printf() debugging". There are pros and cons to both.

Lua requires an emulator that has Lua integrated, *and* that the homebrewer or romhacker know Lua, *and* that they know whatever functions/details are needed to achieve said goal in the emulator. I think this provides most extensive capabilities, no doubt about it, but (respectfully, as someone who does bits of Lua!) it's asking a lot in comparison to what I describe above -- it puts a lot of weight on both the homebrewer *and* the emulator author equally. I'm trying to keep it simple.

Maybe coming up with a list of desired features that could be achieved during active runtime through unused/undocumented opcodes (and/or their operands or subsequent data bytes) would help?
User avatar
rainwarrior
Posts: 8731
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Native 6502 code inducing debugger/emulator functionalit

Post by rainwarrior »

koitsu wrote:
rainwarrior wrote:If you're trying to time within VBlank then you're not doing something you can see in the NES visual output anyway (though you could use an oscilloscope with $4011, $4016, etc. to get a signal across in an alternate way). I was demonstrating the use of $2001 there specifically because it makes visual output on the target hardware.
Yes, and it's something I've talked about in the past being useful too. The downside is, as I described, that it's basically the "only" visual interface a programmer/developer/romhacker person might have to find out what's going on -- and $2001 affects a lot more than just R/G/B intensity, so now you might have to manage/tweak bits 7-5 vs. what bits 4-0 have (in other words: a simple lda/sta is no longer involved).

As for the oscilloscope... if you think this is practical for a homebrewer then let me know when I can book a flight to Planet Rainwarrior. ;-)
Let's not argue straw points here. You brought up and argued out of context my example that was specifically something that is easy to do on a real NES. If you want to argue about what's easy to do in an emulator, that's a completely different situation.

I really don't appreciate you making fun of my suggestion that an oscilloscope might be the most appropriate tool for timing things within vblank on the hardware. If you wanna talk about a different context, fine, but I was responding to the context you borrowed from my original example.

...after which I mentioned several different approaches to profiling in the (very different) context of emulators in the other thread. I'll reiterate them here in case they're useful:
  • Writing the emphasis/greyscale bits of $2001 for rough/quick visual timing. Good for playtesting, and effective and easy on the real hardware.
  • Writing to $4011 (audio output) or $4016 (controller pin output), alternative output from the real hardware if you have appropriate tools.
  • Breakpoints in FCEUX or Nintendulator. You can see the scanline/pixel, and/or cycles since last breakpoint and counter reset, very useful when you need to time one specific routine.
  • Lua response to triggers in FCEUX. This is extremely versatile, you can trigger from execution, writes, reads, etc. whatever is appropriate, and at the same time you can use the scripting language to process the timing information and display it in a convenient form. You can even trigger on "harmless" writes to unused memory locations to send information that way. (I think some of blargg's tests do this for a debug text output.)
  • Use an emulator like thefox's Nintendulator DX which has some debugging extensions built in. (Not anywhere near as versatile as scripting, but sometimes more efficient.)
  • Trace logs. Take a log of everything that the CPU did for one or several frames. Maybe write a program to process this information, organize call stacks, time in/out of functions etc. There's a ton of information you can pull from these. The big drawback is that the log files get very large very quickly, usually only useful in short bursts of recording.
I use any and all of these whenever they seem like the most effective tool. It's worth getting to know them all.

In my view the "killer" debugging feature is FCEUX's Lua support. If you want a "debug opcode" you can use something like "STA $FF" or "STA $4020" or something else that happens to be "harmless" for your program. You can add all sorts of debug features through it (e.g. I use it frequently for hitbox visualization, or inspecting sprite data).

Trace logs are an essential tool. There's really no substitute for these.

On the other hand, I find the debugger extensions with fake opcodes / registers etc. to be the least useful, because they're generally redundant to other things (especially Lua scripting).


The biggest drawback to FCEUX's LUA scripting is only being able to run one script at once, though you can get around this generally by combining scripts.
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Native 6502 code inducing debugger/emulator functionalit

Post by koitsu »

I hope folks find the information in this thread useful.
User avatar
rainwarrior
Posts: 8731
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Native 6502 code inducing debugger/emulator functionalit

Post by rainwarrior »

In case you want to use FCEUX Lua scripting to implement a debug function that runs in response to a particular opcode, here's a very basic example:

Code: Select all

function debug_trigger()
	local pc = memory.getregister("pc")
	local opcode = memory.readbyte(pc)
	if opcode == 0xEA then
		emu.print("EA triggered!\n")
	end
end

-- main

memory.registerexec(0x0000,0x10000,debug_trigger)

while (true) do
	emu.frameadvance()
end
It's not terribly efficient, because it's running the Lua script for every single executed opcode, but it works, and if your machine is reasonably fast this lack of efficiency isn't a problem (my laptop can handle it fine).

Implementing some kind of "debug register" is much more efficient, because then the Lua script only needs to run in response to specific writes.
Post Reply