I've been poking around with NES development ever since Bob Rost taught his class at CMU in '04. I've graduated from nbasic to full-out assembly, but still haven't taken on anything in assembly as grand as a full game. I've got a smaller, more humorous demo in the final stages for the Famicom's 25th anniversary this month, but I just got this little beauty presentable the other night.
It's not raster-effects-and-DMC-saw-waves technical, but it still took a handful of custom math routines. I may inline the math a little later, which would definitely improve speed.
Enjoy. I'm kinda curious what a real NES dev communnity thinks of it after bouncing it off a couple friends and a general emulation forum...
http://www.disgruntleddesigner.com/chri ... ngine.html
The ASM code is free for you to check out too. Hu6280 and 6502 ASM are quite similar, so even though the instructions might not be the same, the program logic is.
Ironic that last year, I couldn't find a single example of fractal code on the 6502, and now there are two!
Truth be told, after cranking this out in a week and change, my brain doesn't want to jump right back into 65xx ASM just yet. It would much rather finish up Star Ocean 3 and possibly start roughing out a new C++ game.
The block preview mode in your TG codebase is a nice idea- I might rig something similar in a later version of mine. On the whole, though, I'm certain my code is slower. Case in point: my 16-bit multiply
Code: Select all
mul16: stx mul16_xcache lda mul16Flag1 ora mul16Flag2 and #%00000010 beq mul16_no_overflow sta mul16Flag2 ; what? You want I should preserve the negative flag on a junk call? mul16_no_overflow: ; basic shift-and-add method ; keep halving mul*1 and popping bits off mul*2 ; if the bit off mul*2 is a 1, add the remaining mul*1 to result lda #0 sta lsh16Flag sta rsh16Flag sta add16Flag1 sta add16Flag2 sta add16Hi2 sta add16Lo2 lda mul16Hi1 sta rsh16Hi lda mul16Lo1 sta rsh16Lo lda mul16Hi2 sta lsh16Hi lda mul16Lo2 sta lsh16Lo clc jsr rsh16 ; since the highest power place in mul*2 is 1/2 ldx #0 mul16_loop: jsr lsh16 ; which pops the shifted-out bit into carry bcc mul16_loop_no_add ; so we can act on it right away lda rsh16Hi sta add16Hi1 lda rsh16Lo sta add16Lo1 jsr add16 mul16_loop_no_add: jsr rsh16 inx cpx #15 ; after 15 rshs, we're guaranteed to have 0 in the rsh input bne mul16_loop ; visual break to bookend the loop lda add16Hi2 sta mul16Hi2 lda add16Lo2 sta mul16Lo2 lda mul16Flag1 eor mul16Flag2 sta mul16Flag2 ; safe, since we know we can't have overflowed, so only the sign flags might be unequal, producing a negative ldx mul16_xcache rts ; for reference, the above math is done on non-2's-complement 16-bit values, highest place being 1/2, lowest being 1/64k, with a flag byte consisting of 6 unused bits followed by an overflow flag and a negative flag
I also have a nice little restraining order in there called itersPerNMI which I've set quite low indeed for the sake of the music. Come to think of it, I should reset my counter in the NMI routine rather than the mandelbrot loop since that's not the only place I ever waitNMI... *changes code* ... great. Now it chugs even more I could just dec the address rather than dey and reload y to catch what are probably NMIs that occur just before I'd wait for an NMI but that would cost 5 cycles in my inner loop as opposed to 2... bleh. Clearly more work is needed.
When I allow as much frameskip as is needed to crunch out an entire tile before actively waiting, iirc it runs a good deal faster. But the music hiccoughs something fierce.
Is that an iso I see with FractalEngine? Meaning I could run it on my actual TurboDuo? Crazy talk! 8) I'll have to get by on Nestopia and good sense for mine unless I can scrounge a dev cart and/or EEPROM burner.
edit: new version uploading as I type. My iterations-per-frame counting was way off, so my wait calls were eating a lot of time. I decided to nix the whole iteration-counting deal and instead just let frameskips happen and update the music as needed. The result cuts runtime to 75% what it used to be.
Controls are documented in the included readme(s). Each time you press a button, it starts redrawing the screen with the revised parameters (even if no revision was actually made, e.g. you try to zoom out from 1x zoom or pan off the edge of the visible space). Just be patient. Also, unless you pump up the render depth by about 10 (Start button), you won't get much more detail zoomed in than zoomed out. That's just the nature of the beast- to get more detail, you need to crunch more cycles, which means things draw slower. I set the default depth to one which would show the whole fractal at passable detail at 1x within the first full iteration of the bgm. That said, I also experienced some control glitchiness under PC Nestopia that wasn't there under OSX Nestopia, so I may need some more PC and/or hardware testing to polish this thing up.