It is currently Fri Mar 24, 2017 5:10 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 50 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
PostPosted: Wed Jan 11, 2017 6:15 pm 
Offline

Joined: Sun Feb 07, 2016 6:16 pm
Posts: 166
thefox wrote:
There's a bug in the Visual 2C02 OAM DMA: viewtopic.php?p=169373#p169373 (it does not actually seem to corrupting the source address to 0 always unlike I said in that post, instead it seems to depend on the value written and the hibyte of "ab": spr_addr = value_written AND hibyte(ab)).
Yea, that seems to be what DK is doing - the OAM data matches the zero page. Isn't the high byte of AB always $40 in this particular scenario?
cpow seemed to be convinced these bugs were not there originally - is there any Git/SVN repository available for the Visual 2A03 that I might be able to use to figure out when the bugs appeared?

Eugene.S wrote:
There are different versions of scanline.nes
Ah, I wasn't aware there were 2 versions, thanks for pointing that out. But the version I'm using is definitely supposed to be gray.

The contents of the palette RAM doesn't match what I get in Mesen for both the scanline test & DK.
My assembly skills are terrible, so I may have messed up, but I expected this to roughly fill palette ram with $01 to $20 (mirroring aside)
Code:
  lda #$3F
  sta $2006
  lda #$00
  sta $2006
  ldx #$01
  ldy #$1F
loop:
  stx $2007
  inx
  dey
  bne loop

Instead I got this:
Code:
17 08 09 0A 1B 0C 0D 0E 1F 10 11 12 13 14 15 16
17 18 19 1A 1B 1C 1D 1E 1F 00 01 02 13 04 05 06
When I look at the trace, I see the PPU setting AB to $3F01 and DB to $01 at one point, which is off by 1 (should be $3F00?) but the content written in the actual RAM doesn't match - it seems to be writing to the AB value that the PPU is set to outside of those "writes" (typically 3F30-3F4Fish range). So I guess there might be something wrong with palette writes, too (although I would have to test the palette writes on the visual 2C02 to make sure this isn't a bug specific to my code)


Top
 Profile  
 
PostPosted: Wed Jan 11, 2017 7:19 pm 
Offline

Joined: Wed Jun 15, 2016 11:49 am
Posts: 21
This is really cool stuff Sour! I'm already impressed at the speed this is running at and what you are getting it to do. I'll definitely be keeping track of the progress and look forward to making use of it.

Good Luck!


Top
 Profile  
 
PostPosted: Wed Jan 11, 2017 7:53 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
@Sour That code looks fine, but it's not going to write to $3f1f due to the dey/bne causing Y=$00 and the branch to not be taken. Use of bpl should work (branch will be taken until the negative flag in P is set, which Y going from $00 to $FF will trigger (the branch to no longer be taken)), thus allowing PPU RAM $3f1f=$20. Sorry for the wording of this paragraph. :)

As for the rest of the PPU RAM values: I'm fairly certain PPU palette mirroring plays a role here (with regards to what values you end up seeing in PPU RAM). Can't really help with the "internal behaviour" aspect.


Top
 Profile  
 
PostPosted: Thu Jan 12, 2017 9:29 am 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 2778
Location: Tampere, Finland
Sour wrote:
thefox wrote:
There's a bug in the Visual 2C02 OAM DMA: viewtopic.php?p=169373#p169373 (it does not actually seem to corrupting the source address to 0 always unlike I said in that post, instead it seems to depend on the value written and the hibyte of "ab": spr_addr = value_written AND hibyte(ab)).
Yea, that seems to be what DK is doing - the OAM data matches the zero page. Isn't the high byte of AB always $40 in this particular scenario?
cpow seemed to be convinced these bugs were not there originally - is there any Git/SVN repository available for the Visual 2A03 that I might be able to use to figure out when the bugs appeared?

IIRC "ab" was related to the value of PC (probably due to the fact that right after the $4014 write the CPU still has time to fetch the first byte of the next instruction before RDY is deasserted by the DMA unit). You can try this by executing LDA #$FF / STA $4014 at $100. spr_addr should start as $FF and then get corrupted to $01.

I don't think there's a repository or change history of Visual 2A03 publicly available, but with some luck Quietust might have one.

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: kkfos.aspekt.fi


Top
 Profile  
 
PostPosted: Thu Jan 12, 2017 4:33 pm 
Offline

Joined: Sun Feb 07, 2016 6:16 pm
Posts: 166
koitsu wrote:
@Sour That code looks fine, but it's not going to write to $3f1f due to the dey/bne causing Y=$00 and the branch to not be taken. Use of bpl should work (branch will be taken until the negative flag in P is set, which Y going from $00 to $FF will trigger (the branch to no longer be taken)), thus allowing PPU RAM $3f1f=$20.
Thanks, that probably explains the $00 value in the palette ram. Not quite sure why the addresses are wrong though - maybe my test is wrong, considering DK seems to be able to write to the correct indexes (although about half the of values are incorrect in that case)

thefox wrote:
I don't think there's a repository or change history of Visual 2A03 publicly available, but with some luck Quietust might have one.
I just went ahead and checked on archive.org for both the 2A03 & 2C02 - the oldest copy was from May 1st 2013. The 2A03 had no significant change (all node definitions are unchanged as far as I could tell). The 2C02 had some changes to node definitions, apparently related to sprite position, but that wouldn't explain the DMA bug (since it's a 2A03 issue). Maybe the bug has always been there?


Top
 Profile  
 
PostPosted: Thu Jan 12, 2017 5:00 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 10:59 pm
Posts: 1348
One somewhat important thing to consider is that the Visual 2A03 and Visual 2C02 use slightly different versions of ChipSim - I believe Visual 2A03 uses the same version as Visual 6502, but Visual 2C02 uses different logic to resolve groups of "floating" nodes (whereby it considers the area of each node to determine whether the group goes high or low) to fix $2007 writes and Sprite DRAM refreshes.

Of course, there are also several bugs in the Javascript versions of Visual 2A03/2C02 that I've simply never gotten around to fixing...

_________________
Quietust, QMT Productions
P.S. If you don't get this note, let me know and I'll write you another.


Top
 Profile  
 
PostPosted: Thu Jan 12, 2017 8:34 pm 
Offline

Joined: Sun Feb 07, 2016 6:16 pm
Posts: 166
Quietust wrote:
whereby it considers the area of each node to determine whether the group goes high or low
At the moment, both chips use this logic. Do you think the CPU would still run correctly with it, since it's technically just an extra level of precision on the simulation?

Also, I decided to try running some CPU test roms to see how it fares:
Code:
branch_timing_tests:
  1.Branch_Basics  -Pass
  2.Backward_Branch  -Pass
  3.Forward_Branch  -Pass

instr_misc:
  01-abs_x_wrap  -Pass
  02-branch_wrap  -Pass

instr_test-v3:
  01-implied  -Pass
  02-immediate  -Fail (both 69 ADC #n and E9 SBC #n failed)
  10-stack  -Pass  (This one took 117m half clocks to complete, probably took over an hour to run)
  11-jmp_jsr  -Pass
  12-rts  -Pass
  13-rti  -Pass
  14-brk  -Pass
  15-special  -Pass
So far it seems pretty good, but not quite perfect. I'll keep running some in the background and try to get through most of the CPU-related tests done eventually (some take a long time to run so it may take a while). At least that way we'll have an idea of what works and what doesn't. Not sure if that would help in actually finding and fixing bugs, though - and unfortunately, I have very little hope of being able to fix these myself.


Top
 Profile  
 
PostPosted: Fri Jan 13, 2017 6:16 pm 
Offline

Joined: Sun Feb 07, 2016 6:16 pm
Posts: 166
I've been running tests for the past 10 hours or so (running 3 copies of the simulator at once). I didn't run every single test, some take a ridiculously long time to run (1 second test = 1 hour... one of them took about 5-6 hours to complete)

The CPU seems to be working correctly in most cases ($4014 writes aside).
ADC/SBC (and RRA/ISC which reuse their logic) are bugged (I imagine the "carry" part of the operation might not be working properly?) - this has the potential to break other tests if they are used.
The APU seems to be working as well (irq_flag failed, and dmc dma's behavior seems to be slightly incorrect)

The PPU is hard to say - with the $4014 bug, sprite-related tests will all fail.
The palette RAM test passed, but I'm fairly sure there is something wrong with the palette in general.
The background color that gets output seems to always use $3F0F instead of $3F00 (so the lower 4 bits are inverted - incorrect wiring maybe?), among other things.

Hopefully these results can eventually be useful in trying to fix the visual 2A03/2C02 - I'm not sure there is anything I can do beyond this, though.

Edit: Also updated the download link to include the latest build (better speed, fixes, and some UI improvements)

Code:
blargg_apu_2005.07.30:
  01.len_ctr: Pass
  02.len_table: Pass
  03.irq_flag: FAIL ($06 - "Writing $00 or $80 to $4017 doesn't affect flag")
  04.clock_jitter: Pass
  05.len_timing_mode0: Pass
  06.len_timing_mode1: Pass
  07.irq_flag_timing: Pass
  08.irq_timing: Pass
  09.reset_timing: Pass
  10.len_halt_timing: Pass
  11.len_reload_timing: Pass

blargg_ppu_tests_2005.09.15b:
  palette_ram: Pass
  power_up_palette: FAIL (expected it to fail)
  sprite_ram: FAIL ($06 - "$4014 DMA copy doesn't work at all")
  vbl_clear_time: Pass
  vram_access: Pass
 
branch_timing_tests:
  1.Branch_Basics: Pass
  2.Backward_Branch: Pass
  3.Forward_Branch: Pass

cpu_interrupts_v2:
  1-cli_latency: Pass
 
dmc_dma_during_read4:
  dma_2007_read: FAIL? (Outputs: 11 22, 11 22, 11 22, 11 22, 33 44 - 4AEFDE12)
  dma_2007_write: Pass
  double_2007_read: FAIL? (Outputs: 22 33 44 55 66, 02 33 44 55 66, 31D9ED83)
  read_write_2007: Pass

instr_misc:
  01-abs_x_wrap: Pass
  02-branch_wrap: Pass
  03-dummy_reads: Pass
  04-dummy_reads_apu: Pass

instr_test-v3:
  01-implied: Pass
  02-immediate: FAIL (69 ADC, E9 SBC)
  03-zero_page: FAIL (65 ADC, E5 SBC, 67 RRA, E7 ISC)
  04-zp_xy: FAIL (75 ADC, F5 SBC, 77 RRA, F7 ISC)
  05-absolute: FAIL (6D ADC, ED SBC, 6E RRA, EF ISC)
  06-abs_xy: FAIL (7D ADC, 79 ADC, FD SBC, F9 SBC, 7F RRA, FF ISC, 7B RRA, FB ISC)
  07-ind_x: FAIL (61 ADC, E1 SBC, 63 RRA, E3 ISC)
  08-ind_y: FAIL (F1 SBC, 71 ADC, 73 RRA, F3 ISC)
  09-branches: Pass
  10-stack: Pass
  11-jmp_jsr: Pass
  12-rts: Pass
  13-rti: Pass
  14-brk: Pass
  15-special: Pass
 
oam_read: FAIL (Displays mostly stars)

ppu_sprite_hit:
  01-basics: FAIL ("Flag isn't working at all" - Most likely caused by broken $4014 writes)

ppu_sprite_overflow:
  01-basics: FAIL ("Should clear flag at end of VBL" - Not sure what is causing this)
 
read_joy3:
  count_errors_fast: FAIL (because no controller is connected - need to emulate a standard controller and try again)
 
test_apu_2:
  test_1: Pass
  test_2: FAIL (might be normal - apparently can also fail on NES based on cpu-ppu alignment)
  test_3: Pass
  test_4: Pass
  test_5: Pass
  test_6: FAIL (not sure if this is normal - test 6 was originally affected by alignment, but it sounded like it was fixed?)
  test_7: Pass
  test_8: Pass
  test_9: Pass
  test_10: Pass
  test_11: Pass

The OAM read test looked like this:
Attachment:
oamread.png
oamread.png [ 18.12 KiB | Viewed 295 times ]


Top
 Profile  
 
PostPosted: Fri Jan 13, 2017 7:17 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
All the ADC/SBC operations failing is interesting. Quite possibly someone didn't implement twos-complement correctly? These two opcodes are the #1 pain point, opcode-wise, for emulator authors. Just the first thing that comes to mind.


Top
 Profile  
 
PostPosted: Fri Jan 13, 2017 7:52 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 10:59 pm
Posts: 1348
koitsu wrote:
All the ADC/SBC operations failing is interesting. Quite possibly someone didn't implement twos-complement correctly? These two opcodes are the #1 pain point, opcode-wise, for emulator authors. Just the first thing that comes to mind.

There's one distinct possibility: as I originally mentioned when I released the Visual 2A03, the 6502 core I used is a direct copy of the Visual 6502 which has working decimal mode, so if the D flag somehow got set, then I would expect lots of ADC/SBC test failures.

_________________
Quietust, QMT Productions
P.S. If you don't get this note, let me know and I'll write you another.


Top
 Profile  
 
PostPosted: Fri Jan 13, 2017 11:41 pm 
Offline

Joined: Sun Feb 07, 2016 6:16 pm
Posts: 166
The decimal flag isn't set at startup, and the tests don't set it anywhere in their code - so it doesn't look like that would be it.

I just spent 30+ minutes trying a lot of combinations of ADC #$xx (with and without the carry flag set) and couldn't find any that didn't set the flags as expected or gave the wrong result.. I think I'll try to recompile blargg's test with only the ADC portion of the test, trace the value of A and the flags at each step & then compare that with an emulator's trace.


Top
 Profile  
 
PostPosted: Sat Jan 14, 2017 7:18 am 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 2778
Location: Tampere, Finland
Sour wrote:
The decimal flag isn't set at startup, and the tests don't set it anywhere in their code - so it doesn't look like that would be it.

Make sure it also doesn't get set by PLP. It really does seem like the most likely culprit in this case.

(How difficult would it be to disable the decimal mode in Visual 6502 in the same way that they disabled it in 2A03? Wasn't it something like one wire cut or a transistor removed?)

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: kkfos.aspekt.fi


Top
 Profile  
 
PostPosted: Sat Jan 14, 2017 7:42 am 
Offline

Joined: Sun Feb 07, 2016 6:16 pm
Posts: 166
thefox wrote:
Make sure it also doesn't get set by PLP. It really does seem like the most likely culprit in this case.
And it looks like you're probably correct - forgot that was even possible.
The test roms seem to be doing this at some point:
Code:
lda #$FF
sta in_p
[...]
lda in_p
pha
[...]
plp
So it's pretty likely the decimal flag is on during some of the tests (although "cld" is called at one point in the code).
I'll replace the $FF with $F7 in the rom and see if that changes anything.

According to this, it sounds like removing transistors t1329, t3212, t2750, t2202 and t2256 would replicate the 2A03's modifications.


Top
 Profile  
 
PostPosted: Sat Jan 14, 2017 4:57 pm 
Offline

Joined: Sun Feb 07, 2016 6:16 pm
Posts: 166
Attachment:
immediate.png
immediate.png [ 20.48 KiB | Viewed 195 times ]

Progress? Sort of...

I've forced all 5 transistors mentioned above to be "on" at all times (this is what the link said the 2A03's modifications did) and this is what I'm getting now - the official opcodes pass, but now these unofficial ones apparently don't (I've only tried this test though, since it completes in about an hour, vs 4+ for some of the others).

I had hacked up blargg's test before trying this to make sure the instructions were always performed with the decimal flag off, and that version of the test actually gave me the same result (I think - unfortunately did not save the result screen from that) - so I guess decimal mode is actually correctly disabled by this, but then why do these break when decimal mode is off is another story.


Top
 Profile  
 
PostPosted: Wed Jan 18, 2017 8:19 pm 
Offline

Joined: Fri Dec 30, 2011 7:15 am
Posts: 40
Location: Sweden
Since the wiki says it takes 2-3 and 3-4 cpu cycles for the sequencer counter to reset, i tried to see if i could find out with visual nes. Also if half and quarter frames are delayed or not.
I'm not very good at using visual nes or especially operating the apu, but here goes...

Code:
test:
lda 0x0F
sta 0x4000 //set volume / envelope to 15 to see when it decreases
lda 0x80
sta 0x4017 //restart things
jmp test


Cpu cycles, starting from the write cycle in STA 0x4017:

0: Write to 0x4017
1: Read
2: Read* (happens if write was or wasn't on an apu cycle, probably if it wasn't)
3: Read - cpu_frm_half and cpu_frm_quarter goes low, cpu_sq0_envt3-0 decrements
4: Read - cpu_'frm_/tXX' resets

If I shouldn't post random visual nes testing in this topic, tell me haha...


So I guess that's 2-3 cpu cycles of delay until half and quarter frames resets? The sequencer counter, if I even logged the correct thing, resets one cycle later instead of doing some kind of increment.
I'm incrementing the sequencer counter every cpu cycle, resetting the counter instead of / and incrementing at the same time as half & quarter frames should work in that case, I think. I say "instead of / and" because resetting *and* incrementing makes my emulator pass blargg's apu jitter test, but that doesn't really mean much.

I also tried "asl 0x4017" just because, it seems to write back 0xFF which wasn't what I was expecting (0x80?) but should work anyway. It doesn't toggle cpu_frm_half and cpu_frm_quarter. It seems to reset the counter based on the first write, which would make sense.

---

The download link has a typo, btw! "VisualNEs.zip"


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 50 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group