It is currently Sun Dec 17, 2017 8:36 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 74 posts ]  Go to page Previous  1, 2, 3, 4, 5
Author Message
PostPosted: Wed Aug 28, 2013 3:44 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10172
Location: Rio de Janeiro - Brazil
The PPU registers are mirrored all the way from $2000 to $3FFF. This means that writing to $2008 (for example) is the same as writing to $2000, so you shouldn't check for the exact addresses in your switch. Instead, check whether the address being accessed is in the $2000-$3FFF range, then discard all bits of the address except the first 3, which you can use to select a register (0 to 7).


Top
 Profile  
 
PostPosted: Wed Aug 28, 2013 10:28 pm 
Offline
User avatar

Joined: Mon Jan 01, 2007 11:12 am
Posts: 203
tokumaru wrote:
The PPU registers are mirrored all the way from $2000 to $3FFF. This means that writing to $2008 (for example) is the same as writing to $2000, so you shouldn't check for the exact addresses in your switch. Instead, check whether the address being accessed is in the $2000-$3FFF range, then discard all bits of the address except the first 3, which you can use to select a register (0 to 7).

Code:
case (Address & 0xE007)


Top
 Profile  
 
PostPosted: Wed Aug 28, 2013 11:50 pm 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
Yep, and the cool thing is that the compiler will do all the optimizations tokumaru mentioned: masking the address, checking that it's in the $2000-$2007 range, then using either a binary search or jump table, depending on which is most efficient. Further, this kind of masking them compare is actually closest to how the hardware does it; the upper 3 address lines are decoded ('138?) then the lower 3 are used to select the register.
Code:
switch ( Address & 0xE007 ) {
case 0x2000: ... // handles all mirrors, e.g. 0x2008, 0x2010... 0x3F80
case 0x2001: ...
...
case 0x2007: ...
}


Top
 Profile  
 
PostPosted: Thu Aug 29, 2013 1:20 am 
Offline
User avatar

Joined: Sat Jan 03, 2009 3:28 pm
Posts: 59
Location: Oregon
Thanks for all the helps guys! I should have been using pointers from the start for mirroring - it makes it so much simpler. I rewrote all of my memory functions that have to do with RAM so that almost everything is done through pointers now.

However, a new problem has arisen. I decided to run back through Blarg's CPU tests just to make sure that my new code worked correctly, and now tests past 03 give me a message that interrupts should not happen during the tests. For some reason, one BRK instruction is being executed per test. In 04 - zero_page, at one point during the test the program writes $00 to $3A6. Later on in the test, the program counter ends up at $3A6 and reads the opcode from that address. This of course makes a BRK instruction occur.

I'm not really sure if this from me rewriting my memory functions or from the CPU. When I ran the tests before, tests 04-09 spit out garbage output, so I was told that I had passed them since no opcodes showed up.


Top
 Profile  
 
PostPosted: Thu Aug 29, 2013 1:28 am 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
Binary search between the working version and this version until you find the set of code changes that broke it. Then break those changes roughly in half, if there is any independence. Repeat until you find the cause.

It'd be slightly interesting to add a check for BRK causing the vectoring rather than unwanted IRQ, and report this differently.


Top
 Profile  
 
PostPosted: Thu Aug 29, 2013 4:29 pm 
Offline

Joined: Fri Dec 30, 2011 7:15 am
Posts: 43
Location: Sweden
Are you handling unofficial opcodes? Depending on what you do with them, I suspect they might be the culprit.
I tried running these tests in my emulator recently and pass some, others fail with the 'interrupts should not occur' message. My emulator fails spectacularly at unofficial opcodes, as i pretty much do nothing with them yet. Can't recall if I looked into the fail closely, but I took it as a "passed officials, crashed at unofficials".


Top
 Profile  
 
PostPosted: Thu Aug 29, 2013 4:32 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19355
Location: NE Indiana, USA (NTSC)
Treating unofficial opcodes as one-byte NOPs runs the risk of BRKing the program if a 2- or 3-byte unofficial opcode's operand is $00.


Top
 Profile  
 
PostPosted: Fri Aug 30, 2013 1:21 am 
Offline
User avatar

Joined: Sat Jan 03, 2009 3:28 pm
Posts: 59
Location: Oregon
Hmm, I never even thought of the unofficial opcodes. They are all implemented, but I'm not sure how well they work. They all move the program counter the right number, but that's about all I can guarantee. Three of them fail on test 03 but that's the only test that outputs any opcodes. The other tests that fail just complain about the BRK instruction.

So I grabbed a copy of my old memory function and ran the tests again. Now the old memory function is complaining about interrupts happening as well. I've been able to figure out that at some point during the test, opcode $91 (indirect, Y STA) is executed. A = $00 at this point and so $91 writes $00 to address $3A6. Then, later on in the test, the program counter ends up at $3A6 and reads $00, which throws the break instruction.

From what I just found it, it looks like my accumulator isn't set to the right value when opcode $91 is executed. 05-zp_xy and 06-absolute have the same results: opcode $91 sets address $3A6 to $00 which later causes a BRK instruction.


Top
 Profile  
 
PostPosted: Fri Aug 30, 2013 3:36 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
Dartht33bagger wrote:
So I grabbed a copy of my old memory function and ran the tests again. Now the old memory function is complaining about interrupts happening as well. I've been able to figure out that at some point during the test, opcode $91 (indirect, Y STA) is executed. A = $00 at this point and so $91 writes $00 to address $3A6. Then, later on in the test, the program counter ends up at $3A6 and reads $00, which throws the break instruction.

From what I just found it, it looks like my accumulator isn't set to the right value when opcode $91 is executed. 05-zp_xy and 06-absolute have the same results: opcode $91 sets address $3A6 to $00 which later causes a BRK instruction.

The description you just gave of sta ($zp),y is wrong/makes no sense. The contents of the accumulator have no bearing on the indirect, nor indexing, operations.

Please provide the code you use for opcodes $81 and $91. It's fairly easy to tell when someone has this wrong (and many people do).

I would recommend you make your emulator halt/stop/throw some indication when BRK is executed; it's usually (99% of the time) a sign that someone has something somewhere that's broken in their 6502 core, or their mapper implementation. Same goes for making your emulator halt/stop/etc. when an invalid opcode is executed. Do this for now -- do not go about implementing unofficial opcodes at this point (sorry I'm repeating myself, I know I said this before), just stop the emu and dump the code around the area that induced the error (i.e. show -10 and +10 bytes disassembled around the area which broke, and all contents of registers (PC, S, A, X, Y, P, etc.)).

Also, not to get pedantic or off track, but FYI the addressing modes are generally referred to as the following:

Code:
sta $12     = zero page (ex. opcode $85)
sta $12,x   = zero page indexed X (ex. opcode $95)
stx $12,y   = zero page indexed Y (ex. opcode $96)
sta $1234   = absolute (ex. opcode $8d)
sta $1234,x = absolute indexed X (ex. opcode $9d)
sta $1234,y = absolute indexed Y (ex. opcode $99)
jmp ($1234) = indirect (ex. opcode $6c)
sta ($12,x) = indexed indirect X (sometimes called "pre-indexed X") (ex. opcode $81)
sta ($12),y = indirect indexed Y (sometimes called "post-indexed Y") (ex. opcode $91)

IMO, I believe it's time you start looking at other peoples' 6502 emulation cores and comparing the code to yours. CPU opcode testers are not able to get everything correct because they rightfully have to assume addressing modes and opcodes are implemented correctly.


Top
 Profile  
 
PostPosted: Fri Aug 30, 2013 1:50 pm 
Offline
User avatar

Joined: Sat Jan 03, 2009 3:28 pm
Posts: 59
Location: Oregon
I've looked over another core a few weeks ago and it looked to be doing the same thing that mine does. I'll try your dumping method though and see if I can't figure this out.

Here are $81 and $91:

Code:
case 0x81:   //Indirect,X store A to memory
         temp1 = (memory->readRAM(PC, ppu) + X) & 0xFF; //Wraps around if >255
         temp2 = (memory->readRAM((temp1 + 1) & 0xFF, ppu) << 8) | memory->readRAM(temp1, ppu); //Gets address
         memory->writeRAM(temp2, A, ppu);
         cycles =  6;
         PC++;
         break;
      case 0x91:   //Indirect,Y store A to memory
         temp1 = memory->readRAM(PC, ppu);         //Gets Zeropage address
         temp2 = ((memory->readRAM((temp1 + 1) & 0xFF, ppu) << 8) | memory->readRAM(temp1, ppu)) + Y; //Gets real address
         memory->writeRAM(temp2, A, ppu);
         cycles =  6;
         PC++;
         break;


Top
 Profile  
 
PostPosted: Fri Aug 30, 2013 2:36 pm 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
You've got a precedence error in your 0x91 handler; + has higher precedence than |. It's a good idea to avoid mixing bitwise and arithmetic operators without parentheses, even if you've memorized the precedence levels perfectly. Below, yours is like addr1, and addr2 shows what's being done first, whereas you want addr3, with the displacement added afterwards.
Code:
   int addr1 =  (0x01 << 8) |  0xFF  + 0x01 ; // 0x0100
   int addr2 =  (0x01 << 8) | (0xFF  + 0x01); // 0x0100
   int addr3 = ((0x01 << 8) |  0xFF) + 0x01 ; // 0x0200

In this case, it seems simpler to just use all arithmetic operators:
Code:
temp2 = memory->readRAM((temp1 + 1) & 0xFF, ppu)*0x100 + memory->readRAM(temp1, ppu) + Y; //Gets real address

though I have to say, that expression is too verbose to read easily, especially how Y is pushed to the end almost out of sight. This makes the regularity clear and the calculation uncluttered:
Code:
int low  = memory->readRAM( (temp1 + 0) & 0xFF, ppu);
int high = memory->readRAM( (temp1 + 1) & 0xFF, ppu);
int addr = high*0x100 + low + Y;


Top
 Profile  
 
PostPosted: Sat Aug 31, 2013 1:45 am 
Offline
User avatar

Joined: Sat Jan 03, 2009 3:28 pm
Posts: 59
Location: Oregon
Ok. I sat down tonight and wrote functions to get addresses in the CPU instead of having each instruction get the address. Could you guys check my functions out to make sure everything is right (it should be)?

Code:
//Get address functions
const unsigned short cpu::zeroPageX(memory* memory, ppu* ppu)
{
   unsigned short temp = memory->readRAM(PC, ppu);
   temp += X;
   temp &= 0xFF;      //Wraps around
   return temp;
}

const unsigned short cpu::zeroPageY(memory* memory, ppu* ppu)
{
   unsigned short address = memory->readRAM(PC, ppu);
   address += Y;
   address &= 0xFF;
   return address;
}

const unsigned short cpu::absolute(memory* memory, ppu* ppu)
{
   unsigned short high = memory->readRAM(PC + 1, ppu) << 8;
   unsigned short low = memory->readRAM(PC, ppu);
   unsigned short address = high | low;
   return address;
}
   
const unsigned short cpu::absoluteX(memory* memory, ppu* ppu)
{
   unsigned short high = memory->readRAM(PC + 1, ppu) << 8;
   unsigned short low = memory->readRAM(PC, ppu);
   unsigned short address = high | low;
   address += X;
   return address;
}

const unsigned short cpu::absoluteY(memory* memory, ppu* ppu)
{
   unsigned short high = memory->readRAM(PC + 1, ppu) << 8;
   unsigned short low = memory->readRAM(PC, ppu);
   unsigned short address = high | low;
   address += Y;
   return address;
}

const unsigned short cpu::indirectX(memory* memory, ppu* ppu)
{
   unsigned short zeropageAddress = zeroPageX(memory, ppu);
   unsigned short low = memory->readRAM(zeropageAddress, ppu);
   zeropageAddress++;
   zeropageAddress &= 0xFF;
   unsigned short high = memory->readRAM(zeropageAddress, ppu) << 8;
   unsigned short address = high | low;
   return address;
}

const unsigned short cpu::indirectY(memory* memory, ppu* ppu)
{
   unsigned short zeropageAddress = memory->readRAM(PC, ppu);
   unsigned short low = memory->readRAM(zeropageAddress, ppu);
   zeropageAddress++;
   zeropageAddress &= 0xFF;
   unsigned short high = memory->readRAM(zeropageAddress, ppu) << 8;
   unsigned short address = high | low;
   address += Y;
   return address;
}


Top
 Profile  
 
PostPosted: Sat Aug 31, 2013 3:03 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
Edit: I took a longer stare at your indirectX() and zeroPageX() routines, and yeah, now I get it. They were confusing me because of your mention of PC, which to me (at that point in the CPU) would still be pointing to the opcode. However somewhere else in your code you're obviously doing PC++ before handling the actual functionality of the addressing mode. In other words I really expected to see PC+1 and PC+2 being used all over.

You need to be aware of 3 things relating to addressing modes: zero page wrapping, page boundary crossing, and the JMP indirect CPU bug:

a) "Zero page wrapping", which is any time a ZP read/write operation happens, the effective addresses used for reading/writing need to stay within the $00xx range (hence the name zero page). This is what &= 0xff is about, but it needs to be applied only where applicable. "Zero page wrapping" does not incur a cycle penalty (keep reading).

b) Actual "page boundary crossing", which is any time an effective address rolls over into the next page successfully (i.e. $12ff -> $1300). You might think "why do I care about this, it's just simple 16-bit addition" -- you need to care about it because crossing a page actually costs an extra CPU cycle. Right now the first-generation NES games you're testing with tend to not be very "timing-dependent" but this will matter quite a lot later, trust me. And don't forget about something like $ffff -> $0000 too (that's also considered page crossing). Your current abstraction methodology loses this ability.

c) There's an actual CPU-level bug in the 6502 which affects jmp ($xxxx) (opcode $6c) only, where (b) above does not happen correctly. In other words, jmp ($80ff) would read the effective 16-bit address low byte from $80ff and the high byte from $8000 -- not $8100 like you would expect. (And no, there is no additional CPU cycle penalty in that situation since the page never gets crossed)

Remaining part of my previous (non-edited) post, which I'll keep just for folks reading this thread (whose Subject is no longer accurate):

Here's some actual 6502 code with comments (I assume you speak 6502):

Code:
lda #$fe     ; A=$fe
sta $0622    ; Store value $fe at memory location $0622 (in RAM)
lda #$22     ; A=$22
sta $4c      ; Memory location $4c ($004c) now contains value $22 (low byte of 16-bit address)
lda #$06     ; A=$06
sta $4d      ; Memory location $4d ($004d) now contains value $06 (high byte of 16-bit address)
ldx #$3a     ; X=$3a
lda ($12,x)  ; Effective address is $12+X (thus $4c)
             ; Memory location $4c ($004c) contains value $22 (low byte of 16-bit address)
             ; Memory location $4d ($004d) contains value $06 (high byte of 16-bit address)
             ; Effective 16-bit address to read from is $0622
;
; A now contains value $fe
;

The difference between this and indirect indexed Y (e.g. lda ($12),y, opcode $b1) is where/when the indexing is applied. This is why some people call indexed indirect X "pre-indexed mode", and indirect indexed Y "post-indexed" mode.

If you want me to do a little write-up like the above but for indirect indexed Y, let me know and I can.

Also a coding practise tip in passing: I would strongly suggest you use inttypes.h typedefs for integers, i.e. uint16_t for an unsigned 16-bit integer (what you call unsigned short. They're fewer characters to type and allow for better cross-architecture support since not all architectures (or compilers/environments for that matter) are identical. In case you think I'm kidding...

P.S. -- What compiler are you using that's letting you shove new variable declarations right in the smack dab centre of your code without making an new code block (e.g. { ... })? Awful that this is allowed. In other words, it should really look like this:

Code:
const unsigned short cpu::indirectY(memory* memory, ppu* ppu)
{
   unsigned short zeroPageAddress, low, high, address;

   zeropageAddress = memory->readRAM(PC, ppu);
   low = memory->readRAM(zeropageAddress, ppu);
   zeropageAddress++;
   zeropageAddress &= 0xFF;
   high = memory->readRAM(zeropageAddress, ppu) << 8;
   address = high | low;
   address += Y;
   return address;
}


Top
 Profile  
 
PostPosted: Sat Aug 31, 2013 5:29 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19355
Location: NE Indiana, USA (NTSC)
koitsu wrote:
Also a coding practise tip in passing: I would strongly suggest you use inttypes.h typedefs for integers, i.e. uint16_t
[...]
P.S. -- What compiler are you using that's letting you shove new variable declarations right in the smack dab centre of your code without making an new code block (e.g. { ... })?

This has been allowed in C++ forever and in C since 1999. In fact, the same revision of C that added stdint.h (the standardized version of inttypes.h) added declaring variables anywhere. I don't think it's as awful as you appear to think it is because it allows variables to be initialized with a value when they come into existence. This is especially important in C++ where declaring a variable of a non-POD type and initializing it later causes a default constructor to run at the point of declaration followed by the class's operator = handler at initialization. Even in C, declaring late allows giving a defined value once you know a value, which makes your program less likely to encounter undefined behavior from using a variable before it is initialized.

Quote:
it should really look like this:

Code:
const unsigned short cpu::indirectY(memory* memory, ppu* ppu)
{
   unsigned short zeroPageAddress, low, high, address;
   zeropageAddress = memory->readRAM(PC, ppu);

At this point, if low, high, or address were accessed right now, that would be undefined.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 74 posts ]  Go to page Previous  1, 2, 3, 4, 5

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 8 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group