basic register question

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

Post Reply
jjpeerless
Posts: 5
Joined: Sat Jun 06, 2009 2:32 pm

basic register question

Post by jjpeerless » Sat Jun 06, 2009 2:42 pm

Hey all,

I am new here and new to emulator programming but its something I've always been interested in pursuing. I finally took the plunge after reading these forums and some docs on nes hardware and started my cpu emulator.

I am using BombSweeper.nes for now (recommended as a simple game on these forums) but I think since im still only working with the CPU maybe theres a better test rom floating around.

anyways, how i've been doing it is as follows:

in an infinite loop, I read an opcode, if i have defined the operation, i perform it. if its not defined yet, I print out which opcode it was and halt execution so I can go implement it. I also print out some debug information to see what its doing.

finally, on to my question.

The operation I just encountered is: DEX (Decrement the X Register by 1). My X Register is defined as a uint8_t in c. Before this operation the value of the X Register is 0, what is supposed to happen on a decrement here? Should my registers not be unsigned? Should this flip to 255 and set the negative flag in the status register? A little advice or information about the registers in this context would be great help.

Thanks.

tepples
Posts: 22277
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: basic register question

Post by tepples » Sat Jun 06, 2009 3:49 pm

jjpeerless wrote: The operation I just encountered is: DEX (Decrement the X Register by 1). My X Register is defined as a uint8_t in c. Before this operation the value of the X Register is 0, what is supposed to happen on a decrement here?
X becomes $FF, flags.N is turned on (because a value >= $80 was written to a register) and flags.Z is turned off (because a value != $00 was written to a register).

User avatar
cpow
NESICIDE developer
Posts: 1099
Joined: Mon Oct 13, 2008 7:55 pm
Location: Minneapolis, MN
Contact:

Re: basic register question

Post by cpow » Sat Jun 06, 2009 4:50 pm

jjpeerless wrote: The operation I just encountered is: DEX (Decrement the X Register by 1). My X Register is defined as a uint8_t in c. Before this operation the value of the X Register is 0, what is supposed to happen on a decrement here?
I actually found the C=64 Programmer's Reference Guide to be useful for the basic instruction micro-ops (available online). There are some gaping holes in it though, specifically what exactly the oVerflow flag should set to.
A very good demo ROM to try to see if you've implemented all your opcodes correctly (I just found out about this myself a week or so ago!) is on the Wiki.

http://nesdevwiki.org/wiki/Emulator_Tests. (It is the nestest one under CPU).

From the readme.txt file:
This test program, when run on "automation", (i.e. set your program counter
to 0c000h) will perform all tests in sequence and shove the results of
the tests into locations 02h and 03h.

-----
So, if you haven't implemented the screen yet it's okay...you can just look at your memory, bytes 2 and 3 after running the program. Beware though that if you haven't implemented some instructions correctly the program may go on a fishing expedition.

EDIT
Here's what it looks like on-screen:
http://nesdev.com/bbs/viewtopic.php?t=5224&highlight=

jjpeerless
Posts: 5
Joined: Sat Jun 06, 2009 2:32 pm

Post by jjpeerless » Sun Jun 07, 2009 10:31 am

thanks for the replies guys, they were great help. As I continue implementing all these damn opcodes I have another question I'd like to address before I continue. A lot of these opcodes require checking/updating of the status register. I currently have it implemented as just another 1 byte uint and I made some functions to set_bit() clear_bit() get_bit() using bitwise operations to work with the status register. Since these things are happening a lot, I was thinking it might be better to just implement the status register as separate flags instead of just a one byte register. This would let me check,set,clear the flags directly instead of having to manipulate a single byte. Is this a better idea? what is the consensus on this?

Also, I was doing some reading on older posts about implementing the cpu memory and I'm not positive I understand why my approach is bad.

I simply have a 64KB byte array: uint8_t cpu_mem[0x10000];

so say I want to read the next opcode I can do it with cpu_mem[cpu.PC];

say for STA_ZP (Store Accumulator using Zero Page addressing)

my program counter is pointing to the byte which represents the 8-bit address where we want to store the acc, my code looks like this:

cpu_mem[cpu_mem[cpu.PC++]] = cpu.ACC;

the inner cpu_mem[cpu.PC] gets the 8 bit address, and then it indexes into the mem again to point to the place I need to store the acc.

is this a really bad idea? this just makes sense to me to do it this way.


----

and my last question for this post is about JSR / RTS (Jump to SubRoutine and Return From SubRoutine). So I look at it like this:

[opcode (JSR)] [low order address] [high order address] [opcode2]

so originally my PC is pointing to the first opcode (JSR) and once I read it, I increment my PC so its now pointing to the low order byte of the address to jump to. Once I realize the opcode was a JSR, I am supposed to store my PC address (minus 1) on the stack. First off, why minus 1? Why not store the address of [opcode2] since thats where we want to be once we return anyways? Anyways, given my block above, which address am I supposed to store, the JSR opcode address? the one before it?

Just curious, to me the logical thing to do would be to store the address of [opcode2] where we want to be whenever we reach the RTS call.


thanks for your input, sorry for the long post!

User avatar
Disch
Posts: 1849
Joined: Wed Nov 10, 2004 6:47 pm

Post by Disch » Sun Jun 07, 2009 10:58 am

jjpeerless wrote:Is this a better idea?
Yes. Status flags need only be in bit form when they're pushed to the stack (via IRQ/NMI/PHP/BRK). Since those occur much less frequently than individual flag changes, it pays to have flag settings as quick as possible.

EDIT: Here are some misc points:

- the B and R flags do not really exist (status bits 5, 6). You do not need to keep track of their state. When you put status on the stack (PHP/IRQ/NMI/BRK), bit 6 (R) is always pushed as set. bit 5 (B) is pushed as set for BRK/PHP, but is pushed as clear for IRQ/NMI.

- you do not have to have boolean true/false or 0/1 for flag state indications. You can have any nonzero value indicate one thing and zero indicate something else. Or have only the high bit indicate something. For instance if you want to shortcut setting/clearing of N, you can say that only bit 7 of your flgN member is significant. That way when you do something like LDA you can do:

Code: Select all

flgN = A;
No need for any bitwise or logic operation. Just a simple copy. Then to see if N is set you can just do this:

Code: Select all

if(flgN & 0x80)  // N is set
Z can be done similarly. Just say that flgZ==0 when Z is set, and flgZ!=0 when Z is clear. Then setting it is just as simple:

Code: Select all

flgN = flgZ = A;  // set N and Z

//  to see if Z is set
if(flgZ)  // Z is CLEAR
if(!flgZ) // Z is SET (notice it's backwards)
There are also tricks blargg came up with that you can use to combine N and Z into a single variable, since N and Z are set together so often. But I'll not mention it here because I'm rambling already. If you're really interested lemme know and I'll post it later.
Also, I was doing some reading on older posts about implementing the cpu memory and I'm not positive I understand why my approach is bad.

I simply have a 64KB byte array: uint8_t cpu_mem[0x10000];

so say I want to read the next opcode I can do it with cpu_mem[cpu.PC];
This is bad because:

- It doesn't write protect PRG-ROM (writes to $8000-FFFF might change PRG, which shouldn't happen!)

- It makes bankswitching much more difficult because you have to perform bulk memory copies instead of a simple pointer change

- A lot of that space is mirrored. IE, LDA $1000 should read the same byte of memory as LDA $0000.

- A lot of that space is open bus (ie: addresses $5xxx)

- It makes it weirder to catch register reads/writes.

is this a really bad idea?
It's not REALLY bad, no. It's just that it will make adding future functionality more difficult. You'll end up with a lot of spaghetti code with a bunch of little hacks worked in to get a specific feature working.
First off, why minus 1? Why not store the address of [opcode2] since thats where we want to be once we return anyways?
Because the high byte of the PC is pushed before the second jump-to byte is fetched. I cannot explain the rationale of why the 6502 works that way -- it just does. Perhaps it was the only way to get it to work within the 6 cycles (though I doubt it)
Anyways, given my block above, which address am I supposed to store, the JSR opcode address? the one before it?
assuming PC=$8000 and you hit a JSR, $80 and $02 are pushed to the stack (in that order). On RTS, you'll pull these to get $8002, then increment by 1 to get $8003 (the next opcode)

You only do this -1 business for JSR/RTS. IRQ/NMI/RTI/etc all work "normally".
Just curious, to me the logical thing to do would be to store the address of [opcode2] where we want to be whenever we reach the RTS call.
That is logical, but it is not how the 6502 works.
sorry for the long post!
don't be =)

jjpeerless
Posts: 5
Joined: Sat Jun 06, 2009 2:32 pm

Post by jjpeerless » Sun Jun 07, 2009 11:31 am

thanks for the quick reply! few more questions for you =)
This is bad because:

- It doesn't write protect PRG-ROM (writes to $8000-FFFF might change PRG, which shouldn't happen!)
Isn't it safe to assume games won't do this, someone would have to write 'malicious' app to do things like that? (though if I was going for an exact replica of the nes then one should stop things like this).

Anyways, so would a better approach be to have read_memory(16 bit address) and write_memory(addr) functions? in the write_memory I could add checks to the addr to make sure its not in read-only memory? is that the idea?

- It makes bankswitching much more difficult because you have to perform bulk memory copies instead of a simple pointer change
As I have very limited knowledge of what I am doing, I was just going at this project on a 'as-i-go' basis, I've only been working on implementing the cpu. I am not even sure what/why bankswitching refers to?

- A lot of that space is mirrored. IE, LDA $1000 should read the same byte of memory as LDA $0000.
Yeah, I noticed that when looking at the memory map. How do people handle these mirrorings? Does it make sense to just store one copy and just translate addresses to the one copy? or do people actually mirror it and just update all mirrors on writes?



- It makes it weirder to catch register reads/writes.
Im guessing you are talking about I/O registers? Haven't really thought about these things yet. :/



It's not REALLY bad, no. It's just that it will make adding future functionality more difficult. You'll end up with a lot of spaghetti code with a bunch of little hacks worked in to get a specific feature working.
Ok so if I am going to restructure my memory implementation Id like to do it now before I get too far into this.

The only other way I could think to implement memory would be to do something like this:

Allocate bytes for each section of memory to their own byte pointers?

uint8_t * internalRAM = malloc(0x800);
uint8_t * prgROM = malloc(size of rom);

etc etc

this just seems really messy, would also have to store the start addresses for each section.

what about having my 64KB byte array and just add pointers to the start of specific sections?

like uint8_t * prg_rom_bank1 = &cpu_mem[0x8000];

I dunno, at this point I guess I cant see how this would help or be easier than my original plan.



Also, if I dont use my memory implementation then it seems like reads/writes will be messier, requiring lots of checks of where the address is to find out which pointers to use, instead of just indexing right into the spot?



----

about the JSR stuff

Because the high byte of the PC is pushed before the second jump-to byte is fetched
Why does the order matter? I was using a 'read_address()' function when absolute addressing, which returned the 16 bit address starting at the current PC. After this call I just manually increment the PC by 2. Should I always just be going one byte at a time and having a function to "fetch" the byte AND increment the PC?

I want to clarify this before I go on, because these sorts of things are happening ALL the time, reading a byte, incrementing PC, etc.

Thanks again!

User avatar
Disch
Posts: 1849
Joined: Wed Nov 10, 2004 6:47 pm

Post by Disch » Sun Jun 07, 2009 12:17 pm

jjpeerless wrote: Isn't it safe to assume games won't do this,
NO!

Most mappers put their write-only regs at $8000-FFFF, so therefore the game will write there every time it wants to swap PRG or CHR (several times per frame). Other games are malicious in that they check for antipiracy purposes (to see if the ROM is being run on a copier, which might allow overwriting ROM).

Assume games can and will do everything. Because they can. And they do.
Anyways, so would a better approach be to have read_memory(16 bit address) and write_memory(addr) functions? in the write_memory I could add checks to the addr to make sure its not in read-only memory? is that the idea?
Generally yeah that's the idea.

A common variant is instead of one overall read and write function group, you split it up so that you have 16 function pointers (one for each $1000 bytes of addressing space) and assign the function pointers to different read/write functions. ie, ReadFunc[0] and ReadFunc[1] would point to ReadRAM, ReadFunc[2],3 would point to ReadPPU, Read[8]+ would point to ReadROM, etc. This allows for easy mapper intervention, as well (just change a pointer)... which is nice because you'll need to catch reads and writes for various mappers.

Blargg propsed a hybrid solution where you have an overall read/write function first which deals with system RAM (0x0000-1FFF) and then does the 16 callbacks method if the address is outside that range. The idea is that this is benefitial because RAM reads/writes occur very frequently.
As I have very limited knowledge of what I am doing, I was just going at this project on a 'as-i-go' basis,
Heh. Yeah we've all been there. It's hard to plan ahead for eveyrthing when you don't know what "everything" is. Just prepare to do a lot of rewrites, or hit a few snags/walls. Because it's bound to happen. Don't let it discourage you though!
I've only been working on implementing the cpu. I am not even sure what/why bankswitching refers to?
Since there's only 32k of PRG space available (addresses $8000-FFFF) for games to have more than this, they employ bankswapping to "swaps out" different pages of PRG allowing for the overall game to be larger. This is where you get into mappers and stuff. Right now you're probably just dealing with mapper 0 games which have no swapping.
Yeah, I noticed that when looking at the memory map. How do people handle these mirrorings?

Code: Select all

// to read RAM (assume addr is between 0-0x1FFF)
v = RAM[ addr & 0x07FF ];
Im guessing you are talking about I/O registers? Haven't really thought about these things yet. :/
Yup. This is a very important part. After all you can't make games if you can't output anything.
Ok so if I am going to restructure my memory implementation Id like to do it now before I get too far into this.

The only other way I could think to implement memory would be to do something like this:

Allocate bytes for each section of memory to their own byte pointers?
[snip]
this just seems really messy, would also have to store the start addresses for each section.
That is typically what is done. And it's actually much more organized (not messy) because each block of memory is clearly labelled (ie: you have RAM, ROM buffers instead of just a big "memory" buffer which could be anything or nothing). Keep in mind that when you get into bankswitching and stuff, the "big 64k clump" just isn't practical and you're much better off with separate buffers.

You might want to look at some bankswitching info before you go further. Once you see how it works you'll better understand what I'm talking about.

For an example -- let's take a simple "mapper 2"-ish example:

- PRG-ROM is 128K (0x20000 bytes)
- this is broken up into 8 "pages", each 16K (0x4000) in size.
- $C000-FFFF is mapped to the last page (page 7: 0x1C000-0x1FFFF)
- $8000-BFFF is swappable, so it can reflect any of the 8 pages in the ROM.

Basically, when the games "swaps" PRG, it's changing which page is "visible" in that slot. IE:

Code: Select all

JSR SwapToPage_0
LDA $8000    ; reads from PRG offset 0x00000
JSR SwapToPage_2
LDA $8000    ; reads from PRG offset 0x08000
For this to work with the "big 64K clump" you'd need to copy the full 16K page to your memory block every time a swap occurs. Since swaps occur repeatedly and rapidly (several times per frame) this is very wasteful.

The general approach to this is to have pointers. Like one pointer for each $1000 bytes. These pointers would then point to different areas in the main PRG buffer. That way to swap all you have to do is change where the pointer(s) point(s).
Also, if I dont use my memory implementation then it seems like reads/writes will be messier, requiring lots of checks of where the address is to find out which pointers to use, instead of just indexing right into the spot?
Hence why function pointers are a common solution. They make it sort of like a jump table.

Code: Select all

v = ReadFunc[ addr>>12 ](addr);
Why does the order matter?
It doesn't really. All that matters is that you emulate the desired output behavior. I was just saying why the 6502 does it the way it does. Your emu need not do it that way.

It also depends on the level of accuracy you want. If you want your CPU core to be cycle accurate, then you'd want to do the reads and writes in the exact right order and have them spaced out one cycle apart.

User avatar
cpow
NESICIDE developer
Posts: 1099
Joined: Mon Oct 13, 2008 7:55 pm
Location: Minneapolis, MN
Contact:

Post by cpow » Sun Jun 07, 2009 12:20 pm

jjpeerless wrote:thanks for the quick reply! few more questions for you =)
I just implemented separate classes. My CPU class has internal 2KB memory. My PPU class has internal 8KB memory (nametable), 256B memory (for OAM), and 32B memory (for palettes). My ROM (cartridge) class has internal 8KB SRAM, EXRAM [still working this], and Nx16KB PRG banks and Nx8KB CHR-ROM banks. The ROM is filled with data when a .nes file is loaded. The ROM class takes care of all memory mapping by subclassing the ROM class for different mapper implementations.

Then my emulator has memory read/write functions that check the address and forward the request to the appropriate class based on address range. Start at the top and work your way down...ie.

if ( addr > 0x8000 ) ROM::read(addr)
else if ( addr > 0x6000 ) ROM::sram(addr)
else if ( addr > 0x4020 ) ROM::extra(addr)
else if ( addr > 0x4000 ) IO::read(addr)
else if ( addr > 0x2000 ) PPU::read(addr)
else CPU::read(addr)

Internally the classes know whether the address read is a register, memory, etc and return the appropriate result or cause the appropriate behavior (such as joypad strobe in IO class on write to 0x4016).

tanoatnd
Posts: 37
Joined: Sat Apr 18, 2009 12:45 am

Post by tanoatnd » Sun Jun 07, 2009 12:41 pm

Personally, I use this tip in my cpu emulator (I did not realize much
of ppu, for now), represent flags as separate variable that
assume 0 if cleared, and the corresponding psr bit if set.
Example:

uint8_t c_flag, n_flag, ecc..

#define C_FLAG 0x01
..
#define N_FLAG 0x80

c_flag = 0; /* clear carry */
c_flag = C_FLAG; /* set carry */

when you need the value of the psr, or (|) all the variables together.
Bye,
tano

jjpeerless
Posts: 5
Joined: Sat Jun 06, 2009 2:32 pm

Post by jjpeerless » Sun Jun 07, 2009 3:10 pm

ok so I have read and re-read your replies a bunch of times to try to grasp it all.

I am pretty sure I get the PRG bank swapping and understand that it would make sense to have pointers to PRG_LOW_BANK and PRG_HIGH_BANK which point to whichever PRG_ROM page I need (say there are 8 pages, your example) I could do PRG_LOW_BANK = &PRG_ROM[7]; PRG_HIGH_BANK = &PRG_ROM[0]; or whichever page I need the banks to be pointing to. where each PRG_ROM page is 16KB.

aside from that I don't really see why I shouldnt just lump the rest of memory into one chunk.

also, for the read/write functions..

say im doing something like:

if(addr < 0x0800)
{
read_RAM(addr);
}

if(addr < 0x2000)
{
read_RAM(map_addr_to_actual_ram_since_its_accessing_a_mirror);
}

if(addr < 0x8000)
{
read_??(addr);
}


if the cpu memory was divided into its own chunks, then the address passed into the function would be indexing into the "whole thing" but we only have a chunk from 0-->size_of_chunk, so would there also need to be some address translation so it indexes correctly?

User avatar
Disch
Posts: 1849
Joined: Wed Nov 10, 2004 6:47 pm

Post by Disch » Sun Jun 07, 2009 4:29 pm

preface:

I might sound like I'm lecturing you here and/or telling you how to build your emu. I don't mean to sound too pushy, I'm just trying to share my experience with you so you don't have to go through all the trial and error hardships I did.

Remember that this is your project, and I'm only giving you input. Feel free to tell me to shove it and code the project your own way. Remember that it's all about fun and if doing it your way is more fun, then that's how you should do it! Don't let me bully you. I really don't want to!
I am pretty sure I get the PRG bank swapping and understand that it would make sense to have pointers to PRG_LOW_BANK and PRG_HIGH_BANK
You'll want to go finer than 16K, though. I just used 16K in my example. There's also 8K swapping on many mappers (in fact it's probably the most common). NSFs go with 4K, so if you want NSF support you'll need to go at least that low.
aside from that I don't really see why I shouldnt just lump the rest of memory into one chunk.
Rather than ask "why shouldn't I?" Try asking "why should I?" I already listed several reasons why you shouldn't.

Also note that $6000-7FFF may also be swappable (RAM or ROM)

What good does having a 64K chunk do you? All you really need to gut it is:

1) a system RAM buffer (2K for RAM at $0000-07FF)
2) a PRG ROM buffer (variable size, for game's PRG-ROM)
3) a PRG RAM buffer (variable size, typically 0 or 8K for on cartridge RAM / SRAM -- typically $6000-7FFF)
4) a sane way to perform reads/writes

1 is hardly anything big
2 and 3 you need anyway because of swapping issues
and 4 you should have anyway otherwise adding IO regs and mapper stuff will be a complete nightmare.
if the cpu memory was divided into its own chunks, then the address passed into the function would be indexing into the "whole thing" but we only have a chunk from 0-->size_of_chunk, so would there also need to be some address translation so it indexes correctly?
side note: notice that it's called an "address" and not an "index". This is because you're not really indexing anything. Different addresses get mapped to different areas. Some areas contain RAM, others ROM, others IO regs, and others nothing at all. This is the concept you and I seem to be clashing on. You want to shove everything in a big array so indexing is quick, but that's not really logical when you look at what's actually going on.

As for translating addresses, this is pretty much always a quick mask. You can think of it as the high bits of the address tell the NES where to go, and the low bits tell it what address within that area to read from. So for example address $8123 would go to PRG because the high bit is set, and the low bits indicate it wants address $0123 from within the PRG. All you need to do to extract those low bits is an AND operation.

I really think I might be able to sell you on the function pointer idea. Here's a simplistic example of how you could use function pointers to sanely handle reading -- simple bankswitching ability is included:

Code: Select all

typedef u8 (*ReadProc)(u16);  // read function pointer typedef

u8   RAM[0x0800];  // 2K system RAM
u8*  PRGROM;       // dynamically allocated PRG ROM
u8*  PRG[0x10];    // bankswapping PRG pointers (each represents 4K)
ReadProc Rd[0x10]; // read procs (each represents 4K address space)

u8 Read_RAM(u16 a)
{
  return RAM[a & 0x07FF];  // mask to handle mirroring
}

u8 Read_Nothing(u16 a)
{
  return 0;  // junk return value
             //  technical note:  you never actually return 0
             //  closest to "nothing" you return is open bus
             //  but don't worry too much about that.
}

u8 Read_PRG(u16 a)
{
  return PRG[a >> 12][a & 0x0FFF];// right shift to get the 4K "slot"
                                  //  and mask to get index within that slot
}

//----------------------------

void WhenYouInit()
{
  // set up your function pointers to read from the desired area
  Rd[0x0] = &Read_RAM; // $0xxx = RAM
  Rd[0x1] = &Read_RAM; // $1xxx = RAM

  Rd[0x2] = &Read_Nothing; // $2xxx - $7xxx = nothing
  Rd[0x3] = &Read_Nothing; //  note $6xxx,$7xxx typically
  Rd[0x4] = &Read_Nothing; //    have PRG RAM, but that's omitted
  Rd[0x5] = &Read_Nothing; //    for this example
  Rd[0x6] = &Read_Nothing;
  Rd[0x7] = &Read_Nothing;

  Rd[0x8] = &Read_PRG; // the rest = PRG ROM
  Rd[0x9] = &Read_PRG;
  Rd[0xA] = &Read_PRG;
  Rd[0xB] = &Read_PRG;
  Rd[0xC] = &Read_PRG;
  Rd[0xD] = &Read_PRG;
  Rd[0xE] = &Read_PRG;
  Rd[0xF] = &Read_PRG;
}

//-----------------------

u8 WhenYouNeedToRead(u16 addr)
{
  // right shift the address to figure out which read function
  //  to call.  Then call it
  return Rd[addr >> 12](addr);
}

//-----------------------

void SwapPRG16K(int page)
{
  // swap to put a new page at $8000-BFFF
  // you'd probably want more generic routines than this
  //  this is just an example

  page *= 0x4000;
  PRG[0x8] = &PRGROM[page         ];
  PRG[0x9] = &PRGROM[page + 0x1000];
  PRG[0xA] = &PRGROM[page + 0x2000];
  PRG[0xB] = &PRGROM[page + 0x3000];
}
Writing could be done the same way. You just make another set of function pointers.

jjpeerless
Posts: 5
Joined: Sat Jun 06, 2009 2:32 pm

Post by jjpeerless » Sun Jun 07, 2009 5:22 pm

Hey,

Thanks for the simple example, should clear up a lot of the confusion I was having. My only question regarding your code is this:

Code: Select all

u8*  PRG[0x10];    // bankswapping PRG pointers (each represents 4K) 
if each pointer points to 4KB of PRG_RAM, and there are 16 pointers, this is 64KB of PRG_RAM pointers when the nes is supposed to only have room for 32KB of addressable PRG_RAM at a time? From 0x8000 to 0xFFFF?


Anyways, I am going to go ahead and work on getting all the opcodes implemented first..should take a while given the amount of them and use dummy read_memory write_memory place holders in them.

Once I finish all of them I'll implement a similar memory access system as described in your sample and see where I'm at.

Quick extra question though regarding the stack pointer. The stack address range from 0x0100 to 0x01FF. In the opcode TXS we set the stack pointer to the value stored in X register. If the X register is 8 bit then it can't possibly address a valid stack address? This opcode seems strange to me anyways, shouldnt the stack pointer just be initialized to 0x01FF and decremented/incremented whenever something is push'd or pop'd? Do we just do something like SP = (0x0100 | X)? and make the SP a 16bit address?

Anyways, thanks again for the help.

tepples
Posts: 22277
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Sun Jun 07, 2009 5:34 pm

If you know what LDA $0100,X means, then stack accesses are like LDA $0100,SP. That's how an 8-bit stack pointer can point between $0100 and $01FF.

User avatar
Disch
Posts: 1849
Joined: Wed Nov 10, 2004 6:47 pm

Post by Disch » Sun Jun 07, 2009 5:35 pm

jjpeerless wrote: if each pointer points to 4KB of PRG_RAM, and there are 16 pointers, this is 64KB of PRG_RAM pointers when the nes is supposed to only have room for 32KB of addressable PRG_RAM at a time? From 0x8000 to 0xFFFF?
Some mappers put it at $6000-FFFF, so you'd need at least 10. But yeah you don't need all of them. Having 16 pointers just makes the computation easier (don't have to subtract or anything)
Quick extra question though regarding the stack pointer. The stack address range from 0x0100 to 0x01FF. In the opcode TXS we set the stack pointer to the value stored in X register. If the X register is 8 bit then it can't possibly address a valid stack address?
[snip]
Do we just do something like SP = (0x0100 | X)? and make the SP a 16bit address?
Stack pointer is only 8 bits. When you do a push/pop operation it accesses ($0100 | SP).
This opcode seems strange to me anyways, shouldnt the stack pointer just be initialized to 0x01FF and decremented/incremented whenever something is push'd or pop'd?
It's important to be able to set all regs to a known state. Games will TSX once at program startup to ensure that the stack is where they want it to be (some games move the stack down to $0180 so they can use $01Fx for other things). Other games have multiple stacks and use TXS/TSX to switch between them.

thanks again for the help.
np

Post Reply