It is currently Wed Dec 13, 2017 12:27 am

All times are UTC - 7 hours



Forum rules


Related:



Post new topic Reply to topic  [ 31 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
PostPosted: Mon Jul 18, 2016 11:23 pm 
Offline

Joined: Fri Jul 15, 2016 9:47 pm
Posts: 13
Also, are all integers signed? I've got a "getWord" function, but I wasn't sure if that should return a signed or unsigned value, or if there are times when it's both.


Top
 Profile  
 
PostPosted: Mon Jul 18, 2016 11:26 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
hatfarm wrote:
Yeah, but the manual implies that sometimes it takes a single cycle to do that, and sometimes it takes multiple cycles (at least in the case of a word vs a byte). If it's a 4 byte instruction, how many cycles is that going to take?

Where in the manual is this "implied"? The CPU cycle counts are defined clearly and are static, barring the conditionals that might cause them to take more time. If you want T-phase tear-downs of each addressing mode, that's available as well, but you do not need to worry about that level of granularity when emulating the CPU. Honest.

If we're talking about memory access times etc. then that's a different subject and one I can't really talk about (the other hardware guys here can).


Top
 Profile  
 
PostPosted: Mon Jul 18, 2016 11:33 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
hatfarm wrote:
Also, are all integers signed? I've got a "getWord" function, but I wasn't sure if that should return a signed or unsigned value, or if there are times when it's both.

When you say "are all integers signed", I need to know what you're talking about *specifically*. This question is sort of loaded, in the sense that it sounds like something someone familiar with higher level languages (particularly C) would ask. I don't mean that in a judgy way either!

Are we talking about branch instructions? If so, the operand is treated as a signed number. Otherwise no, most things are unsigned.

However, many instructions (esp. load instructions) keep track of whether or not the MSB in the resulting value or modified value is set (this is reflected in the CPU flag n, which stands for negative) (the CPU flag z (zero flag) is also a commonly modified one, defining whether or not the result was 0 or not). Whether or not the underlying 65816 program *chooses* to make use of the n flag is up to the programmer. In other words: values are just values. What CPU flags are modified by an instruction are documented per-opcode/per-addressing-mode.

Do you have the time to sit down and read the manual (not skim it)? If you do, I think it'd be worthwhile. The WDC manual (originally from Ron Lichty and David Eyes) actually reads fairly easily a lot of the time, meaning it's not necessarily "hard" reading material. It goes over 6502 and 65c02 as well, so you can have a good understanding of the 8-bit CPUs it was based on (and that information applies to emulation mode as well).


Top
 Profile  
 
PostPosted: Tue Jul 19, 2016 12:05 am 
Offline

Joined: Fri Jul 15, 2016 9:47 pm
Posts: 13
koitsu wrote:
Where in the manual is this "implied"? The CPU cycle counts are defined clearly and are static, barring the conditionals that might cause them to take more time. If you want T-phase tear-downs of each addressing mode, that's available as well, but you do not need to worry about that level of granularity when emulating the CPU. Honest.

If we're talking about memory access times etc. then that's a different subject and one I can't really talk about (the other hardware guys here can).

I do mean memory access times. I actually would be interested in T-phase teardowns of these things, but would probably just think they were cool and move on. I'm not interested in super hardcore perfect reproduction, I just want to make sure I'm not missing anything brutal that will cause my times to slip.

I can't remember the exact line, but I think it said that during a memory read (LDA/STA), it'd read/write 2 bytes, but there were other times when it would read a single byte, and those take the same amount of time.


Top
 Profile  
 
PostPosted: Tue Jul 19, 2016 12:09 am 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 818
koitsu wrote:
2. Most branch instructions cost 3 CPU cycles unconditionally.

Huh? No, it's only unconditional branches that always take 3 cycles. Conditional branches only take 3 cycles if they're taken; otherwise they take 2. (Plus the page boundary thing in emulation mode, of course.)


Top
 Profile  
 
PostPosted: Tue Jul 19, 2016 12:14 am 
Offline

Joined: Fri Jul 15, 2016 9:47 pm
Posts: 13
koitsu wrote:
When you say "are all integers signed", I need to know what you're talking about *specifically*. This question is sort of loaded, in the sense that it sounds like something someone familiar with higher level languages (particularly C) would ask. I don't mean that in a judgy way either!

Are we talking about branch instructions? If so, the operand is treated as a signed number. Otherwise no, most things are unsigned.

However, many instructions (esp. load instructions) keep track of whether or not the MSB in the resulting value or modified value is set (this is reflected in the CPU flag n, which stands for negative) (the CPU flag z (zero flag) is also a commonly modified one, defining whether or not the result was 0 or not). Whether or not the underlying 65816 program *chooses* to make use of the n flag is up to the programmer. In other words: values are just values. What CPU flags are modified by an instruction are documented per-opcode/per-addressing-mode.

Do you have the time to sit down and read the manual (not skim it)? If you do, I think it'd be worthwhile. The WDC manual (originally from Ron Lichty and David Eyes) actually reads fairly easily a lot of the time, meaning it's not necessarily "hard" reading material. It goes over 6502 and 65c02 as well, so you can have a good understanding of the 8-bit CPUs it was based on (and that information applies to emulation mode as well).


I'm definitely a higher level dude (C, Java, Python, JavaScript, all recently), but I do understand low level details. I worked on a team that determined whether IO signals were valid or not coming from sensors, and we looked at timing diagrams and such quite often. I did write linux device drivers for what was essentially the NES controller in my OS course a few years back (I was old in school), but I haven't done much assembly work since then. I also designed/wrote VHDL for a RISC microprocessor, so I have the background to understand this stuff, I'm just a bit rusty.

I don't really remember dealing with multibyte data (and that was x86, not 65816), so that's why I'd like to know. In what instances are we getting a signed integer (8 or 16 bit) and in what instances is it not? I have to select the right datatypes when I'm parsing this, so I want to make sure I do so.

I've been skimming the manual (basically implementing an instruction at a time and reading what I can from the manual). I understand the flags and stuff, they're basically the same as other processors I've worked with. I didn't actually see that bit about the BRA instruction's offset being signed, which would be a big deal if I had screwed it up. However, I've mostly been working on this stuff late at night, so my reading comprehension could be impared a bit by my sleepiness.

Anyway, thank you so much for your helpful (and quick!) responses!


Top
 Profile  
 
PostPosted: Tue Jul 19, 2016 12:30 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
hatfarm wrote:
I do mean memory access times.

Okay, how those fit into the picture is something folks like lidnariq or byuu or others would have to comment on. Purely from a 65816 programmer's perspective, stuff like that has never been a "big" focus of mine. I care exclusively about CPU cycle counts and not, say, how many memory clocks/cycles reading from some memory bus takes. Does it matter? Yes, but that's something nobody has ever been able to explain to me in such a way where it makes tremendous sense. For example, I can tell you that sure, using 3.58MHz memory access is faster than 2.68MHz, and sure, you should try to benefit from that (I think it's 120ns vs. 200ns memory access times? I'm thinking of ROMs here). Likewise, I can tell you a 65816 on a 10MHz clock/crystal definitely runs faster than a 2MHz one (I know this first-hand from having an Apple IIGS accelerator card :-) ). But that's somewhat anecdotal. The hardware bits are what I tend to stray away from.

hatfarm wrote:
I actually would be interested in T-phase teardowns of these things, but would probably just think they were cool and move on.

Sometimes the teardowns help in understanding "how" the CPU does something, but generally speaking it isn't very helpful for emulation (IMO). The references I'm thinking of are here (for 6502 exclusively) and here (for 65816 exclusively).

hatfarm wrote:
I can't remember the exact line, but I think it said that during a memory read (LDA/STA), it'd read/write 2 bytes, but there were other times when it would read a single byte, and those take the same amount of time.

I'd need a reference to this quote/concern. It sounds to me like it's talking about register size, because the 65816 allows you to dynamically (at run-time) change the size (8-bit vs. 16-bit) of the accumulator and the X/Y index registers. 16-bit takes an extra cycle (for what should be an obvious reason), but it's consistent.

Random tip in passing: the two opcodes you're going to have the biggest problem with are probably adc and sbc, although on 65816 there are some others too that'll cause some grief, but they're still to date the biggest stumping points for emulator authors given the two's complement nature and how the overflow flag fits into the picture. They've been discussed heavily here. I always refer people to this thread (the post from blargg has the easiest proper implementation).


Top
 Profile  
 
PostPosted: Tue Jul 19, 2016 12:33 am 
Offline

Joined: Fri Jul 15, 2016 9:47 pm
Posts: 13
Thank you again! So glad someone pointed me here!


Top
 Profile  
 
PostPosted: Tue Jul 19, 2016 12:40 am 
Offline

Joined: Fri Jul 15, 2016 9:47 pm
Posts: 13
Also, I already had to handle the SBC logic (through CPX), and I think I got it right, based on what's in that post. However, definitely not as elegantly written as Blargg's.


Top
 Profile  
 
PostPosted: Tue Jul 19, 2016 12:42 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
93143 wrote:
koitsu wrote:
2. Most branch instructions cost 3 CPU cycles unconditionally.

Huh? No, it's only unconditional branches that always take 3 cycles. Conditional branches only take 3 cycles if they're taken; otherwise they take 2. (Plus the page boundary thing in emulation mode, of course.)

Sorry, you're correct. I could explain what I meant by my statement, but all it'd do is add further confusion to the thread. So I'll correct myself and clarify:

Branch instructions which have conditionals (ex. bcc, bcd, beq, bmi, bne, bpl, bvc, bvs) take 2 cycles by default. If the conditional proves true (branch is taken), there's an additional 1 cycle penalty. There's an additional 1 cycle penalty on top of that if in emulation mode, and the branch is taken, and the effective address crosses a page boundary.

The bra instruction takes 3 cycles, and there's an additional 1 cycle penalty on top of that if in emulation mode, and the branch is taken, and the effective address crosses a page boundary.

The brl instruction takes 4 cycles.


Top
 Profile  
 
PostPosted: Tue Jul 19, 2016 12:50 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
hatfarm wrote:
Also, I already had to handle the SBC logic (through CPX), and I think I got it right, based on what's in that post. However, definitely not as elegantly written as Blargg's.

cpx doesn't affect the v flag, though; adc/sbc do. You probably got the c flag right though. The WDC manual actually goes over all this too (chapter 9) if you want a long step-by-step walkthrough.


Top
 Profile  
 
PostPosted: Tue Jul 19, 2016 7:48 am 
Offline

Joined: Sun Jan 26, 2014 9:31 am
Posts: 266
hatfarm wrote:
So glad someone pointed me here!


You're welcome :D

I knew the fine folks here would be able to help you. I'm glad this discussion came up too, because I might dabble with 65816 ASM eventually (via the Super Game Boy). I'll be bookmarking this for reference.


Top
 Profile  
 
PostPosted: Tue Jul 19, 2016 5:44 pm 
Offline

Joined: Mon Nov 10, 2008 3:09 pm
Posts: 431
hatfarm wrote:
Also, are all integers signed? I've got a "getWord" function, but I wasn't sure if that should return a signed or unsigned value, or if there are times when it's both.


When you're writing a CPU emulator, you should use unsigned integers almost everywhere, and explicitly test the sign bit (e.g. flags.n = (bool)(result & 0x8000)) or cast to a signed type on an as-needed basis (regs.pc += (int8_t)operand). The reason is that in C and C++, signed integer overflow is undefined behaviour. Code like the following:

Code:
void add(int16_t operand) {
  int16_t result = regs.a + operand; // regs.a is an int16_t
  flags.n = (result < 0);
  // compute other flags
  regs.a = result;
}


simply cannot be guaranteed to give the result you expect for flags.n (or even for result). (int16_t)32767 + (int16_t)32767 does not necessarily equal (int16_t)(-2) in C.


Top
 Profile  
 
PostPosted: Sun Dec 18, 2016 5:55 pm 
Offline
User avatar

Joined: Mon Jan 23, 2006 7:47 am
Posts: 79
byuu wrote:
Here is how you compute the speed (6,8,12 clocks) of any memory address on the SNES:

Code:
unsigned CPU::speed(unsigned addr) const {
  if(addr & 0x408000) return addr & 0x800000 ? romSpeed : 8;
  if(addr + 0x6000 & 0x4000) return 8;
  if(addr - 0x4000 & 0x7e00) return 6;
  return 12;
}


Where romSpeed is 6 when $420d.d0=1, and 8 when $420d.d0=0.

I'm pretty sure the region 80-BF:8000..FFFF should be included in the romSpeed calculation, but the "& 0x408000" prevents it? Unless it really is always slow...


Last edited by creaothceann on Mon Dec 19, 2016 8:10 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Sun Dec 18, 2016 6:06 pm 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1339
> I'm pretty sure the region 80-BF:8000..FFFF should be included in the romSpeed calculation, but the "& 0x408000" prevents it?

Nope. Either bit allows the condition to pass.

if(addr & 0x408000) is true for:
00-ff:8000-ffff
40-7f:0000-ffff
c0-ff:0000-ffff

The next ternary inside, addr & 0x800000, narrows the range again to:
80-ff:8000-ffff
c0-ff:0000-ffff

The failure condition there captures both ROM regions that must be slow, and WRAM:
00-3f:8000-ffff
40-7d:0000-fff
7e-7f:0000-ffff


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 31 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group