Page 2 of 2

Re: Is a Game Boy faster than an NES?

Posted: Tue Apr 03, 2018 8:45 am
by Oziphantom
I did the test, the Z80 will launch into the IRQ handler once EI is enabled, if the IRQ event was triggered while in a DI.

Re: Is a Game Boy faster than an NES?

Posted: Thu Apr 05, 2018 3:20 pm
by tepples
ISSOtm in the GBDev Discord server had an idea to split the actor struct into two, pointing HL at one half and DE at the other. This allows actor movement code to have two "last offset" values at the cost of register DE.

One way to split it up is to put things relevant to display (and possibly movement) go in one struct and things relevant only to movement in the other. Things relevant to display include pixel/screen position, current frame, and current direction. Things relevant only to movement include subpixel position, velocity, health, time until next frame, etc.

In a platformer, a display struct might begin with
Y pixel position
X pixel position
X 256 pixels position
Actor type
Current frame
Facing direction
VRAM offset of actor's tiles

and the movement part of the struct might begin with
Y pixel velocity
Y subpixel velocity
Y subpixel position
X pixel velocity
X subpixel velocity
X subpixel position
Time to next frame
Health
Height of received damage

With this arrangement, a game can point DE at the display part and HL at the movement part, call "add velocity to position" twice, once for Y and once for X, and then update the X 256 pixels position based on the velocity sign and the carry from adding velocity.

Re: Is a Game Boy faster than an NES?

Posted: Tue Apr 17, 2018 7:41 pm
by tepples
The Game Boy analog of Popslide is this, which takes 5 bytes of program code and 9 machine cycles per pair of copied bytes:

Code: Select all

rept MAXCOPYLEN/2
pop bc
ld a,c
ld [hl+],a
ld a,b
ld [hl+],a
endr
Or, as seen in a part of first-generation Pokémon nicknamed "vcopy.asm" by reverse engineers, with the same timing characteristics:

Code: Select all

rept MAXCOPYLEN/2
pop de
ld [hl],e
inc l
ld [hl],d
inc l
endr
It's helped on Game Boy (4.5 cycles per byte instead of 8) by a few things:
  • Stack pointer is bigger, allowing multiple buffers if needed.
  • The stack is full (points to top used element) rather than empty (points to first unused element). This means the cycle penalty for pre-modification is taken on push rather than on pop.
  • pop loads two bytes.
  • Pointer register HL means not having to repeat the VRAM address in program code for each copied byte.
Thus the theoretical count of bytes that can be copied from the stack to VRAM during one vblank minus one OAM DMA is about 216 for both.

NES: (113.667*20 - 545)/8
GB: (114*10-168)/4.5

Re: Is a Game Boy faster than an NES?

Posted: Sat Apr 21, 2018 7:16 pm
by tepples
In this post about ARM vs. 68000, lidnariq pointed out "Timeline of instructions per second" on Wikipedia. The article attempts to give values normalized to the performance of a 5 MHz VAX-11/780 on the Dhrystone benchmark.
6502: 0.430 MIPS at 1.000 MHz
Intel 8080: 0.290 MIPS at 2.000 MHz
Zilog Z80: 0.580 MIPS at 4.000 MHz
Thus both the Intel 8080 and the Zilog Z80 are said to have 0.145 instructions per tstate. Being architecturally very similar to those two, I'd imagine the Game Boy processor not to differ much.

6502 in NES: 0.770 MIPS at 1.790 MHz
Z80 in CV, MSX, and SMS: 0.519 MIPS at 3.580 MHz (2:1 clock)
SGB CPU: 0.623 MIPS at 4.295 MHz (2.4:1 clock)

Re: Is a Game Boy faster than an NES?

Posted: Wed Jul 18, 2018 10:45 am
by psycopathicteen
Oziphantom wrote:Just when you think you find something .. denied!

The Z80 is such a 95% cpu... it almost does things in a nice way, but it just lacks that 1 opcode....

I've been asking in Z80 circles and basically I just get silence. There is no good way. The best I've be given is "align to 256 boundaries" and Use Tables..
I've also been pointing out to others how to use their CPU faster ..sigh...

Seems there are no silver bullets nor horizons. But the Stack move trick Really does slay a 6502 ;) Just the main computer I'm doing it on only has a 2mhz Z80 which makes it only slightly faster, on a 3.5 or 4mhz Z80 it would slay...
I feel the same way about the SPC700. It's like Sony intended it to do nothing but change notes.

Re: Is a Game Boy faster than an NES?

Posted: Wed Jul 18, 2018 11:15 am
by adam_smasher
i mean it was designed primarily to change notes

Re: Is a Game Boy faster than an NES?

Posted: Fri Jan 04, 2019 7:00 pm
by ISSOtm
Here's a rather tame but interesting note to add in favor of the GB:
The OAM DMA can be executed mid-frame. This causes the PPU to read $FF bytes from OAM for the duration of the transfer, causing up to 2 scanlines of blank sprites(*), but the BG is still shown.
Prehistorik Man uses this to refresh OAM mid-frame and increase the amount of on-screen sprites (did you know? The trunks of the palm trees are actually made from two sprites to circumvent OBJ priority issues). You can parallel this to Super Mario Kart's OAM refresh for the same purpose.

The way this is relevant to this discussion is, you can essentially perform OAM DMA outside of VBlank without many drawbacks. Particularly, side-scrollers with a status bar at the top could use this time to refresh OAM, since sprites aren't supposed to show up there anyways. Or, generally, it could be assumed that 1) this will only occur once in a while resulting in minor flicker, and 2) generally few objects will be located near the top of the screen, so the flicker has chances to go unnoticed. Also, if using GBC double-speed, the time taken is halved, which reportdly allows fitting the entire DMA into a HBlank period.

This allows my game to squeeze a few more bytes out of its popslide copy.


(*) If the OAM DMA is started during Mode 3, it might occur while the PPU is accessing a sprite's tile ID or attributes, which would cause it to glitch out. Starting OAM DMA during Mode 2 appears to be perfectly safe, but affects the initial OAM search, so sprites may only show up on scanlines 0-7 / 0-15 with specific timing.

Re: Is a Game Boy faster than an NES?

Posted: Thu Mar 28, 2019 8:08 am
by tepples
One place where Game Boy trounces NES is nibble swapping without needing 256 bytes of fixed bank space for a lookup table. SM83 repurposes an unofficial Z80 CB prefix instruction to swap nibbles in a byte in 2 cycles, while 6502 takes 16 in either of two ways:

Code: Select all

; Option A
asl a
rol a
rol a
rol a
sta tmp1 ; Shift A210 and C into A321
and #$0F
adc tmp1

; Option B (-1 byte RAM, +2 bytes ROM)
cmp #$80  ; Copy A7 into C
rol a
cmp #$80
rol a
cmp #$80
rol a
cmp #$80
rol a

Re: Is a Game Boy faster than an NES?

Posted: Thu Mar 28, 2019 11:46 pm
by Rahsennor
David Galloway wrote:

Code: Select all

ASL  A
ADC  #$80
ROL  A
ASL  A
ADC  #$80
ROL  A
12 cycles, 8 bytes, no RAM.

Re: Is a Game Boy faster than an NES?

Posted: Fri Mar 29, 2019 5:03 am
by rainwarrior
That's really neat. Kinda synthesizing a barrel shift 2 bits at a time. Rad!

Re: Is a Game Boy faster than an NES?

Posted: Fri Mar 29, 2019 1:51 pm
by tokumaru
Neat trick!