Is a Game Boy faster than an NES?

Discussion of programming and development for the original Game Boy and Game Boy Color.
Oziphantom
Posts: 1565
Joined: Tue Feb 07, 2017 2:03 am

Re: Is a Game Boy faster than an NES?

Post by Oziphantom »

I did the test, the Z80 will launch into the IRQ handler once EI is enabled, if the IRQ event was triggered while in a DI.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Is a Game Boy faster than an NES?

Post by tepples »

ISSOtm in the GBDev Discord server had an idea to split the actor struct into two, pointing HL at one half and DE at the other. This allows actor movement code to have two "last offset" values at the cost of register DE.

One way to split it up is to put things relevant to display (and possibly movement) go in one struct and things relevant only to movement in the other. Things relevant to display include pixel/screen position, current frame, and current direction. Things relevant only to movement include subpixel position, velocity, health, time until next frame, etc.

In a platformer, a display struct might begin with
Y pixel position
X pixel position
X 256 pixels position
Actor type
Current frame
Facing direction
VRAM offset of actor's tiles

and the movement part of the struct might begin with
Y pixel velocity
Y subpixel velocity
Y subpixel position
X pixel velocity
X subpixel velocity
X subpixel position
Time to next frame
Health
Height of received damage

With this arrangement, a game can point DE at the display part and HL at the movement part, call "add velocity to position" twice, once for Y and once for X, and then update the X 256 pixels position based on the velocity sign and the carry from adding velocity.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Is a Game Boy faster than an NES?

Post by tepples »

The Game Boy analog of Popslide is this, which takes 5 bytes of program code and 9 machine cycles per pair of copied bytes:

Code: Select all

rept MAXCOPYLEN/2
pop bc
ld a,c
ld [hl+],a
ld a,b
ld [hl+],a
endr
Or, as seen in a part of first-generation Pokémon nicknamed "vcopy.asm" by reverse engineers, with the same timing characteristics:

Code: Select all

rept MAXCOPYLEN/2
pop de
ld [hl],e
inc l
ld [hl],d
inc l
endr
It's helped on Game Boy (4.5 cycles per byte instead of 8) by a few things:
  • Stack pointer is bigger, allowing multiple buffers if needed.
  • The stack is full (points to top used element) rather than empty (points to first unused element). This means the cycle penalty for pre-modification is taken on push rather than on pop.
  • pop loads two bytes.
  • Pointer register HL means not having to repeat the VRAM address in program code for each copied byte.
Thus the theoretical count of bytes that can be copied from the stack to VRAM during one vblank minus one OAM DMA is about 216 for both.

NES: (113.667*20 - 545)/8
GB: (114*10-168)/4.5
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Is a Game Boy faster than an NES?

Post by tepples »

In this post about ARM vs. 68000, lidnariq pointed out "Timeline of instructions per second" on Wikipedia. The article attempts to give values normalized to the performance of a 5 MHz VAX-11/780 on the Dhrystone benchmark.
6502: 0.430 MIPS at 1.000 MHz
Intel 8080: 0.290 MIPS at 2.000 MHz
Zilog Z80: 0.580 MIPS at 4.000 MHz
Thus both the Intel 8080 and the Zilog Z80 are said to have 0.145 instructions per tstate. Being architecturally very similar to those two, I'd imagine the Game Boy processor not to differ much.

6502 in NES: 0.770 MIPS at 1.790 MHz
Z80 in CV, MSX, and SMS: 0.519 MIPS at 3.580 MHz (2:1 clock)
SGB CPU: 0.623 MIPS at 4.295 MHz (2.4:1 clock)
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: Is a Game Boy faster than an NES?

Post by psycopathicteen »

Oziphantom wrote:Just when you think you find something .. denied!

The Z80 is such a 95% cpu... it almost does things in a nice way, but it just lacks that 1 opcode....

I've been asking in Z80 circles and basically I just get silence. There is no good way. The best I've be given is "align to 256 boundaries" and Use Tables..
I've also been pointing out to others how to use their CPU faster ..sigh...

Seems there are no silver bullets nor horizons. But the Stack move trick Really does slay a 6502 ;) Just the main computer I'm doing it on only has a 2mhz Z80 which makes it only slightly faster, on a 3.5 or 4mhz Z80 it would slay...
I feel the same way about the SPC700. It's like Sony intended it to do nothing but change notes.
adam_smasher
Posts: 271
Joined: Sun Mar 27, 2011 10:49 am
Location: Victoria, BC

Re: Is a Game Boy faster than an NES?

Post by adam_smasher »

i mean it was designed primarily to change notes
User avatar
ISSOtm
Posts: 58
Joined: Fri Jan 04, 2019 5:31 pm
Location: France, right of a pile of consoles
Contact:

Re: Is a Game Boy faster than an NES?

Post by ISSOtm »

Here's a rather tame but interesting note to add in favor of the GB:
The OAM DMA can be executed mid-frame. This causes the PPU to read $FF bytes from OAM for the duration of the transfer, causing up to 2 scanlines of blank sprites(*), but the BG is still shown.
Prehistorik Man uses this to refresh OAM mid-frame and increase the amount of on-screen sprites (did you know? The trunks of the palm trees are actually made from two sprites to circumvent OBJ priority issues). You can parallel this to Super Mario Kart's OAM refresh for the same purpose.

The way this is relevant to this discussion is, you can essentially perform OAM DMA outside of VBlank without many drawbacks. Particularly, side-scrollers with a status bar at the top could use this time to refresh OAM, since sprites aren't supposed to show up there anyways. Or, generally, it could be assumed that 1) this will only occur once in a while resulting in minor flicker, and 2) generally few objects will be located near the top of the screen, so the flicker has chances to go unnoticed. Also, if using GBC double-speed, the time taken is halved, which reportdly allows fitting the entire DMA into a HBlank period.

This allows my game to squeeze a few more bytes out of its popslide copy.


(*) If the OAM DMA is started during Mode 3, it might occur while the PPU is accessing a sprite's tile ID or attributes, which would cause it to glitch out. Starting OAM DMA during Mode 2 appears to be perfectly safe, but affects the initial OAM search, so sprites may only show up on scanlines 0-7 / 0-15 with specific timing.
The French Lord of Laziness (and a huge Legend of Zelda fan)
https://github.com/ISSOtm
ASMu is laifu <3
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Is a Game Boy faster than an NES?

Post by tepples »

One place where Game Boy trounces NES is nibble swapping without needing 256 bytes of fixed bank space for a lookup table. SM83 repurposes an unofficial Z80 CB prefix instruction to swap nibbles in a byte in 2 cycles, while 6502 takes 16 in either of two ways:

Code: Select all

; Option A
asl a
rol a
rol a
rol a
sta tmp1 ; Shift A210 and C into A321
and #$0F
adc tmp1

; Option B (-1 byte RAM, +2 bytes ROM)
cmp #$80  ; Copy A7 into C
rol a
cmp #$80
rol a
cmp #$80
rol a
cmp #$80
rol a
Rahsennor
Posts: 479
Joined: Thu Aug 20, 2015 3:09 am

Re: Is a Game Boy faster than an NES?

Post by Rahsennor »

David Galloway wrote:

Code: Select all

ASL  A
ADC  #$80
ROL  A
ASL  A
ADC  #$80
ROL  A
12 cycles, 8 bytes, no RAM.
User avatar
rainwarrior
Posts: 8731
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Is a Game Boy faster than an NES?

Post by rainwarrior »

That's really neat. Kinda synthesizing a barrel shift 2 bits at a time. Rad!
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Is a Game Boy faster than an NES?

Post by tokumaru »

Neat trick!
Post Reply