It is currently Wed Apr 24, 2019 3:44 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 26 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Tue Apr 03, 2018 8:45 am 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 711
I did the test, the Z80 will launch into the IRQ handler once EI is enabled, if the IRQ event was triggered while in a DI.


Top
 Profile  
 
PostPosted: Thu Apr 05, 2018 3:20 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21322
Location: NE Indiana, USA (NTSC)
ISSOtm in the GBDev Discord server had an idea to split the actor struct into two, pointing HL at one half and DE at the other. This allows actor movement code to have two "last offset" values at the cost of register DE.

One way to split it up is to put things relevant to display (and possibly movement) go in one struct and things relevant only to movement in the other. Things relevant to display include pixel/screen position, current frame, and current direction. Things relevant only to movement include subpixel position, velocity, health, time until next frame, etc.

In a platformer, a display struct might begin with
Y pixel position
X pixel position
X 256 pixels position
Actor type
Current frame
Facing direction
VRAM offset of actor's tiles

and the movement part of the struct might begin with
Y pixel velocity
Y subpixel velocity
Y subpixel position
X pixel velocity
X subpixel velocity
X subpixel position
Time to next frame
Health
Height of received damage

With this arrangement, a game can point DE at the display part and HL at the movement part, call "add velocity to position" twice, once for Y and once for X, and then update the X 256 pixels position based on the velocity sign and the carry from adding velocity.

_________________
Pin Eight | Twitter | GitHub | Patreon


Top
 Profile  
 
PostPosted: Tue Apr 17, 2018 7:41 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21322
Location: NE Indiana, USA (NTSC)
The Game Boy analog of Popslide is this, which takes 5 bytes of program code and 9 machine cycles per pair of copied bytes:
Code:
rept MAXCOPYLEN/2
pop bc
ld a,c
ld [hl+],a
ld a,b
ld [hl+],a
endr


Or, as seen in a part of first-generation Pokémon nicknamed "vcopy.asm" by reverse engineers, with the same timing characteristics:
Code:
rept MAXCOPYLEN/2
pop de
ld [hl],e
inc l
ld [hl],d
inc l
endr


It's helped on Game Boy (4.5 cycles per byte instead of 8) by a few things:

  • Stack pointer is bigger, allowing multiple buffers if needed.
  • The stack is full (points to top used element) rather than empty (points to first unused element). This means the cycle penalty for pre-modification is taken on push rather than on pop.
  • pop loads two bytes.
  • Pointer register HL means not having to repeat the VRAM address in program code for each copied byte.

Thus the theoretical count of bytes that can be copied from the stack to VRAM during one vblank minus one OAM DMA is about 216 for both.

NES: (113.667*20 - 545)/8
GB: (114*10-168)/4.5

_________________
Pin Eight | Twitter | GitHub | Patreon


Top
 Profile  
 
PostPosted: Sat Apr 21, 2018 7:16 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21322
Location: NE Indiana, USA (NTSC)
In this post about ARM vs. 68000, lidnariq pointed out "Timeline of instructions per second" on Wikipedia. The article attempts to give values normalized to the performance of a 5 MHz VAX-11/780 on the Dhrystone benchmark.

6502: 0.430 MIPS at 1.000 MHz
Intel 8080: 0.290 MIPS at 2.000 MHz
Zilog Z80: 0.580 MIPS at 4.000 MHz


Thus both the Intel 8080 and the Zilog Z80 are said to have 0.145 instructions per tstate. Being architecturally very similar to those two, I'd imagine the LR35902 not to differ much.

6502 in NES: 0.770 MIPS at 1.790 MHz
Z80 in CV, MSX, and SMS: 0.519 MIPS at 3.580 MHz (2:1 clock)
LR35902 in SGB: 0.623 MIPS at 4.295 MHz (2.4:1 clock)

_________________
Pin Eight | Twitter | GitHub | Patreon


Top
 Profile  
 
PostPosted: Wed Jul 18, 2018 10:45 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2841
Oziphantom wrote:
Just when you think you find something .. denied!

The Z80 is such a 95% cpu... it almost does things in a nice way, but it just lacks that 1 opcode....

I've been asking in Z80 circles and basically I just get silence. There is no good way. The best I've be given is "align to 256 boundaries" and Use Tables..
I've also been pointing out to others how to use their CPU faster ..sigh...

Seems there are no silver bullets nor horizons. But the Stack move trick Really does slay a 6502 ;) Just the main computer I'm doing it on only has a 2mhz Z80 which makes it only slightly faster, on a 3.5 or 4mhz Z80 it would slay...


I feel the same way about the SPC700. It's like Sony intended it to do nothing but change notes.


Top
 Profile  
 
PostPosted: Wed Jul 18, 2018 11:15 am 
Offline

Joined: Sun Mar 27, 2011 10:49 am
Posts: 270
Location: Seattle
i mean it was designed primarily to change notes


Top
 Profile  
 
PostPosted: Fri Jan 04, 2019 7:00 pm 
Offline
User avatar

Joined: Fri Jan 04, 2019 5:31 pm
Posts: 32
Location: France, right of a pile of consoles
Here's a rather tame but interesting note to add in favor of the GB:
The OAM DMA can be executed mid-frame. This causes the PPU to read $FF bytes from OAM for the duration of the transfer, causing up to 2 scanlines of blank sprites(*), but the BG is still shown.
Prehistorik Man uses this to refresh OAM mid-frame and increase the amount of on-screen sprites (did you know? The trunks of the palm trees are actually made from two sprites to circumvent OBJ priority issues). You can parallel this to Super Mario Kart's OAM refresh for the same purpose.

The way this is relevant to this discussion is, you can essentially perform OAM DMA outside of VBlank without many drawbacks. Particularly, side-scrollers with a status bar at the top could use this time to refresh OAM, since sprites aren't supposed to show up there anyways. Or, generally, it could be assumed that 1) this will only occur once in a while resulting in minor flicker, and 2) generally few objects will be located near the top of the screen, so the flicker has chances to go unnoticed. Also, if using GBC double-speed, the time taken is halved, which reportdly allows fitting the entire DMA into a HBlank period.

This allows my game to squeeze a few more bytes out of its popslide copy.


(*) If the OAM DMA is started during Mode 3, it might occur while the PPU is accessing a sprite's tile ID or attributes, which would cause it to glitch out. Starting OAM DMA during Mode 2 appears to be perfectly safe, but affects the initial OAM search, so sprites may only show up on scanlines 0-7 / 0-15 with specific timing.

_________________
The French Lord of Laziness (and a huge Legend of Zelda fan)
https://github.com/ISSOtm
ASMu is laifu <3


Top
 Profile  
 
PostPosted: Thu Mar 28, 2019 8:08 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21322
Location: NE Indiana, USA (NTSC)
One place where Game Boy trounces NES is nibble swapping without needing 256 bytes of fixed bank space for a lookup table. SM83 repurposes an unofficial Z80 CB prefix instruction to swap nibbles in a byte in 2 cycles, while 6502 takes 16 in either of two ways:
Code:
; Option A
asl a
rol a
rol a
rol a
sta tmp1 ; Shift A210 and C into A321
and #$0F
adc tmp1

; Option B (-1 byte RAM, +2 bytes ROM)
cmp #$80  ; Copy A7 into C
rol a
cmp #$80
rol a
cmp #$80
rol a
cmp #$80
rol a

_________________
Pin Eight | Twitter | GitHub | Patreon


Top
 Profile  
 
PostPosted: Thu Mar 28, 2019 11:46 pm 
Offline

Joined: Thu Aug 20, 2015 3:09 am
Posts: 451
Code:
ASL  A
ADC  #$80
ROL  A
ASL  A
ADC  #$80
ROL  A

12 cycles, 8 bytes, no RAM.


Top
 Profile  
 
PostPosted: Fri Mar 29, 2019 5:03 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 7412
Location: Canada
That's really neat. Kinda synthesizing a barrel shift 2 bits at a time. Rad!


Top
 Profile  
 
PostPosted: Fri Mar 29, 2019 1:51 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 11296
Location: Rio de Janeiro - Brazil
Neat trick!


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 26 posts ]  Go to page Previous  1, 2

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group