It is currently Thu Aug 16, 2018 5:41 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 21 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Tue Apr 03, 2018 8:45 am 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 510
I did the test, the Z80 will launch into the IRQ handler once EI is enabled, if the IRQ event was triggered while in a DI.


Top
 Profile  
 
PostPosted: Thu Apr 05, 2018 3:20 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20405
Location: NE Indiana, USA (NTSC)
ISSOtm in the GBDev Discord server had an idea to split the actor struct into two, pointing HL at one half and DE at the other. This allows actor movement code to have two "last offset" values at the cost of register DE.

One way to split it up is to put things relevant to display (and possibly movement) go in one struct and things relevant only to movement in the other. Things relevant to display include pixel/screen position, current frame, and current direction. Things relevant only to movement include subpixel position, velocity, health, time until next frame, etc.

In a platformer, a display struct might begin with
Y pixel position
X pixel position
X 256 pixels position
Actor type
Current frame
Facing direction
VRAM offset of actor's tiles

and the movement part of the struct might begin with
Y pixel velocity
Y subpixel velocity
Y subpixel position
X pixel velocity
X subpixel velocity
X subpixel position
Time to next frame
Health
Height of received damage

With this arrangement, a game can point DE at the display part and HL at the movement part, call "add velocity to position" twice, once for Y and once for X, and then update the X 256 pixels position based on the velocity sign and the carry from adding velocity.


Top
 Profile  
 
PostPosted: Tue Apr 17, 2018 7:41 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20405
Location: NE Indiana, USA (NTSC)
The Game Boy analog of Popslide is this, which takes 5 bytes of program code and 9 machine cycles per pair of copied bytes:
Code:
rept MAXCOPYLEN/2
pop bc
ld a,c
ld [hl+],a
ld a,b
ld [hl+],a
endr


Or, as seen in a part of first-generation Pokémon nicknamed "vcopy.asm" by reverse engineers, with the same timing characteristics:
Code:
rept MAXCOPYLEN/2
pop de
ld [hl],e
inc l
ld [hl],d
inc l
endr


It's helped on Game Boy (4.5 cycles per byte instead of 8) by a few things:

  • Stack pointer is bigger, allowing multiple buffers if needed.
  • The stack is full (points to top used element) rather than empty (points to first unused element). This means the cycle penalty for pre-modification is taken on push rather than on pop.
  • pop loads two bytes.
  • Pointer register HL means not having to repeat the VRAM address in program code for each copied byte.

Thus the theoretical count of bytes that can be copied from the stack to VRAM during one vblank minus one OAM DMA is about 216 for both.

NES: (113.667*20 - 545)/8
GB: (114*10-168)/4.5


Top
 Profile  
 
PostPosted: Sat Apr 21, 2018 7:16 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20405
Location: NE Indiana, USA (NTSC)
In this post about ARM vs. 68000, lidnariq pointed out "Timeline of instructions per second" on Wikipedia. The article attempts to give values normalized to the performance of a 5 MHz VAX-11/780 on the Dhrystone benchmark.

6502: 0.430 MIPS at 1.000 MHz
Intel 8080: 0.290 MIPS at 2.000 MHz
Zilog Z80: 0.580 MIPS at 4.000 MHz


Thus both the Intel 8080 and the Zilog Z80 are said to have 0.145 instructions per tstate. Being architecturally very similar to those two, I'd imagine the LR35902 not to differ much.

6502 in NES: 0.770 MIPS at 1.790 MHz
Z80 in CV, MSX, and SMS: 0.519 MIPS at 3.580 MHz (2:1 clock)
LR35902 in SGB: 0.623 MIPS at 4.295 MHz (2.4:1 clock)


Top
 Profile  
 
PostPosted: Wed Jul 18, 2018 10:45 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2728
Oziphantom wrote:
Just when you think you find something .. denied!

The Z80 is such a 95% cpu... it almost does things in a nice way, but it just lacks that 1 opcode....

I've been asking in Z80 circles and basically I just get silence. There is no good way. The best I've be given is "align to 256 boundaries" and Use Tables..
I've also been pointing out to others how to use their CPU faster ..sigh...

Seems there are no silver bullets nor horizons. But the Stack move trick Really does slay a 6502 ;) Just the main computer I'm doing it on only has a 2mhz Z80 which makes it only slightly faster, on a 3.5 or 4mhz Z80 it would slay...


I feel the same way about the SPC700. It's like Sony intended it to do nothing but change notes.


Top
 Profile  
 
PostPosted: Wed Jul 18, 2018 11:15 am 
Offline

Joined: Sun Mar 27, 2011 10:49 am
Posts: 259
Location: Seattle
i mean it was designed primarily to change notes


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 21 posts ]  Go to page Previous  1, 2

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group