It is currently Thu Dec 14, 2017 8:12 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 8 posts ] 
Author Message
 Post subject: Is this info correct?
PostPosted: Wed Sep 21, 2016 7:52 am 
Offline

Joined: Wed Feb 17, 2010 5:42 pm
Posts: 359
Location: Denine's Devil Mansion
I was looking at good practices on the wiki when I spotted Use Jump tables with RTS instruction instead of JMP indirect instruction incorrectly page and it says.
Quote:
Savings : 4 bytes, 1 cycle.

Now, isnt that wrong? 4 bytes is correct, but there is no saving in cycles.
In fact, the alternate piece of code is 1 cycle slower, not faster. If thats the case, this optimisation should be in Optimise code size at the expense of cycles part of the page.

Sorry if this is in wrong subforum.


Top
 Profile  
 
PostPosted: Wed Sep 21, 2016 8:24 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19343
Location: NE Indiana, USA (NTSC)
To report problems with the text of a single article, I recommend editing the article's talk page. I've replied there.


Top
 Profile  
 
PostPosted: Wed Sep 21, 2016 9:10 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
Assuming all the variables are absolute addresses (i.e. not in ZP):
Code:
  ldx JumpEntry        ; aeXXXX   / 4 cycles
  lda PointerTableL,X  ; bdXXXX   / 4 cycles (5 if crosses page)
  sta Temp             ; 8dXXXX   / 4 cycles
  lda PointerTableH,X  ; bdXXXX   / 4 cycles (5 if crosses page)
  sta Temp+1           ; 8dXXXX   / 4 cycles
  jmp [Temp]           ; 6cXXXX   / 5 cycles
                       ; ===================
                       ; 18 bytes / 25 to 27 cycles

Code:
  ldx JumpEntry        ; aeXXXX   / 4 cycles
  lda PointerTableH,X  ; bdXXXX   / 4 cycles (5 if crosses page)
  pha                  ; 48       / 3 cycles
  lda PointerTableL,X  ; bdXXXX   / 4 cycles (5 if crosses page)
  pha                  ; 48       / 3 cycles
  rts                  ; 60       / 6 cycles
                       ; ===================
                       ; 12 bytes / 24 to 26 cycles

The situation changes if Temp is in ZP. The odds of JumpEntry or PointerTable{H,L} being in ZP is extremely low given the entire point of the routine/goal (this is all going to be in ROM :-) ).


Top
 Profile  
 
PostPosted: Wed Sep 21, 2016 9:40 am 
Offline

Joined: Wed Feb 17, 2010 5:42 pm
Posts: 359
Location: Denine's Devil Mansion
@tepples
I'm sorry. Next time, i'll do it there.

@koitsu
Ahh, I see. I always have temps in ZP, and didn't think about having these in non-ZP.

Sorry for the commotion, will use talk page next time.


Top
 Profile  
 
PostPosted: Wed Sep 21, 2016 10:36 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5898
Location: Canada
!

Ha ha wow I feel like I've been lied to. All this time I've been using RTS for jump tables because I thought it was universally better, even though I hate building tables with the -1 (feels like obfuscation).

I'm surprised though; I thought I'd counted out the difference at some point, but I guess I really hadn't.


Top
 Profile  
 
PostPosted: Wed Sep 21, 2016 11:12 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19343
Location: NE Indiana, USA (NTSC)
You may have counted out the cycle difference, except you assumed the jump target variable was outside zero page.

Less likely, it was between the RTS trick and moving one of your arrays out of zero page to make room for the jump target variables for main, NMI, and/or IRQ.


Top
 Profile  
 
PostPosted: Wed Sep 21, 2016 11:28 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5898
Location: Canada
No, it was none of those things.


Top
 Profile  
 
PostPosted: Wed Sep 21, 2016 12:25 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10164
Location: Rio de Janeiro - Brazil
I tend not to use RTS calling trick by default, since I already noticed It's 1 cycle slower, but I do use it when there's something to gain elsewhere. I've used it for calling VRAM update routines, for example, so that each routine could directly call the next one as fast as possible, due to the addresses being set up beforehand (I don't use this method for VRAM updates anymore though). Another case where calling with RTS makes a lot of sense is when the address was obtained from the stack after a JSR, since it's already "-1 adjusted".


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group