It is currently Sat Dec 16, 2017 1:55 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 1389 posts ]  Go to page Previous  1 ... 89, 90, 91, 92, 93
Author Message
PostPosted: Wed Oct 18, 2017 9:16 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 6535
Location: Seattle
VRC1, Bisqwit's MMC4 subset (doesn't use tile switching), and NINA-001 seem to all be reasonable ways to get 4+4 CHR banking with PRG banking, for less overhead than MMC1.

Other than the complication of possibly needing to handle interrupting MMC1 writes, I'm not clear the extra 4×(4+2)=24 cycles per bankswitch is particularly worth caring about, though.


Top
 Profile  
 
PostPosted: Wed Oct 18, 2017 9:42 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10166
Location: Rio de Janeiro - Brazil
lidnariq wrote:
I'm not clear the extra 4×(4+2)=24 cycles per bankswitch is particularly worth caring about, though.

Depends on how often you switch banks. If you do it 20 times or more per frame, it really starts to add up! 20 x 24 = 480 cycles, which's 1.7% of the 240 scanlines of picture. And it should actually be a little slower than that, because if you switch banks in the NMI for playing music, for example, the bankswitching routine used in the main thread has to do some additional steps to verify whether it was interrupted.


Top
 Profile  
 
PostPosted: Wed Oct 25, 2017 2:18 pm 
Offline
User avatar

Joined: Thu Apr 23, 2009 11:21 pm
Posts: 807
Location: cypress, texas
unregistered, on top of page 92 wrote:
edit2 20171006: thought it would be good to say that the code I mentioned "posted above" was reduced to
Code:
.rept 256
  .db <$
.endr
because the beginning of the first 15 banks in our game always start at $8000
Just wanted to say that this code was reduced by one more character:
Code:
.rept 256
  .dl $
.endr
that works with asm6 assembler. :) edit: fills a page of ROM with the low byte of the address. This helps implement tokumaru's idea (allows txy using ldy $8000, x and tyx using ldx $8000, y; takes an extra byte when compared with txa tay or tya tax, but tokumaru says his idea doesn't use the accumulator and that may result in saving space when not having to save and restore the accumulator; after rewriting the sections of code where this might be used, it is beneficial for me to use txa tay so haven't been able to use this yet) without using assembly instructions. :mrgreen: :D


Top
 Profile  
 
PostPosted: Thu Oct 26, 2017 10:33 am 
Offline
User avatar

Joined: Thu Apr 23, 2009 11:21 pm
Posts: 807
Location: cypress, texas
tokumaru, on page 43, wrote:
Setting the scroll should ALWAYS be the very last thing in your VBlank handler.
This is not true for our game. :) The end of my VBlank handler looked something like this:
Code:
jsr update_vram

;modify the flag
inc FrameReady

SkipUpdates:
jsr FamiToneUpdate
;"Setting the scroll should ALWAYS be the very last thing in your VBlank handler." -tokumaru pg 43
lda needscroll
beq +end
lda FORWARD_scroll
 beq +
jsr scroll_screen
jmp +end
+ jsr scroll_screen_left

+end ;return from the NMI (vblank)
pla
tax
pla
tay
pla
rti
Now, after being blessed with fixing a lot of the problems we found, I turned on the music and the screen started being lowered two rows (16 bits) whenever draw_our_column was being called. And that reminded me of tepples' guidance, that t gets clobbered everytime $2006 and $2007 are written to and so I should be sure to write $2000 and $2005 afterwards to fix t (his post is linked to at top of page 91). After checking, it was clear that $2000 and $2005 were being written to, inside a scroll_screen, every frame $2006 and $2007 were being written to, somewhere inside update_vram, and so it seemed that jsr FamiToneUpdate was taking too long. So after finding this earlier advice from tokumaru also on page 43 I tried moving jsr FamiToneUpdate to a spot right after +end because it doesn't include PPU operations. The game works and the screens don't lower 16 pixels for instances after draw_our_column is called inside update_vram while the music is playing!! :mrgreen: :D

edit: tokumaru, I understand now that there was a hidden reason behind your words that are quoted above... posted this to help others who read your quoted statement. :) I really respect you tokumaru; thank you so much for all of your fantastic help and ideas! :D

edit3: changed a "to" to "too" and changed text referring to page 92 to refer to page 91. Two mistakes corrected :)


Last edited by unregistered on Fri Oct 27, 2017 8:42 am, edited 2 times in total.

Top
 Profile  
 
PostPosted: Thu Oct 26, 2017 2:05 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10166
Location: Rio de Janeiro - Brazil
Correcting myself: setting the scroll should be the last PPU-related thing in your vblank handler. And you must keep track of how much time your vblank handler takes, because the scroll has to be set before the vertical blank ends. After setting the scroll, you're free to do things that do not affect rendering, such as playing music, reading the controllers, and whatever else you need to do at 60/50Hz.

If you know what you're doing, you can do a number of PPU operations outside of vblank, but if you're not sure, it's better to ask.


Top
 Profile  
 
PostPosted: Fri Nov 17, 2017 2:44 pm 
Offline
User avatar

Joined: Thu Apr 23, 2009 11:21 pm
Posts: 807
Location: cypress, texas
^Thanks tokumaru!! :D

tokumaru, your page of low bytes of the address, it has helped me to save two cycles and a zero page variable! :mrgreen: :D

in some code like this:
Code:
  ;sty ejectLRvalue
  ;code affected by y here
  lda altoX
  ldx FORWARD
  beq +zero
    clc
    adc $8000, y;will add whatever value is in y :) ;was adc ejectLRvalue
    ;rest of addition to 16bit value altoX here
    jmp DrawSprite
  +zero
    sec
    sbc $8000, y ;was sbc ejectLRvalue
    ;rest of subtraction from 16bit value altoX here
DrawSprite:

adc $8000, y takes 4 cycles and is 3 bytes big
adc ejectLRvalue took 3 cycles and was 2 bytes
but, being able to comment the sty ejectLRvalue saved 3 cycles and 2 bytes! :) So the code is the exact same size with the changes and it is 2 cycles faster! :mrgreen: :D (it will never run both sides of the branch)


Top
 Profile  
 
PostPosted: Fri Dec 08, 2017 11:16 am 
Offline
User avatar

Joined: Thu Apr 23, 2009 11:21 pm
Posts: 807
Location: cypress, texas
Code:
0C1FC A6 60             ldx FORWARD
0C1FE 4C 01 C2          jmp +
0C201              .pad $c201
0C201 F0 0F           + beq +zero
You know how branches start with the following address and then add the second byte to reach the address branched to? Like, the beq +zero will branch to 0C212 and so it takes the next address, 0C203, and adds 0F and reaches 0C212. Does jmp act the same way? No, I think, because why would the 6502 waste time thinking about the next address? Just set the PC to the address in bytes three and two. If so, then our game would be frustrated, haha :) , cause it wouldn't really jump anywhere.

But, the jmp, being 3 cycles, is faster than two nops, 4 cycles. For me, it is important that beq +zero not branch to a different page. And, if bytes are ever removed from before 0C1FC, the code would still work fine. :)

edit.


Top
 Profile  
 
PostPosted: Fri Dec 08, 2017 6:35 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10166
Location: Rio de Janeiro - Brazil
unregistered wrote:
You know how branches start with the following address and then add the second byte to reach the address branched to?

That's called relative addressing. Since most conditional jumps don't need to go very far, the 6502 was designed to make these jumps smaller and faster, by encoding the target address as an 8-bit displacement rather than a 16-bit absolute address. The downside of that is that when you do have to conditionally jump to an address that's far away, you have to branch (testing for the opposite condition) over a JMP instruction.

Quote:
Does jmp act the same way? No, I think, because why would the 6502 waste time thinking about the next address? Just set the PC to the address in bytes three and two.

JMP uses absolute addressing, so the actual destination address is the instruction's operand.

Quote:
For me, it is important that beq +zero not branch to a different page.

People normally use macros to ensure that no unwanted page-crossing happens... In ca65 it's trivial to write a macro that checks whether a branch instruction's address is in the same page as its destination address, and throws an error/warning if it isn't.

I personally tend to put code with this kind of sensitive timing near the start of the bank, where it's easier to predict where things will end up, and there are less things changing and shifting addresses.


Top
 Profile  
 
PostPosted: Mon Dec 11, 2017 2:11 pm 
Offline
User avatar

Joined: Thu Apr 23, 2009 11:21 pm
Posts: 807
Location: cypress, texas
Relative and Absloute addressing... thank you tokumaru, I forgot about those terms. :oops: A macro that gives an error if the branch crosses a page? That's interesting; thanks for introducing me to that! :) The assembler always gives an error if the code before a .pad overflows the pad address... so that has been working well for me. :)

Branches used at the start of pages is a great idea, that's what I try to do too; this branch is in the middle of my main loop so it just needs to be moved to the next page. My knowledge of code that needs to have perfect timing is pretty vague. It really helps sometimes when branches are moved/changed to prevent page-crossings so all of my branches are imprisoned within their respective pages. After doing that for a while it's not hard, or painful, and just part of the journey. :)


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 1389 posts ]  Go to page Previous  1 ... 89, 90, 91, 92, 93

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group