It is currently Sat Dec 16, 2017 3:47 pm

All times are UTC - 7 hours



Forum rules


Related:



Post new topic Reply to topic  [ 82 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
Author Message
PostPosted: Wed Apr 12, 2017 11:46 pm 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 262
psycopathicteen wrote:
Nicole wrote:
If you're smart about optimizing the stuff that really has to be, it can certainly save development time.


That's what I've always figured, it just bugs me when people think you can't write code that is both optimized and maintainable under time constraints. I've seen programmers who thought this:

Code:
sep #$20
ror $01
ror $00
ror $01
ror $00
rep #$20
lda $00


was more maintainable than this:

Code:
rep #$20
lda $00
ror
ror
sta $00


simply because optimizations are "risky".

I feel your pain. That is not an optimisation, that is just how you do it.


Top
 Profile  
 
PostPosted: Thu Apr 13, 2017 8:15 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
Another thing that annoys me is when someone dismisses an optimization because "it will only cost 2% of a frame" but they do the same thing 50 times a frame, resulting in slowdown.


Top
 Profile  
 
PostPosted: Thu Apr 13, 2017 10:25 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5899
Location: Canada
psycopathicteen wrote:
That's what I've always figured, it just bugs me when people think you can't write code that is both optimized and maintainable under time constraints. I've seen programmers who thought this:
{contrived example}
was more maintainable than this:
{contrived example}
simply because optimizations are "risky".

This is a big fat straw-man. I can believe you found the former example in your disassembly of some shipped code. I don't believe you have any insight as to why the programmer wrote it that way.

There's certainly reasons to write slower code in service to maintainability, but this example isn't it. You need context to make such a justification. Five lines of assembly code is not a context; a hundred thousand line program that needs to ship by Tuesday might be.

You can't just act like someone deliberately considered those two alternative pieces of code and chose the former. There'a a million ways code gets edited mangled, etc. during production where everything is constantly changing. I think it's preposterous that you propose this was the result of an argument for maintainability.

psycopathicteen wrote:
Another thing that annoys me is when someone dismisses an optimization because "it will only cost 2% of a frame" but they do the same thing 50 times a frame, resulting in slowdown.

Except the example you gave is 0.02% of a frame, not 2%, and you'd have to do the same thing 5000 times, not 50, and that's a real difference.

You're not representing the argument fairly here. I don't dismiss something because "it will only cost 2% of a frame", I was suggesting that optimization should be approached by profiling and working from the top down (example prior discussion), and that finding and fixing a thousand tiny pin pricks might not make the change you've hoped for.

What I really object to is you calling programmers or other labourers stupid or lazy or bad at their job for having written some inefficient code in one place or another. They succeeded at making a game that you liked so much that you're disassembling it 25 years later with tools that are like a microscope compared to their magnifying glass.

There is such a thing as doing a bad job, but making examples of extreme minutiae out of context isn't a good argument for it.


Top
 Profile  
 
PostPosted: Thu Apr 13, 2017 1:30 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
It's not like I'm saying "why don't they unroll every loop in the game?". If I was trying to fix a thousand nitpicks I would never have gotten this far with my homebrew.


Top
 Profile  
 
PostPosted: Thu Apr 13, 2017 2:04 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10166
Location: Rio de Janeiro - Brazil
But you're making assumptions about how the code came to be the way it is. You can't say that the cause for a specific instance of weird/slow code is ignorance, laziness or whatever, because you weren't there. We're seeing this code from a completely different point of view than the people who wrote it.

I for example am not proud of every single line of code I've written when coding professionally, because getting things done on time was more important than writing the best possible code for every little aspect of a program. The exact same happens with commercial games, there's always someone on your back expecting results, so there's hardly any time to look back on stuff that's already working to make improvements.

Now, you appear to be dead set on creating the most optimal SNES game ever, and from the forum topics I remember reading, you've rewritten your sprite system a few hundred times already. Has that gotten you any closer to actually finishing a game? Things are different when coding is a hobby. One could even argue that your hobby is actually optimizing code, not making games. The same thing happens to me, I've rewritten the same systems so many times in search of the best possible solutions it isn't even funny, and while that has been a good exercise in its own right, because it's a hobby of mine, it hasn't gotten me any closer to shipping a finished product. This is a completely different context from that in which the games you're debugging were created, and it's unfair of you to do some of the comparisons you make.

By all means, debug the hell out of them and point out all the weirdness and slowness you can find, that actually helps other coders make better choices in their own projects, but try not to make assumptions about why other programmers did what they did, it just makes you sound pretentious.


Top
 Profile  
 
PostPosted: Thu Apr 13, 2017 2:06 pm 
Offline

Joined: Mon Jul 14, 2008 4:02 pm
Posts: 85
Unrelated, but I rejoice a little every time I see posts by rainwarrior.
Not only because of his avatar picture, but also because they contain
so much truth, sharp analysis and wisdom without ever being
insulting, arrogant or condescending.
My hat's off to you!


Top
 Profile  
 
PostPosted: Thu Apr 13, 2017 2:18 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
My code example didn't really drive my point across, but if I waste 60 cycles doing something, and that routine gets called 100 times, then it takes up 10% of the CPU time, and the 60 cycles wasted in the routine might include the above code.


Quote:
you've rewritten your sprite system a few hundred times already. Has that gotten you any closer to actually finishing a game?


Well, actually it did because I have less limitations to work with than I did back in 2010. I no longer have to manually squeeze sprites into VRAM.


Top
 Profile  
 
PostPosted: Fri Apr 14, 2017 9:44 am 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3162
Location: Nacogdoches, Texas
tokumaru wrote:
you've rewritten your sprite system a few hundred times already

Wow. That's even more than I have! :lol: (I'm actually in the process of rebuilding it again; I had broken routines into a million specialized subroutines that were faster, but I just couldn't keep up with it all and then I started to have to have to chain beq/bne to bra a bunch, which often mitigated whatever speed increase there was. I've also redone my metasprite routine to accept metasprite data outside of bank 0. I had direct page acting as an index register, as at that point I didn't know it was a cycle slower if it wasn't set to multiples of 256.)

psycopathicteen wrote:
I no longer have to manually squeeze sprites into VRAM.

It's because of me, right? :wink:


Top
 Profile  
 
PostPosted: Fri Apr 14, 2017 10:40 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
Espozo wrote:
I'm actually in the process of rebuilding it again; I had broken routines into a million specialized subroutines that were faster, but I just couldn't keep up with it all and then I started to have to have to chain beq/bne to bra a bunch, which often mitigated whatever speed increase there was. I've also redone my metasprite routine to accept metasprite data outside of bank 0. I had direct page acting as an index register, as at that point I didn't know it was a cycle slower if it wasn't set to multiples of 256.


So is this what's going on:

Code:
beq +
jmp over_routine
+;
(routine)
over_routine:
(another_routine)
rts


Then you're going to do this?

Code:
bne +
jsr routine
+;
(another_routine)
rts

(routine)
rts


Top
 Profile  
 
PostPosted: Sat Apr 15, 2017 8:59 am 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3162
Location: Nacogdoches, Texas
Well, what I was referring to specifically was the problem I was having with the vram engine. In order to create the linked list, it would need to know what slot was used previously. To do that, it would look at the entry in the tile request table right before it. I had it find all the 32x32's for a metasprite, then all the 16x16's, as you wouldn't need to do anything to the number you were indexing by. The thing is, I had a 32x32 tile request table, and a 16x16 tile request table, which makes it to where you have to have a different subroutine for every situation; I had one for starting 32x32, one for starting 16x16 (if the metasprite had no 32x32 sprites), one for 32x32 to 16x16, one for continuing 32x32, and one for continuing 16x16. That's 5 different "groups" of code. The alternative is to also store the results for the previous slot as regular variables in ram as well as in the appropriate table. I started feeling like an idiot when I thought about how the speed of the code really wouldn't change much, as one absolute x indexed load is 4 cycles, while a direct page store and load is 6 cycles combined. However, with all the extra branches I had to do, the speed probably evened out, while the code was probably over twice as large. I really wasn't thinking... :lol:


Top
 Profile  
 
PostPosted: Sat Apr 15, 2017 10:38 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
How I did it was something like this:

Code:
jsr find_vram_slot
store initial slot number

loop:
store x/y/attributes on the linked list
branch to end if done

store slot number
jsr find_vram_slot
branch to loop

end:


Top
 Profile  
 
PostPosted: Sat Apr 15, 2017 11:38 am 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3162
Location: Nacogdoches, Texas
Does something as short as "find_vram_slot" really need to be it's own routine?


Top
 Profile  
 
PostPosted: Sat Apr 15, 2017 12:20 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
My routine also sets up DMA.


Top
 Profile  
 
PostPosted: Sat Apr 15, 2017 12:26 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3162
Location: Nacogdoches, Texas
What do you mean by "sets up DMA"?

This is my sprite tile uploading routine. I don't know how you could do any less of this in VBLANK and still be able to do the DMA transfer; all this is really doing is writing to the different DMA registers.

Code:
.proc tile_uploader
  sep #$10
  rep #$20
  lda #$4300
  tcd
  lda #$1801         ;Set DMA mode (word, normal increment) and destination register (VRAM write register)
  sta $00
  sta $10
  sta $20
  sta $30
  ldy #$80
  sty a:$2115
  lda a:TileRequestCounter16x16
  beq tile_uploader_32x32
  ldx #$00

tile_uploader_16x16_loop:
  lda #$0040
  sta $05
  sta $15

;16x16 Top Half
  lda a:TileRequest16x16LoWordTable,x
  sta $02
  clc
  adc #$0040
  sta $12

  lda a:TileRequest16x16BankByteTable,x
  tay
  sty $04
  sty $14

  lda a:TileRequest16x16VramAddressTable,x
  sta a:$2116

  ldy #$01      ;Initiate DMA transfer (channel 0)
  sty a:$420B

;16x16 Bottom Half 
  clc
  adc #$0100
  sta a:$2116

  ldy #$02      ;Initiate DMA transfer (channel 1)
  sty a:$420B

  inx
  inx
  beq tile_uploader_done
  cpx a:TileRequestCounter16x16
  bne tile_uploader_16x16_loop



tile_uploader_32x32:
  lda a:TileRequestCounter32x32
  beq tile_uploader_done
  ldx #$00

tile_uploader_32x32_loop:
  lda #$0080
  sta $05
  sta $15
  sta $25
  sta $35

;32x32 Top Part
  lda a:TileRequest32x32LoWordTable,x
  sta $02
  clc
  adc #$0080
  sta $12
  adc #$0080
  sta $22
  adc #$0080
  sta $32

  lda a:TileRequest32x32BankByteTable,x
  tay
  sty $04
  sty $14
  sty $24
  sty $34

  lda a:TileRequest32x32VramAddressTable,x
  sta a:$2116

  ldy #$01      ;Initiate DMA transfer (channel 0)
  sty a:$420B

;32x32 Upper Middle Part
  clc
  adc #$0100
  sta a:$2116

  ldy #$02      ;Initiate DMA transfer (channel 1)
  sty a:$420B

;32x32 Lower Middle Part
  adc #$0100
  sta a:$2116

  ldy #$04      ;Initiate DMA transfer (channel 2)
  sty a:$420B

;32x32 Bottom Part
  adc #$0100
  sta a:$2116

  ldy #$08      ;Initiate DMA transfer (channel 3)
  sty a:$420B

  inx
  inx
  cpx #$40
  beq tile_uploader_done
  cpx a:TileRequestCounter32x32
  bne tile_uploader_32x32_loop

tile_uploader_done:
  lda #$0000
  tcd
  stz TileRequestCounter16x16
  stz TileRequestCounter32x32
  rts
.endproc


Top
 Profile  
 
PostPosted: Sat Apr 15, 2017 12:41 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
By setting up DMA, I mean building the DMA table.

You know I must say, the reason I dislike liscensed developers is not because they write inefficient code, but because a lot of them acted like know-it-alls in interviews. Treasure's lead programmers didn't say they couldn't get SNES to run Gunstar Heroes because of time constraints, they said because "the SNES's CPU can't handle the action, period" which I know is complete bullshit. Plus the fact that Konami jumped on the bandwagon, and made "Contra Hard Corps" on the Genesis which they've designed to be as CPU efficient as possible (even under time constraints) from the get go, unlike their SNES counterparts where they just threw together any code from NES games or 68000-based arcade games where they converted ASM code line by line, and just did a couple nitpick optimizations at the end.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 82 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group