VRAM copy in WRAM

Discussion of programming and development for the original Game Boy and Game Boy Color.
Post Reply
gnarlyWarlock
Posts: 18
Joined: Fri Jul 24, 2015 1:30 pm

VRAM copy in WRAM

Post by gnarlyWarlock »

Hello. Initially making my game, I realized I ran out of VRAM space as it was taken up by all the main character sprite variations (standing, walking, etc.). It only Made sense to have a working copy in WRAM that would be uploaded at VBlank. However, I'm fairly sure I simply ran out of time, as I've noticed, the larger the pool of that data I was trying to copy, the harder my game would lag. DMA exists for OAM, however, no such thing for VRAM (I don't get it, but w/e).

After all my actual game calculations though, I'd go into a loop waiting for VBlank. In that loop, I'd attempt to send data in Mode 0 (Hblank). It seemed the game would run at a fine speed, however, I'd get my sprite to wobble, not collide properly and the screen would sometimes not scroll (which made some sense, since part of the frame on screen was adhering to calculations from the previous frame, whilst the other to the current calculations).

Thus, I got no clue on what to try next. Any tips?? Thanks
adam_smasher
Posts: 271
Joined: Sun Mar 27, 2011 10:49 am
Location: Victoria, BC

Re: VRAM copy in WRAM

Post by adam_smasher »

Unfortunately there's probably not an especially good answer for you, only some hard truths: vblank is short, the GB is slow, graphics are big. If you want to do anything interesting, your vblank routine needs to be as absolutely tight as possible (almost definitely in assembly and not C), and there's a very real limit to the possible. If you post some code we might be able to make some suggestions.

That said, a couple of things about your post caught my eye:

1) why keep a copy of the sprite data in WRAM? Are you decompressing it? WRAM is no faster than cart ROM, and definitely less plentiful.

2) In general, your main game loop should update an internal state of the game and your vblank should copy that information to the screen; inefficiencies in the latter shouldn't cause logic bugs in the former. When you talk about things not colliding properly and such, it doesn't quite sound like you're doing that, which suggests to me that maybe your vblank handler and the interaction it has with the rest of your game isn't necessarily thought out well enough.
gnarlyWarlock
Posts: 18
Joined: Fri Jul 24, 2015 1:30 pm

Re: VRAM copy in WRAM

Post by gnarlyWarlock »

Code: Select all

	
@@VBlank:
	@WaitLoop:
		ldh A, ($44)	;load in the value from the LCDC Y-coordinate
		cp $91			;mask that checks if the processor is in mode 1 (V-blank)
		jr nz, @WaitLoop
		
		push AF
		
		;;;check if we are in the intialization routine; if we are, simply return. Otherwise, do the proper VRAM updates
		;;;;check for starter coordinates in WRAM - if bits 0, 1, 2 are loaded at $FF40 (LCD) we know that this is the after intialization.
		ldh A, ($40)
		and $07
		cp $07
		jr nz, @finishVBlank
		
		ld A, $C0			;SET A TO OAM START
		call @@SpriteOAM	;Do sprite updates, the OAM table at the bottom of WRAM (don't have to call the macro, the subroutine's already in HRAM)
		
		;load VRAM copy from WRAM...........
		ld BC, $20
		ld DE, MainSpr_NL_WRAM
		ld HL, $8000
		call @@CpyVRAMData	;again, reusing this subroutine
		
		ld A, 1
		ldh (vblank), A
		
		pop AF
		reti
		
		@finishVBlank: ;used at initialization
		pop AF
		ret
		
the routine called inside is this

Code: Select all

@@CpyVRAMData:
	@VRAMDataLdLoop:
		ld A, (DE)
		inc DE
		ld (HL+), A
		dec BC
		ld A, C
		or B
		jr nz, @VRAMDataLdLoop
	ret		
The reason I keep a copy of VRAM in WRAM is because I want to be able to update my sprite tiles dynamically, without wasting space in VRAM. You see, I'm using 8x16 mode, and my sprite is 16x16, so technically, every sprite "frame" (standing looking upwards, standing sideways, etc.) takes up 4 tiles. SO, I can only have 32 tiles in VRAM at once. I have 20 frames of just my sprite in different poses, and 10 others are of their weapon in different animation frames. SO, I have no space for anything else in VRAM. The way I was trying to do it is - set the main sprite to point at tiles 0 and 2, ALWAYS, and then every cycle, I'd upload the new sprite stance into WRAM from ROM, and then at VBlank transfer it to ROM, so instead of taking up 20*4 = 80 TILES in VRAM, he'd always only take 4. In order to do taht, I need to have a copy of VRAM in WRAM that I can alter at any point, and then upload it in VBlank. This would be no problem on the SNES, since there's DMA for VRAM. Not in the GB - \_(o_O")_/

PS: this is the "original" version. At the moment I've modified the first few lines to this:

Code: Select all

ld BC, $400
ld DE, MainSpr_NL_WRAM
ld HL, $8000
@WaitLoop:
ldh A, ($41) 
and ($03) ;this checks if the mode it's in is Mode 0, or HBlank. So while waiting for VBlank, I can still copy data...this works but not really
jr nz, @doneHBlankCopy
ld A, C
or B
jr z, @doneHBlankCopy
ld A, (DE)
inc DE
ld (HL+), A
dec BC
@doneHblankCopy:
ldh A, ($44)	;load in the value from the LCDC Y-coordinate
cp $91			;mask that checks if the processor is in mode 1 (V-blank)
jr nz, @WaitLoop
adam_smasher
Posts: 271
Joined: Sun Mar 27, 2011 10:49 am
Location: Victoria, BC

Re: VRAM copy in WRAM

Post by adam_smasher »

I'd strongly advise against trying to transfer during hblank - on the GB, almost all of your processing time happens while the screen's drawing. If you're spinwaiting for hblank, you're wasting almost all of your CPU time.

Doing the math, aside from the OAM DMA transfer, your vblank is taking around...1200 cycles, roughly? Which is still only about quarter of the actual length (4560 cycles) - you should be pretty comfortable. And now that I think about it, if you ran out of vblank time the result wouldn't be lag in your game - it would be corrupt graphics as you try to write to the screen while VRAM's locked and writes are blocked. That means that it's actually your main game loop that's taking too long. What are you doing in there?

Anyway, if you're keen, here are some nitpicks I have with your vblank handler:

First of all, why do you push/pop AF? Your wait loop is trashing them anyway.

Code: Select all

      ;;;check if we are in the intialization routine; if we are, simply return. Otherwise, do the proper VRAM updates
      ;;;;check for starter coordinates in WRAM - if bits 0, 1, 2 are loaded at $FF40 (LCD) we know that this is the after intialization.
      ldh A, ($40)
      and $07
      cp $07
      jr nz, @finishVBlank
Why are you waiting for vblank during initialization? Why do you have the screen on during initialization? If the screen's off vblank shouldn't fire...although I'm not sure if LY updates regardless, I use the method here to wait for vblank which doesn't involve spinning on LY, and which I recommend. At any rate, if you check this BEFORE you wait for vblank you can return right away, meaning that you don't need to do this test in vblank proper, saving you 36 cycles.

Code: Select all

      ld BC, $20
...
   @VRAMDataLdLoop:
      ld A, (DE)
      inc DE
      ld (HL+), A
      dec BC
      ld A, C
      or B
      jr nz, @VRAMDataLdLoop
Why are you using all 16 bits of BC if you're only copying 32 bytes? If you do this instead:

Code: Select all

      ld B, $20
...
   @VRAMDataLdLoop:
      ld A, (DE)
      inc DE
      ld (HL+), A
      dec B
      jr nz, @VRAMDataLdLoop
...you save 4 + (12 * 20) = 244 cycles.

EDIT: Also, a bit more food for thought. If you make sure that your source data in VRAMDataLdLoop never crosses a 256 byte boundary, you can change that inc DE to inc E, saving you an additonal 4 cycles per iteration. Again, that's probably not necessary here, but thinking like that is always good with the GB.
gnarlyWarlock
Posts: 18
Joined: Fri Jul 24, 2015 1:30 pm

Re: VRAM copy in WRAM

Post by gnarlyWarlock »

Thanks for the reply. However...

I DO transfer during HBlank, which you did condon. That brings me to my next point - I can't use HALT as H-Blank isn't treated as an interrupt. That loop looks for both V-Blank and H-Blank, so i don't see the point of using halt.

But thanks for the tip - I don't see corrupt graphics per se, however, bear in mind that's cause my main sprite tiles are occupying the first 4 tiles. Maybe it runs out of time whilst it's on the 143th tile or something, and that's not being used?

The push and pop of AF was totally useless though LOL definitely cut those out.

Also, I was trying to transfer a new copy occupying the entire VRAM memory - that's 800 bytes. That definitely crosses the 256 byte boundary thus I use BC as a counter.

In my game loop, I first calculate any changes in the main sprite's location, just x/y coordinates. Afterwards I check for collision (where I do some weird calculations, but it's not too hefty). Then, IF my character was attacking (I have a variable if he's in the attack animation, if so, input processing is skipped entirely) which isn't too hefty. That's it. I seriously doubt it's anywhere NEAR 66000 cycles.
adam_smasher
Posts: 271
Joined: Sun Mar 27, 2011 10:49 am
Location: Victoria, BC

Re: VRAM copy in WRAM

Post by adam_smasher »

Naw naw naw, I meant don't transfer during hblank. It's extremely short - only around 200 cycles - so you'll barely increase the number of bytes transferred, and you're burning through the time you should be spending on your game logic waiting for it. In fact, by my math, with all the overhead in your loop, you're only going to get one or two bytes per hblank in (there's also a race condition, in that you might detect that you're in hblank, but by the time you actually get to write your byte you're out of it and the write will break).

I'm still not exactly sure what you're trying to do - are you drawing your weapon tile data over top of the player tile data in WRAM? Why not use separate sprites for the player and the weapon? Then you could just blast tile data from ROM to VRAM directly. That's substantially less than 800 bytes to upload per vblank (aside: "a new copy occupying the entire VRAM memory - that's 800 bytes"; the entire VRAM is actually 8K (8192), and the tile data specifically is 6K) and would probably make your game loop nicer too.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: VRAM copy in WRAM

Post by tepples »

adam_smasher wrote:Why not use separate sprites for the player and the weapon?
Probably so as not to exceed 10 sprites per scanline, 40 sprites per scene, and 2048 bytes of sprite VRAM.
gnarlyWarlock
Posts: 18
Joined: Fri Jul 24, 2015 1:30 pm

Re: VRAM copy in WRAM

Post by gnarlyWarlock »

adam_smasher wrote:Naw naw naw, I meant don't transfer during hblank. It's extremely short - only around 200 cycles - so you'll barely increase the number of bytes transferred, and you're burning through the time you should be spending on your game logic waiting for it. In fact, by my math, with all the overhead in your loop, you're only going to get one or two bytes per hblank in (there's also a race condition, in that you might detect that you're in hblank, but by the time you actually get to write your byte you're out of it and the write will break).

I'm still not exactly sure what you're trying to do - are you drawing your weapon tile data over top of the player tile data in WRAM? Why not use separate sprites for the player and the weapon? Then you could just blast tile data from ROM to VRAM directly. That's substantially less than 800 bytes to upload per vblank (aside: "a new copy occupying the entire VRAM memory - that's 800 bytes"; the entire VRAM is actually 8K (8192), and the tile data specifically is 6K) and would probably make your game loop nicer too.
Lol sorry I misread. Alright, I'll take out the H-Blank spin; still though, my game would engage in it only if it's done with the calculations. Ill go through it with the debugger tonight and see how many times it steps through that H/V-Blank loop, cause, right, if it's done with calculations and there's time before V-Blank, surely it will copy a few bytes?

Also, I meant $800 bytes, my bad. Which in decimal, as tepples said above, would be 2048 bytes (I wish it was 8k). $8000-$8800 (at least in my implementation) is sprite tile data. I have it set in $FF40 that my background tiles start at $8800+ so that area is irrelevant in terms of sprite manipulation.
Post Reply