It is currently Mon Oct 21, 2019 2:18 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Mon Mar 11, 2019 4:53 pm 
Offline

Joined: Sun Jan 31, 2016 9:55 pm
Posts: 54
I'm wanting to stretch the picture 2x in the vertical direction. Assuming I can write during hblanks, I figured it would work to set the scroll every other scanline:

scanline 0, set to y scroll to 0
0
1, set to 1
1
2, set to 2
2
...

I could do a 2006, 2005, 2005, 2006 series of writes for each of those, but I'm hoping it's not necessary.
Some details:
- My x scroll is always 0
- The nametable number is 0

The best I've found is:
Code:
lda #0
sta 2006
sta 2005
lda #y
sta 2005
lda #(y & $f8) << 2
sta 2006


I noticed that y scroll is incremented only if rendering is enabled. Is it possible to disable rendering just before the end of the scanline, then re-enable it? Alternately, can a shorter series of writes set y in this way?


Top
 Profile  
 
PostPosted: Mon Mar 11, 2019 5:02 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8626
Location: Seattle
A 2006/2006 write will preserve fine X, allow setting coarse X and coarse Y, and will allow setting the bottom two bits of fine Y and clear the uppermost one. So ... if it were useful to switch between 6/6 and 6/5/5/6 you could, but oh well.

If you're having problems with this effect, check your exact timing using Mesen's "Event Viewer".

russellsprouts wrote:
Is it possible to disable rendering just before the end of the scanline, then re-enable it?
Only if you aren't using sprites. Also, what would that achieve?


Top
 Profile  
 
PostPosted: Mon Mar 11, 2019 6:07 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 11416
Location: Rio de Janeiro - Brazil
russellsprouts wrote:
I could do a 2006, 2005, 2005, 2006 series of writes for each of those, but I'm hoping it's not necessary.

The $2006/5/5/6 sequence is necessary because of the fine Y scroll. Since you're doubling every pixel (i.e. covering all fine Y values), you'll definitely need to use this sequence.

Quote:
Code:
lda #0
sta 2006
sta 2005
lda #y
sta 2005
lda #(y & $f8) << 2
sta 2006

Keep in mind that $2006 and $2005 share the toggle that defines whether a write is the first or the second, so the values for each write should actually be the following:

$2006: high byte of VRAM address;
$2005: Y scroll;
$2005: X scroll;
$2006: low byte of VRAM address;

Notice how X and Y are apparently swapped, because the first $2006 write changed the toggle to "second write", so the following $2005 write is expecting the Y scroll, not X. Then the toggle goes back to "first write", and you can write the X scroll to $2005, and finalize with the low byte of the VRAM address, which would be in the format YYYXXXXX. Your code is mostly correct, you should just swap the $2005 writes (Y goes first).

Quote:
Is it possible to disable rendering just before the end of the scanline, then re-enable it?

Disabling and re-enabling rendering mid-frame comes with lots of annoyances (it kills sprites, there will be jitter if there's picture near the place where rendering is turned off/on, etc.), but this wouldn't work for your purposes anyway, because then the X scroll wouldn't be reset for the next scanline (this happens around the same time as Y is incremented).

Your best bet is to go with the $2006/5/5/6 sequence. It doesn't interfere with the rendering process nearly as much as turning rendering off/on, and since your X scroll is 0, you can do the first 3 writes before the scanline ends, and only the final $2006 write has to fall within hblank. You basically have to make sure that one cycle (the last cycle of a write instruction is when the write actually takes place) lands in a ~20 cycle window (the part of hblank when it's safe to change the scroll)... Not difficult to time at all, even considering the latency of responding to the NMI (which introduces up to 7 cycles of jitter).

If you time this right (be sure to count scanlines as if they took 114, 114, 113 cycles, to approximate the real length of 113.66666 cycles without accumulating error that'd throw off the timing of the lower writes - if you plan to support PAL, you'll need extra logic to approximate the scanline length of 106.5625 cycles), you can do this completely glitch-free.


Top
 Profile  
 
PostPosted: Mon Mar 11, 2019 8:27 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 11416
Location: Rio de Janeiro - Brazil
Well, I was kinda bored and thought that coding this would be a fun exercise, so here's what I came up with (completely untested, so proceed with caution if you decide to use this for anything!).

I wanted to do something that supported NTSC and PAL, and since the fractional part of the scanline length is different for each region, we need a variable to remember what that fraction of a cycle is, depending on the region of the console:

Code:
   ;initialize the fraction according to the region
   ;(if we scale the fraction up by 256, a byte overflow will
   ;indicate when the error becomes larger than a whole unit)
   lda #170 ;an NTSC scanline is 113.666 cycles: 0.666 * 256 = 170
   ;lda #144 ;a PAL scanline is 106.5625 cycles: 0.5625 * 256 = 144
   sta Fraction

Here I just hardcoded the NTSC value and commented the PAL value out, but ideally you'd detect the region during startup (as in this wiki article) and select the correct value dynamically.

Then comes the actual code that runs during rendering. You must make sure that this code is consistently timed to run at the beginning of rendering every frame. You can do this by using a constant-timed NMI handler (usually not very easy!), or by waiting for the sprite 0 hit or sprite overflow flags to be cleared (which happens at the end of vblank - if you arranged for them to get set in the previous frame, of course), among other options. All variables are in ZP for faster access, and the code should be aligned to a memory page so that branches won't suffer the penalty of crossing pages, as that'd screw up the timing. The loop is timed so it takes 113 cycles on NTSC and 106 cycles on PAL, plus an extra cycle whenever the fractional error overflows, keeping the scanline counting consistent all throughout the frame.

Code:
   ;initialize all variables for the frame
   lda #$00
   sta ScrollY
   sta Error ;accumulated error of fractional time
   sta Increment ;bit 0 of this decides whether the scroll is incremented
   lda #$78 ;end after 120 scanlines have been doubled
   sta EndScanline

   ;ADD TIMED CODE HERE TO ALIGN THE CODE BELOW TO THE FIRST HBLANK! THIS TIME MAY
   ;NEED TO BE DIFFERENT FOR PAL AND NTSC, DEPENDING ON THE POINT OF REFERENCE.

DoScanline:

   ;wait according to the region (8 cycles if PAL, 15 cycles if NTSC)
   ;(an NTSC scanline is 7 whole cycles longer than a PAL
   ;scanline, and this difference is handled here)
   lda Fraction ;3
   cmp #170 ;2
   bcc HandleFraction ;2(NTSC)|3(PAL)
   clc ;2
   clc ;2
   clc ;2
   clc ;2

HandleFraction:

   ;compensate for the fractional error (8 or 9 cycles)
   ;(an overflow means that the fractional error has accumulated into
   ;a whole cycle, so an extra cycle must be wasted in that case)
   adc Error ;3
   sta Error ;3
   bcs KillTime ;2(no overflow)|3(overflow)

KillTime:

   ;kill some time (41 cycles)
   ;(the actual work we need to do doesn't take a
   ;whole scanline, so we need to pad the time)
   ldx #$07 ;2
Wait:
   dex ;2
   bne Wait ;3

   ;set the scroll (25 cycles)
   stx $2006 ;4
   lda ScrollY ;3
   sta $2005 ;4
   and #%11111000 ;2
   asl ;2
   asl ;2
   stx $2005 ;4
   sta $2006 ;4

   ;increment Y every other scanline (18 cycles)
   ;(by toggling the lower bit of the increment every frame,
   ;we alternate between adding 0 and adding 1 to the scroll)
   lda Increment ;3
   and #%00000001 ;2
   clc ;2
   adc ScrollY ;3
   sta ScrollY ;3
   inc Increment ;5

   ;decide whether to do another scanline (6 cycles)
   cmp EndScanline ;3
   bne DoScanline ;3

BTW, this will consume 100% of the CPU time, so it will not be possible to do any gameplay while this effect is on, unless the whole logic is so simple that it fits in vblank (I wouldn't expect anything more complex than pong to be possible).


Top
 Profile  
 
PostPosted: Tue Mar 12, 2019 12:20 am 
Offline
User avatar

Joined: Fri Nov 12, 2004 2:49 pm
Posts: 7744
Location: Chexbres, VD, Switzerland
What about
Code:
ldy #y
lda #0
sta $2005
sty $2005
sta $2005
lda #(y & $f8) << 2
sta $2006

First $2006 write is largely unnecessary and needs not to be computed if you can discard it. That being said I didn't understand your exact problem. If you're disabling rendering, nothing will be displayed so you shouldn't care about the scroll value, right ?


Last edited by Bregalad on Tue Mar 12, 2019 7:17 am, edited 1 time in total.

Top
 Profile  
 
PostPosted: Tue Mar 12, 2019 4:38 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 11416
Location: Rio de Janeiro - Brazil
Bregalad wrote:
First $2006 write is largely unnecessary and needs not to be computed if you can discard it.

Actually, it serves the purpose of selecting the name table. Since the name table remains constant, you could indeed skip that write if you set the name table beforehand, but I don't see any advantage if the scroll change itself is not any faster, since you still have 4 register writes. You're just creating the need to set the name table elsewhere, wasting time and space. Also, I don't know why you're writing Y before X...

Quote:
That being said I didn't understand your exact problem. If you're disabling rendering, nothing will be displayed so you shouldn't care about the scroll value, right ?

I don't think he is disabling rendering, he just assumed he could briefly disable rendering before the end of the scanline in order to skip the automatic Y increment, thus doubling scanlines, but things aren't that simple.


Top
 Profile  
 
PostPosted: Tue Mar 12, 2019 7:20 am 
Offline
User avatar

Joined: Fri Nov 12, 2004 2:49 pm
Posts: 7744
Location: Chexbres, VD, Switzerland
tokumaru wrote:
I don't see any advantage if the scroll change itself is not any faster, since you still have 4 register writes. You're just creating the need to set the name table elsewhere, wasting time and space. Also, I don't know why you're writing Y before X...

Well, he'd need to write to $2000 during VBlank anyway to set sprite size, and pattern tables for BG and Sprites and to enable the next VBlank so... I don't think that's wasting any time nor space.
As for the order I just copied his wrong order (from the OP) but now I edited my post so that it's correct.

Quote:
I don't think he is disabling rendering, he just assumed he could briefly disable rendering before the end of the scanline in order to skip the automatic Y increment, thus doubling scanlines, but things aren't that simple.

Ah OK I didn't understand that part. Well it's not possible to do so I guess but what happens when you try ? Even if it worked the timing would be extremely sensitive.


Top
 Profile  
 
PostPosted: Tue Mar 12, 2019 10:35 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 11416
Location: Rio de Janeiro - Brazil
Bregalad wrote:
Well, he'd need to write to $2000 during VBlank anyway to set sprite size, and pattern tables for BG and Sprites and to enable the next VBlank so...

OK, It would be good practice to have a $2000 write during vblank anyway. Another advantage of doing it your way (i.e. not resetting the NT bits every scanline) is that you are able to display images that cross the name table boundary (the picture can start on either NT and cross to the other one).

Quote:
Ah OK I didn't understand that part. Well it's not possible to do so I guess but what happens when you try ? Even if it worked the timing would be extremely sensitive.

I think Y won't be incremented, but X won't be reset either. Maybe that won't matter if horizontal mirroring is used? Or maybe it matters anyway because the PPU fetches 34 tiles per frame and not 32? I don't know. Things would definitely look jittery near the end of the scanline if you had anything but color 0 there. I just don't think this is a good idea.

EDIT: crossed out an erroneous assumption.


Last edited by tokumaru on Tue Mar 12, 2019 1:31 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Tue Mar 12, 2019 1:07 pm 
Offline

Joined: Sun Jan 31, 2016 9:55 pm
Posts: 54
Yeah, it looks like 2006 2005 2005 2006 is the best option.

It's not particularly important for my use case that the scanlines be exactly in order. I could use 2x2006 writes and ignore the contents of scanlines I can't set (ones that look like xxxxx1xx)

0 0 1 1 2 2 3 3 8 8 9 9 10 10 11 16 16 17 17 18 18 19 19 24 24...

But that would also require writing 3 scanlines in a row for the transitions from 3 to 8, for example:

2, set to 2
2
3, set to 3
3, set to 8
8, set to 8
8
9, set to 9
etc.

I probably wouldn't end up saving any time that way, with the overhead of (for example) setting an MMC3 interrupt to trigger the next writes.


Top
 Profile  
 
PostPosted: Tue Mar 12, 2019 1:30 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 11416
Location: Rio de Janeiro - Brazil
russellsprouts wrote:
Yeah, it looks like 2006 2005 2005 2006 is the best option.

Like Bregalad pointed out, using $2005/5/5/6 would work just as well. I made the wrong assumption that it would be possible to allow crossing a NT boundary, but I was wrong, on the final $2006 write the original NT index will be copied over from t to v, so you're restricted to a single NT anyway, unless the first $2006 write uses a dynamic value rather than a constant 0. This is probably not important though, so either sequence is fine for your purpose as long as the NT is set *somewhere*.

Quote:
I could use 2x2006 writes and ignore the contents of scanlines I can't set (ones that look like xxxxx1xx)

You could, but IMO it sounds like way more trouble to manage a NT with gaps (specially if the NT contents are dynamic, as opposed to pre-computed), than just doing a little bit shifting for that final $2006 write.

Quote:
I probably wouldn't end up saving any time that way, with the overhead of (for example) setting an MMC3 interrupt to trigger the next writes.

If saving time is important to you (i.e. you need to run game logic along with this), then you absolutely need mapper IRQs instead of timed code. But even then, firing IRQs every other scanline will eat a significant portion of the CPU time, considering the time it takes to get in and out of the IRQ handler, backing up and restoring registers, setting up the next IRQ, changing the scroll, calculating the new values for next time, and so on... You'll probably end up with only 50% or so of the CPU time available for game logic.


Top
 Profile  
 
PostPosted: Tue Mar 12, 2019 3:01 pm 
Offline

Joined: Sun Jan 31, 2016 9:55 pm
Posts: 54
My idea was to make a game with a monochrome bitmap display, but make each pixel of the bitmap be 2x2 hardware pixels, to decrease memory requirements, cut down on artifacts, and let me update more of the screen at once. I would use the palette trick to make each tile cover 8x8 hardware pixels on the left of the screen, and 8x8 pixels on the right, and use a raster effect to vertically stretch the image to get double the area as well. The display would be something like 240x224=53570 hardware pixels, but each 16 byte tile would cover 256 hardware pixels on screen, so it would require less than one pattern table to fill the screen.

But it seems like it would take up too much of the frame time to accomplish the raster effect, so probably it would be better to just double the writes to CHR-RAM. Each time I get one byte of pixel values, I would do sta $2007 twice to write it to two rows of the tile. I would have to use both pattern tables then, and swap midscreen.

Side note, has anyone considered doing 4x4 virtual pixels? Maybe it's too limiting to make a game in a 64x60 resolution, but you could get a 64x60 display with 2-bit color, and each 4x4 virtual pixel area could be a different palette. It would be like a zoomed in view of the NES with fewer attribute restrictions. In this case, the pattern table would be a fixed set of 256 tiles -- each combination of 4 colors for the 4 pixels. The nametable would be the bitmap data. Maybe I should try this before doing 2x2 virtual pixels.


Top
 Profile  
 
PostPosted: Tue Mar 12, 2019 3:32 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8626
Location: Seattle
russellsprouts wrote:
Side note, has anyone considered doing 4x4 virtual pixels? Maybe it's too limiting to make a game in a 64x60 resolution, but you could get a 64x60 display with 2-bit color, and each 4x4 virtual pixel area could be a different palette.
The Apple 2, TMS9918 (e.g. MSX, SG-1000, ColecoVision), and Intellivision each supported a video mode that worked like this. I don't think it was ever widely used. (sample)

More recently, pubby's F-FF homebrew (n.b. preview is deceptive) is a racer that ran in this logical mode.


Top
 Profile  
 
PostPosted: Tue Mar 12, 2019 3:54 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 11416
Location: Rio de Janeiro - Brazil
If your goal is to have a 128x120 monochrome mode on the NES, a better alternative would be to use predefined tiles with 4x2 patterns repeated on the top and bottom halves of 256 tiles. Then, using vertical mirroring (name tables side by side), draw one row of tiles on the left name table, then the next row on the second name table, and keep alternating name tables all the way down.

The advantage is that you only need to change the scroll every 8 scanlines, as opposed to every other scanline, and the scroll change itself is much simpler: just one write to $2000 to select the alternate name table, no need to mess with $2005/6.

The disadvantage is that you can't scroll horizontally, but apparently you don't need that anyway. Updating the whole screen would take a lot of time though, because it's a total of 1920 bytes. This means your game has to work with incremental updates, or you may need to use 4-screen/no mirroring in order to double buffer screens. You may also need forced blanking to update more tiles per frame.


Top
 Profile  
 
PostPosted: Tue Mar 12, 2019 4:41 pm 
Offline

Joined: Sun Jan 31, 2016 9:55 pm
Posts: 54
That makes a lot more sense, thanks. I was trying to figure out how to use fixed pattern tables, but I didn't think about this method.

So each tile would look like the attached image. Call the tile here %1000000, because that's the pattern of the 8 virtual pixels on the top half. The pattern is repeated below the line.

So first we need to set y scroll to 4, so that the nametable swaps take place in the middle of the tiles instead of on the edge. Then swap the nametables like this:

Scanlines 0-7: nametable 0
Scanlines 8-15: nametable 1
Scanlines 16-23: nametable 0
...

You're right that it's 1920 bytes to update the screen, but that's the theoretical minimum for a full bitmap screen anyway.


Attachments:
tile-idea.png
tile-idea.png [ 214 Bytes | Viewed 2205 times ]
Top
 Profile  
 
PostPosted: Tue Mar 12, 2019 5:30 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 11416
Location: Rio de Janeiro - Brazil
Yup, that's exactly it. One thing I didn't explain well is that if you alternate name tables every 2 rows, instead of every row, you can change the scroll every 8 pixels, because you can render the screen like this:

*set scroll to (0, 4)*
NT0, tile row 0, bottom pattern;
NT0, tile row 1, top pattern;
*$2000 write to select right NT*
NT1, tile row 1, bottom pattern;
NT1, tile row 2, top pattern;
*$2000 write to select left NT*
NT0, tile row 2, bottom pattern;
NT0, tile row 3, top pattern;
etc.

EDIT: just noticed you got this part right too, even though I didn't explain it properly before.

Anyway, this is perfectly doable with MMC3 IRQs without wasting much CPU time. And the timing can be very loose too, you have the entire scanline to do that $2000 write, because the NT bit will only take effect when the PPU copies t to v at the end of the scanline.


Last edited by tokumaru on Tue Mar 12, 2019 5:42 pm, edited 1 time in total.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 16 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group