It is currently Fri Dec 14, 2018 5:02 pm

All times are UTC - 7 hours



Forum rules





Post new topic Reply to topic  [ 4 posts ] 
Author Message
PostPosted: Sat Aug 18, 2018 6:38 pm 
Offline

Joined: Sun Dec 18, 2016 1:11 pm
Posts: 24
So how does the PPU actually convert sprites and background into pixels? Does it directly loop through each individual pixel on each line, then loop through tiles/sprites to calculate the final pixel that should end up in that spot? Or does it loop through each tile or sprite, draw a row of pixels all at once (assuming there isn't a pixel there already)? Both of those sound like they'd be a very intensive process, especially for such old hardware; there's just so many pixels (and layers!). Not to even mention the
Code:
byte or word of VRAM data -> bits -> pixel value + palette -> final color
conversion, which sounds time-consuming as well considering the sheer amount of pixels on screen. So how is everything blitted onto the screen so efficiently?

Also, clipping: transparent pixels. Is it as simple as
Code:
if (pixel_value != 0) draw_pixel() else continue
? Retro Game Mechanics Explained mentions clipping behavior here (without going into any real detail), it seems to decide which sprite pixels to render and which to "clip away" in a convoluted way.

Finally, compositing (semi-related to above) is something I'm wondering about as well. According to the video I linked above, the various types of graphics are rendered separately. Are objects and background layers all rendered and stored into their own personal "buffers" before compositing? If so, do the "high-priority" bg tiles and sprites have their own separate buffers from low-priority, or are they separated from lower-priority tiles in a different way?

It is also not clear to me whether the compositing process is done per-pixel one after the other, or if each scanline is rendered in it's entirety before being composited together.

Can someone help me understand, at least on a slightly lower level than the video I linked, how does the "internal logic" of clipping, drawing and compositing everything work, and how is it so efficient?


Top
 Profile  
 
PostPosted: Sat Aug 18, 2018 6:56 pm 
Online

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20876
Location: NE Indiana, USA (NTSC)
The S-PPU is in fact too big to fit on one chip using the standard cell process Ricoh had in 1990. So during hblank time, the sprite tiles are retrieved from video memory and composited to a 256x1-pixel buffer in one chip. During draw time, it feeds a stream of sprite pixels to the other PPU and clears the buffer while scanning OAM for sprites to be retrieved next hblank. This other PPU reads tilemap and CHR for four backgrounds, decodes them on the fly through four shift registers, and feeds them and the sprite stream pixel-by-pixel into a 5-input priority encoder. (For comparison, the NES's sprite unit is eight shift registers and an 8-input priority encoder, followed by a 2-input priority encoder to combine them with the background.) This priority encoder is double-pumped, meaning it can return two results per pixel, to be displayed either in the left and right halves of a pixel (hi-res backgrounds) or blended using addition, average, or subtraction.

Sega instead chose to keep its Genesis VDP in one chip, which limited the palette memory it could address, and pushed more advanced features like texture compression and affine mapping out to a video coprocessor that was delayed until the Sega CD.


Top
 Profile  
 
PostPosted: Sun Aug 19, 2018 1:29 am 
Offline

Joined: Fri Feb 16, 2018 5:52 am
Posts: 29
Location: Ukraine
tepples wrote:
This priority encoder is double-pumped, meaning it can return two results per pixel

Mode 7 also uses double-pumped for get two results of the multiplication or uses two separate multipliers?

Thanks, very helpful information.


Top
 Profile  
 
PostPosted: Sun Aug 19, 2018 5:17 am 
Offline
User avatar

Joined: Mon Jan 23, 2006 7:47 am
Posts: 145
ittyBittyByte wrote:
So how does the PPU actually convert sprites and background into pixels? Does it directly loop through each individual pixel on each line, then loop through tiles/sprites to calculate the final pixel that should end up in that spot? Or does it loop through each tile or sprite, draw a row of pixels all at once (assuming there isn't a pixel there already)? Both of those sound like they'd be a very intensive process, especially for such old hardware; there's just so many pixels (and layers!). Not to even mention the byte or word of VRAM data -> bits -> pixel value + palette -> final color conversion, which sounds time-consuming as well considering the sheer amount of pixels on screen. So how is everything blitted onto the screen so efficiently?
There is no framebuffer holding the rendered picture. Each pair of the 512 pixels per line is rendered and output on-the-fly as the electron beam scans over the screen.

Background tiles are loaded from VRAM a few tiles in advance. The sprites that are visible on a line are determined outside of HBLANK and, as tepples said, loaded and rendered into a line buffer (ordered by their index, not their priority) during HBLANK.

ittyBittyByte wrote:
Also, clipping: transparent pixels. Is it as simple as if (pixel_value != 0) draw_pixel() else continue? Retro Game Mechanics Explained mentions clipping behavior here (without going into any real detail), it seems to decide which sprite pixels to render and which to "clip away" in a convoluted way.
Clipping is not transparency. The latter means that the output of one background layer renderer controls if this result makes it into the final output. Clipping on the other hand makes the pixel output black; it basically controls if the colors loaded from CGRAM are blocked or not.

ittyBittyByte wrote:
how is it so efficient?
The amount of data to process is much less than what is usual today: The graphics are mostly just indices into a palette. Transparency is just a palette index of zero - very easy to do. Tiles are composed out of bitplanes (the 0th, 1st, 2nd, ... bits of a pixel are stored together), so for the programmer expanding a 3bpp tile from ROM to a 4bpp tile in VRAM is as easy as adding one byte per 8-pixel row.

The graphics system is basically just advancing counters and shifting groups of bits around. When speed is important, things are done in parallel by duplicated hardware. For example the 4 background layer pixels for the current screen position are most likely calculated all at once.

Remember that the hardware is basically just one big circuit. The bits of the data stored in hardware registers are hooked up directly to switches (transistors) that open and close certain lines, directing the power flowing from VCC to ground while the system has some time (one of the two clock phases of a master clock cycle) to stabilizes the voltages. This is how the BG pixel combiner could look like:

EDIT: incorrect drawing removed


Attachments:
BG combiner (v2).pdf [78.4 KiB]
Downloaded 103 times
Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group