OAM cycling on hypothetical 15-sprite PPU with X as priority

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: OAM cycling on hypothetical 15-sprite PPU with X as prio

Post by tepples »

psycopathicteen wrote:You would only be allowed one copy of an object at once unless you duplicate code.
Not necessarily. You could, for example, statically allocate up to 6 OAM indices to each of eight actor slots (16-63) and 1 to each of fifteen bullet slots (1-15), leaving sprite 0 open for split use. I'm pretty sure Balloon Fight does something like this.

Code: Select all

  ; Calculate this actor's starting index into OAM
  lda cur_actor
  asl a
  adc cur_actor ; A = cur_actor*3
  asl a
  asl a
  asl a  ; A = cur_actor*24
  adc #64
  tay

  ; Alternate method
  ldx cur_actor
  ldy actor_to_oam_index,x

actor_to_oam_index:
  .byte 64, 88, 112, 136, 160, 184, 208, 232
The advantage of a 1:1 mapping between actor slots and OAM indices is you can save cycles by working more directly with shadow OAM in response to things that rarely change:
  • Only having to change the attribute in shadow OAM when facing direction changes
  • Only having to change the tile number when the object changes to the next cel
  • Only having to change the position when the object moves, especially in non-scrolling games. It'd even be possible to handle camera movement for stationary objects by adding the displacement since last frame to the coordinates of all sprites 1-63.
I was just curious about at what point this simplication to save CPU time overrode the benefits of cycling, and whether that point varied with a PPU's coverage capability and priority policy.
User avatar
rainwarrior
Posts: 8731
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: OAM cycling on hypothetical 15-sprite PPU with X as prio

Post by rainwarrior »

TBH I think the CPU cycles question is more of a red herring.

People don't implement OAM cycling as an intentional tradeoff between performance and the ability to flicker. The cycling is considered a necessity for its visual functionality. Performance is just collateral damage.

People who don't implement OAM cycling aren't doing so to save cycles. They do it because it's simpler to implement. Games you find that do it aren't generally high performance games.

Burger Time is an example of a commercial NES game that doesn't do it:
https://www.youtube.com/watch?v=TcPXTwXKkSE

Why doesn't it do it? Well, there are only 6 enemies allowed at once, and they're all 1 tile wide. This leaves 2 tiles for the chef. The falling buns are sprites but given low priority, allowed to drop out, and rightly so, because they are the least important for gameplay (always below player, quickly return to being a nametable detail after falling). None of this has anything to do with performance, it was written this way because the game's needs were simple, and implementing static OAM is also simple (and has advantageous priority control).
Oziphantom
Posts: 1565
Joined: Tue Feb 07, 2017 2:03 am

Re: OAM cycling on hypothetical 15-sprite PPU with X as prio

Post by Oziphantom »

I think there a few techniques getting lumped under one banner, which is cause multiple parties to get confused and cross point each ;)

Having Entity type X = RAM address Y to me is Static Allocation. Its means you only ever have one active at one time and allocation/update is super fast. So you might have clones in the Entity type table to handle having more than one of the same object. Mario Kart Battle for example, 4 players and 3 balloons, fixed alloc it why bother doing anything else ;)

Having Spawn Entity X @ Sprite Y to me is Fixed Allocation. This is when you walk though your data structure in the level editor and work out where each entity will have its sprites. For example if I was porting Super Mario Bros to the C64, I would use this method to spawn enemies. As you can only walk left and hence the trigger order and number of entities on screen is mostly fixed and known at all points.Allowing me to handle things that will walk off vs something like a hammer bros that is fixed to an area but needs 2 extra for hammers. So I can just put the sprite number in the level data. Zero sorting, zero hunting for allocation and my level editor tool can check and flag instances where I "sprite out". Eats some RAM though as the level data is now larger.

Having an Entity set to a "slot" at allocation by the code, this is Bucket Allocation. The idea is you divide your sprites into groups, 6 sprites for example. Then when you spawn an entity you request the next free group, or next free run of groups if you need n. This speeds up allocation and hunting, and helps combat fragmentation. As you can when you remove one, copy the last sprite into the empty bucket but potentially wastes space, as if you only need 4 sprites you still alloc 6.So you can get the situation where there are enough free sprites but no buckets. You might have different buckets or pools, so bullets will have their allocation, small entities, large entities as needed etc

There are other special cases like Ring Buffer allocation. For example Squid Jump uses a Ring buffer allocate, as the appearance and disappearance are in a fixed order, so I either add at the head or the tail and them remove at the head or the tail ( if you collect a pickup - I just hide the sprite but keep it in the buffer until it is culled as it goes off screen )

As always there are hybrid methods ;) Bullets being Ring Buffer while main entities are bucket, while special case enemies are Static alloc etc

as rainwarrior says, Sprite cycling is a separate problem, done because you need it, or not if you don't. you can alloc your OAM one way as per above but not copy it that way to OAM RAM. So you can add an extra step the shuffle the OAM before DMA eats more clocks, but then having each entity no have to look up where and what order its sprites are in every frame might save you more in the long run. Depends on what your game needs. For example you could store the OAM stripped, so 64 X, 64 Y, 64 Tile, 64 Attribute this way a ent can modify each of them with a single ,x and no + 4 maths to get to the next. Then your code does a LSFR to step through all 64 in a random order as you copy the stripped to OAM format. Or you use a LSFR to change the "bucket" order when you copy etc etc
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: OAM cycling on hypothetical 15-sprite PPU with X as prio

Post by psycopathicteen »

The way I've been doing it is having a routine setting up what order objects are drawn into OAM, and then drawing the sprites in that order. I have 8 priority levels, and within each priority level objects alternate between forward and reverse drawing order.

If I want to add a psuedo 3D level, I would need to think of a more sophisticated priority system though.
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: OAM cycling on hypothetical 15-sprite PPU with X as prio

Post by psycopathicteen »

Here's a question. Is there any correlation between slowdown in games and oam allocation?
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: OAM cycling on hypothetical 15-sprite PPU with X as prio

Post by tokumaru »

If by allocation you mean selecting which slots to use for which sprites, then I don't think it contributes significantly to slowdowns. Processing the metasprite entries themselves might take a good amount time, as the NES isn't particularly good at bulk data processing. Also, having many sprites on screen usually means that there are many active objects, which will definitely contribute to slowdowns.
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: OAM cycling on hypothetical 15-sprite PPU with X as prio

Post by psycopathicteen »

I never knew OAM allocation was even a thing outside of smw hacking. VRAM allocation makes more sense, because, unlike the OAM, you can't update the whole VRAM in one vblank frame. If the SNES had VRAM access during the entire frame, I think I would've just DMAed almost everything onscreen every frame, like what I do with the OAM, except for maybe bullets and explosion frames.
Oziphantom
Posts: 1565
Joined: Tue Feb 07, 2017 2:03 am

Re: OAM cycling on hypothetical 15-sprite PPU with X as prio

Post by Oziphantom »

You probably allocate VRAM about 4 times in a game. If that. Its pretty static, Screens here, Chars there, sprites there. If you change modes then you need to reallocate it but that is about it. Palettes on the other hand probably get moved around a bit. The Sprite tiles you either keep fixed or you have "slots" that you can copy data into, and then frames are copied over the top of the previous frames.
OAM is a constantly changing highly volatile resource that needs constant management to ensure you can alloc resources as you need for a given frame.

Although you are thinking SNES, this is the NES portion to which their VRAM is in ROM and hence very statically allocated ;)
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: OAM cycling on hypothetical 15-sprite PPU with X as prio

Post by psycopathicteen »

Oziphantom wrote:The Sprite tiles you either keep fixed or you have "slots" that you can copy data into, and then frames are copied over the top of the previous frames.
Which can get pretty complicated.
OAM is constantly changing...
Which is why I rebuild OAM every frame.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: OAM cycling on hypothetical 15-sprite PPU with X as prio

Post by tepples »

Oziphantom wrote:Although you are thinking SNES, this is the NES portion to which their VRAM is in ROM and hence very statically allocated ;)
Try Haunted: Halloween '85 once. Run its demo with the PPU viewer open and marvel at how it double-buffers enemy cels.
Post Reply