Line plotting engine

A place where you can keep others updated about your NES-related projects through screenshots, videos or information in general.

Moderator: Moderators

hcs
Posts: 31
Joined: Mon Nov 27, 2006 11:34 pm
Location: NYC
Contact:

Line plotting engine

Post by hcs »

I've been working on a little line plotting engine for a few days, after having abandoned it since August. Here's the current test version: http://hcs64.com/files/vectest3.nes

The square is intentionally writing over the triangle, I'm trying to demonstrate that it is actually overwriting it every frame. The triangle can be redrawn by pressing select. If you press start you can manually rotate the square with the D-pad. It hits around 5 or 6 FPS with this scene.

12x21 tiles of CHR-RAM are used for each frame. Each tile is used twice; the screen is split so that the left half only displays the low bitplane and the right half the high, this is simply accomplished with attribute tables and palettes. This gives a 192x168 pixel area where we can render arbitrary 1-bit images.

The main trouble is accessing CHR-RAM. As it can only be safely read and written during NMI, I came up with a scheme to allow the main code to generate a "display list" which would be run during the next NMI. This is written to a ring buffer, which at present takes up almost half of RAM.

I made a decision early on not to use WRAM, so it is not possible to store the bitmap in CPU-accessible RAM for buffering purposes. Thus the NMI routine must:
- set the VRAM address
- read the currently set pixels
- perform the set or clear operation
- store this somewhere temporarily (since VRAM is accessed serially)
- set the address back to the start
- write the stored data back

There are quite a few optimizations on this theme. For instance, if we are updating a dirty tile we can update bits and perform the necessary copy simultaneously. I have a system that tracks for each tile whether it was updated on the even or odd frame, and how many primitives currently occupy that tile. that when the last prim is removed I can do a fast clear of the tile.

My end goal for the moment is to get a bit more speed (already 30% faster than two days ago) and do something in the style of the old Windows Mystify screensaver.

Source for NESHLA is at https://github.com/hcs64/Nestify/

---
Current ROM http://hcs64.com/files/nestify7.nes
Last edited by hcs on Fri Jan 27, 2012 12:35 am, edited 4 times in total.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

Wireframe with dirty tiles? Sounds like Ian Bell's tank demo.

Mystify screensaver? Sounds like Qix.
hcs
Posts: 31
Joined: Mon Nov 27, 2006 11:34 pm
Location: NYC
Contact:

Post by hcs »

Cool, I didn't realize there was source available for the Tank Demo. I'll hold off on looking at it for a while, but it should be fun to see what I've missed.
hcs
Posts: 31
Joined: Mon Nov 27, 2006 11:34 pm
Location: NYC
Contact:

Post by hcs »

Very little progress with speed, but I do have the full Mystify effect running:
http://hcs64.com/files/nestify1.nes

Pretty awful speed, gets under 1 FPS regularly. But I did what I set out to do lo those months ago.
User avatar
thefox
Posts: 3134
Joined: Mon Jan 03, 2005 10:36 am
Location: 🇫🇮
Contact:

Post by thefox »

Not bad for not using WRAM (pretty sure Tank Demo does).
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
hcs
Posts: 31
Joined: Mon Nov 27, 2006 11:34 pm
Location: NYC
Contact:

Post by hcs »

Nope, though Elite uses WRAM the Tank Demo does not. Though it also doesn't persist stuff across frames, which is where a lot of the unavoidable CHRRAM access comes from.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

You technically don't need to persist stuff across frames if you just redraw the four polygons from scratch every frame.
hcs
Posts: 31
Joined: Mon Nov 27, 2006 11:34 pm
Location: NYC
Contact:

Post by hcs »

The "lines erase each other" behavior is part of the effect I'm going for.

If I wanted non-interfering lines I'd have to be a lot smarter than I am now to redraw without having the last 3 polygons still have to pull from CHRRAM to merge blocks. So unless I can draw the 4 lines of each side simultaneously (which is worth looking into for other uses) I don't get much advantage, and additionally I'd need to redraw these 16 long lines each frame. Real wireframe has the advantage of fewer intersections (and near-intersections), and the majority (corners) are completely predictable, so when drawing you can hold onto the corner between lines. Which is something I ought also to do, but corners are just a small part of the performance issues here.

Which is not to say that I shouldn't do all this stuff much faster than it is currently done.
User avatar
Kreese
Posts: 65
Joined: Sat Sep 22, 2007 3:42 pm

Post by Kreese »

Cool. Do you actually plot dots, or is it tile swapping?

I want to see some line vector cube, hidden surface! Add some sprites and music too. :)
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

What the tank demo does is store a copy of the nametable in RAM and then use that for allocating tiles in each frame. Before plotting each pixel, it looks in the RAM copy of the nametable to check whether the tiles are already allocated, and if not, allocates the next numbered tile and writes it back to the nametable. Then during what appears to be forced blanking, it rewrites the nametable and all used tiles using some sort of double buffering technique.
hcs
Posts: 31
Joined: Mon Nov 27, 2006 11:34 pm
Location: NYC
Contact:

Post by hcs »

Kreese wrote:Cool. Do you actually plot dots, or is it tile swapping?
I want to see some line vector cube, hidden surface! Add some sprites and music too. :)
It's all naive Bresenham dot drawing, which is again part of the problem. I'd like to flesh it out more fully (some music would be fun), but it's already running way too slow at the moment. I don't think sprites would mix well with the ridiculous framerate, either. I considered a cube but it just didn't seem that exciting.
tepples wrote:What the tank demo does is store a copy of the nametable in RAM and then use that for allocating tiles in each frame. Before plotting each pixel, it looks in the RAM copy of the nametable to check whether the tiles are already allocated, and if not, allocates the next numbered tile and writes it back to the nametable. Then during what appears to be forced blanking, it rewrites the nametable and all used tiles using some sort of double buffering technique.
I implemented it with a totally static nametable on the assumption that it would just be more trouble to reallocate everything. Maybe I need to reconsider that.

My current scheme is to ship tiles out as soon as possible, but that is counterproductive when I just need to read them in again. It may make sense to ditch the code-based ring buffer for a tile- (or line-)based LRU, and have the eviction and writing (until the end of the frame) handled directly during vblank. This would make more efficient use of memory, but at the cost of more precious vblank cycles. I don't like the potential for thrashing, though it can't be much worse than what I have now, and when under the cache size it would be well better.

Maybe I should go further in the direction of cycle counting and allow the line drawing to happen opportunistically in vblank? So many choices...
hcs
Posts: 31
Joined: Mon Nov 27, 2006 11:34 pm
Location: NYC
Contact:

Post by hcs »

Got another 10% or so out of it with a simplified dlist buffering strategy. This way, whatever is available with the NMI hits gets used, and a BRK is used to easily tell how far it was able to execute. Still a shambling monstrosity.

http://hcs64.com/files/nestify2.nes

Interestingly, it is actually slower (4%) on the rotating square benchmark. This is probably due to it wasting time once the vblank has filled up, whereas it used to be able to run ahead and start filling the next vblank. It does generally run faster on Mystify, more testing shows by 13%. It seems like I need to bring back the old multiple buffering while still allowing an incomplete dlist to run.

---

All around faster with a more flexible buffering method.
http://hcs64.com/files/nestify3.nes
hcs
Posts: 31
Joined: Mon Nov 27, 2006 11:34 pm
Location: NYC
Contact:

Post by hcs »

Just when I was about to give up, fixed some ugly bugs (should always use bcc/bcs for unsigned compares) for both stability and speed, it's almost tolerable now. Also made some modifications to the scheme I was using for opportunistic processing, and turned off the PPU while updating to be safe and not screw everything up if I wind up a few cycles over.

http://hcs64.com/files/nestify4.nes
User avatar
Bregalad
Posts: 8055
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Post by Bregalad »

Nice,
what are the differences between this and the "Tank Demo by Ian Bell" that already exists on the NESdev main page ? I think your demoes "only" updates the pattern table while Ian Bell's updated both pattern and name table, but I'm not so sure how they made it that fast.
Useless, lumbering half-wits don't scare us.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

The secret is that the tank demo's tiles are double buffered, and any blank tile is never written to VRAM.
Post Reply