It is currently Sat Oct 21, 2017 1:14 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 34 posts ]  Go to page 1, 2, 3  Next
Author Message
 Post subject: Line plotting engine
PostPosted: Thu Dec 29, 2011 11:20 am 
Offline

Joined: Mon Nov 27, 2006 11:34 pm
Posts: 31
Location: NYC
I've been working on a little line plotting engine for a few days, after having abandoned it since August. Here's the current test version: http://hcs64.com/files/vectest3.nes

The square is intentionally writing over the triangle, I'm trying to demonstrate that it is actually overwriting it every frame. The triangle can be redrawn by pressing select. If you press start you can manually rotate the square with the D-pad. It hits around 5 or 6 FPS with this scene.

12x21 tiles of CHR-RAM are used for each frame. Each tile is used twice; the screen is split so that the left half only displays the low bitplane and the right half the high, this is simply accomplished with attribute tables and palettes. This gives a 192x168 pixel area where we can render arbitrary 1-bit images.

The main trouble is accessing CHR-RAM. As it can only be safely read and written during NMI, I came up with a scheme to allow the main code to generate a "display list" which would be run during the next NMI. This is written to a ring buffer, which at present takes up almost half of RAM.

I made a decision early on not to use WRAM, so it is not possible to store the bitmap in CPU-accessible RAM for buffering purposes. Thus the NMI routine must:
- set the VRAM address
- read the currently set pixels
- perform the set or clear operation
- store this somewhere temporarily (since VRAM is accessed serially)
- set the address back to the start
- write the stored data back

There are quite a few optimizations on this theme. For instance, if we are updating a dirty tile we can update bits and perform the necessary copy simultaneously. I have a system that tracks for each tile whether it was updated on the even or odd frame, and how many primitives currently occupy that tile. that when the last prim is removed I can do a fast clear of the tile.

My end goal for the moment is to get a bit more speed (already 30% faster than two days ago) and do something in the style of the old Windows Mystify screensaver.

Source for NESHLA is at https://github.com/hcs64/Nestify/

---
Current ROM http://hcs64.com/files/nestify7.nes


Last edited by hcs on Fri Jan 27, 2012 12:35 am, edited 4 times in total.

Top
 Profile  
 
 Post subject:
PostPosted: Thu Dec 29, 2011 1:54 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19113
Location: NE Indiana, USA (NTSC)
Wireframe with dirty tiles? Sounds like Ian Bell's tank demo.

Mystify screensaver? Sounds like Qix.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Dec 29, 2011 2:17 pm 
Offline

Joined: Mon Nov 27, 2006 11:34 pm
Posts: 31
Location: NYC
Cool, I didn't realize there was source available for the Tank Demo. I'll hold off on looking at it for a while, but it should be fun to see what I've missed.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Dec 29, 2011 7:26 pm 
Offline

Joined: Mon Nov 27, 2006 11:34 pm
Posts: 31
Location: NYC
Very little progress with speed, but I do have the full Mystify effect running:
http://hcs64.com/files/nestify1.nes

Pretty awful speed, gets under 1 FPS regularly. But I did what I set out to do lo those months ago.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Dec 29, 2011 7:57 pm 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 2962
Location: Tampere, Finland
Not bad for not using WRAM (pretty sure Tank Demo does).

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: kkfos.aspekt.fi


Top
 Profile  
 
 Post subject:
PostPosted: Thu Dec 29, 2011 8:02 pm 
Offline

Joined: Mon Nov 27, 2006 11:34 pm
Posts: 31
Location: NYC
Nope, though Elite uses WRAM the Tank Demo does not. Though it also doesn't persist stuff across frames, which is where a lot of the unavoidable CHRRAM access comes from.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Dec 29, 2011 8:54 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19113
Location: NE Indiana, USA (NTSC)
You technically don't need to persist stuff across frames if you just redraw the four polygons from scratch every frame.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Dec 30, 2011 1:51 am 
Offline

Joined: Mon Nov 27, 2006 11:34 pm
Posts: 31
Location: NYC
The "lines erase each other" behavior is part of the effect I'm going for.

If I wanted non-interfering lines I'd have to be a lot smarter than I am now to redraw without having the last 3 polygons still have to pull from CHRRAM to merge blocks. So unless I can draw the 4 lines of each side simultaneously (which is worth looking into for other uses) I don't get much advantage, and additionally I'd need to redraw these 16 long lines each frame. Real wireframe has the advantage of fewer intersections (and near-intersections), and the majority (corners) are completely predictable, so when drawing you can hold onto the corner between lines. Which is something I ought also to do, but corners are just a small part of the performance issues here.

Which is not to say that I shouldn't do all this stuff much faster than it is currently done.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Dec 30, 2011 4:05 am 
Offline
User avatar

Joined: Sat Sep 22, 2007 3:42 pm
Posts: 65
Cool. Do you actually plot dots, or is it tile swapping?

I want to see some line vector cube, hidden surface! Add some sprites and music too. :)


Top
 Profile  
 
 Post subject:
PostPosted: Fri Dec 30, 2011 6:43 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19113
Location: NE Indiana, USA (NTSC)
What the tank demo does is store a copy of the nametable in RAM and then use that for allocating tiles in each frame. Before plotting each pixel, it looks in the RAM copy of the nametable to check whether the tiles are already allocated, and if not, allocates the next numbered tile and writes it back to the nametable. Then during what appears to be forced blanking, it rewrites the nametable and all used tiles using some sort of double buffering technique.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Dec 30, 2011 9:07 am 
Offline

Joined: Mon Nov 27, 2006 11:34 pm
Posts: 31
Location: NYC
Kreese wrote:
Cool. Do you actually plot dots, or is it tile swapping?
I want to see some line vector cube, hidden surface! Add some sprites and music too. :)

It's all naive Bresenham dot drawing, which is again part of the problem. I'd like to flesh it out more fully (some music would be fun), but it's already running way too slow at the moment. I don't think sprites would mix well with the ridiculous framerate, either. I considered a cube but it just didn't seem that exciting.

tepples wrote:
What the tank demo does is store a copy of the nametable in RAM and then use that for allocating tiles in each frame. Before plotting each pixel, it looks in the RAM copy of the nametable to check whether the tiles are already allocated, and if not, allocates the next numbered tile and writes it back to the nametable. Then during what appears to be forced blanking, it rewrites the nametable and all used tiles using some sort of double buffering technique.

I implemented it with a totally static nametable on the assumption that it would just be more trouble to reallocate everything. Maybe I need to reconsider that.

My current scheme is to ship tiles out as soon as possible, but that is counterproductive when I just need to read them in again. It may make sense to ditch the code-based ring buffer for a tile- (or line-)based LRU, and have the eviction and writing (until the end of the frame) handled directly during vblank. This would make more efficient use of memory, but at the cost of more precious vblank cycles. I don't like the potential for thrashing, though it can't be much worse than what I have now, and when under the cache size it would be well better.

Maybe I should go further in the direction of cycle counting and allow the line drawing to happen opportunistically in vblank? So many choices...


Top
 Profile  
 
 Post subject:
PostPosted: Mon Jan 02, 2012 5:52 am 
Offline

Joined: Mon Nov 27, 2006 11:34 pm
Posts: 31
Location: NYC
Got another 10% or so out of it with a simplified dlist buffering strategy. This way, whatever is available with the NMI hits gets used, and a BRK is used to easily tell how far it was able to execute. Still a shambling monstrosity.

http://hcs64.com/files/nestify2.nes

Interestingly, it is actually slower (4%) on the rotating square benchmark. This is probably due to it wasting time once the vblank has filled up, whereas it used to be able to run ahead and start filling the next vblank. It does generally run faster on Mystify, more testing shows by 13%. It seems like I need to bring back the old multiple buffering while still allowing an incomplete dlist to run.

---

All around faster with a more flexible buffering method.
http://hcs64.com/files/nestify3.nes


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 03, 2012 5:02 pm 
Offline

Joined: Mon Nov 27, 2006 11:34 pm
Posts: 31
Location: NYC
Just when I was about to give up, fixed some ugly bugs (should always use bcc/bcs for unsigned compares) for both stability and speed, it's almost tolerable now. Also made some modifications to the scheme I was using for opportunistic processing, and turned off the PPU while updating to be safe and not screw everything up if I wind up a few cycles over.

http://hcs64.com/files/nestify4.nes


Top
 Profile  
 
 Post subject:
PostPosted: Wed Jan 04, 2012 4:02 am 
Offline
User avatar

Joined: Fri Nov 12, 2004 2:49 pm
Posts: 7232
Location: Chexbres, VD, Switzerland
Nice,
what are the differences between this and the "Tank Demo by Ian Bell" that already exists on the NESdev main page ? I think your demoes "only" updates the pattern table while Ian Bell's updated both pattern and name table, but I'm not so sure how they made it that fast.

_________________
Life is complex: it has both real and imaginary components.


Top
 Profile  
 
 Post subject:
PostPosted: Wed Jan 04, 2012 7:04 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19113
Location: NE Indiana, USA (NTSC)
The secret is that the tank demo's tiles are double buffered, and any blank tile is never written to VRAM.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 34 posts ]  Go to page 1, 2, 3  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group