It is currently Thu Aug 22, 2019 6:46 am

All times are UTC - 7 hours



Forum rules





Post new topic Reply to topic  [ 23 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Tue Jun 11, 2019 11:30 pm 
Offline
Formerly Espozo
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3501
Location: Richmond, Virginia
The FastRom patch thread had got me thinking about this; what I'm thinking of, is a demo with 128, 16x16 blocks that bounce off the sides of the screen and change color when they overlap (trying to keep this as simple as possible). Just speculating, is there any hope this would ever run at 60fps, even with nothing else going on? (Is there a similar demo on similarly powerful hardware?) It's entirely impossible you could achieve this level of performance by checking every block against every other block, because that would end up being 128 * 127 = 16,256 comparisons...

I have heard people suggesting using sorting when dealing with huge numbers of checks, but other than the obvious problem of having to sort the bounding boxes first, I don't know how you would apply this to both axis. You could still check whatever objects overlap on the sorted axis "regularly" though, and it should still be much faster than no sorting at all.

(Also, not relevant to this particular case, but, unless I'm not thinking straight, another problem with this approach would be dealing with bounding boxes that are different sizes, as it's possible one bounding box could entirely fit inside another one on that axis, in which case you can't really determine the order.)

The only other option I'm aware of, is to break the screen into quadrants, find what bounding boxes occupy what quadrants, and only perform checks for a bounding box for what quadrants it's in. I'm not really sure what the ideal size is; too small, and you'll end up with too many instances of bounding boxes spanning multiple quadrants (and too many quadrants to look through, depending on how this is implemented I suppose), but too large, and you'll end up doing to many collision checks in each quadrant.

Does anyone know what is actually used in practice, through looking at source code, looking at / creating disassemblies, etc.?


Top
 Profile  
 
PostPosted: Wed Jun 12, 2019 12:10 am 
Offline

Joined: Tue Oct 06, 2015 10:16 am
Posts: 965
I would do that via a grid-linked-list, either 2d if they are all the same size or 1d if they vary. However no matter what the structure was, that's probably too much for SNES to handle.


Top
 Profile  
 
PostPosted: Wed Jun 12, 2019 7:17 am 
Offline

Joined: Sun May 11, 2014 8:36 am
Posts: 94
Location: France
Why do you want to do that?
Well, I tried to do it because I had a free morning.

On the other hand, you can not do this in a frame, on the SNES you have about 45000 cycles / frame (and about 5000 cycles for the VBlank).
So I had to do the test on a lot of frame, so there's a little latency between the collision and the display (where if the speed is too big too).

I 1024 collision test per frame, maybe with a grid in this particular case we could only do 1024 test, but good to be honest I do not see much interest to continue, impossible to make a game with so much constraint

Screen :
Image


Attachments:
main.smc [512 KiB]
Downloaded 89 times
Top
 Profile  
 
PostPosted: Wed Jun 12, 2019 8:53 am 
Offline
User avatar

Joined: Fri Nov 24, 2017 2:40 pm
Posts: 170
I used to write physics engines for a living, and this can be a fairly endlessly difficult problem if you want a super-duper-generic-and-scalable solution.

The most important step you can do first is to split your data into separate types that don't collide. (ex: moving/non-moving, player/enemy) I've often seen people trying to implement complicated solutions to collide a player against a list of bullets. You can't really improve the speed of colliding one object against many (that's already just O(n)).

The easiest solution to implement is probably 1D sort and sweep. (Example Code) Sort the objects on the axis you think they will be most spread out on (probably horizontally). Use insertion sort as the sorting will change little from frame to frame and you'll quickly settle into an average runtime of O(n). To collide the objects, start at the beginning (ex: leftmost side) of the list, and check collisions with objects until the bounds don't overlap anymore (ex: the right side of the current object is less than the left side of the next object to be checked for collisions). If your objects are well distributed it will run in ~O(n) time, but will end up as O(n^2) if they get too bunched up. 2D sort and sweep can help with that, but isn't so trivial to implement.

Another option that might work well on the SNES is a spatial grid or spatial hash like Calima suggested. Basically split the screen up into a grid, and either directly back each of the cells with 2D array, or hash the cell coordinates into a 1D array. Each cell contains a list of objects that are in the cell. The cell size needs to be chosen experimentally, but a good place to start is 2x the size of the average object. This algorithm scales much better than 1D sort and sweep, but is a lot more work to implement correctly. While it has an average runtime of O(n), it has a HUGE constant cost which will probably make it perform poorly with fewer than thousands of objects. On the upper end, it let me kinda-realtime simulate 20,000 particles 10+ years ago on a Core2 Duo: https://www.youtube.com/watch?v=dD-um_8KqpE

The algorithm I use in Chipmunk2D nowadays is an axis aligned bounding box tree + temporal coherence tidbits. Basically a binary tree whose nodes are smaller and smaller bounding boxes, and the leaves are objects you want to collide. The tree is kept (roughly) balanced by reinserting objects when their bounds change. To keep from rebuilding the tree every frame, objects are given bounding boxes that are slightly larger than their real size and also extruded slightly in their direction of movement. These are the bounds stored in the leaves, and only need to be reinserted in the tree when the actual object exits the extended bounds. To keep from having to traverse the tree every frame to find overlapping leaves, a set of colliding pairs is maintained in a separate structure during tree insertion. So to run the collision detection, it needs update the bounds of all the objects, reinsert a few of them in the tree, and iterate the colliding pairs set. This scales really well to thousands of objects even when there is no information about their size or distribution.


Top
 Profile  
 
PostPosted: Wed Jun 12, 2019 9:10 am 
Offline
User avatar

Joined: Mon Jan 23, 2006 7:47 am
Posts: 201
Location: Germany
slembcke wrote:
it let me kinda-realtime simulate 20,000 particles 10+ years ago on a Core2 Duo: https://www.youtube.com/watch?v=dD-um_8KqpE

Reminds me of Agenda. :)

_________________
My current setup:
Super Famicom ("2/1/3" SNS-CPU-GPM-02) → SCART → OSSC → StarTech USB3HDCAP → AmaRecTV 3.10


Top
 Profile  
 
PostPosted: Wed Jun 12, 2019 9:17 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2885
Quote:
128 * 127 = 16,256 comparisons


It's more like 8128 comparisons, because half of the comparisons are redundant.

Quote:
I would do that via a grid-linked-list, either 2d if they are all the same size or 1d if they vary.


What about sprites that are overlapping two regions at once?

@Kannagi
I'm surprised you were able to whip up a demo so fast.


Top
 Profile  
 
PostPosted: Wed Jun 12, 2019 11:31 am 
Offline

Joined: Tue Oct 06, 2015 10:16 am
Posts: 965
You make a rule, like "top left corner determines the cell they go in". Then if the object being checked spans two to four cells, you check objects in those cells too.


Top
 Profile  
 
PostPosted: Wed Jun 12, 2019 11:52 am 
Offline

Joined: Sun May 11, 2014 8:36 am
Posts: 94
Location: France
@psycopathicteen
I developed an engine for SNES with optimized function / macro ;)
So I can easily make demos quickly ^^

Anyway, whatever the chosen algorithm, it would be necessary to sort quickly and well and to have only 8 elements on average to test for each object (or 16, but it will be necessary to perform a one test out of two).
In any case, good luck, but a manic shooter is possible without necessarily going through such an extreme test of 128 * 127


Top
 Profile  
 
PostPosted: Wed Jun 12, 2019 10:41 pm 
Offline
Formerly Espozo
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3501
Location: Richmond, Virginia
Very cool demo! :) I could use frame advance, but I'm lazy, what is the frame rate? I was expecting it to be really sluggish, but it looked pretty smooth (although the sprites are moving fairly slow).

slembcke wrote:
I used to write physics engines for a living, and this can be a fairly endlessly difficult problem if you want a super-duper-generic-and-scalable solution.

Yeah; this is useless for a game, because there are too little assumptions that have to be made (all objects are the same size, objects never spawn or are destroyed, etc.) and no game would ever have to do this, but I thought this might make a good demo idea. It's impressive when you know how much work the CPU has to do, and it's also visually appealing.

I don't think collision detection beyond "check every object against every other object" is totally out of the question for the SNES though; it's not impossible you could have a 2 player shooter where you have to check 32 player projectiles against 32 enemies (or some other similar configuration) in an extreme case. Of course, there will likely still be lag no matter what you do from having to process the rest of the game. I think I like the idea of the 1D sort and sweep for something like this; that's probably right where you have to many objects for "regular" collision detection but not so many that will justify the overhead of a more advanced algorithm.


Top
 Profile  
 
PostPosted: Thu Jun 13, 2019 12:51 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 7565
Location: Canada
There's a lot of academic literature out there on this subject. A lot of techniques in modern use aren't that different from things you might have used 30 years ago too.

The most important concept seems to be "broad phase" vs "narrow phase". In the broad phase you do some coarse approximation of collision to collect smaller groups of objects that are close together, and then in the narrow phase you do accurate collision tests between the objects in the small group only.

I think most of the techniques for broad phase have been mentioned already, but to summarize some of them:
  • "Sweep and prune" keeps sorted lists of objects on each axis (or possibly just one axis). This makes selecting nearby objects within a range quick to do, since they will be adjacent in the list.
  • Grid based methods collect objects into spaces of a coarse grid.
  • Tree methods divide space into a searchable tree. Nearby objects will end up sharing a common branch in the tree. (e.g. BSP, Quadtree)

All of these have tons of variations and tweaks, and each method has different useful properties. A grid approach might require a relatively large memory buffer, but selecting an element from the grid might be very fast. Trees tend to be very good for sparsely populated spaces, especially in 3D spaces where grids start to take up a ton of memory. If most of your action occurs on a single axis, a sweep and prune list for one axis can be extremely effective.

I found this powerpoint while looking for some summary I could link to. It seems to give a lot of relevant terms at least, though it is light on description of them. A good starting point reference:
https://studylib.net/doc/5711955/collis ... structures


Top
 Profile  
 
PostPosted: Thu Jun 13, 2019 12:59 am 
Offline

Joined: Sun May 11, 2014 8:36 am
Posts: 94
Location: France
Thanks, the demo runs at 60 FPS, but the collisions are them at 4 FPS.
So there is a certain latency between the display / movement of the bullet(60FPS) and the collisions (4FPS).

It's more feasible to do 32 bullets * 32 enemies, that's 1024 test, exactly what I can manage in one frame ;)
If you handle collisions on 30 FPS then a game is feasible.

But personally for my shmup, I prefer this configuration:
24 bullets ship
72 bullets ennemy
12 ennemy

It's 113 sprites (and I'll have sprites for the power up / effect) so I'll have 128 sprites in total (I'm forced to use meta sprite).
And I think I can reduce the number of ship bullets to add to enemy bullets.


Top
 Profile  
 
PostPosted: Thu Jun 13, 2019 7:39 am 
Offline

Joined: Thu Apr 18, 2019 9:13 am
Posts: 160
slembcke wrote:
Another option that might work well on the SNES is a spatial grid or spatial hash like Calima suggested. Basically split the screen up into a grid, and either directly back each of the cells with 2D array, or hash the cell coordinates into a 1D array. Each cell contains a list of objects that are in the cell. The cell size needs to be chosen experimentally, but a good place to start is 2x the size of the average object. This algorithm scales much better than 1D sort and sweep, but is a lot more work to implement correctly. While it has an average runtime of O(n), it has a HUGE constant cost which will probably make it perform poorly with fewer than thousands of objects. On the upper end, it let me kinda-realtime simulate 20,000 particles 10+ years ago on a Core2 Duo: https://www.youtube.com/watch?v=dD-um_8KqpE


The per-grid-space time cost can be eliminated entirely in exchange for some small per-item time and storage cost. If items are exact-match points, each item will need to indicate whether it is the first item in its grid space as well as the identity of the next item in its grid space (if any), each non-empty grid space must identify the "first" item therein, and empty grid spaces may identify any item. Before adding an item to a grid space, check whether the item identified thereby is the first item in that grid space. If not, figure that the grid space was actually empty.

Larger items may be handled as a group of point items, or using fancier approaches, but the trick of avoiding grid re-initialization applies regardless.


Top
 Profile  
 
PostPosted: Thu Jun 13, 2019 1:59 pm 
Offline
User avatar

Joined: Fri Nov 24, 2017 2:40 pm
Posts: 170
Sounds kinda similar to what I do in Chipmunk's spatial hash implementation. I timestamp the cells, and if the timestep doesn't match when inserting or querying against a cell, then the list is old and added back to the pool.


Top
 Profile  
 
PostPosted: Fri Jun 14, 2019 4:14 am 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 749
For something that high on something like a SNES you might be in Recursive Dimensional Clustering territory.(See Game Programming Gems 2) the main trick is your objects don't move that fast and hence once sorted to "update" sorts are 1~2 positions, this is something say a multiplexer sort exploits. You can also keep and maintain a "needs to move X to separate" and track that number rather than do a full collision each time.

Since you have fixed size you can do a Minkowski collision on the AABB of the one you are looking at compared to the centre points of the rest to speed it up, fairly easily.

For changing colour when overlapping I would exploit hardware and enable blending modes put them on different planes and let hardware do as much as it can.


Top
 Profile  
 
PostPosted: Mon Jul 08, 2019 5:59 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 1076
Attachment:
128collision.sfc [64 KiB]
Downloaded 49 times


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 23 posts ]  Go to page 1, 2  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group