It is currently Thu Dec 13, 2018 7:36 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 22 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: N64 benchmarks
PostPosted: Fri Dec 07, 2018 11:57 am 
Offline

Joined: Tue Oct 06, 2015 10:16 am
Posts: 849
Like on NES some time back, here's some data for which things to pick on N64. I plan to bench some audio and video codecs next. All on gcc 8.2 -O3.

Code:
Results from cen64, which slightly differs from hw (~5%).

Text decompression, source LGPLv3 7.5kb, speed in kb/s

Algo    | Ratio | Speed | License, comments
-------------------------------------------
zstd    | 0.333 | 1457  | BSD, requires ~160kb RAM
zlib    | 0.343 | 2823  | zlib, requires ~4kb RAM (tinfl)
lzo     | 0.402 | 4773  | GPL, no RAM required
lz4hc   | 0.475 | 10471 | BSD, no RAM required
lzjb    | 0.591 | 4998  | CDDL, no RAM required, nemequ github version


Audio, 10s 44100 Hz mono clip, % realtime

Algo            | Ratio | Speed | License, comments
-------------------------------------------
Speex           | 0.038 | 208   | BSD, fixed point
Vorbis 128      | 0.158 | 410   | BSD, tremor lowram, measured ~35kb
Vorbis 96       | 0.122 | 458   |
Vorbis 64       | 0.089 | 498   |
Vorbis 48       | 0.068 | 498   |
Opus 64         | 0.099 | 215   | BSD, fixed point, measured ~95kb
Opus 48         | 0.075 | 229   |
Opus 32         | 0.049 | 252   |
MP3 128         | 0.131 | 215   | PD, no RAM required, lieff/minimp3
MP3 96          | 0.109 | 215   |
MP3 64          | 0.087 | 219   |
MP3 32          | 0.044 | 430   | Lame chose to downsample to 22kHz and mpeg-2l3


Audio, 10s 16000 Hz mono clip, % realtime

Algo            | Ratio | Speed | License, comments
-------------------------------------------
Speex           | 0.071 | 582   | BSD, fixed point
Vorbis 64       | 0.173 | 1066  | BSD, tremor lowram, measured ~32kb
Vorbis 48       | 0.142 | 1165  |
Vorbis 32       | 0.111 | 1206  |
Opus 64         | 0.266 | 252   | BSD, fixed point
Opus 48         | 0.199 | 264   |
Opus 32         | 0.135 | 276   |


Zstd is pretty disappointing given how hyped it is. Barely better compression than zlib and much slower, with huge RAM usage.


Last edited by calima on Thu Dec 13, 2018 8:29 am, edited 3 times in total.

Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Fri Dec 07, 2018 12:02 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20870
Location: NE Indiana, USA (NTSC)
How does Zstandard compare to implementations of Deflate other than zlib's, such as 7-Zip's or Google's Zopfli? I ask because I'm familiar with these two in particular from the advzip and advpng tools in AdvanceCOMP. Decompression speed probably wouldn't differ much, but compression would probably be slower, and the rate might differ.

Would you be interested in results for DTE and Huffword codecs as a low water mark? But I'll admit these may not be quite as useful on Nintendo 64 as they are on a small-RAM, execute-in-place environment like the NES.


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Fri Dec 07, 2018 12:32 pm 
Offline

Joined: Tue Oct 06, 2015 10:16 am
Posts: 849
Zstd on modern computers beats even the best zlib implementations, according to all reports. The small size of the test data here probably hinders it a bit, and it's not very speedy on an old MIPS like this. I'll probably use zlib for everything, it hits the sweet spot here.

As for any additional codecs, sure, I'll add any data points.


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Fri Dec 07, 2018 12:58 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3725
Location: Mountain View, CA
Results for lzjb? (Yes I'm aware lz4 is a faster implementation/replacement for lzjb)


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Fri Dec 07, 2018 1:02 pm 
Online

Joined: Sun Apr 13, 2008 11:12 am
Posts: 7820
Location: Seattle
I wonder if Zstd is here specifically crippled by the N64's tiny cache.


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Sat Dec 08, 2018 11:50 am 
Offline

Joined: Tue Oct 06, 2015 10:16 am
Posts: 849
lzjb and speex added. Speex compresses quite well, but at these speeds it's not that suitable for many voices at once. For cutscenes or RPG-style talking to one character, it should work great.

lzjb was fast to test, but I had never even heard of it. Why do you find it interesting? ZFS was a Sparc thing, no MIPS relation.


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Sat Dec 08, 2018 12:02 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20870
Location: NE Indiana, USA (NTSC)
SILK, the low-rate voice mode in Opus, is similar to Speex in several ways. Is Opus on the whole too slow for the N64? Or Codec 2?


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Sat Dec 08, 2018 12:03 pm 
Offline

Joined: Tue Oct 06, 2015 10:16 am
Posts: 849
It's on the list to test, along with vorbis and mp3.


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Sat Dec 08, 2018 7:42 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3725
Location: Mountain View, CA
calima wrote:
lzjb was fast to test, but I had never even heard of it. Why do you find it interesting? ZFS was a Sparc thing, no MIPS relation.

ZFS isn't a "Sparc" thing, it was originally a "Solaris thing". Around the time of the Oracle buy-out of Sun, OpenIndiana/Illumos happened (think: open-source Solaris), which then resulted in parts of ZFS becoming open-source (though under CDDL), which resulted in it being imported into FreeBSD and a fusefs version for Linux (slow). This all later resulted in OpenZFS and ZFS on Linux -- so now FreeBSD, Linux, and OpenIndiana/Illumos all have ZFS (regardless of arch; x86, amd64, aarch64/ARM, etc.).

Why I found it interesting: because I've known it to be faster than gzip, faster than zlib, but slower than lz4 (which is extremely new), and wanted to see how it performed on the N64.

I don't know if there's some "easy to add" code that would be testable, but gzip and bzip2 (for text) might be interesting as well. I've seen many cases where text compresses better with gzip than bzip2, and in other cases the exact opposite. Another one to consider might be some bare-bones native Huffman implementation, although I wouldn't be surprised if one of the previously-tested algorithms dynamically implements something like that.

For audio, you might look into Codec 2 which is know for being OSS and having extremely high compression rates, but again I have no idea how easy this would be to add/test.


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Sat Dec 08, 2018 9:24 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20870
Location: NE Indiana, USA (NTSC)
PKZIP and Gzip use the same Deflate algorithm as zlib, and bzip2 is very RAM-intensive on the order of 1 MB, which is one-fourth of the N64's RAM.

How much of the decoding for MP3, Vorbis, Speex, Opus, Codec 2, or FLAC can be done on the RSP? FLAC can be turned into a time-domain lossy codec using the LossyWAV preprocessor, which bit-crushes each 512-byte block with noise-shaped dithering so that FLAC has fewer significant bits to code.


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Sun Dec 09, 2018 4:10 am 
Offline

Joined: Tue Oct 06, 2015 10:16 am
Posts: 849
Audio codecs generally don't benefit from SIMD, there isn't any vectorizable processing going on. The RSP's SU (scalar unit) lacks multiply instructions and 64-bit instructions, so even using that as a "second core" would be slower than the main core. I expect graphics processing to take the most of RSP's frame time anyway.


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Sun Dec 09, 2018 6:58 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20870
Location: NE Indiana, USA (NTSC)
calima wrote:
Audio codecs generally don't benefit from SIMD, there isn't any vectorizable processing going on.

Not even inverse FFTs or MDCTs or filtering?


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Sun Dec 09, 2018 11:03 am 
Offline

Joined: Tue Oct 06, 2015 10:16 am
Posts: 849
The amount of data in one audio packet is so small, that the overhead usually kills any speedup. FFT/DCT for images is a different case, if you process a mb at once instead of a hundred bytes.


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Sun Dec 09, 2018 12:16 pm 
Online

Joined: Sun Apr 13, 2008 11:12 am
Posts: 7820
Location: Seattle
... How bad is the overhead? I haven't yet found any documentation on how writing code for the RDP/RSP works.


Top
 Profile  
 
 Post subject: Re: N64 benchmarks
PostPosted: Mon Dec 10, 2018 1:20 am 
Offline

Joined: Tue Oct 06, 2015 10:16 am
Posts: 849
Well, I was speaking in general, as in even on x86_64 you won't get much speedup if any from vectorizing parts of audio decoding. You have to load data to the vector, often from unaligned or scattered addresses, do the calculation, and store. The load/permute/store steps may make the 8x/16x processing step speedup worthless if you don't have much data. More so if the vectorizable parts alternate with non-vectorizable.

RSP specifically: vector loads have three delay slots, meaning effectively it takes four instructions worth for an aligned, perfect load. Then you have to DMA in and out of the 4kb memory, giving further overhead. "SGI_Nintendo_64_RSP_Programmers_Guide.pdf" is available on the ultra64.ca site, as well as a RDP register doc.

I've read pretty much all N64 docs by now. In some ways it's better and in others worse than expected. For example there is no flipped Z comparison mode, and rendering triangles is very much a PITA, but on the other hand the RSP will allow many kinds of software pixel effects. Gaussians, additive rendering, better scaling algos, maybe even some form of shadow mapping.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 22 posts ]  Go to page 1, 2  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group