It is currently Sun Dec 17, 2017 4:33 pm

All times are UTC - 7 hours



Forum rules


Related:



Post new topic Reply to topic  [ 52 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
PostPosted: Tue Aug 16, 2016 12:07 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2433
About audio, how do MP3s and OGGs keep everything compressed at certain rate, like 128kbps?


Top
 Profile  
 
PostPosted: Tue Aug 16, 2016 12:38 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3165
Location: Nacogdoches, Texas
They're both lossy.


Top
 Profile  
 
PostPosted: Tue Aug 16, 2016 12:45 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19354
Location: NE Indiana, USA (NTSC)
There are several methods for controlling how much loss is applied over the course of an audio or video stream.

  1. "Average bitrate" uses two-pass encoding. First it reads all the audio once to determine how acoustically complex each frame (split second of audio) is in order to calculate how much audible distortion would occur at various bitrates. Then it uses this information to estimate the overall quality level to apply over a piece of music to keep total data divided by time equal to, say, 128 kbps. This doesn't keep the stream at a rock-solid 128 kbps and is thus not ideal for streaming over a channel with a hard maximum rate, but it's great for downloads or physical media if you have enough rate headroom that overall capacity is the limiting factor.
  2. "Constant bitrate" dials the distortion up and down based on the acoustic complexity of each frame so that each uses exactly the same amount of data. Simpler codecs such as ADPCM (IMA, BRR, VAG, etc.) do this implicitly.
  3. "Bit reservoir" is a method used in MP3 to allow the data for a less complex frame to include part of the data used in later, more complex frames. This method can be thought of as short-term ABR, allowing some of the consistency of average bitrate over a channel with a given peak rate.


Top
 Profile  
 
PostPosted: Tue Aug 16, 2016 12:53 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 6538
Location: Seattle
Because they don't operate in the time domain.

Unlike BRR or its close relative ADPCM, which take a sample in, apply some simple filters, and emit a sample out, everything newer than MP2 operates in the frequency domain. They usually use the MDCT, and then use simple rules to figure out what data they can throw away (the "lossy" stage) before using some lossless compression algorithm.


JPEG is substantially similar, and someone's written a handy guide of how its compression works: (1), (2).



'Constant bitrate' MP3 actually isn't. Each MP3 frame contains what's called a "bit reservoir" which allows it to reserve some of the bits in the current frame of audio for the next frame. So it's not as good as a "real" VBR encoding, but it allows for a guaranteed bandwidth allocation instead.

Vorbis (and opus) are true VBR codecs and have no bit reservoir. There, the only way to get CBR-like effect is to adjust quality on a frame-by-frame basis, which isn't really all that great, quality-wise.


Top
 Profile  
 
PostPosted: Tue Aug 16, 2016 1:27 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19354
Location: NE Indiana, USA (NTSC)
lidnariq wrote:
'Constant bitrate' MP3 actually isn't. Each MP3 frame contains what's called a "bit reservoir" which allows it to reserve some of the bits in the current frame of audio for the next frame. So it's not as good as a "real" VBR encoding, but it allows for a guaranteed bandwidth allocation instead.

Vorbis (and opus) are true VBR codecs and have no bit reservoir. There, the only way to get CBR-like effect is to adjust quality on a frame-by-frame basis, which isn't really all that great, quality-wise.

As I understand it, the only reason that Vorbis and Opus don't use short-term ABR (which MP3 calls "bit reservoir") is that Xiph is waiting for MP3 patents to expire.


Top
 Profile  
 
PostPosted: Tue Aug 16, 2016 10:33 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2433
I still wonder how much the SNES can compress graphics on it's own. I'll write code for a Huffman decoder right here.

Code:
huffman:
rep #$20
sep #$20
ldx tree
-;
lda (source)
sta byte_buffer
inc source
ldy #$0008
-;
lda $0000,x
beq found_byte
lsr byte_buffer
bcs +
inx #3
dey
bne -
bra --
+;
lda $0002,x
xba
lda $0001,x
tax
dey
bne -
bra --

found_byte:
lda $0001,x
sta (destination)
inc destination
ldx tree
dec legnth
bne -
rts


Top
 Profile  
 
PostPosted: Wed Aug 17, 2016 4:16 am 
Offline

Joined: Thu Aug 20, 2015 3:09 am
Posts: 298
Would Huffman-style rANS be any faster than regular Huffman on the SNES? Or, coming at it the other way, can the SNES handle multiplication fast enough to decode full rANS in a reasonable amount of time?


Top
 Profile  
 
PostPosted: Wed Aug 17, 2016 2:05 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2433
I did some math, and found out the Huffman code would only be able to do about 200 compressed bytes per frame. Not quite enough.

Although I did think up of a fast way to do pixel-wise RLE. Have the runs of pixels vertically instead of horizontally. Instead of drawing the entire vertical runs of pixels, just draw points at the vertical edges with the XOR value of the two surrounding colors. Then do XOR filling to fill in between the lines.

Edit:
It would still probably kinda slow. Maybe an "xor 8x1 slivers and remove blank bytes" would do the trick.


Top
 Profile  
 
PostPosted: Fri Aug 19, 2016 9:42 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2433
I just did some experimenting with music samples, and I found out it's possible to make even 8khz samples sound good in Audacity. Go into the "equalizer" and make a slope going up 12db from 2khz-3khz, and up to 30db at 4khz.

It's funny how much unnecessary muffling most programs add to samples when they're downsampling, just to prevent antialiasing artifacts.


Top
 Profile  
 
PostPosted: Sat Aug 20, 2016 4:43 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19354
Location: NE Indiana, USA (NTSC)
The term "unnecessary muffling" makes me think you have a lot to learn about signal processing. Have you watched the Digital Show & Tell video on Xiph.org Video and read about the Nyquist theorem?

Now that Pocket Heaven is dead, I'll make this the official site for a demo that I call "Fake Highs Off U". When someone asked me why the music in Luminesweeper sounds better than 18 kHz music sounds through GSM Player, I took a short sample of "The legend of MAX" from Dance Dance Revolution Extreme to demonstrate interpolation techniques.

LOM_lerping.ogg (23 seconds)

I think* this is what I did for that demo:
  1. Original, with energy up to 16 kHz
  2. Lowpass to 9 kHz to simulate downsampling to 18157 kHz, a common sample rate for GBA audio and close to the Neo Geo sample rate
  3. Nearest neighbor interpolation of the 18 kHz signal to 36 kHz
  4. Linear interpolation of the 18 kHz signal to 36 kHz


* Wayback Machine by Archive.org is useless. Whenever a domain expires and ends up sold, the new owners can hide the archive from the user using /robots.txt.


Top
 Profile  
 
PostPosted: Sat Aug 20, 2016 9:03 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2433
I meant extra muffling as in whatever type of antialiasing filter Audacity uses to resample music, also mutes a large portion of sub-Nyquist frequencies.


Top
 Profile  
 
PostPosted: Sat Aug 20, 2016 4:16 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 818
Not as much as you're adding back: http://src.infinitewave.ca/

I suspect your method is making the samples sound better by boosting what high frequencies are left so as to somewhat compensate for the loss of everything past the Nyquist. I wouldn't recommend using a single set of filter settings for this in the general case; you could get some really nasty results if you don't twiddle the knobs first to see how the material responds.

I suppose one could design a program that used power spectrum analysis and psychoacoustics to try to figure out how to add back the high-frequency energy lost to the bandlimiting procedure in a form that sounds better than simple Nyquist folding. But that sounds complicated...

...

Oh, hey - SoX is apparently free. How have I never heard of it before? Just look at those graphs...


Top
 Profile  
 
PostPosted: Wed Aug 24, 2016 9:52 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2433
Since we're talking about sound quality, I felt like analysing what types of harmonic content can be created using different loop lengths.

perfect 5ths require 2 wavelegnths
major 4ths require 3 wavelengths
major 3rds require 4 wavelengths
major 2nds require 8 wavelengths
15 cents require 128 wavelengths

Anything between a major 2nd and 15 cents can make your ears bleed.


Top
 Profile  
 
PostPosted: Wed Aug 24, 2016 10:32 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5899
Location: Canada
psycopathicteen wrote:
Since we're talking about sound quality, I felt like analysing what types of harmonic content can be created using different loop lengths.

perfect 5ths require 2 wavelegnths
major 4ths require 3 wavelengths
major 3rds require 4 wavelengths
major 2nds require 8 wavelengths
15 cents require 128 wavelengths

Anything between a major 2nd and 15 cents can make your ears bleed.

To be more accurate, there's two numbers of wavelengths in any harmonic ratio like this. E.g. you need 3 wavelengths of the higher pitch in a perfect fifth in the space of 2 wavelengths of the lower note. There are a lot of useful harmonic ratios:
  • major second = 9:8
  • minor third = 6:5
  • major third = 5:4
  • perfect fourth = 4:3
  • perfect fifth = 3:2
  • minor sixth = 8:5
  • major sixth = 5:3
  • minor seventh = 7:4
  • octave = 2:1

These are just intonation intervals, however. The fifth, fourth, and second will sound normal, but just thirds will be slightly out of tune with music using the common equal tempered scale. The minor seventh will be very out of tune compared to equal temperament. (They will, however, be perfectly in tune with each other. These are pure harmonic ratios.)

There are other useful harmonic ratios too, but they can be looked up if you're interested.


Top
 Profile  
 
PostPosted: Thu Aug 25, 2016 9:12 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 818
Questions re: the SPC700:

1) What is the relationship between addw, carry, and half carry? Specifically, can I simply use addw without worrying about the flag values? From what I can tell, it sets them but doesn't use them...

2) It seems that the observed variance in SPC700 clock timing is between about zero and about +0.3%. Is this true of both NTSC and PAL? What sort of safety margin would the experts recommend?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 52 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: Bing [Bot] and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group