Understanding sound for the first time

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

Post Reply
Bowie90333212391
Posts: 11
Joined: Sun Aug 16, 2015 9:02 am

Understanding sound for the first time

Post by Bowie90333212391 »

I have my NES emulator working great (Input, graphics, etc.) and am moving on to getting sound. I read through the wiki to understand how the APU works. There are a lot of technical terms like sampling, length counters, envelopes, pulse/triangle waves, etc.

Up to this point, everything has made sense to me. My issue is that I have no experience working with sound engineering or signal processing, so I'm not even sure where to start.

Is there some kind of book or class I can sign up for that would give me a basic understanding of whats going on here? The books I've skimmed through (mainly undergraduate DSP textbooks) have so much prerequisite knowledge and advanced math that isn't relevant to what is needed to get sound working. I just don't want to spend months reading through those kind of books only to discover that they arn't applicable to doing APU emulation.

I'm just looking for a starting point... Has anyone had the same issue, and if so, what books/articles/websites did you turn to for understanding the prerequisite knowledge?
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Understanding sound for the first time

Post by tepples »

User avatar
James
Posts: 431
Joined: Sat Jan 22, 2005 8:51 am
Location: Chicago, IL
Contact:

Re: Understanding sound for the first time

Post by James »

Don't worry about DSP for now. Once you get it working, you may choose to learn some DSP stuff to implement better resampling, etc., but there's no DSP involved in generating the APU output (well, unless you count downsampling the signal to e.g., 48kHz, but that's as simple as taking every ~37th sample -- no complex math involved).

As far as emulating the APU goes... it was hard for me to wrap my head around it as well and I'm sure someone else will explain it better than I can. Blargg's APU reference (http://nesdev.com/apu_ref.txt), particularly the diagrams, was probably what helped me the most. Check that out if you haven't already.
get nemulator
http://nemulator.com
User avatar
zeroone
Posts: 939
Joined: Mon Dec 29, 2014 1:46 pm
Location: New York, NY
Contact:

Re: Understanding sound for the first time

Post by zeroone »

@Bowie90333212391 Do you have access to an API that can convert a stream of samples into sound? For instance, are you able to generate a pure tone from the points of a sine wave?
User avatar
Jarhmander
Formerly ~J-@D!~
Posts: 569
Joined: Sun Mar 12, 2006 12:36 am
Location: Rive nord de Montréal

Re: Understanding sound for the first time

Post by Jarhmander »

James wrote:Don't worry about DSP for now. Once you get it working, you may choose to learn some DSP stuff to implement better resampling, etc., but there's no DSP involved in generating the APU output (well, unless you count downsampling the signal to e.g., 48kHz, but that's as simple as taking every ~37th sample -- no complex math involved).
If you proceed to downsample like this, it will work, but the resulting sound will be a bit harsh because of aliasing. The good news is, DSP knowledge is only required for the downsampling operation, and even then, there are libraries available that does just that.

Do you have trouble playing a software-generated sound, or understanding how the APU make sound waves from the APU registers?
((λ (x) (x x)) (λ (x) (x x)))
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Understanding sound for the first time

Post by rainwarrior »

The simplest way to downsample is just take several samples and average them. If you generate samples at 4x your samplerate, this is enough to knock off the most unpleasant parts of the aliasing. (It's not perfect, by any means, but it's very easy to implement and has passable quality.)

To get noise to sound correct, I recommend manually averaging it over the cycles run per sample (i.e. having a "mini" downsampler inside the noise generator). It's the one thing on the NES that really won't sound right without significant oversampling.
Bowie90333212391
Posts: 11
Joined: Sun Aug 16, 2015 9:02 am

Re: Understanding sound for the first time

Post by Bowie90333212391 »

Okay I'll take a look at the resources here you guys posted. I'm using SDL, and I'm just trying to understand the basics including the terminology. I guess my issue is that I lack the prerequisite knowledge to understand sound (literally zero knowledge). This is pretty clear when I was referring to DSP, which I erroneously thought was necessary for just playing the sound.

I also found this post viewtopic.php?t=8491 which is very resourceful.

I'll read through everything and see what I can get from all this. Thanks guys.
User avatar
Disch
Posts: 1848
Joined: Wed Nov 10, 2004 6:47 pm

Re: Understanding sound for the first time

Post by Disch »

THE BASICS:

If you've ever used a music player that has an oscilloscope view, you'll be able to get the idea easier. Here's a video of an actual oscilloscope:
https://youtu.be/pdC_aITNFG0

Basically an oscilloscope lets you see the movement of the sound wave. These waves roughly correspond to the movement of the cone inside your speaker. No movement = no sound.
"Wider" waves are lower pitch.
"Taller" waves are louder.


"PCM" is a digital recording of such a sound wave. If you open a wav or other kind of PCM audio file in a wave editor like GoldWave or SoundForge, you'll be able to see the entire wave all at once on a graph. Here's an example:
http://imgur.com/a/uSTb5
Digital audio is just a really long wave/graph like that.


If you think of it in terms of speaker movement, the 'Y' axis is the position of the speaker, and the 'X' axis is time.



A "sample" is merely 1 point on that graph. Since computers can't really record the analog movement of a speaker, they "sample" it -- or take a snapshot of it -- every couple of milliseconds. You can think of it the same way as video --- in video you have a "frame" which is a snapshot of what is being output to the monitor at the given time.... whereas with audio you have a "sample" which is a snapshot of the speaker cone position at the given time.

Video needs to have a 'framerate' indicating the time between individual frames. 60 FPS is going to move through individual frames twice as fast as 30 FPS, and thus produce smoother output for the user. Similarly, audio has a 'samplerate' indicating the time between individual samples. 44100 samples per second is going to move through individual samples twice as fast as 22050 samples per second -- and just like with video, higher samplerates typically produce smoother output*. Samplerates are measured in hertz (Hz), which basically is just "times per second". So 44100 Hz == 44100 samples per second.



The hardest part with NES emulation -- and all this talk about "downsampling" the others are throwing at you -- is because the NES will output 1 sample every CPU cycle. This means it outputs a samplerate of 1789772.7272 Hz ... much higher than your computer outputs. Your emulator will probably only want to output 44100 Hz, which means you have to find a way to transform the NES samplerate down to the more reasonable PC samplerate.


The easiest of easy way to this is a "nearest neighbor" approach. 1789772.7272 / 44100 = ~40.58 ... so you can get away with outputting 1 sample every 40.5 CPU cycles and simply dropping the rest. The sound quality won't be great, but it'll be recognizable.

A better approach would be "linear interpolation" -- which could be done by creating an average of the output for ~40.5 cycles and output that. This is a bit more CPU intensive, but will sound MUCH better and is totally passable as far as quality goes.

There are other techniques that are more complicated, but get really good quality with lower CPU demand. Blargg wrote a document on "Band limited synthesis" which is the technique I use. It's worth checking out later, but I would not recommend trying it for your very first time.





* There's a limit as to how high of a samplerate/framerate you need, though. Human hearing is not perfect and once you reach a certain samplerate you don't really gain anything by going any higher. 44100 Hz is generally accepted to be that threshold which is why it's the common standard.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Understanding sound for the first time

Post by rainwarrior »

Disch wrote:* There's a limit as to how high of a samplerate/framerate you need, though. Human hearing is not perfect and once you reach a certain samplerate you don't really gain anything by going any higher. 44100 Hz is generally accepted to be that threshold which is why it's the common standard.
The recommended samplerate these days is 48000 Hz.

This doesn't have much to do with human hearing, it's just that most devices these days will use that by default instead of 44100 Hz (which used to be the most common), and a lot of drivers have crummy resamplers that add unpleasant ringing/distortion to source audio that is delivered to the driver at 44100 Hz.
User avatar
Disch
Posts: 1848
Joined: Wed Nov 10, 2004 6:47 pm

Re: Understanding sound for the first time

Post by Disch »

I thought they settled on 44HKz back in CD days because human hearing can't pick up frequencies above 22KHz

But whatever, I'll take your word for it. =)
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: Understanding sound for the first time

Post by lidnariq »

Wikipedia has, of course, much ink spilled about the origin of the 44.1kHz sample rate.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Understanding sound for the first time

Post by rainwarrior »

I meant that the standard samplerate was changed to 48000 Hz, but not to accomodate human hearing (44100 Hz already encodes enough information to cover human hearing, but 48000 Hz had some implementation advantages).
User avatar
TmEE
Posts: 960
Joined: Wed Feb 13, 2008 9:10 am
Location: Norway (50 and 60Hz compatible :P)
Contact:

Re: Understanding sound for the first time

Post by TmEE »

Since introduction of AC97, pretty much all sound cards have 24.576000MHz clock the only thing connected to the codec chip or ADC or DAC making them 32000Hz, 48000Hz and their multiples based (32000 * 768, 48000 * 512). 44100Hz (and multiples) are based on 33.868800MHz or 16.934400MHz clock (44100 * 768 or 384).
If possible 48/64/96/128/144/192KHz should be the preferred choice, it will avoid some most probably nasty resampling on the OS/driver side on most hardware.
Post Reply