Page 1 of 3

granular synthesis analysis tool

Posted: Tue Jun 23, 2009 6:12 pm
by lidnariq
About a year ago I wrote a tool that would automatically take a (monaural) audio file and convert it to a series of bytes you could write to the NES noise channel control registers. This is specifically useful for making percussive sounds, and works best on snare drum and high hat samples.

I kept on meaning to go and make it less fragile but never got around to it. I figure it'd be better to post it here now, so that either 1- someone else benefits from it 2- someone else inspires me to fix it.

There are a few samples inside. You probably are most interested in noise.asm and noise.obj. (I'm using the xa65 assembler, because it's the only one in debian.). The actual converter itself is analyze.pl -- it's verbosely documented, so if you want to rewrite it in a different language you should be ok.

How it works:
The tool takes a reference set of sounds generatable by the noise hardware, takes their FFTs, and then compares this to the FFTs of a file to be converted, 1/60th of a second of audio at a time, finding the best match. (This is a kind of granular synthesis)
Caveats:
* This tool needs perl, PDL, and PDL::Audio.
* I wrote it on Linux. It might work on windows, but I've not even looked at it.
* I generated the reference audio using FCEU which I'm certain isn't very accurate.
* The closest-FFT algorithm is the wrong one to choose for the short-loop mode of the noise hardware, so tonal things convert things badly
* Conversion is actually simply wrong for converting files not sampled at 48kHz. (it's a tempo-preserving pitch shift of the file by 48kHz/input rate)

Anyway: -h-t-t-p-:-/-/-e-a-m-p-.-o-r-g-/-l-i-/-n-o-i-s-e-g-r-a-i-n-.-7-z- [937kB] It's only so big because I'm including the reference noisewaveforms.wav -- generating it involved stupid hackery.


edit x2: Please use the C version:
preanalyze.c
(6.28 KiB) Downloaded 269 times
, it doesn't have any of the build obnoxiousness of the PDL version
preanalyze2.c
(8.58 KiB) Downloaded 259 times
is the same but also produces a residual (FFT resynthesis of error signal)
A demonstration NES image for the C version is at
newnoise.nes
(16.02 KiB) Downloaded 286 times
, or read on from http://nesdev.com/bbs/viewtopic.php?p=69538#p69538

Posted: Wed Jun 24, 2009 1:59 am
by Bregalad
Wow this sounds absolutely awesome ! Very usefull for percussion and sound effects too.

Posted: Wed Jun 24, 2009 3:33 am
by neilbaldwin
Cool idea but somewhat limited if only using the noise channel. There are pitch/tonal qualities to many percussion sounds that can't be simulated with noise alone.

I always found the best way was to do it by ear - load a drum sample into a sample editor and play it back at 1/2 or 1/4 speed. You can then clearly here the changes in pitch over time and mimic this in your wavetable tables.

Having said that, I didn't actually get to hear the generated sounds (for some reason many of the files came out garbage when I unpacked the archive) so they could be awesome :)

Posted: Wed Jun 24, 2009 4:32 am
by Dwedit
The reference sounds good enough. The noise channel isn't that hard to emulate correctly once you understand how the shift register works.
Can't test it now though, since Cygwin's perl doesn't come with PDL.

Posted: Wed Jun 24, 2009 6:24 am
by lidnariq
neilbaldwin wrote:Cool idea but somewhat limited if only using the noise channel. There are pitch/tonal qualities to many percussion sounds that can't be simulated with noise alone.
Yeah -- I've seen the simulators that use one sine and one white noise source behind a VCF, that work remarkably well. My rationale for not digging into that was 1- it's a lot harder 2- the NES only has 5 channels so I'd rather see what I can do with one. Since the noise channel only has 512 (really 481) different unique sounds it can make, searching the space by brute-force is straight-forward.
I always found the best way was to do it by ear - load a drum sample into a sample editor and play it back at 1/2 or 1/4 speed. You can then clearly hear the changes in pitch over time and mimic this in your wavetable tables.
I never thought of trying to do it by ear -- but that makes sense.
Having said that, I didn't actually get to hear the generated sounds (for some reason many of the files came out garbage when I unpacked the archive) so they could be awesome :)
Some of them are compellingly close, many are so-so. However, I should fix this problem -- does 7z complain about the files, or something else? (MacOS, Windows, and linux disagree about what marks a newline in a text file -- if it's just the text files that are giving you trouble)

Posted: Wed Jun 24, 2009 6:26 am
by tepples
It might be easier just to fit a 2-pole linear predictive model to the sound using the Levinson-Durbin algorithm, determine its pole frequency, and find the noise pitch that has the closest pole frequency.

Posted: Wed Jun 24, 2009 7:01 am
by neilbaldwin
lidnariq wrote:
neilbaldwin wrote:Cool idea but somewhat limited if only using the noise channel. There are pitch/tonal qualities to many percussion sounds that can't be simulated with noise alone.
Yeah -- I've seen the simulators that use one sine and one white noise source behind a VCF, that work remarkably well. My rationale for not digging into that was 1- it's a lot harder 2- the NES only has 5 channels so I'd rather see what I can do with one. Since the noise channel only has 512 (really 481) different unique sounds it can make, searching the space by brute-force is straight-forward.
I always found the best way was to do it by ear - load a drum sample into a sample editor and play it back at 1/2 or 1/4 speed. You can then clearly hear the changes in pitch over time and mimic this in your wavetable tables.
I never thought of trying to do it by ear -- but that makes sense.
Having said that, I didn't actually get to hear the generated sounds (for some reason many of the files came out garbage when I unpacked the archive) so they could be awesome :)
Some of them are compellingly close, many are so-so. However, I should fix this problem -- does 7z complain about the files, or something else? (MacOS, Windows, and linux disagree about what marks a newline in a text file -- if it's just the text files that are giving you trouble)
Don't get me wrong, I think it's a noble effort and if you could make it work with pitch too I'd definitely use it.

I'm currently writing a NES audio engine (known around these parts as 'Nijuu') and my drum synthesis wavetables use any or all of the NES hardware voices. I recently uploaded an editor I made on the NES itself to enable you to live-preview your sounds;

http://dutycyclegenerator.com

(apologies for the lack of navigation, scroll down to bottom and work up)

Re. 7z. That was my fault. Years ago I'd installed some stand-alone (flaky) unzipper for 7z files. I've just thrown it in the bin and unarchived it properly and it's not corrupted any more.

I've still not heard the sounds yet though - the .NES file won't open in Nestopia (I'm on OSX).

Posted: Wed Jun 24, 2009 7:25 am
by neilbaldwin
I got the code compiling with ASM6. Pretty interesting experiment though I was right about the sounds ;)

Posted: Wed Jun 24, 2009 9:55 am
by B00daW
I would really like to see this capable of creating FTI (FamiTracker Instrument) files for the noise channel.

Posted: Wed Jun 24, 2009 5:15 pm
by B00daW
Thought I'd help those who are too lazy to compile xa65 for themselves.

http://average.truechiptilldeath.com/ne ... .5-w32.zip

There's the "Windows version" compiled with Cygwin.

Posted: Wed Jun 24, 2009 5:47 pm
by B00daW
Also... I can almost get it working in Windows, but it is really picky on the version of Types.pm used. Could you supply that please?

Posted: Wed Jun 24, 2009 8:20 pm
by lidnariq
B00daW wrote:Also... I can almost get it working in Windows, but it is really picky on the version of Types.pm used. Could you supply that please?
That's so weird. anyway, http://eamp.org/Types.pm
tepples wrote:It might be easier just to fit a 2-pole linear predictive model to the sound using the Levinson-Durbin algorithm, determine its pole frequency, and find the noise pitch that has the closest pole frequency.
That would make sense... My background is more in DIP than audio, so I've not really done any work on pole estimation. It didn't help that my preceding class on DSP used a book where the author was very down on same.

The other problem is that the different values written to $400E have different loudnesses, too. Which is why I'm searching the entire 512-entry space of frequency and volume rather than just the 32-entry space of frequency.
neilbaldwin wrote:Don't get me wrong, I think it's a noble effort and if you could make it work with pitch too I'd definitely use it.
Do you mean fix the sample-rate thing? Or using two voices?

Posted: Wed Jun 24, 2009 11:49 pm
by neilbaldwin
lidnariq wrote:
neilbaldwin wrote:Don't get me wrong, I think it's a noble effort and if you could make it work with pitch too I'd definitely use it.
Do you mean fix the sample-rate thing? Or using two voices?
Yep, using multiple voices. I appreciate that that may make it massively more complicated.

Seems like I'm the only one concerned about this though so don't go out of your way to accommodate me - I'm quite happy using my little editor.

Though if you like a challenge...... :)

Posted: Thu Jun 25, 2009 2:09 am
by Bananmos
Absolutely awesome sounds like a mild compliment in this case. I would never have imagined that you could make the noise channel sound so much like real samples while still only updating its registers once per frame. It's always fun to see how people's accomplishments in NES music still surprises us.

Posted: Thu Jun 25, 2009 5:54 am
by B00daW
Dwedit wrote:The reference sounds good enough. The noise channel isn't that hard to emulate correctly once you understand how the shift register works.
Can't test it now though, since Cygwin's perl doesn't come with PDL.
Check out ActivePerl and use the xa65 compile above. I just ran the MakeFile.PL in the root of PDL and dumped whatever libs from PDL it needed in Perl's /lib directory.