ADPCM codec?

Discuss NSF files, FamiTracker, MML tools, or anything else related to NES music.

Moderator: Moderators

User avatar
tokumaru
Posts: 11858
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru » Sat Nov 17, 2007 10:07 pm

In [url=http://nesdev.com/bbs/viewtopic.php?p=28396#28396]this post[/url], tepples wrote:Eventually, I said screw it, I'm just writing my own ADPCM playback engine.
That would rock! Any progress? :D

tepples
Posts: 22052
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Sat Nov 17, 2007 10:29 pm

It depends on motivation. The bitrate would be the same as for DPCM (about 32 kbit/s), just possibly better quality. If 3/4 of a 512 KB ROM were to be used, that would be about 1.5 minutes of audio. What kind of game would benefit from that much recorded speech?

Splitting

User avatar
tokumaru
Posts: 11858
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru » Sat Nov 17, 2007 10:45 pm

tepples wrote:What kind of game would benefit from that much recorded speech?
Yeah, I didn't expect it to be smaller than DPCM. Hum... I can't think of anything very amusing with that little speech, but maybe we should look into speech synthesizing... I've seen programs reproducing speech from just a few different audio clips (I really don't know how that works, but that's why I said "look into it"). Maybe it would be possible to have a bunch of small clips that when correctly combined would produce intelligible sentences. Sure, it'd sound a bit robotic, but it would be your NES talking with a robotic voice, which fits! Maybe some sort of artificial intelligence program? Sure sounds like novelty to me.

EDIT: Seems like we could make the NES speak spanish: http://en.wikipedia.org/wiki/Diphone

User avatar
Memblers
Site Admin
Posts: 3880
Joined: Mon Sep 20, 2004 6:04 am
Location: Indianapolis
Contact:

Post by Memblers » Sat Nov 17, 2007 10:59 pm

tokumaru wrote:I've seen programs reproducing speech from just a few different audio clips (I really don't know how that works, but that's why I said "look into it"). Maybe it would be possible to have a bunch of small clips that when correctly combined would produce intelligible sentences.
I made a speech synth that works like that, I don't think I'd released it but you can hear it on the last track of my Chipography NSF. Using DPCM (and not at the highest sample rate, I think it was $C, IIRC), the samples barely fit in 16kB and I had to trim a little bit off of them.

So the quality would benefit quite a bit with ADPCM, but it could be a bit more cumbersome. My speech synth is entirely IRQ-driven, so you can do anything you want while it's talking.

User avatar
Bregalad
Posts: 7951
Joined: Fri Nov 12, 2004 2:49 pm
Location: Chexbres, VD, Switzerland

Post by Bregalad » Sun Nov 18, 2007 3:22 am

I guess the simler language to synthethise would be japanese since they have 100 diaphones or so. French and English would be almost impossible to do, tough.

atari2600a
Posts: 324
Joined: Fri Jun 29, 2007 10:25 pm
Location: Earth, Milkyway Galaxy, The Universe, M-Theory
Contact:

Post by atari2600a » Sun Nov 18, 2007 4:18 am

Memblers wrote:
tokumaru wrote:I've seen programs reproducing speech from just a few different audio clips (I really don't know how that works, but that's why I said "look into it"). Maybe it would be possible to have a bunch of small clips that when correctly combined would produce intelligible sentences.
I made a speech synth that works like that, I don't think I'd released it but you can hear it on the last track of my Chipography NSF. Using DPCM (and not at the highest sample rate, I think it was $C, IIRC), the samples barely fit in 16kB and I had to trim a little bit off of them.

So the quality would benefit quite a bit with ADPCM, but it could be a bit more cumbersome. My speech synth is entirely IRQ-driven, so you can do anything you want while it's talking.
Would you say an NES speech synth could be produced w/ a more analogue approach to it w/ a square or triangle, like Berzerk? Maybe when I get around w/ messing w/ the sound registers I'll try that...

Code: Select all

          *=$0000
loop      JMP loop
          .eof

NotTheCommonDose
Posts: 523
Joined: Thu Jun 29, 2006 7:44 pm
Location: lolz!
Contact:

Post by NotTheCommonDose » Sun Nov 18, 2007 11:17 am

This is something I've actually been waiting for.

tepples
Posts: 22052
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Sun Nov 18, 2007 11:27 am

NotTheCommonDose wrote:This is something I've actually been waiting for.
The codec, or a speech synthesizer using the codec?

Celius
Posts: 2157
Joined: Sun Jun 05, 2005 2:04 pm
Location: Minneapolis, Minnesota, United States
Contact:

Post by Celius » Sun Nov 18, 2007 12:27 pm

The NES could speak akward japanese if you had sound samples of the following letters being pronounced:

a, i, u, e, o, k, g, s, sh, z, t, ch, ts, n, h, p, b, f, m, r, y

If you could have a sample that was about .06 seconds long of each of these, you could string them together to make japanese words.

You could maybe even skip the "y" sound, and just use "i", because "ya" sounds pretty much the same as "ia".

tepples
Posts: 22052
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Sun Nov 18, 2007 12:42 pm

Celius wrote:The NES could speak akward japanese if you had sound samples of the following letters being pronounced:

a, i, u, e, o, k, g, s, sh, z, t, ch, ts, n, h, p, b, f, m, r, y

If you could have a sample that was about .06 seconds long of each of these, you could string them together to make japanese words.
It would sound like the "animalese" from Animal Crossing.

Celius
Posts: 2157
Joined: Sun Jun 05, 2005 2:04 pm
Location: Minneapolis, Minnesota, United States
Contact:

Post by Celius » Sun Nov 18, 2007 1:02 pm

Haha, yeah it would. But the animalese is just really fast. Maybe .06 seconds is too fast, but I've worked with a cartoon where every frame is shown for .07 seconds at minimum. The mouth movements were really hard to get right because that just was too slow in some places. However, these sounds are .06 seconds, and two of them make a syllable. So most syllables are .12 seconds long at minimum. This may be a moderate speed for japanese, but it might be too slow. I'm sure it would be hard to make it sound natural. Every sample would have to be pretty monotonous to make it not sound like animalese.

NotTheCommonDose
Posts: 523
Joined: Thu Jun 29, 2006 7:44 pm
Location: lolz!
Contact:

Post by NotTheCommonDose » Sun Nov 18, 2007 1:07 pm

The codec.

User avatar
Bregalad
Posts: 7951
Joined: Fri Nov 12, 2004 2:49 pm
Location: Chexbres, VD, Switzerland

Post by Bregalad » Sun Nov 18, 2007 1:08 pm

You would have to play some sample faster than other (such as having the end of a senstance play lower) to sound slightly more natural. (or higher if the senstance is a question).

NotTheCommonDose
Posts: 523
Joined: Thu Jun 29, 2006 7:44 pm
Location: lolz!
Contact:

Post by NotTheCommonDose » Sun Nov 18, 2007 1:31 pm

but the program will do that right?

tepples
Posts: 22052
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Sun Nov 18, 2007 1:39 pm

It's difficult to make cycle-timed code that will smoothly change the pitch of a sample.

Post Reply