It is currently Tue Dec 12, 2017 4:58 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 48 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
PostPosted: Fri Aug 05, 2016 3:25 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5891
Location: Canada
tepples wrote:
Can an all-up or all-down DMC sample reduce sample and hold artifacts?

Interesting idea. I guess at best it would be linear resampling, and at worst it would be no better than sample-and-hold. I think between the +/- 2 increment granularity of DPCM, having only a few useful frequencies to choose from, and latency of the DMC channel, the practicality seems dubious to me.

You could modify my example for a linear resampling just to inspect the upper bound of quality improvement, if you like.


Top
 Profile  
 
PostPosted: Fri Aug 05, 2016 6:24 pm 
Offline
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 3968
Next question...
Let's say you synthesize an entire frame's worth sound data once per frame. Then your IRQ handler just becomes something that spits out one byte then increments the counter. Sine Wave synthesis would still be fast, only it could be done ahead of time rather than in time with the IRQ routine. You could also run the output handler every scanline or every other scanline.

Once per scanline (15720Hz) is 262 bytes per frame, once every other scanline (7680Hz) is 131 bytes per frame. (yeah, I'm multiplying by 60 here...)

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
PostPosted: Sat Aug 06, 2016 9:41 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5891
Location: Canada
Yeah, possibly a nice memory for speed tradeoff.

I was thinking that this might be a reasonable use case for $2003/$2004 OAM updates. You could use it to update a small amount of sprites each frame, allowing animation. Obviously wouldn't be great for gameplay, but you could still do some animated sequences with limited sprite update bandwidth.


Top
 Profile  
 
PostPosted: Sun Aug 07, 2016 5:51 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5891
Location: Canada
I ended up getting really interested in this idea, so I decided to make a weekend project of it.

I wrote a player that software mixes 2 square channels, and outputs their result every 102 cycles (~17547 Hz), and interleaves it with updates of the other channels. No IRQs or anything, just cycle counted code making full use of the CPU (very stable timing this way). Still working on a way to make the data stream a bit smaller, but the player itself is about 750 bytes.

The resulting aliasing at this frequency is not too bad, and it almost adds a little interesting slightly squelchy texture. I made it to take input as a VRC6. It ignores the saw channel, the DPCM channel (obviously), sweep, and the duty on the two extra channels is always square. I could probably add a saw channel and duty control for a full VRC6 recreation, but the samplerate would be reduced (and aliasing increased); not sure if I really care to do that at this point; I really wanted something to make new music with, not duplicate VRC6 specifically. I used VRC6 just cause it's easy to make data for in Famitracker (and since the squares are 16-bit phase oscillators, they have extended frequency range, so the VRC6's extra bit helps).


Here is a demo playing a modified version of a tune from Akumajou Densetsu (famicom Castlevania III) but with my player substituting for the VRC6. (I also added some extra percussion to the noise channel to make up for the DPCM being lost, and moved the saw channel to the triangle which was unused.)

http://rainwarrior.ca/projects/nes/ad_sound_demo.nsf


Last edited by rainwarrior on Sun Aug 07, 2016 10:36 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Sun Aug 07, 2016 7:01 pm 
Offline
Site Admin
User avatar

Joined: Mon Sep 20, 2004 6:04 am
Posts: 3487
Location: Indianapolis
I was thinking of hacking something up based on this thread too. But that NSF sounds brilliant, rainwarrior.

Bregalad's volume control trick is pretty cool, as a composer I've never been a big fan of the sine wave sound though, it's kinda.. naked. I can remember doing some tracks where every channel used sine waves and I liked that well enough, but I've never felt the urge to mix a sine wave with other wave channels. Though I suppose in practice it doesn't sound all that different from a triangle wave (but I've never tried multi-channel triangle waves either). I'm sure others could do something cool with it though.

Anyways the idea I had was to try 4-bit artibrary waveform with 4-bit volume control. That way you could just OR them together and use a 256-byte LUT.

rainwarrior wrote:
You'd get less distortion if you could skip the update and catch up instead. Shifting the whole timeline back causes worse problems than a sample-and-hold over a missed sample while retaining consistent timing.


Yeah, definitely. I remember expecting my original Squeedo synth to sound like crap with sprite DMA, but it was hardly noticeable for just that reason. The synth itself kept running, the NES just missed the samples and held the last one during DMA. With just the 2A03 I suppose we'd need the update rate to be divisible into 512 (or 513) to keep things kinda sane. I think Dwedit's pre-calculating suggestion is probably easier to deal with.

Actually one of my WIP mapper designs (another el-cheapo one) could have an 8-bit CPU cycle counter at no extra cost really, I figured it'd be perfect for something like this if someone was crazy enough to use it in something.


Top
 Profile  
 
PostPosted: Sun Aug 07, 2016 8:46 pm 
Offline
User avatar

Joined: Sat Jul 12, 2014 3:04 pm
Posts: 950
rainwarrior wrote:
I made it to take input as a VRC6. It ignores the saw channel, the DPCM channel (obviously), sweep, and the duty on the two extra channels is always square.…Akumajou Densetsu (famicom Castlevania III) but with my player substituting for the VRC6. (I also added some extra percussion to the noise channel to make up for the DPCM being lost, and moved the saw channel to the triangle which was unused.)

For those without a 2ear03, was there any sweep, and were the extra channels nonsquare in the original piece?


Top
 Profile  
 
PostPosted: Sun Aug 07, 2016 8:50 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5891
Location: Canada
I made a ROM version in case that's more fun:
http://rainwarrior.ca/projects/nes/ad_sound_demo.nes

Myask wrote:
For those without a 2ear03, was there any sweep, and were the extra channels nonsquare in the original piece?

The original piece didn't use sweep, but it definitely used various non-square duties, and it had a saw channel too (moved in my arrangement to the triangle, which it didn't use for anything). It was also using DPCM for drums, which I added stuff to the noise channel to replace. It's not supposed to be an exact replica, or anything, just a proof of concept.


Top
 Profile  
 
PostPosted: Mon Aug 08, 2016 3:44 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19335
Location: NE Indiana, USA (NTSC)
Memblers wrote:
I can remember doing some tracks where every channel used sine waves and I liked that well enough, but I've never felt the urge to mix a sine wave with other wave channels. Though I suppose in practice it doesn't sound all that different from a triangle wave (but I've never tried multi-channel triangle waves either).

Years ago, I did composition with multi-channel triangle waves, albeit with others for percussion (ogg | tracker source). I eventually remade it on the NES as a stress test for Pently's channel interruption features (ogg | nsf | Pently source).


Top
 Profile  
 
PostPosted: Mon Aug 08, 2016 10:06 am 
Offline
User avatar

Joined: Fri Nov 12, 2004 2:49 pm
Posts: 7314
Location: Chexbres, VD, Switzerland
This demo indeed sounds very good ! I am surprised that you were able to get square waves sound that good, my personal experience was that square waves sound awful when at low frequency and are neither filtered nor a frequency which evenly divide the sampling rate.

However, correct me if I'm wrong but you used a sample rate much, much higher than what Dwedit suggessted.

I am not against the idea of buffering samples and replaying them back during the frame, but it creates some other issues, such as the sampling rate needs to evenly divide the VBlank rate. It's not an absolute need, but is very helpful (hem hem somehow this REALLY remind me the GBA).


Top
 Profile  
 
PostPosted: Mon Aug 08, 2016 11:53 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5891
Location: Canada
Bregalad wrote:
However, correct me if I'm wrong but you used a sample rate much, much higher than what Dwedit suggessted.

My goal was to make the samplerate as high as I could make practical. I figured 2 squares is enough (sort of like a compromise between MMC5 and 5B expansions), and started with something like this:
Code:
square_mix:                ; 6 (jsr)
; mixes and outputs the pseudo-MMC5 square channels to $4011
; clobbers A, flags, temp+0
; time: 78 cycles (always the last 78 cycles of a 102 cycle loop)
   lda phase0+0           ; +3 = 9
   clc                    ; +2 = 11
   adc accum0+0           ; +3 = 14
   sta phase0+0           ; +3 = 17
   lda phase0+1           ; +3 = 20
   adc accum0+1           ; +3 = 23
   sta phase0+1           ; +3 = 26
   assert_branch_page :+
   bmi :+
      lda #0             ; +2+2 = 30
      jmp :++            ; +3 = 33
   :
      lda a:volum0       ; +3+4 = 33
   :
   sta temp+0             ; +3 = 36
   lda phase1+0           ; +3 = 39
   clc                    ; +2 = 41
   adc accum1+0           ; +3 = 44
   sta phase1+0           ; +3 = 47
   lda phase1+1           ; +3 = 50
   adc accum1+1           ; +3 = 53
   sta phase1+1           ; +3 = 56
   assert_branch_page :+
   bmi :+
      lda #0             ; +2+2 = 60
      jmp :++            ; +3 = 63
   :
      lda a:volum1       ; +3+4 = 63
   :
   clc                    ; +2 = 65
   adc temp+0             ; +3 = 68
   sta $4011              ; +4 = 72
   rts                    ; +6 = 78

So I started with these 78 cycles, and tried to think what the minimum I needed between them was. At first I figured I should update registers in groups of 3, to get a whole channel at once, so I set a goal of 27 cycles (LDA zp + STA (zp), Y x 3) in between each sample. That was sort of the initial basis for the choice of samplerate. (Actually, I also forgot that STA (zp), Y was 1 cycle longer than LDA (zp), Y, so I mistakenly believed it was going to be 24 cycles while working on it.)

After I wrote this version, though, I realized that it didn't really matter if I updated all 3 values at once, and I could make the data a lot smaller by treating each register as its own RLE'd data stream. At that point I rewrote everything from scratch, and arbitrarily stuck with 24 cycles between sample updates. That's what determined the samplerate in the end.

I could have picked a number larger or smaller than 24, but 24 was enough. It's hard to change that now without a complete rewrite. The sample loop is easy to change, though, if that number 78 changes, the cycle count of the loop is just a constant I change in the exporter and then I just add or subtract a few "killing time" samples from the main loop to hit a target close to 60 Hz.

The main loop is pretty simple:
Code:
play:
; NMI/IRQ should be disabled
   nop3                   ; 3
@loop:
   ; each loop should do 292 sample loops * 102 cycles = 29870 cycles
   jsr delay_21           ; +21 = 24
   jsr square_mix         ; loops: 1
   jsr read_controller    ; loops: +10 = 11
   ; exit loop if masked buttons are pressed
   lda pad                ; 3 = 3
   and padmask            ; +3 = 6
   assert_branch_page :+
   beq :+
      rts
   :
   ldx #0                 ; +3+2 = 11
   ldy #0                 ; +2 = 13
   stx chan               ; +3 = 16
   nop
   nop
   nop
   nop                    ; +8 = 24
   jsr square_mix         ; loops: +1 = 12
   .repeat 18
      jsr update_channel ; loops: +(18*14) = 264
   .endrepeat
   .repeat 28
      jsr nop_loop       ; loops: +28 = 292
   .endrepeat
   jmp @loop              ; 3

At this point, though, it'd be hard to add new features without sacrificing something else, probably samplerate would have to drop significantly. With any more data streams (e.g. 3 more streams for a saw channel) it's easy to just add a couple of new ones to the main loop, but what's not easy at this point is fitting anything else within 60 Hz. There's only 28 "spare" samples right now and each stream takes 14 samples to update. If I use a longer loop length, I get less samples, though... so basically I'd have to increase the in-between (24 cycle) code until I can do more per sample too.

Eventually I could find a balance that fits a given set of features, but I figure I'm already getting a lot of musical capability out of just two squares, and rewriting the in-betweens is tedious, so I don't really want to add more features at this point. (I'll release source eventually, in case others want to play with it.) Rewriting the in-betweens isn't too bad, actually, maybe 150 lines of code, but I'm not much interested in doing that over and over, maybe I'd do it if I had a cool idea.

The one last thing I want to do is add some internal repeat detection to the exporter, which should help reduce the data stream sizes a lot more. Updating a channel really only takes about 7 samples ideally, but I basically made it twice as long to accomodate this repeat feature. Currently it's just looping at the end of the song, but it can actually be used for arbitrary repeats.

Oh, also I should also mention that I borrowed shiru's lightweight 2A03 emulator from famitone2/nsf2data to make my exporter. It came in very handy.

Also, since it's entirely CPU driven, if you run it on PAL everything adjusts to the PAL CPU rate, so it stays in tune but slows down by ~7% (56 Hz). Much less of a change than the usual 20% 60/50 Hz difference, so for cross-region code you might even just leave it as-is, or you could just adjust the number of empty sample loops per frame to bring it back up to 60 Hz.


Top
 Profile  
 
PostPosted: Mon Aug 08, 2016 4:03 pm 
Offline
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 3968
If you're using almost 100% CPU usage, it gets a lot more competitive. SuperNSF does GBA-like mod playback on a NES.

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
PostPosted: Mon Aug 08, 2016 5:04 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5891
Location: Canada
Yes, I've used SuperNSF before. It is quite good for NSF demos.

I'm not "competing", just having fun making my own alternative. I wanted something that I could use with Famitracker, doesn't require 4k NSF bankswitching, and could fit in a small enough space to maybe use as a title screen or something. Probably I could modify SuperNSF to meet these goals, if I wanted to, but why would I want to?


Top
 Profile  
 
PostPosted: Wed Aug 10, 2016 1:24 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5891
Location: Canada
Here's a beta test and full source of my "two squares" engine, if anybody wants to play around with it.

Edit: I ended up adding duty control, so it's basically VRC6 without saw now, with some aliasing.

http://rainwarrior.ca/projects/nes/dmmc_beta.zip


Last edited by rainwarrior on Fri Dec 09, 2016 4:20 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Sat Aug 20, 2016 10:21 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5891
Location: Canada
Was thinking about the idea of subtracting two waves to control volume. We mentioned before that this only works for sine, but I got to thinking about a similar approach with saw waves, or whether there are other practical saw waves?


If you could reset the phase of a lower pitched saw wave every time a higher pitched one rolls over, you could subtract one from the other to control volume; in this case the difference in pitch would control the difference in volume.

There's a jitter problem, though, since if your saw wave is just the high bits of a phase accumulator, it will normally roll over "in between" samples, so there's a difficulty in propagating the phase reset to other saw with correct phase. I think it would slip by a fixed amount each time, causing some sort of cross-modulation. The slip amount could be pre-calculated, though, and added during the phase reset to compensate, but it seems a bit of a complicated solution.


As an alternative, you could just do a single volume-controlled saw as a phase accumulator with a threshold; i.e. every time the threshold is exceeded, subtract the threshold from the phase. This would combine pitch and volume into the same accumulator increment, so the value you added per sample depends on both at once. Probably this would be a bit more practical than the "two saws" idea above.


The VRC6 itself has a very nice way of doing it, i.e. volume/output is an accumulator, but phase is just 7 fixed steps locked to a division of the samplerate (not really a "true" saw, but close enough?). It operates at a much higher frequency than you could do in software, of course, so tuning might be an issue if you did something similar in software (i.e. ~150x lower samplerate).


Top
 Profile  
 
PostPosted: Sun Aug 21, 2016 7:41 pm 
Offline

Joined: Thu Aug 20, 2015 3:09 am
Posts: 297
I haven't got around to testing it myself, but I'm pretty sure you can implement a divisor of the full CPU clock rate in software:

Code:
update_wavetable:
   JMP (wavetable_period)

   ; clockslide executes here, controlling clock divisor

   lda wavetable_index
   and wavetable_length_mask
   tay
   lda (wavetable_base), y
   sta $4011
   inc wavetable_index
   rts

It only gives you one channel, and you'll need the Deflemask trick for timing since the speed of execution changes with pitch. The minimum period would also be quite high, depending on how often the routine above gets called (the exact interval doesn't matter as long as it's always the same), but anything below ~172 cycles starts to sound out of tune anyway, and you can exchange table length for pitch range at runtime.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 48 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: Yahoo [Bot] and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group