It is currently Mon Oct 23, 2017 3:12 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 98 posts ]  Go to page Previous  1 ... 3, 4, 5, 6, 7  Next
Author Message
PostPosted: Sat Jul 04, 2015 8:23 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 6297
Location: Seattle
Depends on exactly how the balance of math works. Lowpass filter design usually has parameters like "highest frequency that must be passed", "lowest frequency that must be blocked", "maximum passband ripple", "minimum stopband rejection", and placing your stopband more precisely may (or may not) help with that.

This is probably more relevant when you don't need fractional (or irrational) sample rate interpolation to get from the generated frequency to the PC standard one.


Top
 Profile  
 
PostPosted: Sat Jul 04, 2015 9:14 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19116
Location: NE Indiana, USA (NTSC)
At some point, an FIR filter becomes at least as attractive as an IIR filter. For one thing, a symmetric kernel not only guarantees stability and linear phase but also lets you cut computation in half. And you get decimation for free, as you can calculate the convolution only for those output samples you're going to actually use. Furthermore, if you know that your input is going to be a piecewise constant function, as the output of most 8-bit-era PSGs is, you can differentiate the signal and integrate the FIR filter. This is the theory behind BLEP (band limited step) synthesis used in blip_buf. The big disadvantage of FIR is a somewhat wider transition band unless you do filtering and decimation in stages.


Top
 Profile  
 
PostPosted: Sun Jul 05, 2015 7:45 pm 
Offline
User avatar

Joined: Tue Dec 21, 2004 8:35 pm
Posts: 600
Location: Argentina
rainwarrior wrote:
http://ptolemy.eecs.berkeley.edu/eecs20/week12/implementation.html

Edit: sorry, this is just an example of a FIR filter without resampling, but the basic technique is the same. Multiply an array of input samples with the array of weights (FIR filter) and sum them up to produce an output sample.


Pretend you are with a person that knows almost nothing about DSP (me in this case), so you say:

Code:
Multiply an array of input samples with the array of weights (FIR filter) and sum them up to produce an output sample.


So i ask you:
- input samples are the samples that i take every 30.7~ CPU cc given 48Khz?
- array of weights (FIR filter): what is that?
- sum them up to produce an output sample: i think i understand that, but it would be welcomed to clarify that.

So you are with a totally noob now :-)
Code is praciated.

_________________
ANes


Top
 Profile  
 
PostPosted: Sun Jul 05, 2015 7:49 pm 
Offline
User avatar

Joined: Mon Dec 29, 2014 1:46 pm
Posts: 710
Location: New York, NY
tepples wrote:
So i ask you:
- input samples are the samples that i take every 30.7~ CPU cc given 48Khz?
- array of weights (FIR filter): what is that?
- sum them up to produce an output sample: i think i understand that, but it would be welcomed to clarify that.

So you are with a totally noob now
Code is praciated.


https://www.youtube.com/watch?v=r7ypfE5TQK0


Top
 Profile  
 
PostPosted: Mon Jul 06, 2015 2:03 pm 
Offline
User avatar

Joined: Mon Dec 29, 2014 1:46 pm
Posts: 710
Location: New York, NY
James wrote:
Here's some code for designing a filter and converting it to second order sections. Use the coefficients (Bs and As) to implement each section as a transposed direct form II filter.


Why form II instead of form I ?

Edit: Maybe because you can normalize a0 and b0 to 1, saving 2 multiplications?


Last edited by zeroone on Mon Jul 06, 2015 2:44 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Mon Jul 06, 2015 2:41 pm 
Offline
User avatar

Joined: Sat Jan 22, 2005 8:51 am
Posts: 427
Location: Chicago, IL
zeroone wrote:
James wrote:
Here's some code for designing a filter and converting it to second order sections. Use the coefficients (Bs and As) to implement each section as a transposed direct form II filter.


Why form II instead of form I ?

http://www.earlevel.com/main/2003/02/28/biquads/

_________________
get nemulator
http://nemulator.com


Top
 Profile  
 
PostPosted: Mon Jul 06, 2015 7:43 pm 
Offline
User avatar

Joined: Mon Dec 29, 2014 1:46 pm
Posts: 710
Location: New York, NY
@ James, lidnariq, tepples: Thanks for all your help. I think I finally got it.

Using GNU Octave, I generated this 13th-order Elliptic filter consisting of 7 biquad segments, with the following code:

Code:
clear
fc = 20000; % Cut-off frequency (Hz)
fs = 19687500.0 / 11.0; % NTSC Sampling rate (Hz)
%fs = 53203425.0 / 32.0; % PAL Sampling rate (Hz)
order = 13; % Filter order

[z, p, k] = ellip(order, 0.1, 100, 2*fc/fs);

[sos, g] = zp2sos(z, p, k);

Bs = sos(:,1:3)
num2hex(Bs)
As = sos(:,4:6)
num2hex(As)
[nsec,temp]=size(sos);
nsamps = 100000;
x = g*[1,zeros(1,nsamps-1)];
for i=1:nsec
 x = filter(Bs(i,:),As(i,:),x);
end
X = fft(x);
f=[0:nsamps-1]*fs/nsamps;
figure(1);
grid('on');
axis([0 fc*2 -100 5]);
legend('off');
pscale = 48
plot(f(1:nsamps/pscale),20*log10(X(1:nsamps/pscale)));


The cutoff frequency is set to 20000 Hz and it attenuates down to -100 dB by 22000 Hz, satisfying the Nyquist criterion for the traditional audio sampling rates of 44100+ Hz.

lidnariq wrote:
James was using a sample rate of 1786830 intentionally instead of the NTSC 1789772⁸/₁₁: the NTSC NES produces video at 60.1 Hz, not 60 Hz as the end-user's monitor does. If one wants to avoid visible tearing, you need to slow down the emulated NES by 1646ppm:

39375000÷11 Hz × 1½ ÷ (341×262-0.5) = 60.0988…
39375000÷11 Hz ÷ 2 × 60 ÷ 60.0988… = 1786830 Hz exactly


In my emulator, the CPU, PPU and APU use an independent timer to run at the correct rates. The monitor is updated at its refresh rate on a separate thread and they coordinate using the thread disruptor pattern. This is a topic that can be discussed in a separate thread if anyone is interesting in further details.

Below is a version in Java that uses 64-bit doubles. I tested out 32-bit floats and it remains stable and mostly accurate. But, I found that the 64-bit version does not consume a lot of CPU on my box; so, I'll stick with the extra accuracy for now.

Like the Butterworth version that I posted earlier, this one provides an addInputSample() method that is invoked once per CPU cycle and the listener will be called back at the output sampling frequency supplied to the constructor.

Code:
public class Decimator {
 
  private static final double I2 = 1.0 / 2.0;
  private static final double I3 = 1.0 / 3.0;
  private static final double I6 = 1.0 / 6.0;
 
  private static final double NTSC_SAMPLING_FREQUENCY = 19687500.0 / 11.0;
  private static final double PAL_SAMPLING_FREQUENCY = 53203425.0 / 32.0;
 
  private static final long[] LONG_NTSC_A1 = {
    0xbfff67701c84a3deL,
    0xbfff8b62242e56e4L,
    0xbfffaf25694786a4L,
    0xbfffc959eca889e2L,
    0xbfffda7652351a06L,
    0xbfffe6495d699ef1L,
    0xbfef576a39439bdaL,
  };
 
  private static final long[] LONG_NTSC_A2 = {
    0x3feed6d89d961d3bL,
    0x3fef28cd096bf76aL,
    0x3fef7a3a53cd4f67L,
    0x3fefb5a2213e3abbL,
    0x3fefdbe39ed2f06eL,
    0x3feff554ab2f0006L,   
  };
 
  private static final long[] LONG_NTSC_B1 = {
    0xbfff2a2df4fd2ad2L,
    0xbfffbe3738d86ac8L,
    0xbfffd94c2258d55cL,
    0xbfffe22ba0cf6805L,
    0xbfffe5b14c11db72L,
    0xbfffe6ffa75a14ebL,
  };
 
  private static final long LONG_NTSC_G = 0x3ecc208a2d079678L;
 
  private static final double[] NTSC_A1;
  private static final double[] NTSC_A2;
  private static final double[] NTSC_B1;
  private static final double NTSC_G;
 
  static {
    NTSC_A1 = toDoubleArray(LONG_NTSC_A1);
    NTSC_A2 = toDoubleArray(LONG_NTSC_A2);
    NTSC_B1 = toDoubleArray(LONG_NTSC_B1);
    NTSC_G = Double.longBitsToDouble(LONG_NTSC_G);
  }
 
  private static final long[] LONG_PAL_A1 = {
    0xbfff5baa1b81a309L,
    0xbfff81d9b2eb15feL,
    0xbfffa7df44761c9dL,
    0xbfffc3c2600f5556L,
    0xbfffd5ff6fbab417L,
    0xbfffe2a551b417d0L,
    0xbfef4aa6ceb6c246L,
  };
 
  private static final long[] LONG_PAL_A2 = {
    0x3feec08d9ac7eb51L,
    0x3fef18943f900329L,
    0x3fef7018a2a55ed4L,
    0x3fefaffb4199e44aL,
    0x3fefd9237788b71bL,
    0x3feff4844bf56456L, 
  };
 
  private static final long[] LONG_PAL_B1 = {
    0xbfff08b370e2ff76L,
    0xbfffb3ce724731b7L,
    0xbfffd3295d0112b5L,
    0xbfffdd7036a20286L,
    0xbfffe184a77ef413L,
    0xbfffe307f8bb95fdL,
  };
 
  private static final long LONG_PAL_G = 0x3ece3d98b822795aL;
 
  private static final double[] PAL_A1;
  private static final double[] PAL_A2;
  private static final double[] PAL_B1;
  private static final double PAL_G;
 
  static {
    PAL_A1 = toDoubleArray(LONG_PAL_A1);
    PAL_A2 = toDoubleArray(LONG_PAL_A2);
    PAL_B1 = toDoubleArray(LONG_PAL_B1);
    PAL_G = Double.longBitsToDouble(LONG_PAL_G);
  } 
 
  private final double g;
  private final double[] a1;
  private final double[] a2;
  private final double[] b1;
  private final double[] d1 = new double[7];
  private final double[] d2 = new double[6];
  private final double[] ys = new double[4];
  private final double inputSamplingFrequency;
  private final double inputSamplingPeriod;
  private final double outputSamplingPeriod;
  private final DecimatorListener listener;
 
  private double time;
  private int index;
 
  public Decimator(boolean ntsc, double outputSamplingFrequency,
      DecimatorListener listener) {
   
    if (ntsc) {
      a1 = NTSC_A1;
      a2 = NTSC_A2;
      b1 = NTSC_B1;
      g = NTSC_G;
      inputSamplingFrequency = NTSC_SAMPLING_FREQUENCY;     
    } else {
      a1 = PAL_A1;
      a2 = PAL_A2;
      b1 = PAL_B1;
      g = PAL_G;
      inputSamplingFrequency = PAL_SAMPLING_FREQUENCY;
    }
   
    this.listener = listener;
   
    inputSamplingPeriod = 1.0 / inputSamplingFrequency;
    outputSamplingPeriod = 1.0 / outputSamplingFrequency;
  }
 
  private static double[] toDoubleArray(long[] values) {
    double[] ds = new double[values.length];
    for(int i = values.length - 1; i >= 0; i--) {
      ds[i] = Double.longBitsToDouble(values[i]);
    }
    return ds;
  }
 
  public void addInputSample(double x) {
   
    double y;
    for(int i = 0; i < 6; i++) {
      y = x + d1[i];   
      d1[i] = b1[i] * x - a1[i] * y + d2[i];
      d2[i] = x - a2[i] * y;
      x = y;
    }
    y = x + d1[6];
    d1[6] = x - a1[6] * y;       
    ys[index] = g * y;
   
    time += inputSamplingPeriod;
    if (time >= outputSamplingPeriod) {
      listener.processOutputSample(computeSpline(
          1.0 - (time - outputSamplingPeriod) * inputSamplingFrequency,
          ys[(index - 3) & 3],
          ys[(index - 2) & 3],
          ys[(index - 1) & 3],
          ys[index]));
      time -= outputSamplingPeriod;
    }
   
    index = (index + 1) & 3;
  }
   
  private double computeSpline(double t, double y0, double y1, double y2,
      double y3) {
    double c0 = y1;
    double c1 = y2 - I3 * y0 - I2 * y1 - I6 * y3;
    double c2 = I2 * (y0 + y2) - y1;
    double c3 = I6 * (y3 - y0) + I2 * (y1 - y2);
    return ((c3 * t + c2) * t + c1) * t + c0;
  }   
}

public interface DecimatorListener {
  void processOutputSample(double sample);
}


Top
 Profile  
 
PostPosted: Tue Jul 07, 2015 7:48 am 
Offline

Joined: Thu Feb 28, 2013 11:14 am
Posts: 43
Wow, all this looks too overcomplicated for me. In my emulator I used a very simple code. Doesn't look like it has any major problems, so why bother yourself with all this 16-order stuff if 2 could be enough for approximation? You can keep inventing more and more overnice filters all day but I doubt anyone expects a perfect audio from an emulator.


Top
 Profile  
 
PostPosted: Tue Jul 07, 2015 8:16 am 
Offline
User avatar

Joined: Mon Dec 29, 2014 1:46 pm
Posts: 710
Location: New York, NY
x0000 wrote:
Wow, all this looks too overcomplicated for me. In my emulator I used a very simple code. Doesn't look like it has any major problems, so why bother yourself with all this 16-order stuff if 2 could be enough for approximation? You can keep inventing more and more overnice filters all day but I doubt anyone expects a perfect audio from an emulator.


Audibly, I am unable to hear a different between the 13-th order Elliptic filter and simply averaging every ~37.2 samples together for any of the games that I tested. My advice to emulator developers: keep it simple.

Edit:

lidnariq wrote:
A 37-sample boxcar FIR has awful frequency rejection...

As a demonstration, try listening to a square wave at 9.4kHz with that implementation; there should be a pretty audible alias (-43dBFS) at 1kHz (from the 5th overtone).

Alternatively, try the tonal noise, which randomly (depending on the exact value of the LFSR when you switch from periodic to tonal) has either a VERY LOUD or completely absent 31st harmonic: [$400E] with $80-$82 should produce aliases from this overtone (i.e. above 24kHz, for 48kHz sample rate output)


True. My test program, which simply records the gain in decibels for a range of frequencies (a very slow Fourier-like Transform), produces these results. Nonetheless, it sounds fine to my ear and through my speakers. I wonder if the sound hardware or the sound API does it's own filtering and decimating.

For reference:

Code:
public class AveragingDecimator {

  private final double inputSamplingPeriod;
  private final double outputSamplingPeriod;
  private final DecimatorListener listener;
 
  private double time;
  private double sumOfSamples;
  private int numberOfSamples;
 
  public AveragingDecimator(
      double inputSamplingFrequency,
      double outputSamplingFrequency,
      DecimatorListener listener) {

    this.inputSamplingPeriod = 1.0 / inputSamplingFrequency;
    this.outputSamplingPeriod = 1.0 / outputSamplingFrequency;
    this.listener = listener;
  }
 
  public void addInputSample(double x) {
   
    sumOfSamples += x;
    numberOfSamples++;
   
    time += inputSamplingPeriod;
    if (time >= outputSamplingPeriod) {
      listener.processOutputSample(sumOfSamples / numberOfSamples);
      sumOfSamples = 0;
      numberOfSamples = 0;
      time -= outputSamplingPeriod;
    }
  }
}


Top
 Profile  
 
PostPosted: Tue Jul 07, 2015 10:35 am 
Offline
User avatar

Joined: Tue Dec 21, 2004 8:35 pm
Posts: 600
Location: Argentina
zeroone wrote:
tepples wrote:
So i ask you:
- input samples are the samples that i take every 30.7~ CPU cc given 48Khz?
- array of weights (FIR filter): what is that?
- sum them up to produce an output sample: i think i understand that, but it would be welcomed to clarify that.

So you are with a totally noob now
Code is praciated.


https://www.youtube.com/watch?v=r7ypfE5TQK0


I watch and re-watch this youtube video, but my "listening" english is very poor (although he writes on a paper).
Im studying signals generation with a friendly book and im clarifying things.

Anyway i still don't understand what it's refered with "weights": can somebody explain me please??
As you can see im very lost about this topic...

Surely im boring people that as i can see knows about the topic but i just want to get the sound better.

_________________
ANes


Top
 Profile  
 
PostPosted: Tue Jul 07, 2015 10:44 am 
Offline
User avatar

Joined: Mon Dec 29, 2014 1:46 pm
Posts: 710
Location: New York, NY
Anes wrote:

I watch and re-watch this youtube video, but my "listening" english is very poor (although he writes on a paper).
Im studying signals generation with a friendly book and im clarifying things.

Anyway i still don't understand what it's refered with "weights": can somebody explain me please??
As you can see im very lost about this topic...

Surely im boring people that as i can see knows about the topic but i just want to get the sound better.


Are you generating 8-bit samples or 16-bit samples?


Top
 Profile  
 
PostPosted: Tue Jul 07, 2015 11:18 am 
Offline
User avatar

Joined: Tue Dec 21, 2004 8:35 pm
Posts: 600
Location: Argentina
I'm generating 16-bit samples....

_________________
ANes


Top
 Profile  
 
PostPosted: Tue Jul 07, 2015 11:42 am 
Offline
User avatar

Joined: Mon Dec 29, 2014 1:46 pm
Posts: 710
Location: New York, NY
Anes wrote:
I'm generating 16-bit samples....


Verify that you are clamping the sample values to the appropriate range (i.e. -32768 to 32767 if you are using signed values and 0 to 65535 for unsigned) as opposed to letting the values wrap-around if there is underflow/overflow. Make sure the sound API that you are adding those values is set to use signed or unsigned sample values accordingly. Also, verify that each audio unit contributing to the total sound sample is properly scaling to 16-bits.

What is your buffering strategy?


Top
 Profile  
 
PostPosted: Tue Jul 07, 2015 12:16 pm 
Offline
User avatar

Joined: Tue Dec 21, 2004 8:35 pm
Posts: 600
Location: Argentina
zeroone wrote:
Anes wrote:
I'm generating 16-bit samples....


Verify that you are clamping the sample values to the appropriate range (i.e. -32768 to 32767 if you are using signed values and 0 to 65535 for unsigned) as opposed to letting the values wrap-around if there is underflow/overflow. Make sure the sound API that you are adding those values is set to use signed or unsigned sample values accordingly. Also, verify that each audio unit contributing to the total sound sample is properly scaling to 16-bits.

What is your buffering strategy?



Im using a circular DirectSound buffer of 16-bit signed values.

When CPU "cc_counter" pass/reaches samples_div (approx. 37.7~ cc for 48Khz) i do the following:

Code:
if (Sound.cc_counter >= samples_div)
{
   float sample;
   sample = avgsamples(Sound.X, Sound.cc_counter);
   sample *= amp_scale;
   sample *= 0x07FFF;
   sample = clip_sample(sample);
   Sound.buffer[Sound.buffer_index] = sample;
   Sound.buffer_index++;
   Sound.amp = 0;
   Sound.cc_counter -= samples_div;
}


- "Sound.X" is a float array that 42 floats so it never passes the max.

- avgsamples(Sound.X, Sound.cc_counter); do the following:
Code:
float avgsamples(float * out_buffer, int len)
{
   float sample = 0;
   for (int i = 0; i < len; i++)
      sample += out_buffer[i];
   return sample / (float) len;
}

- amp_scale is defined as:
Code:
#define ChangeScale()(amp_scale = pow(1.122018454, g_db));

Where g_db range from -36 to 12 db (this is for volume control)
- sample *= 0x7FFF; Here is my doubt i do that to scale it.
- then clip_sample(sample) is defined as:
Code:
static inline float clip_sample(float sample_value) {
  if (sample_value < -32000) return -32000;
  if (sample_value > 32000) return 32000;
  return sample_value;
}

as Tepples tought me.
- Sound.Buffer[] is an array that can holds 800 samples for the frame.
- Sound.buffer_index is set to "0" to start again when the frame ends.

What is wrong?

_________________
ANes


Top
 Profile  
 
PostPosted: Tue Jul 07, 2015 12:34 pm 
Offline
User avatar

Joined: Mon Dec 29, 2014 1:46 pm
Posts: 710
Location: New York, NY
Why are you clamping to it [-32000, 32000] instead of [-32768, 32767] ?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 98 posts ]  Go to page Previous  1 ... 3, 4, 5, 6, 7  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: mkwong98 and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group