Gauss Table Creation
Moderator: Moderators
Forum rules
 For making cartridges of your Super NES games, see Reproduction.
Re: Gauss Table Creation
I've just got back to the basic table[x]=(e^((x^2)/curv)*volume formula. And tested what happens when using other values than e=2.718281828, like this:
curv=36000, volume=1305, e=2.000
curv=55000, volume=1305, e=2.718281828 (euler's number)
curv=72000, volume=1305, e=4.000
The curv values are needed to match the result close to the middle table entry. And I was expecting to see huge differences on other tables entries. But, no, the results are identical in all three cases. Surprise. Magic. I've no clue how that is possible (yes my maths are very poor). At least, I've now figured out that it doesn't seem to matter if one is using 2.718281828 or other values (except, supposedly, e=1.000 couldn't work).
Please excuse my amateurish attempts. Are there any maths gurus out there who could jump in?
curv=36000, volume=1305, e=2.000
curv=55000, volume=1305, e=2.718281828 (euler's number)
curv=72000, volume=1305, e=4.000
The curv values are needed to match the result close to the middle table entry. And I was expecting to see huge differences on other tables entries. But, no, the results are identical in all three cases. Surprise. Magic. I've no clue how that is possible (yes my maths are very poor). At least, I've now figured out that it doesn't seem to matter if one is using 2.718281828 or other values (except, supposedly, e=1.000 couldn't work).
Please excuse my amateurish attempts. Are there any maths gurus out there who could jump in?
Re: Gauss Table Creation
A number raised to the power of a product of two factor is the number raised to the first factor, then raised to the other factor.
a^(b * c) = (a^b)^c
This means that the following identities hold:
e^(y / curvA) = (e^(1 / curvA))^y
2^(y / curvB) = (2^(1 / curvB))^y
4^(y / curvC) = (4^(1 / curvC))^y
What you're seeing is that (2^(1 / curvB)) = (4^(1 / curvC)) because 2^2 = 4^1, and (e^(1 / curvA)) isn't too different.
2.000^(1/36000) = 1.000019254
2.718281828^(1/55000) = 1.000018182
4.000^(1/72000) =1.000019254
a^(b * c) = (a^b)^c
This means that the following identities hold:
e^(y / curvA) = (e^(1 / curvA))^y
2^(y / curvB) = (2^(1 / curvB))^y
4^(y / curvC) = (4^(1 / curvC))^y
What you're seeing is that (2^(1 / curvB)) = (4^(1 / curvC)) because 2^2 = 4^1, and (e^(1 / curvA)) isn't too different.
2.000^(1/36000) = 1.000019254
2.718281828^(1/55000) = 1.000018182
4.000^(1/72000) =1.000019254
Re: Gauss Table Creation
Thanks! I am really not familar with that stuff (though I suspect that I might have learned it at school ages ago). Some more working e:curv pairs:
curv=36000, volume=1305, e=0.500
curv=36000, volume=1305, e=2.000
curv=55000, volume=1305, e=2.718281828 (euler's number)
curv=72000, volume=1305, e=4.000
curv=108000, volume=1305, e=8.000
curv=1, volume=1305, e=1.00001925
The latter one with curv=1 and e=1.00001925 means that one could simplify the formula to table[x]=(e^((x^2))*volume.
The only problem is that the formula is wrong either way. The middle table entry at x=256 is okay, but higher values near x=128 are a bit too small, and the lower values near x=384 a bit too large. Any ideas how to fix that?
curv=36000, volume=1305, e=0.500
curv=36000, volume=1305, e=2.000
curv=55000, volume=1305, e=2.718281828 (euler's number)
curv=72000, volume=1305, e=4.000
curv=108000, volume=1305, e=8.000
curv=1, volume=1305, e=1.00001925
The latter one with curv=1 and e=1.00001925 means that one could simplify the formula to table[x]=(e^((x^2))*volume.
The only problem is that the formula is wrong either way. The middle table entry at x=256 is okay, but higher values near x=128 are a bit too small, and the lower values near x=384 a bit too large. Any ideas how to fix that?
Re: Gauss Table Creation
This gets fairly close for SNES, but it seems like the window function needs another cosine term or something:
Code: Select all
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
static void generate_lp(double* coeffs, unsigned num_coeffs, double tb_center, double a, double b)
{
double f_s = tb_center;
int N = num_coeffs / 2;
for(int i = 0; i < N; i++)
{
double k = (0.5 + i);
double pik_N = M_PI * 2.0 * k / (num_coeffs  1);
double c_k = sin(M_PI * 2.0 * k * f_s) / k;
double w_k = (1.0  (a + b)) + a * cos(pik_N) + b * cos(2.0 * pik_N);
double r = c_k * w_k;
coeffs[N + i] = r;
coeffs[N  i  1] = r;
}
}
static void normalize(double* coeffs, unsigned num_coeffs, double v)
{
double sum = 0;
for(unsigned i = 0; i < num_coeffs; i++)
sum += coeffs[i];
double multiplier = v / sum;
for(unsigned i = 0; i < num_coeffs; i++)
coeffs[i] *= multiplier;
}
static const int apu_halfimp[512] =
{
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2,
2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5,
6, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10,
11, 11, 11, 12, 12, 13, 13, 14, 14, 15, 15, 15, 16, 16, 17, 17,
18, 19, 19, 20, 20, 21, 21, 22, 23, 23, 24, 24, 25, 26, 27, 27,
28, 29, 29, 30, 31, 32, 32, 33, 34, 35, 36, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56,
58, 59, 60, 61, 62, 64, 65, 66, 67, 69, 70, 71, 73, 74, 76, 77,
78, 80, 81, 83, 84, 86, 87, 89, 90, 92, 94, 95, 97, 99, 100, 102,
104, 106, 107, 109, 111, 113, 115, 117, 118, 120, 122, 124, 126, 128, 130, 132,
134, 137, 139, 141, 143, 145, 147, 150, 152, 154, 156, 159, 161, 163, 166, 168,
171, 173, 175, 178, 180, 183, 186, 188, 191, 193, 196, 199, 201, 204, 207, 210,
212, 215, 218, 221, 224, 227, 230, 233, 236, 239, 242, 245, 248, 251, 254, 257,
260, 263, 267, 270, 273, 276, 280, 283, 286, 290, 293, 297, 300, 304, 307, 311,
314, 318, 321, 325, 328, 332, 336, 339, 343, 347, 351, 354, 358, 362, 366, 370,
374, 378, 381, 385, 389, 393, 397, 401, 405, 410, 414, 418, 422, 426, 430, 434,
439, 443, 447, 451, 456, 460, 464, 469, 473, 477, 482, 486, 491, 495, 499, 504,
508, 513, 517, 522, 527, 531, 536, 540, 545, 550, 554, 559, 563, 568, 573, 577,
582, 587, 592, 596, 601, 606, 611, 615, 620, 625, 630, 635, 640, 644, 649, 654,
659, 664, 669, 674, 678, 683, 688, 693, 698, 703, 708, 713, 718, 723, 728, 732,
737, 742, 747, 752, 757, 762, 767, 772, 777, 782, 787, 792, 797, 802, 806, 811,
816, 821, 826, 831, 836, 841, 846, 851, 855, 860, 865, 870, 875, 880, 884, 889,
894, 899, 904, 908, 913, 918, 923, 927, 932, 937, 941, 946, 951, 955, 960, 965,
969, 974, 978, 983, 988, 992, 997, 1001, 1005, 1010, 1014, 1019, 1023, 1027, 1032, 1036,
1040, 1045, 1049, 1053, 1057, 1061, 1066, 1070, 1074, 1078, 1082, 1086, 1090, 1094, 1098, 1102,
1106, 1109, 1113, 1117, 1121, 1125, 1128, 1132, 1136, 1139, 1143, 1146, 1150, 1153, 1157, 1160,
1164, 1167, 1170, 1174, 1177, 1180, 1183, 1186, 1190, 1193, 1196, 1199, 1202, 1205, 1207, 1210,
1213, 1216, 1219, 1221, 1224, 1227, 1229, 1232, 1234, 1237, 1239, 1241, 1244, 1246, 1248, 1251,
1253, 1255, 1257, 1259, 1261, 1263, 1265, 1267, 1269, 1270, 1272, 1274, 1275, 1277, 1279, 1280,
1282, 1283, 1284, 1286, 1287, 1288, 1290, 1291, 1292, 1293, 1294, 1295, 1296, 1297, 1297, 1298,
1299, 1300, 1300, 1301, 1302, 1302, 1303, 1303, 1303, 1304, 1304, 1304, 1304, 1304, 1305, 1305
};
int main(int argc, char* argv[])
{
double coeffs[1024];
generate_lp(coeffs, 1024, 0.16 / 256.0, 0.50, 0.08);
normalize(coeffs, 1024, 2048 * 256.0);
for(unsigned i = 0; i < 512; i++)
{
int c = floor(0.5 + coeffs[i]);
printf("0x%03x: APU=%d Calced=%d (%f)", i, apu_halfimp[i], c, coeffs[i]);
if(apu_halfimp[i] != c)
printf(" MISMATCH (%f)", 0.5  (coeffs[i]  floor(coeffs[i])));
printf("\n");
}
return 0;
}
Re: Gauss Table Creation
Had an idea that fixed it so it generates a 100% match, new version attached.
 Attachments

 apu.cpp
 (4.29 KiB) Downloaded 233 times
Re: Gauss Table Creation
Cool. With the sine and cosine it's very different from the exponential stuff that I had been trying. And your constants with not more than 2 fractional digits are looking quite simple. Or they do even look like percent values, originally specified as plain integers without any fractional digits. Are you going to try the same formula for the PSX gauss table, too?
Re: Gauss Table Creation
I can get close with PS1, but still a lot of offbyone errors.
The number of 1 values is difficult to reconcile. I'm wondering if I dumped/logged the SPU table incorrectly(from incorrect assumptions about the SPU's interpolation math?), in such a way that wouldn't really have much of an audible effect, but would complicate trying to recreate the original generation algorithm.
The number of 1 values is difficult to reconcile. I'm wondering if I dumped/logged the SPU table incorrectly(from incorrect assumptions about the SPU's interpolation math?), in such a way that wouldn't really have much of an audible effect, but would complicate trying to recreate the original generation algorithm.
Re: Gauss Table Creation
I have this values in the table http://problemkaputt.de/psxspx.htm#spuadpcmpitch
I guess the main difference is using values other than 0.16 and 0.08 for PSX?
And for the larger fractional part... replace 800h/sum by 7f80h/sum?
EDIT: Or did you mean the problem is getting 1 at all, because the formula normally outputs 0 as smallest value?
That might be a rounding issue, subtract 0.5 from the result?
EDIT: Gave it try, too. Replacing 800h by 7F80h, and 0.16 by 0.256 is getting relative close for PSX.
But it's still off +/2 or so, so you might already have better results. Which values are using currently?
I guess the main difference is using values other than 0.16 and 0.08 for PSX?
And for the larger fractional part... replace 800h/sum by 7f80h/sum?
EDIT: Or did you mean the problem is getting 1 at all, because the formula normally outputs 0 as smallest value?
That might be a rounding issue, subtract 0.5 from the result?
EDIT: Gave it try, too. Replacing 800h by 7F80h, and 0.16 by 0.256 is getting relative close for PSX.
But it's still off +/2 or so, so you might already have better results. Which values are using currently?
Re: Gauss Table Creation
0.256 as well, with about the same results, even after bruteforce checking different rounding modes and scaling values.nocash wrote:EDIT: Gave it try, too. Replacing 800h by 7F80h, and 0.16 by 0.256 is getting relative close for PSX.
But it's still off +/2 or so, so you might already have better results. Which values are using currently?
Re: Gauss Table Creation
Does the PlayStation's table have any defects where the four values for one fractional phase (table[x], table[256 + x], table[511  x], and table[255  x]) add up to other than 100%? I know the Super NES table has a few such values. If there aren't such values in the PlayStation's table, but some values there are +/ 1, they might be where particular sets of 4 have been corrected in a postprocessing step.
Re: Gauss Table Creation
That's incredible! Really awesome work, thank you for sharing~ ^^Mednafen wrote:Had an idea that fixed it so it generates a 100% match, new version attached.
Re: Gauss Table Creation
Yes and no. The inaccuracy in sum +/1 is there, but it won't cause overflows on PSX.tepples wrote:Does the PlayStation's table have any defects where the four values for one fractional phase (table[x], table[256 + x], table[511  x], and table[255  x]) add up to other than 100%? I know the Super NES table has a few such values. If there aren't such values in the PlayStation's table, but some values there are +/ 1, they might be where particular sets of 4 have been corrected in a postprocessing step.
On SNES the sum of four values is 800h+/1 (that's bad because it may exceed 800h)
On PSX the sum of four values is 7F80h+/1 (that's better because it won't exceed 8000h)
I've rearranged the code from Mednafen a bit to make it easier to see which immediates are used in which place:
Code: Select all
for i=0 to 511
k = (0.5 + i)
s = (sin(PI * k * 1.280 / 1024) ) ;for PSX: Use 2.048 instead of 1.280
t = (cos(PI * k * 2.000 / 1023)1)*0.50
u = (cos(PI * k * 4.000 / 1023)1)*0.08
table[511i] = s * (t + u + 1.0) / k
next i
For SNES that +/1 error can be fixed by computing the perfect scaling factor for each group of four values. But that doesn't help on PSX.
In the above formula, the /1024 vesus /1023 looks a bit odd. But trying to use only /1024 (or /1023 or /1023.5) is making things worse.
Maybe PSX needs another cosine, multiplied by some small value like 0.0000x or so? That might add some fractional bits that aren't needed on SNES.
Re: Gauss Table Creation
I've got exact matches for PSX! The trick is do the scaling & error adjustment in separate steps:
However, that's working for PSX only. Doing the same on SNES is messing up a few values. Either one shouldn't do that on SNES... or maybe one could do so when adjusting some constants in the SNES.
 First, calc sum of ALL table entries, and scale all entries accordingly (alike the normalize function in your old code version).
 Next, calc sum of each FOUR table entries (alike your newer code), but don't use that as scaling value, instead compute difference=sum7F80h, and then fix the four table entries by subtacting difference/4.
However, that's working for PSX only. Doing the same on SNES is messing up a few values. Either one shouldn't do that on SNES... or maybe one could do so when adjusting some constants in the SNES.
Re: Gauss Table Creation
What I don't understand about the formula,how and why does that magically result in,is there some maths rule that could explain that effect?
Currently, the groups of four values don't sum up to the exact same constant. But it's almost perfect, and when understanding why it is so close, then one could perhaps improve the formula to get totally perfect results (and no longer needing the extra steps for fixing the errors in sum of four values).
There are some webpages explaining what happens when adding or multiplying sine and cosine values, maybe that could explain how the formula works. The extra difficulty is the "divide by k" step.
Hmmmm, or is the formula just working because it was supposed to match Sony's gauss tables? I wouldn't be surprised if changing a few parameters in the formula could be used to reproduce the shape of a chicken's egg  which wouldn't mean that Sony (or the chicken) had actually used that formula for creating eggs.
Code: Select all
for i=0 to 511
k = (0.5 + i)
s = (sin(PI * k * 1.280 / 1024) ) ;for PSX: Use 2.048 instead of 1.280
t = (cos(PI * k * 2.000 / 1023)1)*0.50
u = (cos(PI * k * 4.000 / 1023)1)*0.08
table[511i] = s * (t + u + 1.0) / k
next i
Code: Select all
table[0+i] + table[255i] + table[256+i] + table[511i] = constant ;with "i=0..127", and "constant" being same in each case
Currently, the groups of four values don't sum up to the exact same constant. But it's almost perfect, and when understanding why it is so close, then one could perhaps improve the formula to get totally perfect results (and no longer needing the extra steps for fixing the errors in sum of four values).
There are some webpages explaining what happens when adding or multiplying sine and cosine values, maybe that could explain how the formula works. The extra difficulty is the "divide by k" step.
Hmmmm, or is the formula just working because it was supposed to match Sony's gauss tables? I wouldn't be surprised if changing a few parameters in the formula could be used to reproduce the shape of a chicken's egg  which wouldn't mean that Sony (or the chicken) had actually used that formula for creating eggs.
Re: Gauss Table Creation
Indeed. The importance of algorithms over constant tables isn't just to remove a few lines from a source code file, it's to show you what the constants mean. That's why the NES color generation formula is worth its weight in gold. There's an infinite number of algorithms to produce any table, and we'll never know which one is correct here unless Sony tells us.nocash wrote:Hmmmm, or is the formula just working because it was supposed to match Sony's gauss tables? I wouldn't be surprised if changing a few parameters in the formula could be used to reproduce the shape of a chicken's egg  which wouldn't mean that Sony (or the chicken) had actually used that formula for creating eggs.
But by finally having a bitperfect algorithm, it's a solid base. Now we can try and understand why values are what they are, and what significance they have. It's a critical starting point. I'm excited to see what happens from here.