possible ways to speed up 6502 core?
Moderator: Moderators
-
- Posts: 271
- Joined: Sun Mar 27, 2011 10:49 am
- Location: Victoria, BC
Re: possible ways to speed up 6502 core?
Doing a computed goto is (slightly) faster than using a switch, because you do it at the end of an opcode instead of looping back around. One jump instead of two.
Of course, Java doesn't have any faculties for doing a computed goto (AFAIK). And unconditional branches are all but free on a modern CPU anyway. On CPUs that execute at two billion cycles a second, a million extra unconditional jumps per second really, really doesn't matter.
Of course, Java doesn't have any faculties for doing a computed goto (AFAIK). And unconditional branches are all but free on a modern CPU anyway. On CPUs that execute at two billion cycles a second, a million extra unconditional jumps per second really, really doesn't matter.
Re: possible ways to speed up 6502 core?
I'd like fceux to be more efficient, if only from a power saving standpoint. Currently it takes about 35% of one core, with X taking an additional 20% (apparently it draws inefficiently?).
Re: possible ways to speed up 6502 core?
Er. Even on my Athlon/1333 it was tremendously lighter weight than that...calima wrote:I'd like fceux to be more efficient, if only from a power saving standpoint. Currently it takes about 35% of one core, with X taking an additional 20% (apparently it draws inefficiently?).
Re: possible ways to speed up 6502 core?
Overall, Atom and ARM tend slower than Athlon. An Atom's work per clock is close to that of a Pentium 4.
Re: possible ways to speed up 6502 core?
This is on a Phenom II, but the core is likely not at full speed.
Re: possible ways to speed up 6502 core?
Honestly, if 6502 emulation performance is a problem then you might be just not doing an optimal job at a serial implementation.
PPU is usually way more bottlenecky on slow CPUs.
PPU is usually way more bottlenecky on slow CPUs.
- GradualGames
- Posts: 1106
- Joined: Sun Nov 09, 2008 9:18 pm
- Location: Pennsylvania, USA
- Contact:
Re: possible ways to speed up 6502 core?
It's actually working really well like I said. I'm kinda casting a wide net for improving the performance further. The actual issue I'm experiencing I think has to do with power save features on android devices, because it just mysteriously throttles down to like 1/4 speed even though my thread is doing the exact same work every time.
I don't even have a PPU not really, anyway. It just renders the tiles, straight, rather than scanline per scanline. I.e. not a real NES emulator, see GGVm thread if curious.
I don't even have a PPU not really, anyway. It just renders the tiles, straight, rather than scanline per scanline. I.e. not a real NES emulator, see GGVm thread if curious.
Re: possible ways to speed up 6502 core?
In other words, you have a HLE PPU (high level emulated Picture Processing Unit). The best comparison I guess would be PocketNES for Game Boy Advance, which also has a HLE PPU because it maps the NES's tiled backgrounds and sprites onto those of the GBA. Yet it somehow runs at full speed on a 16.8 MHz ARM7TDMI.
- GradualGames
- Posts: 1106
- Joined: Sun Nov 09, 2008 9:18 pm
- Location: Pennsylvania, USA
- Contact:
Re: possible ways to speed up 6502 core?
I have another question relevant to what I'm working on: I know that the CPU clock speed from the wiki is: 1.789773 mhz, which amounts to about 29829.55 cycles per 1/60th of a second, correct? How many actual instructions per frame does that amount to, on average? I am going to guess the average amount of cycles taken by any given instruction is about 3 (seeing some as low as 2 and some as high as 7)?
Right now, one of the metrics my cpu spits out is "instructions per second." I have no concept of ticks or cycles, it is a purely high level cpu simulator. Thus when I'm seeing a metric such as: 2,570,423 instructions per second, this is ridiculously faster than the NES needs to be, if I'm correct in the above paragraph at all.
One thing I learned today is Android throttles down CPUs when they get hot. I do seem to observe a degradation of performance over time, and if I let it sit and start over the speed is restored.
Perhaps all I need to do is manually throttle the cpu with Thread.nanosleep, at least on mobile devices in an attempt to stress the cpu less? After all, there's absolutely no reason to be "overclocking" to the extreme degree that it is right now.
Right now, one of the metrics my cpu spits out is "instructions per second." I have no concept of ticks or cycles, it is a purely high level cpu simulator. Thus when I'm seeing a metric such as: 2,570,423 instructions per second, this is ridiculously faster than the NES needs to be, if I'm correct in the above paragraph at all.
One thing I learned today is Android throttles down CPUs when they get hot. I do seem to observe a degradation of performance over time, and if I let it sit and start over the speed is restored.
Perhaps all I need to do is manually throttle the cpu with Thread.nanosleep, at least on mobile devices in an attempt to stress the cpu less? After all, there's absolutely no reason to be "overclocking" to the extreme degree that it is right now.
- GradualGames
- Posts: 1106
- Joined: Sun Nov 09, 2008 9:18 pm
- Location: Pennsylvania, USA
- Contact:
Re: possible ways to speed up 6502 core?
...hmm...from what I'm reading, sleep will still use the cpu. Sounds like I want wait. Problem is I can't know when I should do that. Since GGVm already has game-specific knowledge perhaps I can tell it where all nmi wait spin loops are and do wait/notify to give the thread some rest between frames.
Re: possible ways to speed up 6502 core?
Dots per frame = 341 * 262 - 0.5 = 89341.5
Cycles per frame = 89341.5 / 3 = 29780.5
where each "cycle" is one time the CPU reads or writes memory, including dummy reads for certain instructions
So how hard would it be to make your CPU spit out the metric "memory reads and writes per second"?
Yes, once you run out of memory accesses, blocking until the next host vblank would be a good idea.
Another good idea is automatic speed hacking, as implemented in PocketNES. if the system reads a location in a tight loop, stop the CPU until the next interrupt, like this:
"At least one new post has been made to this topic. You may wish to review your post in light of this."
Cycles per frame = 89341.5 / 3 = 29780.5
where each "cycle" is one time the CPU reads or writes memory, including dummy reads for certain instructions
So how hard would it be to make your CPU spit out the metric "memory reads and writes per second"?
Yes, once you run out of memory accesses, blocking until the next host vblank would be a good idea.
Another good idea is automatic speed hacking, as implemented in PocketNES. if the system reads a location in a tight loop, stop the CPU until the next interrupt, like this:
Code: Select all
lda nmis
:
cmp nmis
beq :-
Can you tell it to automatically recognize patterns like that?GradualGames wrote:Since GGVm already has game-specific knowledge perhaps I can tell it where all nmi wait spin loops are
- GradualGames
- Posts: 1106
- Joined: Sun Nov 09, 2008 9:18 pm
- Location: Pennsylvania, USA
- Contact:
Re: possible ways to speed up 6502 core?
I think I have a proof of concept working for a hard-coded example (pattern recognition can wait, especially if this doesn't wind up solving the hot cpu problem on android devices). Now I just need to figure out how to measure instructions per second correctly taking into account when the thread is sleeping, so I can observe whether this really does get me an improvement or not.
Re: possible ways to speed up 6502 core?
Here's the top oprofile results from fceux if anyone's interested. Indeed not the 6502 core, but sound and drawing are taking most of the time.
Code: Select all
samples % image name symbol name
12098 31.7391 fceuxg NeoFilterSound(int*, int*, unsigned int, int*)
10644 27.9245 fceuxg Blit8ToHigh(unsigned char*, unsigned char*, int, int, int, int, int)
2273 5.9632 fceuxg RefreshLine(int)
1948 5.1106 libasound_module_rate_speexrate.so resampler_basic_interpolate_single
1869 4.9033 fceuxg X6502_RunDebug(int)
979 2.5684 fceuxg RDoSQ1()
843 2.2116 fceuxg RDoSQ2()
809 2.1224 fceuxg FCEUPPU_Loop(int)
749 1.9650 fceuxg FlushEmulateSound()
741 1.9440 libc-2.7.so memset
Re: possible ways to speed up 6502 core?
You get the biggest CPU emulation speedup from *idle loop skipping*. But unless you're on a 16MHz ARM or something, you probably won't notice.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
- GradualGames
- Posts: 1106
- Joined: Sun Nov 09, 2008 9:18 pm
- Location: Pennsylvania, USA
- Contact:
Re: possible ways to speed up 6502 core?
Welp, as a quick update, it turns out my 6502 core wasn't the bottleneck after all, it was how I was using opengl. One game was running around 23fps on a 3 year old phone of mine---changed some things in how I'm using opengl and now its 60fps. Crazy stuff...
That said...I actually found a simpler way to use wait/notify on my cpu thread. I just count the instructions and when it reaches 10,000 (a rough estimate based on roughly 30,000 cycles per frame) it blocks, and then I notify on every nmi.
That said...I actually found a simpler way to use wait/notify on my cpu thread. I just count the instructions and when it reaches 10,000 (a rough estimate based on roughly 30,000 cycles per frame) it blocks, and then I notify on every nmi.