It is currently Sun Aug 25, 2019 7:24 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 51 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
PostPosted: Fri Aug 09, 2019 6:05 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2886
rainwarrior wrote:
Oziphantom wrote:
A single FPGA sure, no chance, but I don't see why you have to just use 1.

I don't believe that's feasible either. The FPGAs have to be able to handle not just the complexity but the speed as well, and combining multiple FPGAs has diminishing returns, especially dealing with all the connections between them.

For a rough comparison, here's 3 generations of just the CPU:

SNES: 22K transistors, 4MHz CPU
PS1: 1M transistors, 34MHz CPU
PS2: 13M transistors, 300MHz CPU

SNES is currently proven to be commercially viable to reproduce in an FPGA. PS1... not yet. I've heard of a FPGA PS1 project, but not a complete one. PS2 isn't even on the table. (Please correct me if I'm wrong.)

I'm sure at a high enough price there are suitable FPGAs to reproduce a PS2, but I'm also pretty sure they're so expensive ($$$$?) it would be ridiculous to try to use them for this purpose.


General purpose CPUs and software emulators, on the other hand, are already overcoming these problems at a much more reasonable price. There's been like a 20 year lag between a system viable as a PC emulator, and one viable as an FPGA clone.


A 65816 has 22k transistors? I thought the 6502 had 3500 transistors. If that's correct then how did it jump so much in transistor count?


Top
 Profile  
 
PostPosted: Fri Aug 09, 2019 7:57 pm 
Offline

Joined: Wed Nov 30, 2016 4:45 pm
Posts: 146
Location: Southern California
psycopathicteen wrote:
A 65816 has 22k transistors? I thought the 6502 had 3500 transistors. If that's correct then how did it jump so much in transistor count?

According to https://en.wikipedia.org/wiki/Transistor_count, WDC's 65c02 (ie, CMOS) has 11,500 transistors, and the 65816 does have 22,000. My guess is that the CMOS uses totem poles all over the place to speed up operation (rather than passive pull-ups which have a longer RC time constant), and then of course there are all the added instructions and addressing modes, and additionally for the '816, more and wider registers (remember the bank registers, 16-bit stack pointer, 16-bit direct-page register, and A, X, and Y that can optionally be 16-bit), the 16-bit ALU, and the 24-bit mux'ed address bus).

_________________
http://WilsonMinesCo.com/ lots of 6502 resources


Top
 Profile  
 
PostPosted: Fri Aug 09, 2019 8:05 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 7568
Location: Canada
tepples wrote:
You don't have to put the whole console in one FPGA.

I don't think anyone said you had to, but at the same time it's not necessarily helpful to split them up vs. choosing a bigger FPGA, either. The Super NT and Mega SG are both single-FPGA systems, and there's probably a good reason for that.

...and I still doubt there is currently any reasonably priced FPGA or combination of FPGAs that you could use to recreate a PS2 CPU. (Completely ignoring the additional mountain of engineering work required.)

As per calima's complaint, PS2 emulation is still struggling a bit / infancy, but there isn't an FPGA version waiting in the wings to solve all those problems. Developing a good FPGA clone takes as much research as developing an emulator. Switching paradigms doesn't do 20 years of platform research automatically. Give it time, and there'll be a better PS2 emulator. A good PS2 hardware clone is probably many years away yet, or maybe it will never come. The only reason the current FPGA systems exist is that we have arrived in some perfect storm of cheap enough hardware and deeeeeep public console research knowledge bases to borrow from.


Top
 Profile  
 
PostPosted: Fri Aug 09, 2019 11:26 pm 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 750
for mass production having 1 less chip to place, 1 less chip to program is a saver ;) Sure you would prefer a single chip design, but adding more is possible just adds more complexity and manufacturing costs ;)

For something like a PS2 though I would go with a Zynq UltraScale+ as it has 2/4 ARM cores + FPGA logic, they are over $300 though... not a practical thing at this point as you can get a real PS2 for $50. With the PS2 I would argue that the "sub pixel" accurate press the button on the pad and there is 0 lag case doesn't really exist, PS2 games that run at 50fps are rare with most coming it at 12~25 so using a multi core cpu to emulate it would be "good enough", although the PS2, GC, Xbox are probably the final frontier for such a product as they are the last pre LCD and designed for Composite CRTs. The key technology innovation I would see for these emulators is getting a good way to program multiple cores and keep them lock-stepped.


Top
 Profile  
 
PostPosted: Sat Aug 10, 2019 1:35 am 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 1076
rainwarrior wrote:
What is "sub-pixel latency" supposed to mean?

I was responding to the claim, regarding lag, that an emulator could have "none". In the context of a SNES emulator, I interpreted that as meaning zero (or, to be generous, less than two) extra master clocks between input and output (not necessarily perceptible result, as that's partly on the display device) as compared with real hardware. I would probably be satisfied with less timing precision than that, personally, but that was the claim.

Quote:
If you want 2 simulated chips to correlate with cycle-accurate timing, that's entirely doable on PCs and RPis with emulators.

I know an emulator can be very accurate. I also know that multi-frame latency isn't inherent to software emulation in principle. But in practice, achieving both high accuracy and low latency at the same time is very hard, in particular for a specific console/chip combo that byuu has been complaining about quite recently. As you say, most of the latency is not due to the emulator, but is imposed by the computing environment that runs it. And while it may be different for NES, you can't run an accurate SNES emulator on a system that offers easier low-level access - you need a PC.

Quote:
This is not a performance problem, and I don't know why you think it must be. Many emulators are already doing this kind of thing just fine.

It is absolutely a performance problem. Why do you think higan takes so much CPU power? It's not the individual chips; it's all the syncing. You cannot run higan anywhere near full speed on a RPi.

Which brings up another way in which very low latency could potentially be expensive. If you're simulating exactly what the console does in real time, you have to sync all the chips every cycle. Unless things have changed quite a bit since last time I checked, the only reason higan runs at full speed even on a modern high-powered PC is that it's smart about only syncing when it has to. This means it can't generate half-dots every 93 ns on the tick; it's asynchronous and the output only comes together properly because the results are buffered.

(Correct me if I'm wrong about this.)

Quote:
that's a video device problem

The video device problem is a separate problem, and it applies equally to FPGAs or any other method of making new hardware pretend to be an old console. With a CRT it's not a problem. With an HDTV, the problem is actually worse for original hardware than for clones capable of using HDMI. I'd rather leave the display technology argument aside because it's a whole other discussion.

Quote:
Incidentally, you could probably do scanline-by-scanline output on many PCs' built-in video hardware in VGA mode while connected to a CRT, if you really wanted to go down this road. (Shader language would not help with this in any way, IMO.)

Interesting. But could you do pixel-by-pixel?

I mentioned shader language because I know that GPUs can be used for massively parallel general computing, and it occurred to me that it might be possible to leverage this in an emulator if the goal was a combination of very high accuracy and very low latency. I was basically handwaving at that point because I don't have any expertise with GPU programming.

Quote:
But even if you did this, this concept of zero latency vs 1 frame of latency is almost meaningless.

One frame is absolutely perceptible. Ever play Mario Golf? Even on a CRT, the shot control timing is noticeably late. Also, I've worked with digital music creation tools, and an ASIO buffer size of 20 ms (only a little more than a frame) is unacceptably long for live playing. It's on the same order as the amount of time it takes for a piano key to fully depress after being struck firmly.

According to that keyboard latency page linked earlier, humans can detect as little as 2 ms (0.12 frames) of lag, and perceived lag does make you worse at what you're doing.

...

Dwedit wrote:
Emulators are able to use tricks involving savestates (RunAhead) to skip the game's internal lag frames, and show frames from the future, and thus reduce input lag.

Interesting trick.

Quote:
If you have a CRT plugged in, you have beaten the original hardware at latency.

Not if you're using a framebuffer you haven't (well, the typical full-frame buffer anyway). Also, there are other factors besides the monitor that induce latency on a PC, even if you do figure out how to do direct line-by-line output.

What is hard GPU sync?

Quote:
good enough performance to run multiple frames at once.

That's kinda the catch, isn't it?

Besides, once you try to compensate for more lag than the original game had, you no longer have the necessary input data to emulate the future frames regardless of how fast you can render them, and run-ahead is no longer a perfect display of what the real system would show. If it takes five frames for your controller input to make it through the USB driver to the emulator to the graphics card to the monitor to the actual screen, you need to handle some of that some other way because run-ahead will give you glitches.

All of this game-specific hacking doesn't really damage the case that an FPGA is a "purer" way to get high accuracy at low latency (if anyone were to attempt to make such a case)...

...

Oziphantom wrote:
93143 wrote:
Super Accelerator System
This is an SA-1 cart? Adding a 10mhz 65816 is not going to be a problem at all, even if we write the code to run at the minimal step and get the code to emulate various bus levels I don't see that a current cpu would have any issue. Is there somewhere that documents the issues faced?

From the Mesen-S thread:
byuu wrote:
The thing that hurts libco with the SA1 is that both the SNES CPU and SA1 can simultaneously access BWRAM and IRAM, which are of course volatile, and ROM can be dynamically remapped. So in effect, for perfect synchronization you would have to synchronize to the other component every time ROM, BWRAM, IRAM, and I/O registers were accessed, which is almost every cycle.
And again:
byuu wrote:
The design of the SA1 is ingenius and evil: the CPU cannot be stalled because the SNES CPU has no concept of external wait states (/DTACK on the Genesis, for instance.) So instead, the SA1 detects when the SA1 CPU tries to access ROM, BWRAM, or IRAM while the SNES CPU is accessing it, and will insert wait states into the SA1 CPU.
Three years ago in a different thread:
byuu wrote:
SA-1 memory conflict stalling is going to be the thing that totally destroys us. We're chasing our tails over a bit of SFX timing issues, but the SA1 is probably running 30% faster than it should.

Now, byuu is sometimes a bit hyperbolic about SNES emulation issues, but that doesn't sound trivial to me.

Oziphantom wrote:
Rahsennor wrote:
getting low latency on a 'modern' PC is a stone bitch.
This isn't a Emulation vs FPGA argument, this is "custom designed thing to do task X" vs "giant general purpose machine that multitasks and runs lots of different software" argument.

That's the entirety of the argument. An FPGA is not a unique philosophical primitive. If you think about it, it's really just a computer with an unusual architecture, programmed in an unusual language. Running a simulated console on an FPGA is software emulation.

And for a variety of practical reasons, it's much better suited to certain low-latency parallel applications than C++ on a Windows PC.


Top
 Profile  
 
PostPosted: Sat Aug 10, 2019 11:06 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 7568
Location: Canada
93143 wrote:
rainwarrior wrote:
Incidentally, you could probably do scanline-by-scanline output on many PCs' built-in video hardware in VGA mode while connected to a CRT, if you really wanted to go down this road.

Interesting. But could you do pixel-by-pixel?

Maybe, but this is a bit ludicrous? This is orders of magnitude above all thresholds of human reaction or even perception (0.3μs?), and also above the fidelity of relevant peripherals (i.e. light gun). I completely don't understand the goal of synching video output that finely. There is no plausible purpose.

I mentioned VGA output as a curiosity, just because a lot of PCs happen to still have it, and you could probably run your computer in FreeDOS to get pretty direct access to it with essentially no intervening OS or drivers. It's kind of a bad suggestion though, since you now have a ton of other problems to solve (esp. peripherals like sound device). An RPi is probably infinitely more practical to think about.

93143 wrote:
rainwarrior wrote:
But even if you did this, this concept of zero latency vs 1 frame of latency is almost meaningless.

One frame is absolutely perceptible. ... Also, I've worked with digital music creation tools, and an ASIO buffer size of 20 ms (only a little more than a frame) is unacceptably long for live playing. It's on the same order as the amount of time it takes for a piano key to fully depress after being struck firmly.

According to that keyboard latency page linked earlier, humans can detect as little as 2 ms (0.12 frames) of lag, and perceived lag does make you worse at what you're doing.

I said "almost meaningless", not "imperceptible". There is no hard threshold where there is too much lag, everything depends on purpose.

1. 2ms of lag is not the same problem as 100ms of lag. They should not be equivocated.

2. Musical instrument applications are a different purpose with different latency constraints. The games we're talking about are universally synching to a frame, and gameplay is a reactive feedback loop. A piano is not this.

Under fairly ideal conditions, you have an average of ~7ms (half a frame) of input lag before your input is read by the game. Then there's a whole frame before it begins to display ~15ms, then there's the time until the part of the screen you want to see the change on displays, maybe another ~7ms average. On top of this basic ~30ms, you then have ~100-300ms of human reaction time completing the feedback loop.

An extra 2ms on 130ms is a very small effect. Certainly measurable by scientific methods, but still very small. 15ms makes a difference, but how much? Not very much compared to 100ms, which can easily happen on an unluckily configured PC/peripheral/monitor combo. That scale of latency is worth addressing. Less than 1 frame? I really don't care. Magnitude is relevant.


Playing a musical instrument through ASIO is a different act entirely, and most of the problem you're describing is granularity rather than latency. Having a rhythmic input quantized to the size of the buffer jitters it very noticeably. If you simply added a few ms of latency without being quantized to the buffer, the effect of latency alone is much more subtle on the scales we're talking about. 15ms quantization is pretty severe, 15ms much less so, at least in a musical context. 15ms is the same amount of latency you get by playing along with a drummer who is 5 metres away. 15ms is not even a single wavelength of a moderately low tone.

Playing music is not about making split second feedback reactions against visual stimulus, it's about doing something in precise regular rhythm and timing. Again there is a problem of conflating two separate but correlated problems, because software used with ASIO doesn't usually have much ability to deal with granularity as a separate issue.

93143 wrote:
(SA-1 stuff) Now, byuu is sometimes a bit hyperbolic about SNES emulation issues, but that doesn't sound trivial to me.

Well, okay there's a lot going on there, I will grant that this synchronization is probably an easier problem to solve in the FPGA domain, at least when ignoring all other issues. I think I said as much in an earlier post already.

I would still argue that it's not an insurmountable problem and eventually a better solution could be found, but that begs the question whether it's worth solving. Another quote from just below one of the quotes you took:
byuu wrote:
So essentially, yes, bsnes' SA1 core is cycle accurate. But Snes9X's SNES CPU and SA1 cores are both opcode-based, and to my knowledge it does not break any games.

Essentially we're talking about passing a hardware test ROM. Yes, it's a measurable difference in that scientific context. Does it make any difference to the visual our audio output of any of the relevant games?

Byuu is going down a line of research, testing every minute difference that can be accounted for to see if it makes a difference. Usually the first way of implementing anything like this is going to be deliberately inefficient. The goal is to get a test running, not make it efficient. The initial point is finding the magnitude of its effect.

So the question of whether it could be more efficient is a bit moot. If this never demonstrates an effect that we can see in a game we want to run... how much engineering effort do you want to put into making this feature we don't need faster or more efficient? There's a call to action missing here.

I'm not saying that byuu is slouching at all. What I'm saying is that this particular thing probably could be solved more efficiently than byuu's current solution (there's always a faster way). The real question is whether there is a valuable point in doing so vs. how much work it would take. I'm sure there's been effort already to address it, but ultimately how much it affects things is very relevant when choosing what to spend time on. If it made Yoshi explode, it'd get solved faster.

If the only difference is "I can write a test that determines this is not a SNES", well there's a bunch of ways an FPGA system will fail too if you deliberately write new software to expose such a thing (and some of those things are easier to emulate). There are barriers here that are simply not practical to cross on either side of this problem. There's always another irrelevant test you can write.


Top
 Profile  
 
PostPosted: Sat Aug 10, 2019 11:53 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21564
Location: NE Indiana, USA (NTSC)
I'm agreeing with what you say about 2 ms latency vs. 100 ms. To put it in context, 2 ms is about what you get with the scaler in a Hi-Def NES or Analogue Nt mini, as the circular buffer in Hi-Def NES block RAM holds that many scanlines' worth of pixels. I'm also agreeing with what you say about byuu getting it right first and fast later.

But one caveat related to bidirectional synchronization between S-CPU and SA1:
rainwarrior wrote:
I would still argue that it's not an insurmountable problem and eventually a better solution could be found, but that begs the question whether it's worth solving. Another quote from just below one of the quotes you took:
byuu wrote:
So essentially, yes, bsnes' SA1 core is cycle accurate. But Snes9X's SNES CPU and SA1 cores are both opcode-based, and to my knowledge it does not break any games.

Essentially we're talking about passing a hardware test ROM. Yes, it's a measurable difference in that scientific context. Does it make any difference to the visual our audio output of any of the relevant games?

It depends on whether "the relevant games" is a closed class (licensed games only) or an open one (headroom for compatibility with homebrew yet to be released). Without any sort of warning from the emulator that a game is entering into dangerous territory sync-wise, a homebrew dev may easily end up accidentally relying on some behavior difference between hardware and the primary emulator. It'd be like the NES from when NESticle-only homebrews and ROM hacks were common or Super NES now when Snes9x/ZSNES-only Super Mario World hacks are still common. This is why BGB, a Game Boy emulator, includes a bunch of warnings that the user can turn on to break into the debugger whenever the CPU performs a read or write that doesn't have the expected effect.
Attachment:
bgb_exceptions.png
bgb_exceptions.png [ 5.81 KiB | Viewed 406 times ]


I imagine what may be going through at least one reader's mind: "Then make one emulator that's real-time and another that's cycle accurate. Test a homebrew game under development for technical compliance in the accurate emulator, and test for play balance in the fast one." But relying solely on automated tests for technical compliance, with play balance testing being completely separate, has its own drawbacks. One is test coverage: extensive play balance testing may cover more obscure code paths that the programmers accidentally failed to test for technical compliance.

_________________
Pin Eight | Twitter | GitHub | Patreon


Top
 Profile  
 
PostPosted: Sat Aug 10, 2019 12:41 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 7568
Location: Canada
The topic is FPGA vs Emulator, and the question was whether it can run accurately on an RPi. Demanding that developer tools also have to run at full framerate on an underpowered RPi is a bit out of hand, IMO. That's way outside the scope of what FPGA are even for. It's an inappropriate comparison.

If you want to test for hardware compliance in the way you're describing, neither an FPGA or an Emulator is a sufficient test, and for this purpose FPGAs are very much a worse tool for most aspects of the work.

Emulators can have integrated debuggers, breakpoints, watch for various conditions, scripting, etc. This is just not going to happen on an FPGA clone. That's a great thing about emulators like byuu's: you can throw extra CPU around to test edge cases that would be completely impractical to verify on hardware.


In terms of accuracy FPGA clones are at best no better than the most accurate emulators, all other factors ignored. As a software development tool they are worse.

In terms of efficiency for playing games accurately, they might be able to function with lower power consumption but that's a little bit apples to oranges considering all the other hardware involved. Again, the primary purpose of the FPGA clone is that it's compatible with your cartridges and other original hardware.


Top
 Profile  
 
PostPosted: Sat Aug 10, 2019 1:39 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 7568
Location: Canada
tepples wrote:
Then make one emulator that's real-time and another that's cycle accurate." ... But relying solely on automated tests for technical compliance, with play balance testing being completely separate, has its own drawbacks. One is test coverage: extensive play balance testing may cover more obscure code paths that the programmers accidentally failed to test for technical compliance.

This is not solved by FPGA clones, and you're taking the case in the discussion way out of proportion.

For broad testing the best coverage is by having more coverage. Having something that runs in emulators will get a lot more people to play it. A compromised emulator in 1000 hands is a better than an accurate one in 3.

Broad testing is most likely to catch logical errors in programming. Bugs due to subtle SA-1 timing differences would be extraordinarily unlikely to come up in that kind of testing only and not in the short term. Yeah, in some absolute sense it "could" happen. A million monkeys typing shakespeare and what not. In the real world it's just not very significant. Proportion is important. FPGA clones are great to get testing on, but I think the idea that they will usefully catch some specific class of bug that emulators can't is way off base.

This is also ignoring the ways in which it's easier for an emulator to be accurate than an FPGA. It is only in some particular issues like correlating timing that it seems to be easier... and even then often it isn't, really. Often the accurate thing is still very "real-time" viable, and it almost feels a little bit propagandist to start throwing around terms like that to describe the issue. Accuracy and CPU load aren't a 1:1 thing, and many things that are accurate are perfectly efficient to implement, they just take the right knowledge. A lot of things that make various emulators high-CPU have nothing to do with accuracy. That term just doesn't apply very cleanly to anything in the real world except extremes like Visual2A03. Even when talking about higan it's application is pretty muddy.


Top
 Profile  
 
PostPosted: Sat Aug 10, 2019 5:41 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 1076
rainwarrior wrote:
93143 wrote:
But could you do pixel-by-pixel?
Maybe, but this is a bit ludicrous?

Of course, but the claim I was responding to was that an emulator could have no latency.

There are two issues here. One is the literal no-latency case, which, while not obviously impossible, is certainly absurdly difficult to pull off and not really necessary even if you could. The other is the matter of how low you can reasonably get the latency, and whether it matters, vs. what you can do with an FPGA.

Quote:
2ms of lag is not the same problem as 100ms of lag. They should not be equivocated.

You mean equated, I hope, or something along those lines. Equivocation is a different word entirely, and should not be used lightly in a debate...

The point is that latency much smaller than a frame is in fact perceptible, and may even affect performance at certain tasks. It seems reasonable to extrapolate that the difference between 1.7 and 2.7 frames of end-to-end lag is not irrelevant even in a practical sense. You will feel it, and it will affect gameplay.

Quote:
On top of this basic ~30ms, you then have ~100-300ms of human reaction time completing the feedback loop.

It's not just about a feedback loop. It's about timing your actions precisely, which requires much finer control than that. The PIO I've experienced in emulated Super Mario Kart is an egregious worst case; there are a lot of ways the gamefeel can get worse before it reaches that level.

The human brain compensates for its own signal acquisition and processing lag. You can't just add it to the lag from the computer, because it doesn't do the same thing to the experience.

Quote:
Playing a musical instrument through ASIO is a different act entirely, and most of the problem you're describing is granularity rather than latency.

It's not just buffer size that can cause problems. I've programmed a digital piano, and playing the key attack noise before the main tone starts really does substantially affect the feel, even if the attack is muted. The immediacy is gone. It's not as bad as jitter, but it's still relevant. 20 ms is not fast when you're talking about feedback from a tactile input device. Being a few metres away from the drummer isn't as big a deal because it's just the sound, the lag is consistent, and nothing else in the band needs to sync that precisely with the beat to sound good to the audience, but if you were the drummer and you experienced that level of lag it would absolutely throw you off.

...it occurs to me that being nearly 20 feet away from the drummer might actually be a problem in some cases. Live performances do suffer when there's enough space separating the performers, and the problem compounds quickly when they aren't experienced at lag compensation...

Quote:
Playing music is not about making split second feedback reactions against visual stimulus

Neither is videogaming, for the most part. Games like Punch-Out are the exception, not the rule. A game like Super Mario Bros. or Mario Golf or Guitar Hero is about performing relatively simple tasks with much finer precision than you could ever manage by just reacting with no plan.

Quote:
Essentially we're talking about passing a hardware test ROM.

There's another side to the argument tepples is making. If anybody ever writes an SA-1 game that depends on it working like real hardware beyond the level of the Snes9X hack, a solution will have to be found or that game won't work properly in emulators that don't go the extra mile.

...

Attempting a summary of what I feel are some of the relevant points:

PC:
- not hardware compatible
- latency depends on system configuration and emulator system interface design (and a publicly-distributed emulator can't rely on certain types of crazy hardware tricks)
- non-expert users may additionally suffer effects such as screen tearing, judder, long audio buffer times, and OS-induced timing glitches
- high power draw
- PC user experience, nothing like a console
- can be very accurate, but may require expensive hardware and creative programming to guarantee full speed

FPGA:
- hardware compatible
- no latency (quoting you here)
- lower power draw
- much more console-like user experience
- can be very (perfectly?) accurate at no additional cost to anyone other than the programmer

Raspberry Pi:
- not hardware compatible
- potentially low latency
- lower power draw
- potentially somewhat console-like user experience once it's set up (no carts though)
- lower accuracy due to limited computational resources

It's not so much "purity" as it is the overall fidelity of the experience. Being able to use real cartridges in a standalone device with minimal latency is a lot more like actually using a NES or SNES, as compared with spending hundreds of watts on laggy software emulation on a giant multipurpose PC that requires you to go to extreme measures to play your games without breaking the law. I'd go so far as to say that if PC emulation was all we had, the true experience of playing a classic console would ultimately be lost, regardless of the accuracy of the emulation.

Which is why the display technology issue bugs me so much. How many kids nowadays have tried Super Mario Bros. 3 in an emulator and gotten the impression that that was how the game actually felt back in the day? How many have tried it on a NES Classic on an HDTV and thought that? You'd think you could solve the problem by using a real NES - but plugged into an HDTV it's not going to be any better, and in fact it's likely to be worse.

Real pre-3D console games put modern gaming to shame with how immediate the gamefeel was. Something has been lost.


Last edited by 93143 on Sat Aug 10, 2019 7:31 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Sat Aug 10, 2019 7:28 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 7568
Location: Canada
93143 wrote:
Of course, but the claim I was responding to was that an emulator could have no latency.

You said this twice now, but nobody claimed this.

The closest was just dwedit who suggested a game's built-in latency could be taken advantage of with a rewind-and-replay emulator for lower overall latency, and even linked a video demonstration. It's quite factual and demonstrable, but it can do objectionable things to games that don't have built-in delays or have more quickly divergent simulations. Not a complete general solution, IMO, but an interesting solution for some games. (It has some great applications in netplay though.)

93143 wrote:
You mean equated, I hope, or something along those lines. Equivocation is a different word entirely, and should not be used lightly in a debate...

I meant only that I felt that disparate ranges of latency were being conflated in this discussion, and I felt the need to disambiguate them. Wasn't trying to call you a liar, if that's how you took it.

93143 wrote:
The human brain compensates for its own signal acquisition and processing lag. You can't just add it to the lag from the computer, because it doesn't do the same thing to the experience.

I completely disagree with this. The human response is absolutely part of the interaction feedback loop, and very much affects how strong the effect of additional lag is.

Similarly if you wanted to use an FPGA clone with a modern TV rather than a CRT, which is actually one of their big selling points, the amount of lag that TV has is completely relevant to the discussion. If you don't have a low latency TV, the better latency of an FPGA clone won't have nearly as much impact. That's something you need to know about and account for if low latency is a reason that attracts you to an FPGA clone product.


And that's kinda the whole point I was trying to make in a nutshell, in an absolute sense could could say that device A has 5ms more of latency than B, or similarly device A gets one particular aspect of timing better than device B, but OP is asking for opinions, and it's important to know how much these objective comparisons matter. The total amount of latency in the loop is relevant.


Similarly, I think the idea that the Super NT has more accurate SA-1 timing is meaningless for the machine's purpose. It won't make any difference for any game you play, and as a machine it is not built to be a developer tool or a perfect substitute. It's a very compatible machine, and that's it. If you really want to do something that depends on that kind of thing, this product is not good enough for that. Get a SNES and build yourself an SA-1 dev board.

The "but if some homebrewer needs it in the future" is an argument people bring up all the time for various pet emulator features and I don't buy it here. There are infinite homebrew possibilities like this, and almost none of them are going to ever happen in a real release. You and tepples aren't even proposing something someone could do with it, only that it's possible, and probably by accdient? That's not compelling.

The moment someone actually does release something that needs it that people want to play, emulators will adapt. Same deal with the Super NT: it's already had many firmware updates addressing compatibility issues. So has every FPGA clone system. They get fixed when incompatibilities are found, just like emulators do. FPGA clones can't be "perfectly?" accurate any more or less than emulators can.


Top
 Profile  
 
PostPosted: Sat Aug 10, 2019 7:41 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2886
Garth wrote:
psycopathicteen wrote:
A 65816 has 22k transistors? I thought the 6502 had 3500 transistors. If that's correct then how did it jump so much in transistor count?

According to https://en.wikipedia.org/wiki/Transistor_count, WDC's 65c02 (ie, CMOS) has 11,500 transistors, and the 65816 does have 22,000. My guess is that the CMOS uses totem poles all over the place to speed up operation (rather than passive pull-ups which have a longer RC time constant), and then of course there are all the added instructions and addressing modes, and additionally for the '816, more and wider registers (remember the bank registers, 16-bit stack pointer, 16-bit direct-page register, and A, X, and Y that can optionally be 16-bit), the 16-bit ALU, and the 24-bit mux'ed address bus).


Wow, I never knew CMOS inherently took more transistors than NMOS.


Top
 Profile  
 
PostPosted: Sat Aug 10, 2019 8:16 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 7568
Location: Canada
93143 wrote:
What is hard GPU sync?

I assume it refers to something like this:
https://docs.microsoft.com/en-us/windows/uwp/gaming/reduce-latency-with-dxgi-1-3-swap-chains

There was a point maybe during or after the DirectX 9 era where the default way of presenting the backbuffer introduced extra latency in favour of higher GPU throughput/parallelism. This kinda crept in slowly through driver interfaces, and it's kinda why in some emulators there's superstition about Direct3D versus DirectX interfaces, sometimes a driver responded differently to either. (Part of the eternal arms race to get that FPS number higher.)

Thankfully they exposed this mechanism more directly and explicitly in DXGI 1.3 (circa 2013?) with "swap chains" which lets you take back that trade. Other APIs like Vulkan have similar capabilities. Video drivers may still override it at their discretion, but at this point I think it does work in most cases.


Top
 Profile  
 
PostPosted: Sat Aug 10, 2019 9:02 pm 
Offline
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 4202
"What is Hard GPU Sync?" Good question. As far as I know, it's OpenGL only, I'll go peek at what it does.

Okay, here we go... It's this: glFenceSync combined with glClientWaitSync

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
PostPosted: Sun Aug 11, 2019 11:04 am 
Offline

Joined: Mon Apr 16, 2007 10:07 am
Posts: 122
All this talk of lag with emulators vs FPGA vs original HW fail to take into account one thing (unless I've missed it):

Not all inputs on (most) display devices are treated equally... Unless you game exclusively on monitors.

Ignoring emulators for a moment, lets just compare PLD based re-implementations to original hardware.

Pretty much all original hardware before the Dreamcast era output a video signal that a large percentage of non CRT televisions will treat as interlaced... Add lag for de-interlacing. Next, the television will have to scale the SD image up to the resolution of its panel... Add more lag. Now, because the input is coming from a SD analoge source some TVs won't let you disable all picture processing and that's more lag again.

Add a device like a RetroTink X2 then you can reduce the lag down to under one frame on some displays. Add something like a OSSC and you're nearly golden, expect for the edge cases which gaming has a number of. A game wants to switch between 240p/480i? Enjoy your blank screen for a few seconds as everything re-syncs. You game switches resolution every time you go in and out of the menu? See what I'm getting at?

Now with a PLD re-implementation you can side step nearly all this. At the very least you can build in a scandoubler and you instantly remove one of the larger sources of lag in the chain. Many of the commercial FPGA based re-implementations output upto at least 720p, meaning that most TVs end up using simpler and quicker upscaling mechanics and those that can do 1080p can sidestep the TV's scaler all together. With the right tweaks to the video metadata, most TVs will treat the input like it coming from a computer and disable nearly all its built in lag inducing picture and motion 'enhancing' features.

_________________
Insert witty sig. here...


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 51 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group