Windowed VSync

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

ccovell
Posts: 1045
Joined: Sun Mar 19, 2006 9:44 pm
Location: Japan
Contact:

Re: Windowed VSync

Post by ccovell »

You can skip my post, but coming from an upbringing on systems that had instantaneous scrolling and animation (NES, Amiga) the lagginess & shearing that still seems unavoidable in Windows makes me... (searches for the "puke" smiley)
User avatar
rainwarrior
Posts: 8734
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Windowed VSync

Post by rainwarrior »

It's not unavoidable at all if you go to fullscreen mode and put your game thread at higher priority.

The reason windows is laggy and sheary in windowed mode is because you're doing 10 things at once. Your NES couldn't do that at all. Stuff that ran on the Amiga workbench had plenty of lag and flicker.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Windowed VSync

Post by tepples »

I did several things at once on a 1999 laptop with one 0.333 GHz P6-class core. Clock speeds, number of cores, and instructions per clock have increased since then. Shouldn't a 2 GHz quad core i7 (6x clock, 4x parallelism, 2x IPC) be able to run more than 10 things?
ReaperSMS
Posts: 174
Joined: Sun Sep 19, 2004 11:07 pm

Re: Windowed VSync

Post by ReaperSMS »

Various issues with some of the ideas in this discussion:

"VSync should just be an interrupt!": Ok, who gets it, and in what order? If it's a single core machine, at best everything that asked to be notified will find out about it, in some order. You have no idea what other programs are trying to do, and the OS can't really enforce things. Likely outcome: you tear anyways, because your particular timeslice doesn't land until halfway down the screen.

"Scrolling and large movements look terrible, they didn't on my amiga!": Drawing under windows is inherently tied to the WM_PAINT message, that gets delivered to each window that needs it. For it to show up, first the message goes into the application's message queue, which it's main thread has to pump occasionally. Windows has no control over when that happens, and applications are terrible about doing that in a timely fashion. When you get the usual "Blah (Not Responding)" that's the application not pumping it's message loop. If it isn't pumped, the message doesn't go to the window, and no draw commands happen.

Once the message is delivered, windows calls the window proc, and usually that just switches on the message type. WM_PAINT messages have some additional requirements, they have to wrap their draws with BeginPaint() and EndPaint(), which validates the part of the screen they're asked to repaint. The drawing for most things is generally done with GDI, and has a terrible track record for hardware acceleration. Additionally, applications can catch and override any of the messages windows sends down, including the ones involved in painting the window frame, title bar, etc.

On top of all that, a lot of apps do not bother trying to paint just the part that was invalidated, and redraw their entire window no matter what. Usually they start with a clear, to make sure things look right, and may take a particularly long time to iterate through redrawing all their controls. You've probably seen the crap that shows up in parts of a window that get uncovered when that app isn't responding to WM_PAINT for some reason. Apps that have a more active update approach (emulators, games, anything that's doing particularly intensive video playback, etc) may update an offscreen buffer, and blit it over in WM_PAINT, but that isn't necessarily guaranteed to happen at any particular time.

Further complicating that are situations like dragging, or hitting menus, etc. Sometimes people write horrible message loops that stall the rest of the application while the user is interacting with that stuff. Whether you drag windows or outlines is under user control, if it's outlines, you won't get any WM_PAINT messages anyways.

Lastly, there's just a ton of data to move around some times. A naive redraw may redraw every pixel in the window 2-3 times, between clearing, and excessive amounts of text or transparent junk. A 1024x768 window at 32bb is ~4M of noncontiguous data in the framebuffer. 1920x1080 is ~8.3M

tl;dr: GUIs are Hard.
WedNESday
Posts: 1284
Joined: Thu Sep 15, 2005 9:23 am
Location: Berlin, Germany
Contact:

Re: Windowed VSync

Post by WedNESday »

VSync should just be a bit going from low to high like on the NES. Yes I know that all kinds of programs might wait for it to go high and then all want a piece of the action at the same time but it would still be a nice feature especially if you could via the Windows API change which program(s) get the highest priority.
ccovell
Posts: 1045
Joined: Sun Mar 19, 2006 9:44 pm
Location: Japan
Contact:

Re: Windowed VSync

Post by ccovell »

ReaperSMS wrote:Various issues with some of the ideas in this discussion:

"Scrolling and large movements look terrible, they didn't on my amiga!": Drawing under windows is inherently tied to the WM_PAINT message, that gets delivered to each window that needs it. For it to show up, first the message goes into the application's message queue, which it's main thread has to pump occasionally. Windows has no control over when that happens, and applications are terrible about doing that in a timely fashion. When you get the usual "Blah (Not Responding)" that's the application not pumping it's message loop. If it isn't pumped, the message doesn't go to the window, and no draw commands happen.

...

On top of all that, a lot of apps do not bother trying to paint just the part that was invalidated, and redraw their entire window no matter what. Usually they start with a clear, to make sure things look right, and may take a particularly long time to iterate through redrawing all their controls. You've probably seen the crap that shows up in parts of a window that get uncovered when that app isn't responding to WM_PAINT for some reason. Apps that have a more active update approach (emulators, games, anything that's doing particularly intensive video playback, etc) may update an offscreen buffer, and blit it over in WM_PAINT, but that isn't necessarily guaranteed to happen at any particular time.

tl;dr: GUIs are Hard.
Rainwarrior: Well, the Amiga in that vid had a clock speed of 7.14 Mhz... so some GUI redraw lag is quite forgivable.

Another thing compounding that, and related to what Reaper said, the default redrawing mode for Amiga windows is "simple" which means everything gets redrawn every time something is moved, resized... There is a "smart" redraw which uses the blitter to store hidden windows in RAM, then blit them to "VRAM" when they are revealed as windows are moved. Dumb programs might also only use the "simple" method.

Hopefully smart Windows programs also use some kind of smart window caching routine and don't tie up the rest of the system too.

But I guess this topic is about window refresh, not redrawing, so ignore my comments... :arrow:
WedNESday
Posts: 1284
Joined: Thu Sep 15, 2005 9:23 am
Location: Berlin, Germany
Contact:

Re: Windowed VSync

Post by WedNESday »

What if I were to wait for the vblank with a DirectX function and then start a timer and time say 100 frames to get an average. That way could I just wait for my timer to hit 0 and the Blt windowed?
natt
Posts: 76
Joined: Fri Oct 26, 2012 5:27 pm

Re: Windowed VSync

Post by natt »

WedNESday wrote:What if I were to wait for the vblank with a DirectX function and then start a timer and time say 100 frames to get an average. That way could I just wait for my timer to hit 0 and the Blt windowed?
Do you have access to a timer that good? Probably not.
snarfblam
Posts: 143
Joined: Fri May 13, 2011 7:36 pm

Re: Windowed VSync

Post by snarfblam »

It sounds like it's worth a try. Just keep in mind that the accuracy of timers can vary between different machines.
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Windowed VSync

Post by koitsu »

QueryPerformanceCounter() (another) with QueryPerformanceFrequency(). Or possibly timeGetTime().

Bottom line: if the underlying hardware the OS is being run on lacks decent timecounters (i.e. the HPET implementation is broken or not enabled, ACPI DSDT is wrong, TSC is broken (this would affect a lot more than just QPC though!), or (and god forbid this be the case) the system's TC is the i8254), then too bad. I wouldn't bother trying to cater to those folks either, especially anyone using the i8254.
natt
Posts: 76
Joined: Fri Oct 26, 2012 5:27 pm

Re: Windowed VSync

Post by natt »

QueryPerformanceCounter() doesn't give you a way to sleep your thread and be woken up with accurate timing, though. And if you just spin on it, you're wasting CPU time, and the OS will swap you out at some point; could miss your vblank that way?

Bizhawk uses D3D9's D3DPRESENT_INTERVAL_ONE if the user asks for vsync. It seems to work well for me in windowed mode (low CPU usage, stable with 30hz test patterns), but I guess some implementations are better than others.
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Windowed VSync

Post by koitsu »

natt wrote:QueryPerformanceCounter() doesn't give you a way to sleep your thread and be woken up with accurate timing, though. And if you just spin on it, you're wasting CPU time, and the OS will swap you out at some point; could miss your vblank that way?
Please define "accurate timing". Give me numbers and units. How granular are we talking here? 1ms? 2ms? 10ms? Longer? Without actual numbers it's difficult to say "you're right" or "let me point you to a resource...".

On *IX systems -- I'll use FreeBSD as an example but I know Linux and Solaris have their own implementations as well -- we have kqueue(2) which allows a userland application a way to tell the kernel to "call back into" the userland application and run code at a set interval (resolution as low as 1ms -- see EVFILT_TIMER).

I'm fairly certain Windows has an equivalent of this; this article implies such. But in general, as discussed in this thread there are certain CPU-level features (specifically CPU P-states and EIST) that can add delays to the interval time. There's nothing you can do about that, honestly -- and don't think those necessarily will cause massive amounts of latency (see very last post in that thread; author insisted those features were cause of stuttering, but the issue turned out to be in his code).

Some posts on stackoverflow also indicate that GetTickCount(), preceded by a call to timeBeginPeriod() (to request 1ms resolution), tends to be more accurate. The problem with GetTickCount() is that you do have to spin in a tight loop. Not very ideal. I will admit here that I have not looked at Windows' thread creation API so I don't know what's available WRT that.

But as I understand it, on Windows the best option you have to keep a CPU from being maxxed out while waiting for a thread to fire is Sleep(). And that's discussed in this thread.
natt
Posts: 76
Joined: Fri Oct 26, 2012 5:27 pm

Re: Windowed VSync

Post by natt »

koitsu wrote:
natt wrote:QueryPerformanceCounter() doesn't give you a way to sleep your thread and be woken up with accurate timing, though. And if you just spin on it, you're wasting CPU time, and the OS will swap you out at some point; could miss your vblank that way?
Please define "accurate timing". Give me numbers and units. How granular are we talking here? 1ms? 2ms? 10ms? Longer? Without actual numbers it's difficult to say "you're right" or "let me point you to a resource...".
Vblank can vary quite a bit from setup to setup, so I'll put some specific numbers on: My nVidia drivers, at 1920x1080@60.00hz, CVT reduced blanking, give a vblank length of 465us. Using DMT only puts that up to 667us. At that resolution and refresh rate, GTF and CVT are out of the range of HDMI single speed (165Mhz), and so are not possible. (I don't think my setup is all that atypical?)

I'm not aware of any capability in MS Windows to time that finely without spinning or involving some particular hardware interrupt that isn't universally available. There are a lot of Windows APIs out there, though.
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Windowed VSync

Post by koitsu »

...but QueryPerformanceCounter() will give you microsecond precision, despite slight bits of overhead (depends on what you're doing. So I'm still not convinced this is a problem. Here's some other information I found, which also contains code:

http://www.geisswerks.com/ryan/FAQS/timing.html

The conclusion he reaches at the end of this article is that a combination of the two models should work best. And given what I understand of both of those models, I'm in full agreement. This, to me, is separate from the thread issue -- again, I have little experience/knowledge about Windows' threading model, but Sleep() should do the trick for the most part.

Overall if there is some API that's similar to FreeBSD kqueue(2) that lets the kernel call userland code at set intervals but with microsecond granularity, that would be ideal. But I guess my point is that PC hardware today, and over the past 5-6 years, all offer HPETs and ACPI timers that are extremely granular. Software can detect ones which aren't, so I'm left wondering where the issue lies, other than "all this is so damn complex!" (with which I completely agree :-) ).
User avatar
blargg
Posts: 3715
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Re: Windowed VSync

Post by blargg »

Doesn't the OS allow you to have it blit the entire window on the next frame without any tearing? It seems like very commonly-needed functionality, for anything showing any kind of video. At least on OS X (even back at 10.3, maybe earlier), this was automatic; once you were done with all the window drawing, I believe you just called a function to tell the OS to update it, and it would all occur on the next frame without any visual glitching or half-drawn things appearing.
Post Reply