Just to give my 2 cents on the Mesen part of things:
Mesen is more or less optimized to run on at least 3 different threads (emulation, frame decoding/filtering, rendering), so running it on any dual-core will result in sub-par performance, especially since I abuse spin-locks due to their low latency - but spin locks only work well so long as you actually have free cores to run them on without slowing down the other thread you are waiting on.
On the upside, this design means Mesen can run HDNes' HD packs with very little FPS drop on a quad core machine (e.g Super Mario Bros goes from ~250fps to maybe ~190fps on my machine)
Also, a lot of features result in small performance losses - e.g: debugger, cheats, unlimited sprites, support for HDNes' HD pack format, etc. I try to optimize where I can (using VS' profiler mostly) - but I'm not going to start trying to optimize cache misses in an era where most low end computers are already able to run Mesen at 2-3x normal speed. This made a lot of sense in 2005, but not so much in 2017 (Stuff like raspberry pis aside)
And, this is a matter of taste of course (I'm sure some people might say the same about Mesen's code), but Nestopia's code can be very hard to process. In particular, stuff like this drives me insane:
https://github.com/rdanbrook/nestopia/b ... .cpp#L1435It might result in slightly faster code, but in my opinion makes the code so much harder to read.
This kind of thing also leads to Nestopia's PPU code being 3.4k lines, against Mesen's 1k lines.
P.S: I'm not trying to hate on Nestopia or anything - it's a great emulator, and I've used it as a reference countless of times!