Current state of programmable logic

Discuss hardware-related topics, such as development cartridges, CopyNES, PowerPak, EPROMs, or whatever.

Moderators: B00daW, Moderators

lidnariq
Posts: 10277
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: Current state of programmable logic

Post by lidnariq » Thu May 09, 2019 10:59 am

Hijacking OAMDMA is guaranteed to safely handle any analog timing issues, because it's only running at the same data rate as the CPU, not twice the rate as the CPU.

I don't know of any useful repository of analog timing measurements. I've recorded a very short list of random things: PPU A12 vs MMC3 IRQ and M2 vs Mclk.

As far as digital bits:
The 2A03 runs off two synchronous biphase clocks; the standard 6502 50% φ1/φ2 for everything internal (visual2a03 node "clk0"), and a 15/24 (revision E and newer) duty M2 for everything external (visual2a03 node 11200). We've seen several bugs come from M2 being asserted before φ2 on write cycles.

We used to think that the original letterless 2A03 had a 18/24 duty cycle, but looking more closely at the die shots it looks like it might be 17/24 instead.

The 2C02 runs off a single 50% biphase clock at the pixel clock, which I've personally taken to calling "left half dot" and "right half dot"; Visual2C02 calls it "pclk0" and "pclk1". Normal PPU fetch cadence during rendering is {ALE idle AssertRD doRead}.

I don't think we know anything beyond the divider ratios for the 2A07 (16 master clock cycles/instruction cycle) and 2C07 (5 master clock cycles/pixel).

supercat
Posts: 161
Joined: Thu Apr 18, 2019 9:13 am

Re: Current state of programmable logic

Post by supercat » Thu May 09, 2019 12:07 pm

lidnariq wrote:Hijacking OAMDMA is guaranteed to safely handle any analog timing issues, because it's only running at the same data rate as the CPU, not twice the rate as the CPU.
I was under the impression that the weird controller-port interactions with DMA were caused by the fact that DMA cycles didn't use the same timing as normal CPU accesses.

There are byte-stuffing approaches that run at one byte of data per two CPU cycles (make opcode fetches get NOPs, and use the dummy operand fetch for data transfer using the CPU address), which would allow the low-order address bits to be used by the memory directly. I don't see any advantage of OAM-DMA over such approaches sufficient to justify the extra address latching that would be required to do anything useful during the second cycle. If one wanted to use OAM-DMA to feed data that would be meaningless to the OAM and then later send meaningful data, that would be possible, but one would lose the ability to extend the "useful" part of vblank by disabling background rendering and ensuring that no sprites are too high on the screen.

lidnariq
Posts: 10277
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: Current state of programmable logic

Post by lidnariq » Thu May 09, 2019 1:06 pm

supercat wrote:I was under the impression that the weird controller-port interactions with DMA were caused by the fact that DMA cycles didn't use the same timing as normal CPU accesses.
Only when DMA preempts normal execution, for the DPCM fetch. OAMDMA is triggered directly by the CPU itself, so its timing is guaranteed safe. (it works extremely similar to 2600 WSYNC)
If one wanted to use OAM-DMA to feed data that would be meaningless to the OAM and then later send meaningful data, that would be possible, but one would lose the ability to extend the "useful" part of vblank by disabling background rendering and ensuring that no sprites are too high on the screen.
Writing to $2004 after rendering has started seems to have only temporary (within those scanlines) effects, no changes to the contents of primary OAM DRAM.

supercat
Posts: 161
Joined: Thu Apr 18, 2019 9:13 am

Re: Current state of programmable logic

Post by supercat » Thu May 09, 2019 4:00 pm

lidnariq wrote:
If one wanted to use OAM-DMA to feed data that would be meaningless to the OAM and then later send meaningful data, that would be possible, but one would lose the ability to extend the "useful" part of vblank by disabling background rendering and ensuring that no sprites are too high on the screen.
Writing to $2004 after rendering has started seems to have only temporary (within those scanlines) effects, no changes to the contents of primary OAM DRAM.
Using OAM-DMA to fetch data that isn't meaningful to the OAM would corrupt the OAM if done at any time when the system isn't rendering the frame. If 1- or 2-cycle-per-byte CPU-based transfer methods don't work, OAM-DMA might be usable as a substitute, but relying upon the OAM to ignore transfers would seem hokey unless there was no other way to get good transfer speeds. If one wants a 20-line top border, having to wait until the end of vblank before starting a transfer would forfeit about half of one's potential update window.

lidnariq
Posts: 10277
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: Current state of programmable logic

Post by lidnariq » Fri Feb 05, 2021 10:21 pm

Digging this old thread back up...
infiniteneslives wrote:
Tue Apr 10, 2018 8:12 am
Altera MAX V [are] a bit annoying for two reasons though. They all require 1.8v core supply, so you'll likely need both a 1.8v and 3.3v regulator on board. The smallest device 5M40 only comes in MBGA & EQFP (with 0.4mm pitch) packages. [...] These facts paired with requirement for level shifters kinda makes the 5M40 unreasonable for multi-chip discrete to MMC1 scale mappers.
(emphasis added by me)

I've been looking into them a bit more closely, and I realized that its 4V tolerance plays nicely with the NES's NMOS design. Pins that are driven exclusively by the 2A03 and 2C02 don't go above 4V in normal operation anyway (to wit: CPU A0-A14, M2, R/W; PPU A8-A13, /RD, /WR).

Furthermore, pins that are driven by 74LS series parts have very minimal current-sourcing capability in this top volt, so only very minimal loading is necessary to move PPU A0-A7 and /ROMSEL into a range that's safe for this CPLD. Measuring a Motorola 74LS02, I saw output voltage of 4.3V at no load, 4.1V at 35µA, and 3.5V at 120µA.

However, the remaining pins - CPU D0-D7, PPU AD0-AD7, and PPU /A13 - are driven by CMOS parts (RAMs, the '368s, and in the NES-001 the 74HCU04) and those still require extra care. Furthermore, those top-loaders that use the BUxxx ASICs put /ROMSEL back into question.

But it looks better than needing down-translation for every signal.

(This post tickled by noticing closeout pricing for the 5M40ZE64C5N part on Digi-Key right now, 94¢/ regardless of volume)

User avatar
infiniteneslives
Posts: 2102
Joined: Mon Apr 04, 2011 11:49 am
Location: WhereverIparkIt, USA
Contact:

Re: Current state of programmable logic

Post by infiniteneslives » Tue Feb 16, 2021 2:19 pm

Other thing to consider is modern clones like the AVS, I’m only guessing but I’d expect it’s driving 5v CMOS levels on all signals?

In any event ~$1 is a pretty decent price.

FWIW I’ve mostly moved on to Gowin little bee devices over the past year (GW1N 1152LUTs in TQFP-100) since lattice announced EOL on Mach-XO family. They also announced EOL on LC4000V series, RIP 5v tolerant modern CPLDs (IDK about atmel offerings they were always too expensive to be of interest IMO).

The default GW1N does require a 1.2v regulator, but they have a Mach-XO2 pin compatible version that only requires 3.3v reg. The 72kbit of internal block ram is nice perk for use as PRG-RAM provided the entire CPU bus is levelshifted.
If you're gonna play the Game Boy, you gotta learn to play it right. -Kenny Rogers

lidnariq
Posts: 10277
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: Current state of programmable logic

Post by lidnariq » Tue Feb 16, 2021 2:36 pm

infiniteneslives wrote:
Tue Feb 16, 2021 2:19 pm
FWIW I’ve mostly moved on to Gowin little bee devices over the past year (GW1N 1152LUTs in TQFP-100) [...] The default GW1N does require a 1.2v regulator,
That's basically the same as the iCE40HX1K. Any thoughts comparing the two?
IDK about atmel offerings they were always too expensive to be of interest IMO).
Yeah, the 5V ATF1502 got significantly more expensive after the past year. The bigger ones were too expensive to be worth looking into.

lidnariq
Posts: 10277
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: Current state of programmable logic

Post by lidnariq » Wed Feb 17, 2021 8:14 pm

infiniteneslives wrote:
Tue Feb 16, 2021 2:19 pm
They also announced EOL on LC4000V series, RIP 5v tolerant modern CPLDs
Source? I haven't been able to find any information about this on their web page. And they're still recommending the 4000V series as a replacement for some other more mature parts. Mouser only mentions EOL for the non-ROHS 48-pin TQFP, and DigiKey doesn't for the same model. And it looks like the LC4000Z series (which are 5V-tolerant but require two power supplies) are still ok?

User avatar
infiniteneslives
Posts: 2102
Joined: Mon Apr 04, 2011 11:49 am
Location: WhereverIparkIt, USA
Contact:

Re: Current state of programmable logic

Post by infiniteneslives » Thu Feb 18, 2021 1:41 pm

lidnariq wrote:
Wed Feb 17, 2021 8:14 pm
infiniteneslives wrote:
Tue Feb 16, 2021 2:19 pm
They also announced EOL on LC4000V series, RIP 5v tolerant modern CPLDs
Source? I haven't been able to find any information about this on their web page. And they're still recommending the 4000V series as a replacement for some other more mature parts. Mouser only mentions EOL for the non-ROHS 48-pin TQFP, and DigiKey doesn't for the same model. And it looks like the LC4000Z series (which are 5V-tolerant but require two power supplies) are still ok?
Well suppose I shouldn't say "officially announced EOL", but effectively that's what they've (secretly?) done. Here's all I was told by my Arrow sales rep last spring when discussing a purchase of LAMXO256C-3TN100E & LC4032V-75TN44C, "note: Lattice is raising prices on product families at least 10 years or older starting June 29, 2020. Expect price increases annually going forward."

Beyond that, leverclassic which is the only available EDA tool for LC4000V series no longer has a free license. Costs a whopping $590 per year to spin a new fuse file for any old devices requiring leverclassic. I sent an email to an internal Lattice contact I've had over the years griping about that move which feels un-necessary especially considering they haven't issued an update for leverclassic in over a decade from what I can tell. I mentioned that I understood that move was presumably in response to all devices requiring it now being in EOL and he didn't correct me in his breif reply FWIW. To be fair they did issue me a free license renewal in response to my complaint. Mentioned I could come back asking for free renewal every year too. But I was able to revive an old retired laptop that had an outdated license file, changing the date in windows to within a year of the license date did allow leverclassic to boot & build fuse files.

So yeah, maybe Lattice hasn't officially announced EOL on LC-4000V & Mach-XO families, but sure looks and smells like EOL. Price hikes are always the first step of EOL, and the license cost is just a nice kick in the rear on your way out the door.

lidnariq wrote:
Tue Feb 16, 2021 2:36 pm
infiniteneslives wrote:
Tue Feb 16, 2021 2:19 pm
FWIW I’ve mostly moved on to Gowin little bee devices over the past year (GW1N 1152LUTs in TQFP-100) [...] The default GW1N does require a 1.2v regulator,
That's basically the same as the iCE40HX1K. Any thoughts comparing the two?
Yeah I would agree they are pretty comparable. The only Lattice families I've used to date are LC4000V, Mach-XO, & Mach-XO2. I imagine working with the iCE40HX1K isn't much different than Mach-XO2.

Glossing over the datasheets here's what I see that's meaningful:
  • iCE40HX1K doesn't appear to have an internal oscillator? And the QFP-100 package doesn't have a PLL? GW1N has both of these. I question how the iCE40HX1K configures itself from flash if there is truely no internal oscillator, but perhaps they're being lame it didn't provide it to the user? Even the Mach-XO family has an internal oscillator available to the user albeit not well advertised. If there really is no available internal oscillator/PLL that's a pretty big loss now requiring an external clock source, have fun designing mapper logic without one. Asyncrounous glitch free logic is a massive pain. I see a high speed clock 20MHz+ as a hard requriement especially when it comes to combating /romsel delay, PPU A12 filtering, and connecting syncronous internal block rams to the CPU/PPU.
  • block rams: iCE40HX1K has 16x blocks of 4kbit ram, where GW1N-1 has 4x blocks of 18kbit ram. The extra ram on GW1N comes in form of 9th bit so it's a bit hard to make use of, but it is there if useful somehow. One draw back to larger number of smaller blocks on iCE40HX1K is I'm assuming you'll end up using up some fabric logic to provide output multiplexers when targetting 8KByte PRG/CHR-RAM. Bigger GW1N block rams don't consume any fabric logic for 8Kx8bit RAM using each block to service 2bits per byte. 1bit per byte on iCE40HX1K would create 4KByte of RAM with 8x blocks, so full 8KByte would have to multiplex between 2 of those 4KByte sectors, probably doesn't amount to a significant amount of logic. The smaller blocks are probably more versatile if you have different uses, something to consider anyway..
  • iCE40HX1K doesn't appear to have any user flash available? GW1N-1 has 96kbit of user flash, it's not the simplest to interface with, there are provided IP blocks available to make easier to interface but they consume quite a bit of logic. Have goals to create my own light weight flash controller that would allow the 6502 to issue a few commands to the mapper which would backup block ram (PRG-RAM) into the FPGA user flash, as well as restore it. Would emulate battery backup for PRG-RAM without the costs/hardware or concerns that come with traditional battery backup. There's always PRG-ROM flash saves of course, but it's PRG-ROM flash saves are a bit more awkward as can't exectute from PRG-ROM while saving, wouldn't have that issue with the blockram->userflash.
Having said all that, it's safe to say the GW1N-1 is intended by gowin to be a direct competitor with Mach-XO2 1200 since they offer a LQ100X package that's pin compatible with the Mach-XO2 1200. Quite a bold unheard of move in the programmable logic market, but if you can't decide between GW1N & Mach-XO2 you can build your prototype with both and decide later or potentially commit never. The Mach-XO2 does have decent internal oscillator with wide range of frequency settings, PLL, and user flash. Beyond that, the XO2 has hardened I2C, SPI, & timer counters which is a nice bonus that GW1N doesn't provide.

Ultimately for me it comes down to price, and stability of that price during the life of the design. I've been a supporter of Lattice devices for the past decade since I first started using them. Admittedly there weren't may other option in this market segment though especially in the past ~5yrs. My biggest frustration with Lattice was that the stellar prices that I would be quoted early on while making design decisions rarely proved valid/meaningful when came time for me to order their devices during production. So in all my years of purchasing lattice devices I never really knew what I would end up paying. If I pushed hard enough and bought large enough qty typically I could at least get close to prices I had previously paid. It was a pain, but didn't really have anywhere else to go, until gowin entered the scene anyway. I had been tempted to try out gowin since fall 2019, but it was the effective EOL of Mach-XO devices summer 2020 that pushed me to make the leap.

While I can't say everything with Gowin is perfect, the documentation is good [EDIT: "okay/good enough" is probably more fair in comparison to the big 3], but there are some items especially with the EDA tool that is confusing and not explained by any documentation. I ran into a few problems getting my first prototype flashed & configuring itself at boot, expect I wouldn't have had this issue had I started with one of their dev boards, but hearing from some peers I'm not the first to stumble in bringing up their first gowin design. But can't really complain as they have some great FAEs & sales reps here in the US, they really have been great and big part of why I've had a good experience with Gowin as a company. They've been quick to help me get any problem/question I've ran into resolved quickly. My first production purchase of parts was easy to purchase as quoted, and I'm in the process of making another purchase with no change in pricing.

I don't even really know the best way to compare pricing between Gowin & Lattice. Gowin has some parts listed on edgeelectronics, but none(?) of them are instock ready for purchase as samples, so take online pricing there with a big grain of salt. If I were looking to make a design decision for lattice based on pricing I would probably base it on the prices listed on Digikey. Can't divulge too much, but I will say niether the lattice pricing or experiences I've had to date make the Mach-XO2 or iCE40 devices attractive to me like they once were now that I've migrated to Gowin. This is also easy for me to say now that I've paid my dues to get on the other side of the learning curve with GW1N nuances and supply chain, YMMV...

Beyond that, it is nice to see some open source tool chain support being created for gowin devices. Number of different efforts/options depending on the exact device, but I'm excited about @pepijndevos' work with GW1N. https://www.patreon.com/pepijndevos
If you're gonna play the Game Boy, you gotta learn to play it right. -Kenny Rogers

Post Reply