PC Engine specifics

Discussion of development of software for any "obsolete" computer or video game system.
User avatar
za909
Posts: 193
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

PC Engine specifics

Post by za909 » Sun Mar 10, 2019 4:08 am

Hi,

I have a growing interest in working with the PC Engine, and having read the most common documentation out there (MagicKit, MagicEngine, etc.) I have a few fairly specific questions that I could not find the answers to. I suspect that the system has not been torn to shreds as much as the NES, so there are still some unknown, very specific behaviors in the box. Hopefully I can learn some of those answers here.

- Transfer instructions:
+ What I'd need to know is how the transfer instructions are built up, especially since it might be the case that the TIMER or a VDC IRQ fires during the transfer instruction. Is the IRQ delayed until the transfer finishes (similarly to how OAM DMA blocks IRQ on the NES) or is the transfer interrupted and some point? Perhaps between two iterations?

- VDC Horizontal Sync and draw length registers:
+ I have once read that since the programmer can decide the screen layout and the usage of VRAM in a "tug of war" fashion between tile data and nametable data, a 512x240 virtual screen size is perfectly normal operation. However, the draw length register can allow you to set the speed of the pixel clock and actually show a 512x240 image on the screen. Sprites have a 9th X coordinate bit that helps with sprite-related issues if you do that, but apparently the VRAM is not supposed to be able to react fast enough to reads if the pixel clock speed is above a certain threshold. Is this actually true, and are there any known long-term consequences for showing the full 512 pixels wide image on the screen?

- VDC data access
+ Is the bus shared between the VDC and the HuC6280 in a manner that is similar to the NES architecture? Meaning that accessing VRAM outside of VBlank scanlines is discouraged and can result in potentially corrupting / writing to an unknown VRAM address?

- Detecting the end of VBlank / extending VBlank
+ Is there a VDC flag that can help you detect when VBlank ends, so that your main code can stay in sync with the VDC? (Starting game logic when VBlank is ended) Otherwise, is it possible to force blanking to make sure your NMI handler never spills out of VBlank and corrupts VRAM data, without NES-style sprite ram decay?

Thank you!

User avatar
Gilbert
Posts: 358
Joined: Sun Dec 12, 2010 10:27 pm
Location: Hong Kong
Contact:

Re: PC Engine specifics

Post by Gilbert » Sun Mar 10, 2019 6:50 pm

I'm not good at tech but:
za909 wrote:Hi,
- VDC Horizontal Sync and draw length registers:
+ I have once read that since the programmer can decide the screen layout and the usage of VRAM in a "tug of war" fashion between tile data and nametable data, a 512x240 virtual screen size is perfectly normal operation. However, the draw length register can allow you to set the speed of the pixel clock and actually show a 512x240 image on the screen. Sprites have a 9th X coordinate bit that helps with sprite-related issues if you do that, but apparently the VRAM is not supposed to be able to react fast enough to reads if the pixel clock speed is above a certain threshold. Is this actually true, and are there any known long-term consequences for showing the full 512 pixels wide image on the screen?
Just refer to this.
Basically at the hi-res (512 horizontal pixels) setting the VRAM would be overclocked under normal timing. To remedy this you need to set the VDC to 2-cycle mode, so that VRAM is fetched every 2 cycles instead of 1. There shouldn't be any drawback as under this mode the H-blank is still long enough to fetch all 16 sprites in a single line.
However, at the mid-res (often referred to as "320" horizontal pixel mode) setting it's also advised to have the VDC set to 2-cycle mode (though a number of games did not follow this) according to official specs., and the shorter h-blank (depends on how many actual pixels are set to display per line) would usually cause the system to not be able to display all 16 sprites in a single line.

ccovell
Posts: 1006
Joined: Sun Mar 19, 2006 9:44 pm
Location: Japan
Contact:

Re: PC Engine specifics

Post by ccovell » Mon Mar 11, 2019 2:33 am

Hi, there.

The block transfer instructions clobber the values in the X and Y registers, something I haven't seen mentioned anywhere except the Japanese Develo magazine. Something to watch out for, if you have transfers that aren't a multiple of $100 bytes.
An interrupt occurs after a transfer is completed, delaying the interrupt. More details are in Elmer's posts on the forum that was linked.

The 512-pixel mode does not cause problems with displaying sprites, if you use the default VRAM reading procedure, again described in tons of detail above.

Read/Write access is shared between the CPU and VDC during the active screen, and so VRAM can be written to during the active display, very much unlike the NES, SNES, and Genesis. One thing to watch out for is if you use HSync interrupts, they can interrupt your manual VRAM write mid-word. Also, changing this HSync interrupt means resetting the RCR trigger to another scanline by writing to the VDC registers, so you have to restore the VRAM write register once you get out of that interrupt. There are annoying complications like this.

I don't know how one can detect the END of VBlank, but I guess if you set an RCR interrupt to (for example) line $0136 (sixteen scanlines before the beginning of the active display) you could signal this to your code. But anyway, since you can access the VDC and VRAM inside the active display, there isn't a strong need to extend VBlank.

To get a taller or shorter screen (like only 192 scanlines in the active display as on the SMS and in Dragon's Curse) you can adjust the VDS and VSW registers.

Anyhow, I have a few PCE programming tutorial videos that get into the initial screen setup a bit; they might be useful: https://www.youtube.com/playlist?list=P ... zmyLo9wDzl

tepples
Posts: 21750
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: PC Engine specifics

Post by tepples » Mon Mar 11, 2019 6:42 am

To make your game adjust for 4:3 and 16:9 screens, use the 5.37 MHz pixel clock on 4:3 and the 7.16 MHz pixel clock on 16:9. That way, you can make the visible background 256 or 336 pixels wide as the player chooses.

User avatar
za909
Posts: 193
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: PC Engine specifics

Post by za909 » Thu Mar 14, 2019 12:44 am

Dear all, thank you for the responses, they are all extremely helpful, so I need to make sure to save these specifics for the future, especially the transfer instructions clobbering the X and Y registers, which I would've no doubt run into otherwise. I wouldn't be surprised if ST0, ST1, and ST2 clobber the A register. Also, the IRQs getting delayed might be a bit of a problem when playing pitched samples (like an orchestra hit), but the TIMER frequency seems to be pretty low and a single sample might get delayed every now and then, which shouldn't distort the sound too much.

Additionally, I have started a discussion about the audio quirks before, but I have educated myself a bit more in that regard and now I unfortunately have even more uncertainties in my head! :oops:
I want to attempt stringing together multiple waveforms, possibly 1 or two for "pluck" effects, or 4-8 to synthesize the attack portion of an instrument for example (like a fake "FM synthesis sound") after which the sound sustains on a particular waveform. In most cases, 2-3 frames would pass between a waveform change, if not more.
The MagicEngine documentation mentions that the $0804 register allows you to simply turn off the waveform generation. Now this would let me upload my waveform with a TIN instruction in about 1000 cycles or so. The position read/write buffer of the waveform doesn't really matter because I am rewriting all 32 samples, and I in fact want to AVOID resetting the buffer position with the D/A bit, to reduce possible artifacts. The question is whether the channel retains its voltage level when I turn off the waveform generator, or it falls back to 0, producing a pop / interruption in the sound.
My plan B in the case it falls back to 0 would be to not turn off the waveform at all, but instead set it the channel to the lowest possible frequency, and update the waveform then. Around 1748 CPU cycles pass between two waveform steps, and uploading the full waveform takes a few hundred less than that, but unfortunately I would possibly have to upload the waveform twice to ensure that the buffer pointer does not advance in the middle of it, creating a "missed" waveform step. I can live with the CPU cost of that though, and I would refrain from using TIMER or scanline IRQ based samples in that case.

Do you possibly have any tips on which case I should prepare for?

ccovell
Posts: 1006
Joined: Sun Mar 19, 2006 9:44 pm
Location: Japan
Contact:

Re: PC Engine specifics

Post by ccovell » Thu Mar 14, 2019 7:20 am

People (myself included) have been experimenting with numerous ways to update the 32 bytes in each waveform buffer without generating an audible "pop" and no method works perfectly. The dead PCEfx forums has a few threads on this. Eg:

https://www.pcenginefx.com/forums/index ... ic=22227.0
https://www.pcenginefx.com/forums/index ... ic=21779.0

Oh, and there are no worries about the STx instructions clobbering any registers.

User avatar
za909
Posts: 193
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: PC Engine specifics

Post by za909 » Mon Mar 18, 2019 2:31 pm

Thanks for the replies, I will see what I can get out of the audio knowing the fact that there really isn't a good way to avoid that phase misalignment... (except for that 1 in 32 chance where it will like up perfectly with the wave phase).

I have since made a small fork of the open source ASM6 to support the added instruction layer of the HuC6280 on top of the 6502 instruction set. I have grown too attached to the conventions of ASM6 to be able to move on to something else when the target platform is so similar to the NES. I will make this fork public soon, because there's bound to be someone else out there that could find it useful, as lazily made as it is, considering I didn't bother with the block transfers and added them by using assembler macros.

Pokun
Posts: 1270
Joined: Tue May 28, 2013 5:49 am
Location: Hokkaido, Japan

Re: PC Engine specifics

Post by Pokun » Tue Mar 19, 2019 8:51 am

za909 wrote: I have since made a small fork of the open source ASM6 to support the added instruction layer of the HuC6280 on top of the 6502 instruction set. I have grown too attached to the conventions of ASM6 to be able to move on to something else when the target platform is so similar to the NES. I will make this fork public soon, because there's bound to be someone else out there that could find it useful, as lazily made as it is, considering I didn't bother with the block transfers and added them by using assembler macros.
You bet there is! I don't really mind PCEAS (specifically Elmer's updated fork) but I'd love to have HuC6280 support for ASM6. Maybe the fork can be merged with the 65816 fork and the asm6f fork collection some day (when things like block transfer instructions are fixed).

User avatar
freem
Posts: 162
Joined: Mon Oct 01, 2012 3:47 pm
Location: freemland (NTSC-U)
Contact:

Re: PC Engine specifics

Post by freem » Tue Mar 19, 2019 9:39 pm

za909 wrote:I will make this fork public soon, because there's bound to be someone else out there that could find it useful, as lazily made as it is
This would be nice; I tried to do this a few years back but got hung up on some things.

As for merging it into asm6f, I wouldn't be against it, but a decent way to set the target would be needed. (Probably best discussed elsewhere when the time comes.)

User avatar
za909
Posts: 193
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: PC Engine specifics

Post by za909 » Wed Mar 20, 2019 2:57 am

freem wrote:This would be nice; I tried to do this a few years back but got hung up on some things.

As for merging it into asm6f, I wouldn't be against it, but a decent way to set the target would be needed. (Probably best discussed elsewhere when the time comes.)
That is basically why I said "soon", but didn't really specify it. The bulk of the changes have been made, but I still need to account for the fact that HuC6280 zero page starts at $2000, and that standard non-indexed indirect instructions are being mistaken for zero page. It is a wild ride because I am still familiarizing myself more with C in the process, and also the lack of comments and some variable naming choices are not exactly helping... but I'll crack the code given enough time.

Pokun
Posts: 1270
Joined: Tue May 28, 2013 5:49 am
Location: Hokkaido, Japan

Re: PC Engine specifics

Post by Pokun » Fri Mar 22, 2019 9:24 am

A directive that lets the user manually set the zero/direct page location would solve that. As it is required for the relocatable direct page on the 65816, asm16 already have that I think (or maybe not? I can't see it being mentioned). And yeah a directive for choosing the processor type, like what ca65 has, would be needed. The unofficial mnemonic SAX is already supported by asm6f for an illegal opcode which is a different legal opcode on the HuC6280 among other things.
Last edited by Pokun on Tue Mar 26, 2019 5:42 am, edited 1 time in total.

User avatar
za909
Posts: 193
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: PC Engine specifics

Post by za909 » Tue Mar 26, 2019 12:41 am

I have since fixed most of the issues + added a bit of code to manually force instructions for which the operand would fall within the PCE's zero page to be looked up again and have their operand truncated to its low 8 bits before being emitted to the output file. I'll have to put a bit more time into it though, because it looks like the for loop responsible for evaluating the opcode and the operand is very much dependant on the order of the array items for each mnemonic, so it might happen that when looking for the zero page opcode the search might stop at zero page indexed by X if that opcode comes first in the list of array items, since it also fits the same initial search criteria.

Once that works I can at least say that I have implemented everything HuC6280-related in some way. Moving on to making it 65816-compatible doesn't sound like a difficult task from there.

abridgewater
Posts: 3
Joined: Fri Apr 05, 2019 9:47 pm

Re: PC Engine specifics

Post by abridgewater » Sat Apr 06, 2019 11:05 pm

ccovell wrote:The block transfer instructions clobber the values in the X and Y registers, something I haven't seen mentioned anywhere except the Japanese Develo magazine. Something to watch out for, if you have transfers that aren't a multiple of $100 bytes.
An interrupt occurs after a transfer is completed, delaying the interrupt. More details are in Elmer's posts on the forum that was linked.
Ah, could you elaborate on this, please? All of the information that I've found, including the Develo "mook" (bookazine?) says that X, Y, and A are stashed on the stack during transfers (using three bytes, which may be significant if you're doing something clever with stack space and the S register), although some of the flowcharts imply that they are merely written twice rather than written and restored. And my current PC-Engine development setup absolutely depends on X being preserved across a two-byte TII transfer and some longer TIA transfers, none of which are a multiple of 256 bytes, so the assertion that at least X is clobbered doesn't make sense to me.

ccovell
Posts: 1006
Joined: Sun Mar 19, 2006 9:44 pm
Location: Japan
Contact:

Re: PC Engine specifics

Post by ccovell » Mon Apr 08, 2019 8:45 pm

Okay, I whipped up a quick test.

Hmm, sorry for the misinformation, then. It looks like the Y,A, and X registers are pushed to the stack as part of the Txx instructions. I don't know exactly why anymore, but as part of my programming, using a Txx instruction with like 6-7 bytes of data length screwed up the registers when transferring $2000 bytes or so didn't.

edit: maybe it was done when I transferred $FC00-$FFF5 in ROM to RAM mapped into $C000-$DFFF, before I had mapped in bank $F8 RAM into the $2000 range, or some other rare case.
Attachments
BlockTransferTest.gif
BlockTransferTest.gif (7.66 KiB) Viewed 11022 times

User avatar
za909
Posts: 193
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: PC Engine specifics

Post by za909 » Wed Apr 10, 2019 1:45 am

It would make sense though, because there is probably some kind of internal "buffer register" to hold some of the operands, as there are 6 bytes that need to be kept track of. My guess is that the source, destination and length are likely stored in a combination of these buffers and X & Y, and then the copied data is transferred via A. Someone more familiar with the breakdown of instruction execution and value fetches on the 65xx family CPUs would be able to explain how this actually works, as in, does the fixed "overhead" cycle count of the transfer instructions that is there regardless of the number of iterations suggest both a register push and pull afterwards.

Post Reply