PEA/PER/PEI instruction and stack relative addressing modes

Discussion of hardware and software development for Super NES and Super Famicom.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
User avatar
SusiKette
Posts: 125
Joined: Fri Mar 16, 2018 1:52 pm
Location: Finland

PEA/PER/PEI instruction and stack relative addressing modes

Post by SusiKette » Sun Oct 04, 2020 9:29 am

I got back into SNES assembly and I guess the first thing to start is to try to clear few things I haven't been able to find answers to yet.

1. According to this site PEA/PER/PEI instructions push an effective address to stack. What is the point of these instructions and what are some practical uses for them?

2. How do 'Stack relative' and 'Stack relative indirect indexed' addressing modes work?

EDIT: Are hardware registers available in all data banks or only on a specific one? bank 0 probably?
Avatar is pixel art of Noah Prime from Astral Chain

calima
Posts: 1238
Joined: Tue Oct 06, 2015 10:16 am

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by calima » Sun Oct 04, 2020 9:42 am

P*: Dynamically calculated jumps, or simply as a quicker version of "lda; pha; lda; pha".
Stack addressing is extremely useful, C-like even. But instead of putting a long description here, I direct you to the book "Programming the 65816". A PDF is easily findable.

RainbowSprinklez
Posts: 4
Joined: Sun Oct 04, 2020 7:07 am
Location: United States
Contact:

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by RainbowSprinklez » Sun Oct 04, 2020 10:37 am

TLDR; It's faster.

Imagine pushing 16 bits to stack.
PEA $0x0000
is 5 cycles.

lda #0x0000
pha
is 7 cycles. Also, another benefit of pea is it leaves A alone.

lidnariq
Posts: 9843
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by lidnariq » Sun Oct 04, 2020 11:49 am

Unrolled loops of PEA and PEI were used by a homebrew port of Super Mario Brothers to the Apple IIgs...

User avatar
dougeff
Posts: 2772
Joined: Fri May 08, 2015 7:17 pm
Location: DIGDUG
Contact:

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by dougeff » Sun Oct 04, 2020 11:53 am

Also PEA always pushes 2 bytes regardless of the size of any of the registers. If A is in 8 bit mode, you would have to REP first to PHA 2 bytes.

I believe PER is for relocatable code (like for an Apple iigs computer), where the program could be located anywhere in RAM and still function. Also, the BRA and BRL instead of JMP is used, in that case.
nesdoug.com -- blog/tutorial on programming for the NES

User avatar
SusiKette
Posts: 125
Joined: Fri Mar 16, 2018 1:52 pm
Location: Finland

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by SusiKette » Sat Oct 10, 2020 12:01 pm

So, these things are used mostly for relocatable code then? I guess they mostly won't be necessary on SNES since the code is at fixed location.

By the way, this is a bit off topic, but what do the "Main Screen" and "Sub-Screen" on the SNES actually mean? There seem to be some registers that point to these, but I haven't found any explanation on what they actually are.
Avatar is pixel art of Noah Prime from Astral Chain

User avatar
NovaSquirrel
Posts: 421
Joined: Fri Feb 27, 2009 2:35 pm
Location: Fort Wayne, Indiana
Contact:

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by NovaSquirrel » Sat Oct 10, 2020 1:47 pm

SusiKette wrote:
Sat Oct 10, 2020 12:01 pm
So, these things are used mostly for relocatable code then? I guess they mostly won't be necessary on SNES since the code is at fixed location.

By the way, this is a bit off topic, but what do the "Main Screen" and "Sub-Screen" on the SNES actually mean? There seem to be some registers that point to these, but I haven't found any explanation on what they actually are.
PEA is very helpful for pushing bank values for use with the PLB instruction.

I use this macro taken from lorom-template which presents it as two separate 8-bit values:

Code: Select all

;;
; Pushes two constant bytes in the order second, first
; to be pulled in the order first, second.
.macro ph2b first, second
.local first_, second_, arg
first_ = first
second_ = second
arg = (first_ & $FF) | ((second_ & $FF) << 8)
  pea arg
.endmacro
This means you can do stuff like this:

Code: Select all

  ph2b BankNum1, BankNum2
  plb
  ; Insert code that uses data in the first bank
  plb
  ; Insert code that uses data in the second bank
As far as the main screen and sub screen go, the main screen is the set of layers that are actually displayed, and the sub screen is the set of layers that get added/subtracted with the main screen, if color math is enabled. The sub screen is also used for "Pseudo Hires" mode, in which the horizontal screen resolution is expanded to 512 pixels by alternating between the main screen and sub screen every half pixel.

lidnariq
Posts: 9843
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by lidnariq » Sat Oct 10, 2020 2:25 pm

SusiKette wrote:
Sat Oct 10, 2020 12:01 pm
So, these things are used mostly for relocatable code then?
No, only PER.

PEI and PEA are useful any time you need to get a number onto the stack, whether a constant (PEA) or a variable (PEI).

Since they're both always 16 bit, regardless of the M and X bits, they can be used to set D (PEA / PLD) even when A is in 8-bit mode (without doing the stupid LDA #hh, XBA, LDA #ll, TAD). I'm certain there's lot of other little things like this too.

strat
Posts: 375
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by strat » Sat Oct 10, 2020 9:12 pm

Here's an example of pea from Zombies Ate My Neighbors.

Code: Select all

call nmi routines:
$80/83E0 A5 0C       LDA $0C    [$00:000C]   A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/83E2 F0 33       BEQ $33    [$8417]      A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/83E4 85 10       STA $10    [$00:0010]   A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/83E6 A2 38 00    LDX #$0038              A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/83E9 BD A0 12    LDA $12A0,x[$80:12CC]   A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/83EC F0 23       BEQ $23    [$8411]      A:C349 X:002C Y:0100 P:eNvmxdIzc

; push return address on stack so routine loaded from $12A0,X can return here
$80/83EE 4B          PHK                     A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/83EF F4 00 84    PEA $8400               A:C349 X:002C Y:0100 P:eNvmxdIzc

; push routine address queued in $12A0,X on stack
$80/83F2 E2 20       SEP #$20                A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/83F4 BD A2 12    LDA $12A2,x[$80:12CE]   A:C349 X:002C Y:0100 P:eNvMxdIzc
$80/83F7 48          PHA                     A:C349 X:002C Y:0100 P:eNvMxdIzc
$80/83F8 C2 30       REP #$30                A:C349 X:002C Y:0100 P:eNvMxdIzc
$80/83FA BD A0 12    LDA $12A0,x[$80:12CC]   A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/83FD 48          PHA                     A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/83FE 86 12       STX $12    [$00:0012]   A:C349 X:002C Y:0100 P:eNvmxdIzc

; does not return from this routine!  It calls the routine pushed on the stack
$80/8400 6B          RTL                     A:C349 X:002C Y:0100 P:eNvmxdIzc

; routine called by RTL above will return here since $808400 was pushed on stack

; erase pointer to NMI routine that was just executed if carry clear
$80/8401 A6 12       LDX $12    [$00:0012]   A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/8403 B0 0C       BCS $0C    [$8411]      A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/8405 A9 00 00    LDA #$0000              A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/8408 9D A0 12    STA $12A0,x[$80:12CC]   A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/840B C6 0C       DEC $0C    [$00:000C]   A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/840D C6 10       DEC $10    [$00:0010]   A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/840F F0 06       BEQ $06    [$8417]      A:C349 X:002C Y:0100 P:eNvmxdIzc

; loop to do this again if X >= 0
$80/8411 CA          DEX                     A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/8412 CA          DEX                     A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/8413 CA          DEX                     A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/8414 CA          DEX                     A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/8415 10 D2       BPL $D2    [$83E9]      A:C349 X:002C Y:0100 P:eNvmxdIzc
$80/8417 60          RTS                     A:C349 X:002C Y:0100 P:eNvmxdIzc

creaothceann
Posts: 270
Joined: Mon Jan 23, 2006 7:47 am
Location: Germany
Contact:

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by creaothceann » Sun Oct 11, 2020 2:05 am

SusiKette wrote:
Sat Oct 10, 2020 12:01 pm
what do the "Main Screen" and "Sub-Screen" on the SNES actually mean?
You could read anomie's docs or watch this channel for that.

In short, the SNES has a horizontal resolution of 512 pixels, it's just outputting the same color for every pixel pair (called "dot") in most cases, exceptions being the pseudo-hires bit $2133.3 = 1 and BG modes 5 and 6. Registers like $212C (mainscreen designation) and $212D (subscreen designation) apply only to one pixel of each dot. This is useful when 'color math with subscreen' is enabled, which adds/subtracts the subscreen pixel to/from the mainscreen pixel. By disabling a layer only on one screen, the result is that it appears translucent.

Fun facts: IIRC Nintendo says the hires mode has the subscreen shifted to the left by half a dot. I'd say it's more accurate to say that the non-hires modes have the output delayed by one pixel. (Would be interesting to see it on a real CRT TV / oscilloscope.) Also, from anomie: "The subscreen pixel is clipped (by windows) when the main-screen pixel to the LEFT is clipped, not when the one to the RIGHT is clipped as you'd expect. What happens with pixel column 0 is unknown."
My current setup:
Super Famicom ("2/1/3" SNS-CPU-GPM-02) → SCART → OSSC → StarTech USB3HDCAP → AmaRecTV 3.10

lidnariq
Posts: 9843
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by lidnariq » Sun Oct 11, 2020 12:44 pm

creaothceann wrote:
Sun Oct 11, 2020 2:05 am
Fun facts: IIRC Nintendo says the hires mode has the subscreen shifted to the left by half a dot. I'd say it's more accurate to say that the non-hires modes have the output delayed by one pixel. (Would be interesting to see it on a real CRT TV / oscilloscope.) Also, from anomie: "The subscreen pixel is clipped (by windows) when the main-screen pixel to the LEFT is clipped, not when the one to the RIGHT is clipped as you'd expect. What happens with pixel column 0 is unknown."
How would I set up a test for this? I'm happy to convert directions into a ROM and then take photos, but I'm not clear I understand the specific registers and changes I'd need to show.

creaothceann
Posts: 270
Joined: Mon Jan 23, 2006 7:47 am
Location: Germany
Contact:

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by creaothceann » Mon Oct 12, 2020 12:26 am

These should be the registers to set:

Code: Select all

RESET:
; write %00000000 to $212C (TM      ) to disable backgrounds & sprites
; write %00000000 to $2121 (CGADD   )
; write %11111111 to $2121 (CGDATA  )
; write %11111111 to $2121 (CGDATA  ) to make color 0 (the entire screen) white
; write %10001111 to $2100 (INIDISP ) to enable rendering at full brightness
; write       112 to $4209 (VTIMEL  )
; write         0 to $420A (VTIMEH  )
; write %10100000 to $4200 (NMITIMEN) for an IRQ halfway down the screen

IRQ:  ; write %00001000 to $2133 (SETINI) to  enable pseudo-hires mode
NMI:  ; write %00000000 to $2133 (SETINI) to disable pseudo-hires mode
(to be added: ROM header, interrupt acknowledgements)

As a result the lower half of the screen should be shifted to the left for 1 pixel if my theory is correct.
My current setup:
Super Famicom ("2/1/3" SNS-CPU-GPM-02) → SCART → OSSC → StarTech USB3HDCAP → AmaRecTV 3.10

lidnariq
Posts: 9843
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by lidnariq » Tue Oct 13, 2020 7:19 pm

creaothceann wrote:
Mon Oct 12, 2020 12:26 am
As a result the lower half of the screen should be shifted to the left for 1 pixel if my theory is correct.
Yup. One high-res pixel to the left.

The right edge also moves. Naively, I would have expected the horizontal blanking period to be slightly shorter to account for this difference; instead the active portion of scanline #141 is one high-res pixel shorter. I have to assume that the IRQ is in the middle of the display.

Measurements from my TDS1002:
Sync on scanlines #30-141: active field starts at 11.56µs
30-140: active field ends at 59.23µs
Sync on scanlines #142-253: active field starts at 11.48µs
141-253: active field ends at 59.14µs

My oscilloscope's Video sync detects each scanline at 250ns after the falling edge of composite sync starts.
Attachments
hires-early.7z
source and sfc included
(10.13 KiB) Downloaded 24 times

creaothceann
Posts: 270
Joined: Mon Jan 23, 2006 7:47 am
Location: Germany
Contact:

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by creaothceann » Wed Oct 14, 2020 12:37 am

lidnariq wrote:
Tue Oct 13, 2020 7:19 pm
The right edge also moves. Naively, I would have expected the horizontal blanking period to be slightly shorter to account for this difference; instead the active portion of scanline #141 is one high-res pixel shorter. I have to assume that the IRQ is in the middle of the display.
Thanks for testing!

Yeah, IRQs are fired by the 5A22, so the IRQ handler would execute somewhere in line 141. This also means that 2133.3 takes effect immediately, i.e. it's not cached at the start of a line/field/frame. The effect on graphics would probably be immediate, as it simply controls if the subscreen pixels affect the output.
hires_early.pdf
(171.52 KiB) Downloaded 25 times
My current setup:
Super Famicom ("2/1/3" SNS-CPU-GPM-02) → SCART → OSSC → StarTech USB3HDCAP → AmaRecTV 3.10

turboxray
Posts: 120
Joined: Thu Oct 31, 2019 12:56 am

Re: PEA/PER/PEI instruction and stack relative addressing modes

Post by turboxray » Wed Oct 14, 2020 7:47 am

lidnariq wrote:
Sun Oct 04, 2020 11:49 am
Unrolled loops of PEA and PEI were used by a homebrew port of Super Mario Brothers to the Apple IIgs...
Color computer games used the 6809's 'U' reg for 'stack blasting' technique to draw graphics since the early 80s. Nice to see it implemented on the IIgs. The Megadrive has a similar trick for abusing user stack pointer reg, but I think only a couple of demos use it. Wouldn't surprise me if the ST used it too for graphics.

Post Reply