It is currently Wed Oct 18, 2017 8:10 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Mon Dec 09, 2013 2:58 pm 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1338
Hitting a snag here.

Origin::BG means a background/window pixel with the priority bit clear.
Origin::BGP means a background/window pixel with the priority bit set.

If I treat priority BG pixels as always being on top:

Code:
      unsigned ox = sx + tx;
      if(ox < 160) {
        //When LCDC.D0 (BG enable) is off, OB is always rendered above BG+Window
        if(status.bg_enable) {
          if(pixels[ox].origin == Pixel::Origin::BGP) continue;
          if(attr & 0x80) {
            if(pixels[ox].origin == Pixel::Origin::BG) {
              if(pixels[ox].palette > 0) continue;
            }
          }
        }
        pixels[ox].color = color;
        pixels[ox].palette = index;
        pixels[ox].origin = Pixel::Origin::OB;
      }


The result is the bridge overlapping on top of Shantae at the beginning of a new game, cutting off the sprite.

If I allow palette color 0 on a BG priority pixel to be transparent and allow a sprite on top of it:

Code:
      unsigned ox = sx + tx;
      if(ox < 160) {
        //When LCDC.D0 (BG enable) is off, OB is always rendered above BG+Window
        if(status.bg_enable) {
          if(attr & 0x80) {
            if(pixels[ox].origin == Pixel::Origin::BG || pixels[ox].origin == Pixel::Origin::BGP) {
              if(pixels[ox].palette > 0) continue;
            }
          }
        }
        pixels[ox].color = color;
        pixels[ox].palette = index;
        pixels[ox].origin = Pixel::Origin::OB;
      }


Then Link is visible on his horse before the window pans onto him in Legand of Zelda - Oracle of Ages' intro.

I can't do both, so clearly I'm not understanding something here ...

Full CGB renderer source:

Code:
void PPU::cgb_render() {
  for(auto& pixel : pixels) {
    pixel.color = 0x7fff;
    pixel.palette = 0;
    pixel.origin = Pixel::Origin::None;
  }

  if(status.display_enable) {
    cgb_render_bg();
    if(status.window_display_enable) cgb_render_window();
    if(status.ob_enable) cgb_render_ob();
  }

  uint32* output = screen + status.ly * 160;
  for(unsigned n = 0; n < 160; n++) output[n] = video.palette[pixels[n].color];
  interface->lcdScanline();
}

//Attributes:
//0x80: 0 = OAM priority, 1 = BG priority
//0x40: vertical flip
//0x20: horizontal flip
//0x08: VRAM bank#
//0x07: palette#
void PPU::cgb_read_tile(bool select, unsigned x, unsigned y, unsigned& tile, unsigned& attr, unsigned& data) {
  unsigned tmaddr = 0x1800 + (select << 10);
  tmaddr += (((y >> 3) << 5) + (x >> 3)) & 0x03ff;

  tile = vram[0x0000 + tmaddr];
  attr = vram[0x2000 + tmaddr];

  unsigned tdaddr = attr & 0x08 ? 0x2000 : 0x0000;
  if(status.bg_tiledata_select == 0) {
    tdaddr += 0x1000 + ((int8)tile << 4);
  } else {
    tdaddr += 0x0000 + (tile << 4);
  }

  y &= 7;
  if(attr & 0x40) y ^= 7;
  tdaddr += y << 1;

  data  = vram[tdaddr++] << 0;
  data |= vram[tdaddr++] << 8;
  if(attr & 0x20) data = hflip(data);
}

void PPU::cgb_render_bg() {
  unsigned iy = (status.ly + status.scy) & 255;
  unsigned ix = status.scx, tx = ix & 7;

  unsigned tile, attr, data;
  cgb_read_tile(status.bg_tilemap_select, ix, iy, tile, attr, data);

  for(unsigned ox = 0; ox < 160; ox++) {
    unsigned index = ((data & (0x0080 >> tx)) ? 1 : 0)
                   | ((data & (0x8000 >> tx)) ? 2 : 0);
    unsigned palette = ((attr & 0x07) << 2) + index;
    unsigned color = 0;
    color |= bgpd[(palette << 1) + 0] << 0;
    color |= bgpd[(palette << 1) + 1] << 8;
    color &= 0x7fff;

    pixels[ox].color = color;
    pixels[ox].palette = index;
    pixels[ox].origin = (attr & 0x80 ? Pixel::Origin::BGP : Pixel::Origin::BG);

    ix = (ix + 1) & 255;
    tx = (tx + 1) & 7;
    if(tx == 0) cgb_read_tile(status.bg_tilemap_select, ix, iy, tile, attr, data);
  }
}

void PPU::cgb_render_window() {
  if(status.ly - status.wy >= 144u) return;
  if(status.wx >= 167u) return;
  unsigned iy = status.wyc++;
  unsigned ix = (7 - status.wx) & 255, tx = ix & 7;

  unsigned tile, attr, data;
  cgb_read_tile(status.window_tilemap_select, ix, iy, tile, attr, data);

  for(unsigned ox = 0; ox < 160; ox++) {
    unsigned index = ((data & (0x0080 >> tx)) ? 1 : 0)
                   | ((data & (0x8000 >> tx)) ? 2 : 0);
    unsigned palette = ((attr & 0x07) << 2) + index;
    unsigned color = 0;
    color |= bgpd[(palette << 1) + 0] << 0;
    color |= bgpd[(palette << 1) + 1] << 8;
    color &= 0x7fff;

    if(ox - (status.wx - 7) < 160u) {
      pixels[ox].color = color;
      pixels[ox].palette = index;
      pixels[ox].origin = (attr & 0x80 ? Pixel::Origin::BGP : Pixel::Origin::BG);
    }

    ix = (ix + 1) & 255;
    tx = (tx + 1) & 7;
    if(tx == 0) cgb_read_tile(status.window_tilemap_select, ix, iy, tile, attr, data);
  }
}

//Attributes:
//0x80: 0 = OBJ above BG, 1 = BG above OBJ
//0x40: vertical flip
//0x20: horizontal flip
//0x08: VRAM bank#
//0x07: palette#
void PPU::cgb_render_ob() {
  const unsigned Height = (status.ob_size == 0 ? 8 : 16);
  unsigned sprite[10], sprites = 0;

  //find first ten sprites on this scanline
  for(unsigned s = 0; s < 40; s++) {
    unsigned sy = oam[(s << 2) + 0] - 16;
    unsigned sx = oam[(s << 2) + 1] -  8;

    sy = status.ly - sy;
    if(sy >= Height) continue;

    sprite[sprites++] = s;
    if(sprites == 10) break;
  }

  //render backwards, so that first sprite has highest priority
  for(signed s = sprites - 1; s >= 0; s--) {
    unsigned n = sprite[s] << 2;
    unsigned sy = oam[n + 0] - 16;
    unsigned sx = oam[n + 1] -  8;
    unsigned tile = oam[n + 2] & ~status.ob_size;
    unsigned attr = oam[n + 3];

    sy = status.ly - sy;
    if(sy >= Height) continue;
    if(attr & 0x40) sy ^= (Height - 1);

    unsigned tdaddr = (attr & 0x08 ? 0x2000 : 0x0000) + (tile << 4) + (sy << 1), data = 0;
    data |= vram[tdaddr++] << 0;
    data |= vram[tdaddr++] << 8;
    if(attr & 0x20) data = hflip(data);

    for(unsigned tx = 0; tx < 8; tx++) {
      unsigned index = ((data & (0x0080 >> tx)) ? 1 : 0)
                     | ((data & (0x8000 >> tx)) ? 2 : 0);
      if(index == 0) continue;

      unsigned palette = ((attr & 0x07) << 2) + index;
      unsigned color = 0;
      color |= obpd[(palette << 1) + 0] << 0;
      color |= obpd[(palette << 1) + 1] << 8;
      color &= 0x7fff;

      unsigned ox = sx + tx;
      if(ox < 160) {
        //When LCDC.D0 (BG enable) is off, OB is always rendered above BG+Window
        if(status.bg_enable) {
          if(attr & 0x80) {
            if(pixels[ox].origin == Pixel::Origin::BG || pixels[ox].origin == Pixel::Origin::BGP) {
              if(pixels[ox].palette > 0) continue;
            }
          }
        }
        pixels[ox].color = color;
        pixels[ox].palette = index;
        pixels[ox].origin = Pixel::Origin::OB;
      }
    }
  }
}


Last edited by byuu on Tue Dec 10, 2013 10:51 am, edited 1 time in total.

Top
 Profile  
 
PostPosted: Mon Dec 09, 2013 11:18 pm 
Offline

Joined: Sat Aug 28, 2010 9:01 am
Posts: 190
Image

Image

The reason that the bridge post is above Shantae's sprite is the priority bit in the BG attribute map, as you can see in BGB's excellent VRAM viewer. This also makes sense graphically, since she's walking between the posts. The priority bit is set for that tile, and the LCDC.0 is set, so clearly those tiles (except for palette 0) should overlap the sprite.

Unless you're saying that some other part of the bridge is overlapping her sprite when you change the behavior? Screenshot?

Image

Image

I don't see how this how this would create a contradiction with Zelda OoA. The maroon hills, in the BG, don't have their priority bits set, so the sprite is on top. The ground tiles, in the window, do have their priority bits set, and the ground is drawn with palettes 1 and 2, not 0, so the ground overlaps the sprite.

And while we're talking graphics... Considering the focus on cycle accuracy in the (B)SNES part of Higan, I'm surprised you went for a line based renderer and not a pixel based one. The program running on the GB CPU can still change various registers like SCX and SCY when a line is being drawn to the screen. (I.e. during mode 3.) At least one officially released game makes heavy used of that: Prehistorik Man which is using it in its intro to draw text using palette changes, and probably in some places in the gameplay as well. Not to mention demoscene demos, which (ab)use this a lot, for example Mental Respirator and 20Y, which use this for things like wobbly image stretching and other special effects.

Image Image Image Image

And another "while we're at it." I had to download v093r07 since Shantae didn't work in v093 and I saw this: "I'm posting a beta release of higan, in the hopes of getting some feedback on the new library system (pictured below; more info on the forums), and reports of any potential new regressions before an official release."

Regressions? You mean apart from the new "features", including not being able to import files as a command line argument (useful for dragging and dropping a ROM file over higan.exe or a shortcut to it) or the now absent ability to organize one's ROMs in folders? I do appreciate the focus on accuracy in Higan, but I think the GUI/library aspect of the program is too driven by ideology, even removing legitimate features that don't have to interfere with the purity philosophy of the library. And just to state the obvious, you know my opinion on cross-loading GB and GBC files.


Last edited by nitro2k01 on Tue Dec 10, 2013 12:30 am, edited 3 times in total.

Top
 Profile  
 
PostPosted: Tue Dec 10, 2013 12:10 am 
Offline

Joined: Sat Aug 28, 2010 9:01 am
Posts: 190
Ok, looking at the code, I think I see what the problem is.
Snippet #1 doesn't check the palette of the source pixel, so BG priority pixels overlap sprite pixels unconditionally.
Snippet #2 requires a BG priority pixel to *both* have sprite priority and BG priority to be considered for the palette check.

I believe the correct solution is that a pixel is considered to have priority if one of two criteria are met:
1) The pixel has BG priority.
2) The pixel has sprite priority. This also implies an origin test for BG, so OB pixels aren't incorrectly handled.

This gives the following code:
Code:
      unsigned ox = sx + tx;
      if(ox < 160) {
        //When LCDC.D0 (BG enable) is off, OB is always rendered above BG+Window
        if(status.bg_enable) {
          if(pixels[ox].origin == Pixel::Origin::BGP || (attr & 0x80 && pixels[ox].origin == Pixel::Origin::BG)) {
            if(pixels[ox].palette > 0) continue;
          }
        }
        pixels[ox].color = color;
        pixels[ox].palette = index;
        pixels[ox].origin = Pixel::Origin::OB;
      }

Perhaps pixels[ox].origin == Pixel::Origin::BG should be replaced with pixels[ox].origin != Pixel::Origin::OB to be more "correct" but assuming there are only three sources (BG, BGP and OB) the two comparisons would be logically equivalent because of the left side of the ||.


Top
 Profile  
 
PostPosted: Tue Dec 10, 2013 10:58 am 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1338
Exophase helped out a lot with a priority list: BG0 < OBJL < BGL < OBJH < BGH
I was wrongly using: BG0 < OBJL < BGL | BGH < OBJH

You pretty much had it right as well, thank you.

> And while we're talking graphics... Considering the focus on cycle accuracy in the (B)SNES part of Higan, I'm surprised you went for a line based renderer and not a pixel based one.

Well, the Game Boy is the least documented system I've ever worked on, by a full order of magnitude. I was afraid to implement a dot-based renderer without knowledge of when various memory values and registers are fetched.

But yeah, since I'm working on the PPU, I may as well do it right. We can adjust the read timings later. So okay, I rewrote both the DMG and CGB renderers to be dot-based.

Image

> You mean apart from the new "features", including not being able to import files as a command line argument (useful for dragging and dropping a ROM file over higan.exe or a shortcut to it) or the now absent ability to organize one's ROMs in folders? ... And just to state the obvious, you know my opinion on cross-loading GB and GBC files.

I don't mean any offense, and I greatly appreciate your help here, but I've no interest in discussing these topics any more than I already have.

The GUI is ethos, the core is higan. If you write a GUI for higan, you can support any methodology you like. My emulation cores also eventually appear in RetroArch, OpenEmu, Mednafen, etc.


Top
 Profile  
 
PostPosted: Tue Dec 10, 2013 12:38 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19094
Location: NE Indiana, USA (NTSC)
byuu wrote:
The GUI is ethos, the core is higan.

Then let me get this straight with a compiler analogy:

GUI | ethos | Code::Blocks
Set of cores | GCC (GNU Compiler Collection) | higan
Individual core | gcc (GNU C Compiler) | bsnes

Do I have the analogous terminology right?


Top
 Profile  
 
PostPosted: Tue Dec 10, 2013 3:29 pm 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1338
Correct.

The main point of confusion is that the official binary release is called by the emulator name instead of the GUI name. And that's because I've had about 10 GUIs now.

Even more fun is that I refuse to be inconsistent in my emulator naming, so my GB/C emulator is also called bgb. But it's not like anyone sees the individual emulator names anymore, it's easier to just call it all higan. I am sure beware hates me even more for that though :P


Top
 Profile  
 
PostPosted: Wed Dec 11, 2013 9:32 am 
Offline

Joined: Sat Aug 28, 2010 9:01 am
Posts: 190
byuu wrote:
> And while we're talking graphics... Considering the focus on cycle accuracy in the (B)SNES part of Higan, I'm surprised you went for a line based renderer and not a pixel based one.

Well, the Game Boy is the least documented system I've ever worked on, by a full order of magnitude. I was afraid to implement a dot-based renderer without knowledge of when various memory values and registers are fetched.

But yeah, since I'm working on the PPU, I may as well do it right. We can adjust the read timings later. So okay, I rewrote both the DMG and CGB renderers to be dot-based.

Image
Fair enough. I understand your fear. But I thought you would start from the bottom up, so to speak. Time allowing, I'd be willing to test stuff and help you, maybe write test ROMs if needed. I'll first direct you to a dycp test by beware:

http://akane.bircd.org/dycptest2.gb

Image

Bottom = reference. Top = drawn with dycp (palette changes.) The reference image is valid on DMG, but not GBC (off by one) I suspect that GBC takes on extra (machine) cycle to look up the RGB value for the palette or something like that. Of course, there a couple of different variables that come into play.

And yes, so much left to document, still.

Another thing I plan to do at some point is to make a bus capture device. I think that's a relatively long way into the future, but I think it would definitely produce useful information.
byuu wrote:
> You mean apart from the new "features", including not being able to import files as a command line argument (useful for dragging and dropping a ROM file over higan.exe or a shortcut to it) or the now absent ability to organize one's ROMs in folders? ... And just to state the obvious, you know my opinion on cross-loading GB and GBC files.

I don't mean any offense, and I greatly appreciate your help here, but I've no interest in discussing these topics any more than I already have.

The GUI is ethos, the core is higan. If you write a GUI for higan, you can support any methodology you like. My emulation cores also eventually appear in RetroArch, OpenEmu, Mednafen, etc.
I'm sorry about this. Maybe To give you some background, I didn't bring these things up here simply to be flippant. It came out of actual frustration (though admittedly microfrustration) while investigating the problems you asked about.

Here are the events that lead up to that paragraph: Firstly, I organize the ROMs in Higan in folders, one folder for official games, and one for test/development ROMs, which are located in the GB and GBC folders in the library. So I started by importing the Shantae ROM into v093. It didn't work, so I downloaded v093r07. At that point I moved it to the "official games" folder by habit. I started v093r07 and noticed that my library was empty. (Since I had placed everything in subfolders.) I assumed the library had been moved to a different location or something, so I dragged the ROM file onto the window to reimport it. Didn't work. (I didn't think of trying to drag and drop the purified version at that point.) I quit Higan and tried to drop the file on the .exe. Also didn't work. When I went to import the game from the import tab, I remember thinking that it was silly. Something that should be simple took so many extra clicks to get done.

That friction, together with being easily irritated from sleep deprivation and the fact that you asked for feedback, made me think it was good idea to complain about it. If one of those three things hadn't been true, I probably wouldn't have mentioned it. I apologize for the rash way of delivering the message. I should have put it differently and explained why, or maybe not have mentioned it at all.

I'll however still maintain that those two things I mentioned don't necessarily violate the ethos of Higan. Ie:
1) Allow you to browse subfolders. Not free file system browsing, but allowing the user to browse folders they have created inside the different platforms' directories.
2) Drag and drop of non-pure ROM files. If you're concerned that people will abuse this to bypass the library, make those actions import the file without auto-starting it. Or show an error message. Anything is better than silently doing nothing here, imo.

If you still don't want to discuss it further, no reply to this is needed. I get it.
byuu wrote:
Correct.

The main point of confusion is that the official binary release is called by the emulator name instead of the GUI name. And that's because I've had about 10 GUIs now.

Even more fun is that I refuse to be inconsistent in my emulator naming, so my GB/C emulator is also called bgb. But it's not like anyone sees the individual emulator names anymore, it's easier to just call it all higan. I am sure beware hates me even more for that though :P
Thanks for clarifying this. I was wondering if the bsnes name was dead now. However, didn't the GB emulator use to be called bgameboy. I might just have made that up in my head to disambiguate it from beware's emulator. But if this was the case, why did you change the name? Not that it really matters, for the aforementioned reason.


Top
 Profile  
 
PostPosted: Wed Dec 11, 2013 10:14 am 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1338
> Top = drawn with dycp (palette changes.)

Oh, fun. I made tests like that for the SFC. A lot harder there with things like a moving DRAM refresh locking the system every scanline, variable memory timings, fast-mode accesses, and penalty cycles abound.

But the best way I found to log this information on hardware: first write a routine that synchronizes you to the exact first cycle of a new frame when called. Then write a function that consumes exactly N cycles when called (N taken from a register parameter. Then write a loop that syncs to frame, then seeks ++N cycles, does some hardware write, then logs the results or displays them onscreen. Very easy to find exact points when various values are read and written.

Very, very time consuming, though.

As for your test ROM:

Image

Huh, looks like I made a pretty good guess as to when to start rendering the display. Only off by one or two pixels, but the effect still works.

> I was wondering if the bsnes name was dead now. However, didn't the GB emulator use to be called bgameboy.

Yes, but I added GBA emulation and bgameboyadvance looked ugly. Of course now you could say bnes/bsnes should be bfc/bsfc with my UI naming things by their official names. So yeah, I much prefer to just call it all higan now.

But it's much more like Mednafen where each emulator used has its own name, than like MESS. I'd love to emulate the N64 one day and complete the not-complete-crap (eg no VB) Nintendo cartridge generation, but CPUs aren't anywhere near fast enough for how I want to do it yet.

> Time allowing, I'd be willing to test stuff and help you, maybe write test ROMs if needed.

I would greatly appreciate that! In that case, I'll just go ahead and post my current challenge instead of making another thread.

In Donkey Kong Land, every ~30 frames you get a flickering effect:

Image Image

Happens in v093 official too, not an artifact of recent rendering changes.

Seems at the very bottom, the game sets SCX=0 for the little "Select Game" area at the bottom, and SCX=32 for the top of the screen.

Watching a log, it seems that sometimes the second write misses the entire frame, and hits the next frame before it happens alone, and that causes the flickering.

Downside is that I really don't know GB-Z80 at all, so tracelogs aren't all that helpful to me yet.

Code:
frame  //LY=0
20!  //what we read back on Y=64,X=0
write at 127,0=00  //LY=127, LX=0 (bottom window area) = <SCX value written>
write at 143,160=20  //LY=143,LX=160 (end of display for next frame window area)
frame
20!
write at 127,0=00
write at 143,160=20
frame
20!
write at 127,0=00
frame
0!
write at 143,160=20
frame
20!
write at 127,0=00
write at 143,160=20
frame
20!
write at 127,0=00
write at 143,160=20
frame
20!
write at 127,0=00
frame
0!
write at 143,160=20
frame
20!
write at 127,0=00
write at 143,160=20


Ever see that before?


Top
 Profile  
 
PostPosted: Wed Dec 11, 2013 12:18 pm 
Offline

Joined: Sat Aug 28, 2010 9:01 am
Posts: 190
This one of those bugs that "could be anything". But I'm looking into it.


Top
 Profile  
 
PostPosted: Wed Dec 11, 2013 1:32 pm 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1338
I think this is it.

Code:
void CPU::mmio_write(uint16 addr, uint8 data) {
  if(addr == 0xff46) {  //DMA
    for(unsigned n = 0x00; n <= 0x9f; n++) {
      bus.write(0xfe00 + n, bus.read((data << 8) + n));
    //add_clocks(4);
    }
    return;
  }
  ...
}


Without the time penalty for LCD OAM transfers, the problem goes away.

I know this technically runs in parallel while the DMG CPU only reads from HRAM.

Just implemented the transfer in parallel as an initial test, and the problem remains fixed. But man, that's a painful operation to do. I have to hook the OAM DMA transfer test in every clock tick of the CPU, and all of my opcode read/write functions need to also test for it and reject reads outside of HRAM.

By the way, do you know what happens if you read from 0000-ff7f or ffff during an OAM DMA transfer? Does it mirror around to HRAM, or does it return the MDR (open bus)?


Top
 Profile  
 
PostPosted: Wed Dec 11, 2013 3:01 pm 
Offline

Joined: Sat Aug 28, 2010 9:01 am
Posts: 190
Ugh! Yeah, that's all wrong. That behaves as if the CPU is halted while the OAM DMA is happening. In the end, it was really just a timing error from "somewhere else". Right now I have breakpoints all over the place in BGB (beware's emulator; I will personally call your emulator bgameboy in the future to disambiguate it.) in the DKL ROM to look for suspicious stuff that might break.

byuu wrote:
Just implemented the transfer in parallel as an initial test, and the problem remains fixed. But man, that's a painful operation to do. I have to hook the OAM DMA transfer test in every clock tick of the CPU, and all of my opcode read/write functions need to also test for it and reject reads outside of HRAM.
If it's any comfort, beware has decided against emulating the OAM inaccessibility accurately because of the effort it would take/inefficiency it would create. I think he even commented that if anyone would be crazy enough to implement it accurately, it would be you. :D

beware does support detecting illegal access during OAM DMA. The way he does this (iirc - I may be hazy on the details) is as follows:
If interrupts are disabled, and the OAM routine follows the standard routine verbatim (true for 99% of games, but not DKL incidentally) nothing special is done, since illegal access would be impossible anyway. Otherwise (and if the user has the "break on bad OAM DMA enabled in the exceptions) a special hidden access breakpoint (existing debug infrastructure) is set for reads/writes to 0000-feff. Said breakpoint expires based on a timer.

I once suggested a slightly clusterfuckish idea to beware, which he rejected. Have a 256 place lookup table each for reads and writes which contains a function pointer for each position. These tables handle address decoding for reads and writes respectively. The index is the high byte of the address. During OAM, replace the pointers in index 0-254 with a special handler. This would be a relatively cheap way (in terms of CPU use) to handle the issue. And you really get the rest of the address decoding "for free" with this method, even if it would be a rewrite. Or slightly different, let the read and write handlers be function pointers, which you replace with a filter function during OAM DMA.

I suspect this is a method you will reject as well.
byuu wrote:
By the way, do you know what happens if you read from 0000-ff7f or ffff during an OAM DMA transfer? Does it mirror around to HRAM, or does it return the MDR (open bus)?
Neither. All addresses in the FFxx range are still accessible. What happens outside of that differs with the Gameboy type. I haven't investigated this fully, but here's what I know from my tests:

On DMG, if you read illegal memory, you get back the currently transferred byte.

On GBC, things are more complicated. If you read from the same memory area as the OAM source, you get the currently transferred DMA byte, just like on DMG. (What these memory areas are, exactly, depends on the GBC's internal memory decoding and remains to be investigated, but at the very least I know that ROM and WRAM are different areas.)

But it gets even more hairy. In some circumstances when reading from a different memory area than the one the transfer source is in, there are still some side effects in some cases. I've yet to figure out exactly how this happens, but it could be something like ANDing the the lower 8 address bits with the DMA counter.

Gambatte does emulate the first aspect of the bug on GBC (getting a DMA byte when reading from the same memory area.) but not the other.

I have not investigated whether reading during OAM DMA corrupts the transfer. I have not investigated writes at all.

There is only one game, which is not an official Nintendo release, that relies on the inaccessibility to be implemented to at least some capacity, Super Connard. If you're wondering, that translates to Super Asshole and it's only used for emulator detection to display the text "fuck you, emulator." As far I know, Gambatte is the only emulator that runs this ROM uncracked. The author of the ROM does recommend you to crack the ROM as an exercise.


Top
 Profile  
 
PostPosted: Wed Dec 11, 2013 4:19 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19094
Location: NE Indiana, USA (NTSC)
Or going by the 3- or 4-character codes that appear on the cartridge: bhvc/bnes, bshvc/bsns, bdmg, bvue (complete crap), bnus (PCs may never be fast enough), bcgb, bagb, bntr (PCs may never be fast enough)


Top
 Profile  
 
PostPosted: Wed Dec 11, 2013 4:21 pm 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1338
> I will personally call your emulator bgameboy in the future to disambiguate it.

Sure, that's perfectly acceptable and fine. He named his emulator first. You can also just say higan if you like.

> I think he even commented that if anyone would be crazy enough to implement it accurately, it would be you.

:D

Thing is, I probably won't get too insanely precise with my non-SNES cores. I hear the DMG LCD can actually halt its own clocking in order to catch up when it gets bogged down. I'm probably just going to target as close to 100% compatibility as I can get, along with emulating anything that I reasonably can.

(I really want to emulate GBA prefetching behavior, but there's zero documentation on it, and I don't have the ability to run my own GBA code on hardware. And ARM scares the hell out of me anyway.)

> Said breakpoint expires based on a timer.

Clever, since he has breakpoint stuff live in the official releases anyway.

> Or slightly different, let the read and write handlers be function pointers, which you replace with a filter function during OAM DMA.

The indirection would happen on every read/write, making it probably more demanding than a conditional check inside of them. Same goes for having a memory remapping system.

If you really want to absolutely avoid any impact on each individual read/write, your best method would be to overwrite the actual program code. You'd have to mark that page writable (VirtualProtect / mprotect) first, and then you could copy a smaller function that only does the HRAM read on top of the original op_read() function, and restore it when finished.

This is certainly abusing the language (you'd get the function size by subtracting the next function address from the target function address), but since bgb is Windows-only and closed-source anyway, as long as it works it doesn't really matter.

> I suspect this is a method you will reject as well.

I do something like this in my SNES emulator, where all memory is remappable. But the GB was simple enough that I don't think I bothered with a fancy memory system.

> If you're wondering, that translates to Super Asshole and it's only used for emulator detection to display the text "fuck you, emulator."

Heh. I've certainly dealt with emulator detections on the SNES. They pretty much use the least likely behavior to ever get emulated. d4s used the 12/16-step mul/div operations that run in parallel with the CPU for Breath of Fire II. Another person whose name escapes me detected the PRNG algorithm I used to initialize WRAM on power-on (it has a unique seed each run too, so it was pretty clever. I'll put a cryptographic RNG in there eventually.)

> Or going by the 3- or 4-character codes that appear on the cartridge

If they were consistent to all regions, I would. SNES = shvc/sns/snsp/skor/etc.


Top
 Profile  
 
PostPosted: Wed Dec 11, 2013 5:22 pm 
Offline

Joined: Sat Aug 28, 2010 9:01 am
Posts: 190
byuu wrote:
> Said breakpoint expires based on a timer.

Clever, since he has breakpoint stuff live in the official releases anyway.
Actually, he keeps them as separate as he can. Debug mode is pretty much turned on or off by a single conditional in the main loop. As long as debug mode is turned off, which it is in normal emulation mode, this check is not done/enabled at all. The purpose of BGB's DMA OAM detection is as a development aid, and has nothing to do with the emulation.

byuu wrote:
> Or slightly different, let the read and write handlers be function pointers, which you replace with a filter function during OAM DMA.

The indirection would happen on every read/write, making it probably more demanding than a conditional check inside of them. Same goes for having a memory remapping system.
You may or may not be right about that. If the CPU is smart about things, the table/function reference will stay in L2 cache, and the penalty isn't particularly hard.

Just to be clear about what I mean, it's something like this. Slight modification to the idea: the OAM handlers are in the same table, with an offset of 512.

Please excuse my rustiness in C(++), in case something below is syntactically wrong.

Code:
uint8 (*gbread[512]) (uint16 addr);

Table of 2*256 pointers. For example, index 0-63 would contain a (the same) reference to the handler for bank 0 reads. Entries 256-510 contain OAM handers.
Usage:
Code:
value = (*gbread[(addr>>8)&255 + oamoffset]) (addr);

Normally, oamoffset = 0 , and oamoffset = 256 during OAM.

It might be that this is more expensive than conditionals because of the memory access, but it can't be THAT bad considering you (presumably) already have a 256 place table of function references that you access frequently (the opcode decoder.)
byuu wrote:
This is certainly abusing the language (you'd get the function size by subtracting the next function address from the target function address), but since bgb is Windows-only and closed-source anyway, as long as it works it doesn't really matter.
Note that he rejected my suggestion, and all code he's using is still kosher. Doing something like self-modifying code would very much go against his coding ideals, as far as I know them.


Top
 Profile  
 
PostPosted: Wed Dec 11, 2013 8:00 pm 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1338
> The purpose of BGB's DMA OAM detection is as a development aid, and has nothing to do with the emulation.

Then I guess I don't understand what the performance issue is. I'd rather the emulation run at 2fps if it means faithfully recreating what users will see on real hardware under a development environment. But, to each their own.

> You may or may not be right about that. If the CPU is smart about things, the table/function reference will stay in L2 cache, and the penalty isn't particularly hard.

The same is true of branch prediction on a path that's almost always false. But yeah, in practice it's really not a huge deal, it's just never a good thing putting more code in your innermost, hottest code paths.

> value = (*gbread[(addr>>8)&255 + oamoffset]) (addr);

Yeah, I understand.

If you want to be horrified: bsnes has a table of 16,777,216 8-bit values. Each value is an ID to a vector of callback functions for read and write. So a memory access is: function[table[addr]](addr);

This allows my manifest files to map data at one-byte granularity. Which some things actually rely on.

Of course it murders cache and all of that, but oh well.

> Doing something like self-modifying code would very much go against his coding ideals, as far as I know them.

And mine.

Typically one who wants speed will do whatever it takes, whereas one who wants accuracy will do whatever is necessary. So he seems to be somewhere in the middle.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group