Any interest in an open source GoodNES clone?

A place where you can keep others updated about your NES-related projects through screenshots, videos or information in general.

Moderator: Moderators

proxy
Posts: 68
Joined: Tue Mar 08, 2011 9:45 am
Contact:

Post by proxy »

Also, something that I've noticed is that often, ROMs seem to have "junk" after the PRG/CHR data. GoodNES includes this junk as part of its hashing. I counted 3507 ROMs which are longer than the iNES header says that need to be.

Often this junk is a URL or sometimes the title of the game, it never plays a part in the functionality of the ROM and has no net effect (why didn't DiskDude put his tag at the end of the ROM instead of crapping all over the iNES headers!)

This has a couple results:

* GoodNES can detect ROMs even when they have headers that don't make sense.

* You can have two hashes 2 roms which literally identical PRG+CHR.

Obviously point 1 is good and point 2 is bad :-P.

So first things first, I will add to my DB entries which represent the ROMs when no junk is at the end.

I am considering having a feature which would do the following:

First detect if a ROM matches when doing a "to end of file" hash.
Once I've done that, I now know the correct size of the ROM since my DB has the PRG/CHR sizes.
After that, if the file is longer than expected, I could have an option to create a second ROM which is the same but has no junk at the end.

I think this is a reasonable approach since it will appease the collectors since they won't have to acquire new ROMs to keep there collection complete and give us an option to have "junk" free ROMs for those of us who just want accurate digital backups.

What do you guys think?
Last edited by proxy on Mon Apr 11, 2011 2:44 pm, edited 1 time in total.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

proxy wrote:So I have a question with regard to iNES 2.0. Has the submapper stuff been hashed out, is there any "authority" out there which has designated them? Currently, the way that I imagine them is something like this:

Mapper 4, Sub 0 (MMC3/MMC6 Generic/Unknown)
Mapper 4, Sub 1 (TBROM)
Mapper 4, Sub 2 (TEROM)
Mapper 4, Sub 3 (TFROM)
Mapper 4, Sub 4 (TGROM)
etc...
If the board can be reliably guessed from the size of PRG ROM, CHR ROM, PRG RAM, and CHR RAM, it does not need a submapper. For example, all MMC1 boards with 512 KiB PRG ROM, 8 KiB PRG RAM, and 8 KiB CHR RAM behave the same way as SUROM.
Mapper 2, Sub 0 (UxROM use 8 bits for PRG swap, supports much larger games)
Mapper 2, Sub 1 (UNROM use 3 bits for PRG swap)
Mapper 2, Sub 2 (UOROM use 4 bits for PRG swap)
These can be reliably guessed from PRG size.

As for junk at the end: You might want to put in detection for when the appended data looks like a zip file.
Drag
Posts: 1615
Joined: Mon Sep 27, 2004 2:57 pm
Contact:

Post by Drag »

Given that commercial roms don't "change" as frequently as homebrew roms, the open GoodNES utility should have its main focus on commercial, non-public domain roms. As such, I think it shouldn't include any homebrew roms, save for ones that may have actually been "dumped". :P

Handling homebrew roms that are subject to updates and such would require a different solution, such as checking against an internet database, because that would be the easiest way to propagate updates.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

Drag wrote:Given that commercial roms don't "change" as frequently as homebrew roms, the open GoodNES utility should have its main focus on commercial, non-public domain roms.
So does that exclude any game that was first sold on a cart but later liberated, such as the NES version of Elite?
Handling homebrew roms that are subject to updates and such would require a different solution
I'm glad you're starting to understand FitzRoy's point.
such as checking against an internet database, because that would be the easiest way to propagate updates.
I agree as long as the database can be downloaded separately, because some people don't have the ROM collection and continuous Internet access on the same machine.
User avatar
Dwedit
Posts: 4924
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Post by Dwedit »

Since when did the author of Elite get the authority to claim copyright away from the publisher and liberate it? I thought it was still considered a pirated copy, despite the author releasing it.

Edit: Looks like the game itself attributes copyright to the developers, not the publishers. Then it's okay?
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
Drag
Posts: 1615
Joined: Mon Sep 27, 2004 2:57 pm
Contact:

Post by Drag »

tepples wrote:So does that exclude any game that was first sold on a cart but later liberated, such as the NES version of Elite?
What's the set going to include? Surely it'll have all dumped rom images (from regular ROMs or prototypes dumped from ((e)e)proms or flash roms or what have you). If the game was complete, but unreleased due the inability to find a publisher, then I think it would qualify for the set, because the game reached a point where it was ready to be frozen to a rom and not changed (otherwise, reaching a "static" form).
such as checking against an internet database, because that would be the easiest way to propagate updates.
I agree as long as the database can be downloaded separately, because some people don't have the ROM collection and continuous Internet access on the same machine.
Fine with me, just as long as there's some way to easily update the database.
User avatar
B00daW
Posts: 586
Joined: Thu Jan 03, 2008 1:48 pm

Post by B00daW »

I wouldn't mind seeing NESToy resurrected to use the BootGod/GoodNES/no-intro databases that have been fixed. NESToy also did a good job of keeping a compressed archive of iNES headers that could patch the ROM images. Without an iNES header the binaries are effectively useless. Qualms with GoodNES and no-intro databases excluding iNES headers or not fixing them has rendered people collecting giant masses of ROMs while some of them are unplayable unless you add your own headers fueled by personal research and experimentation. Some people without the knowledge of the iNES header format would consider the image to be broken, when in fact the utilities are lacking.

Again we're also on sketchy ground when it comes to Parodius and NESdev board terms. It seems that we allow linking to ROMs on occasion, especially if they are unreleased or of broad interest, but the database preparation and ROM collection efforts of the community to get a functional and accurate NES/Famicom game database once again would be traipsing on a gray area.

I believe as long as people attempt to not directly link copyrighted ROM images in the forum, but link using PMs or other personal communication means to exchange images for databasing purposes, release databases in the public forum, and let it be known without a public link that the image archive is hosted in a torrent on Underground Gamer, that the NESdev community would have a complete and accurate image archive for reverse engineering, hacking, and emulator development.
User avatar
kevtris
Posts: 504
Joined: Sat Oct 29, 2005 2:09 am
Location: Indianapolis
Contact:

Post by kevtris »

proxy wrote: So I have a question with regard to iNES 2.0. Has the submapper stuff been hashed out, is there any "authority" out there which has designated them? Currently, the way that I imagine them is something like this:
Yes and no. I have made a "definitive" set of ROMs with 2.0 headers, but it's really outdated. I have been waiting patiently for a new set of ROMs to make the current list but so far have not gotten them.

If there is some tool that can extract just the headers from ROMs, I could run my set of 100 or so 2.0 ROMs through it and produce such a list.

I have a document I haven't released yet which has all the submappers I used defined, also... I guess I should clean that up and release it. I don't anticipate it changing too terribly much in the future... Though the Vs. stuff might change a little (mainly the controller type byte. The PPU and other bytes are OK).
/* this is a comment */
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

kevtris wrote:If there is some tool that can extract just the headers from ROMs, I could run my set of 100 or so 2.0 ROMs through it and produce such a list.
man dd

Can you copy the first 16 bytes out of every file using this?
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Post by lidnariq »

kevtris wrote:If there is some tool that can extract just the headers from ROMs, I could run my set of 100 or so 2.0 ROMs through it and produce such a list.
If you're using linux or cygwin, "for i in *.nes; do echo -en "$i"-; hd -n 16 "$i"; done".
proxy
Posts: 68
Joined: Tue Mar 08, 2011 9:45 am
Contact:

Post by proxy »

Another update :-).

So, there is a bit of a flaw in the way that GoodNES handles UNIF dumps. Basically, as far as I can tell, GoodNES will always try two things.

First, skip the first 16 bytes, hash the rest of the file. if that doesn't match anything in the DB, then hash the whole file and see if that matches the DB.

So, as you can imagine, UNIF always falls into the second category. The problem is that any two roms can have different meta-data but have the same PRG/CHR.

Here's my proposed solution:

I load the UNIF file into memory and process the blocks. I hash every PRG[0-9A-F] block in numerical order. I then hash every CHR[0-9A-F] block in numerical order. Non existent blocks have no effect, but you can have discontinuous blocks like PRG0, PRG2, but no PRG1.

I am thinking of making the hash include the UNIF block header so that an image with PRG0 and PRG2 wouldn't match an image with PRG0 and PRG1 where PRG1/2 are the same code. I know this is a corner case, but it's worth addressing if I can.

Then I concatenate the results in the order I collected them and hash that. Now we have a unique hash for UNIF files which will correctly identify that two file are the "same" but may have different meta-data. This now opens the possibility to correct bad meta-data such as the MAPR/MIRR/etc blocks similarly to the fixnes feature. This is necessary since at least one UNIF image in the GoodNES collection has a bad header making correct loading impossible.

Any thoughts? Anyone see a problem with this approach?
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

proxy wrote:Here's my proposed solution:

I load the UNIF file into memory and process the blocks. I hash every PRG[0-9A-F] block in numerical order. I then hash every CHR[0-9A-F] block in numerical order. Non existent blocks have no effect, but you can have discontinuous blocks like PRG0, PRG2, but no PRG1.
Yeah, that matches what one would do when translating UNIF into iNES. The hash would be defined as the hash of the conversion into iNES.
I am thinking of making the hash include the UNIF block header so that an image with PRG0 and PRG2 wouldn't match an image with PRG0 and PRG1 where PRG1/2 are the same code. I know this is a corner case, but it's worth addressing if I can.
I'm not entirely sure I follow you. You don't need to hash the metadata because you can store that directly in the database. And you can do that because each individual cartridge's metadata isn't copyrighted (it's a fact). You can just load all the metadata chunks and compare them elementwise to your database.
Then I concatenate the results in the order I collected them and hash that.
Reminds me of the hash in ZapFC.
proxy
Posts: 68
Joined: Tue Mar 08, 2011 9:45 am
Contact:

Post by proxy »

@tepples: I think perhaps you've misunderstood my plan :-). But that's ok, I'll explain:

What you describe is basically allocating enough memory and copying the PRG/CHR ROM into that memory and hashing it. This would work, but it requires extra memory. I'm not sure that I want to consider UNIF dumps to be duplicates of iNES dumps (at least not yet, I may be convinced otherwise).

What I am thinking is this:

suppose there are 2 PRG chips (PRG0 and PRG1) and one CHR0. Assuming the following:

SHA1(PRG0) = A
SHA1(PRG1) = B
SHA1(CHR0) = C

I would consider the hash of the UNIF dump to be:

SHA1(ABC)

This allows me to calculate the hashes of the chunks as I load them, and not worry about "making" them contiguous in memory. While still maintaining the following properties:

* unique for each UNIF dump (disregarding meta-data)
* I can process the various ROM dumps in an image in the order which that appear in the file (which can be ANY order), no jumping back and forth.
* no need to allocate memory for all chunks at once and copy data around
* considered different from iNES dumps.

The last two points are not 100% important, and I may be able to be convinced to let them go, but this scheme would work:

Anyway, as for the part you aren't entirely sure on, I'll make it more clear:

Suppose we have 2 dumps the first with the following chips/hashes:

dump #1 has PRG0 (16k)/PRG1 (16k)/CHR0 (8k)

SHA1(PRG0) = A
SHA1(PRG1) = B
SHA1(CHR0) = C

dump #2 has PRG0 (16k)/PRG2 (16k)/CHR0 (8k)

SHA1(PRG0) = A
SHA1(PRG2) = B
SHA1(CHR0) = C

Note that it is PRG2, NOT PRG1. I don't want these two dumps to be considered equal. But if I just do as I originally planned:

SHA1(ABC)

They would be...but, if i include the "PRG0\x00\x00\x40\x00" as part of the byte stream that I hash for the PRG chunk, now which chip it is, is part of the "uniqueness".

Essentially, I would like the hash of identical code found on different chip indexes to be considered different. I know almost all ROMs only have PRG0, but who knows what exotic cart we'll find next ;-)

I know this is a corner case, which is likely not happen... But I think since it is easily addressable may as well deal with it. I have a couple of variants on this theme in mind, but that's the general idea. Include the chip#/type as part of the secondary hash and it will be truly unique.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

proxy wrote:What you describe is basically allocating enough memory and copying the PRG/CHR ROM into that memory and hashing it. This would work, but it requires extra memory.
Don't worry about extra memory. The biggest licensed ROM for NES or Famicom is 1 MB. The system requirements to run Windows 7 alone, let alone any applications, is a thousand times that.

suppose there are 2 PRG chips (PRG0 and PRG1) and one CHR0. Assuming the following:

SHA1(PRG0) = A
SHA1(PRG1) = B
SHA1(CHR0) = C
I'd take SHA1(PRG0 + PRG1) and SHA1(CHR0), one hash for each bus.
* no need to allocate memory for all chunks at once and copy data around
With files that small, what's the disadvantage of "keep[ing] the entire input file in memory and scan[ning] it there"?
dump #1 has PRG0 (16k)/PRG1 (16k)/CHR0 (8k)

SHA1(PRG0) = A
SHA1(PRG1) = B
SHA1(CHR0) = C

dump #2 has PRG0 (16k)/PRG2 (16k)/CHR0 (8k)

SHA1(PRG0) = A
SHA1(PRG2) = B
SHA1(CHR0) = C

Note that it is PRG2, NOT PRG1. I don't want these two dumps to be considered equal.
I don't see why not. The data that the CPU sees consists of the same bytes in the same order, no matter what is silkscreened onto the chips. If PRG1 and PRG2 showed up at different bank addresses, the game wouldn't run anyway. Hence SHA1(PRG0 + PRG2) and SHA1(CHR0).
Essentially, I would like the hash of identical code found on different chip indexes to be considered different. I know almost all ROMs only have PRG0, but who knows what exotic cart we'll find next ;-)
The multi-chip games I can think of are Action 52 (three PRG ROMs) and the After Burner mapper (two CHR ROMs).
proxy
Posts: 68
Joined: Tue Mar 08, 2011 9:45 am
Contact:

Post by proxy »

Fair enough points. I've actually just finished reworking some of my code for loading iNES to prefer memory mapped file. I'll try to think of a scheme similar for UNIF :-).

You may be right that the chip # doesn't matter. I had it in my mind that it did, but perhaps it shouldn't?

In an unrelated note. There is a single UNIF ROM in my collection which claims to use UNIF revision 8. Tennessee at one point asked me to be the future maintainer of the UNIF standard, as far as I know, 7b was the latest. Was there ever an 8 (official or otherwise?). At the moment, libunif will refuse to consider files with a version > 7 as valid. I have two choices here:

1. bump the official version to 8 to make the ROM valid. If there was a UNIF version 8 and it had a feature 7b didn't, add it to the standard.

2. correct the ROM.

Anyone have any insight into this mystery version 8 UNIF file?
Post Reply