I'd be interested to discuss a replacement for the file extension (or file name suffix). Let me first define what problems there are so we know what we need to solve.
User double-clicked a file. Which app starts?
Say I have a file whose name ends with .bin (generic ROM image), .cue (index of multitrack optical disc image), or .iso (ISO 9660 or UDF file system image), and I want to launch the appropriate emulator. What steps should an operating system take to determine this? I imagine there would be two steps: first determine the platform for which the image was made, and then determine the preferred emulator for that platform. In a well-designed operating system, how would an application go about registering rules to recognize a format and ability to play that format? Would the registration have to be system-wide or per user?
Web client requested a file. What's its Content-type?
Or say I have a web server from which the user has requested a particular file. I want the web server to determine which Content-type value to send along with the file's contents. What steps should the web server take to determine this? Or say the user drags a file into a mail user agent's compose window as an attachment. What should the attachment's Content-type be once the user sends the email? Per the previous topic Internet media types (MIME types) for retro file formats, I'm aware that even having a correct value to send in the first place is a serious undertaking because of formal documentation that needs to be prepared and submitted to IETF.
User opened an assembly file. What instruction set is it in?
Or say I have a text editor that can edit assembly language for 6502/65816, Z80, 68000, x86, SM83 (8080-like CPU in Game Boy), SPC700, MIPS, SuperH, or ARM. A single project may have code for two or more instruction sets, such as 65816 and SPC700 (Super NES), or 68000, Z80, and SuperH (Genesis 32X), or SuperH and ARM (Dreamcast). How would an editor know what set of syntax highlighting rules to apply for a given .s or .asm file?
I'm at the combination Pizza Hut and Taco Bell
Should polyglot files, which are valid as multiple content types, receive any special treatment? Even apart from lightweight markup languages, such as Markdown being intended for legibility as plain UTF-8 text, there are pairs of formats for which it is straightforward. These pairs can arise by design, such as producing a zipfile with prepended extraction program. Or they can arise by accident, such as a Game Boy ROM that is also a valid PNG image by putting the entire program part in a chunk, or a ROM for Super NES and Game Boy where each platform sees its own program at a separate header ($7FC0 or $0100).
These specific examples motivated this:
- Use of .bin in the Atari 2600 scene, the Mega Drive scene, and the early Game Boy Advance scene
- Use of .iso and .bin/.cue by multiple disc-based consoles
- Use of .spc by music in SPC700 save state format when it was already taken by Authenticode software publisher certificates
- Use of .wad for Wii channel packaging when it was already taken by Id Software's Doom
- No-Intro's use of .md instead of .gen for Mega Drive ROMs when .md was already taken by Markdown
- Use of .deb for both Debian GNU/Linux application packaging and FCEUX debug files
- Use of .gg for both Game Gear ROM images and lists of ROM patches for Game Genie
- Disputes in the gbdev Discord server as to the appropriate extension for SM83 assembly language source code files