Mysterious pirate CD with swapped bytes

You can talk about almost anything that you want to on this board.

Moderator: Moderators

Post Reply
krzysiobal
Posts: 764
Joined: Sun Jun 12, 2011 12:06 pm
Location: Poland

Mysterious pirate CD with swapped bytes

Post by krzysiobal » Sun Jun 14, 2020 1:36 pm

Historical background
In 90s of last century there used to be computer marketplace in centre of Warsaw, Poland, at the Grzybowska 25 street, open every Sunday. It was the biggest place at that time in my country, where you could buy hardware/software for PC, Amiga, Commodore and other stuff. I visited it few times as 9 year kid in 1997-1998. The marketplace consisted of stands in on open air, where you could buy boxed hardware and sofware (rather legal & genuine). How it looked like you can see in one scene of the "Ekstradycja 3" movie:
Image
https://youtu.be/mVwBiIBar5c?t=2238

In addition to the marketplace, there was primary school nearby, where also on Sundays, you could buy stuff out of box, mostly pirated software, which you could either choose from pre-recorded sets or the seller could burn a CD (20PLN = 5$) or 3.25" floppy (5PLN = 1.25$) for you of what you wanted. This was shown in the following short movie:
Image
https://youtu.be/mxQqsqqH8ao?t=62

Shortly after buing first computer (Pentium 166MMX), my father bought me CD from there called "Gry dla Dzieci" (Game for kids).
Image
I still have it and it consists of a couple of games, DOS and Windows 95:
* 3D Dinosaur Adventure
* Barbie super model
* Beauty & the Beast
* DIZZY collection
* Fatty Bear birthay suprise
* Follow the Reader
* Hand of fate PL (Legend of Kyrandia 2)
* Huchback of Notre Dame (Dzwonnik z Notre Dame)
* Jungle Book
* Lion King
* Literki-cyferki
* Mickey 123
* Mickey ABC
* Mickey Jigsaw Puzzle
* Matematyka dla klas 3,4
* Putt Putt 1
* Putt Putt 2
* Scooter's Magic Castle FULL CD
* SMURFS PL (full CD)
* Timon & Pumba Jungle Games

Some of the games are even not available nowadays in internet (even google does not know them), for example:
1) Matematyka klasa III-IV, (c) 1994 by MAREX, Autorzy: Zbigniew Oględzki, Marek Wapniarz (ang. Mathematics for III-IV clases) - it was compiled in Turbo Pascal and allow you to learn four basic mathematic operations
Image Image

2)Literki-cyferki (ang. Letters and digits)
Image Image

3) Scooter's Magic Castle V2.0 CD - this one can be "grabbed" from Internet, but the version available in described CD has ability to choose one of three background music tracks
Image Image

More curiously, a few games present on the CD are packed as ARJ with added info about sellers (company name, number of table in the marketplace) as the archive comment, for example:

Code: Select all

    PACKED & SCANNED FOR VIRUS BY AGNES
             Gielda komputerowa
   ul. Grzybowska 25, 1 pietro, stolik 30
or

Code: Select all

ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
ş            Packed by Wojciech Piegat            ş
ş            BEER  SOFTWARE & HARDWARE            ş
ş      Gielda komputerowa, ul.Grzybowska 25       ş
ČÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍĽ
Going back to the heart of problem - almost every game from that CD cannot be extracted and launched - it says CRC error when unpacking. Not because the CD is scratched and unreadable, because after ~23 years I still can read the whole CD and make ISO image of every sector - that golden Maxwell CD is probably tens of times more durable than todays' CD-Rs). That issue has been bothering me since I bought it.

Few days ago I tried to focus on the problem. I compared few unpacked files from the CD with their equivalents, downlaoded from internet and I found out that they are different. Those diferences are bizarre, because it looks like some two-byte pairs were overwritten with other two-byte pairs and those regions of differences spans for only a few hundred byte blocks and almost always start on offsets ending with B8, for example

file: \README.EXE
size: 186868 bytes
differences only in region: $1AB8-1FFF
Image

\DATA folder (it belongs to Smurf's teletransporter game - there are 2421 files, 211 are different, for example):
file: DATA\JEU_F\S01\S01C\S01C.MUX
size: 507429 bytes
differences only in region: $162B8-$167FF, $176B8-$19257

My questions
I firstly thought that bytes are only swapped and if I could know the algorithm, I would re-swap them back, but analyzing the histogram of byte occurences, it shows that the counts does not match. Anyway, I was wondering what could cause that? I have 3 hypotheses:

* Intentional burn of broken files - I'd rather doubt. As far as I can recall, after few weeks I returned to the seller, pointing at broken game and he burned it to the floppy for free. Unfortunatelly I don't quite remember how he explained to me his fault.

* Files were broken by the virus, present on the seller's computer, before burning CD

* The burning software, used by seller had intentional bug (or feature) that started to burn invalid data after shareware testing period elapsed. This seems rational to me even more because I probably read that somewhere, but do you know any DOS CD-burning software of that era (1997) that could suffer from that?

I attach the first sectors from image of that CD (just the lead-in and directory structure, without data) as well as one of the files that have swapped bytes.

Most CD burning programs write their "credits" in first sector, this one just looks like:

Code: Select all

CD001 CD-RTOS CD-BRIDGE                                                  
1997091514461500 1997091514461500 0000000000000000 0000000000000000  
Image

More funny, this CD is detected as CD-XA, while when you normally burn CD in Nero, it is CD-DA.

Additionally, one of the directories (\SCOOTER\EAKIDS\SCOOTCD\SAMPLES\) also seems to have "broken" entries, so it looks like either the burning software made some mess do it, or the whole CD-image was stored as a ISO file and something happened to it before burning.
Attachments
Gry Dla Dzieci.iso.zip
(186 KiB) Downloaded 34 times
readme.exe.zip
(20.4 KiB) Downloaded 33 times

nocash
Posts: 1231
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: Mysterious pirate CD with swapped bytes

Post by nocash » Sun Jun 14, 2020 2:46 pm

CD-XA can use FORM1 sectors with 800h bytes and error correction, or optionally, FORM2 sectors with 914h bytes without error correction. The sector CRC is optional for FORM2, if the CRC is missing then you won't even see if there are errors.
Maybe that's the problem? The burner might have tried to use FORM2 to squeeze more data on the disc, although it would be a very bad idea to do so for exe or arj files (the FORM2 stuff is intended only for audio/video streaming, not for normal "binary" files).
To see if it is FORM2, have a look at the sector headers (of sectors that contain bad data), Or look at bit12 of the file attribute flags in the directory, https://problemkaputt.de/psx-spx.htm#cd ... escriptors
PS. CD-DA would be Digital Audio, that is actually very common... but not for CDROM data discs ; )
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

nocash
Posts: 1231
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: Mysterious pirate CD with swapped bytes

Post by nocash » Sun Jun 14, 2020 3:37 pm

PPS. All CDROM data sectors (including those on for CD-XA discs) are internally stored inside of 930h-byte CD-DA audio sectors. The bytes are encoded as optical codes (as far as I remember, each code is 10bit wide, with 256 valid codes for 256 bytes, and several invalid codes) (and maybe the optical code for byte 8Bh is somehow harder to identify; that might depend on the optics, so a different cdrom drive might act better on those bytes, and/or fail on other bytes).

Even though the FORM2 data layer has no data error correction, the underlayjng audio sectors do also have some audio error correction, which might have actuallly fixed some bytes, but apparently that wasn't good enough for fixing all bytes. Instead, it looks as if the audio layer has given up on audio error correction - and instead reused the previous 16bit sample (from the previous 2x16bit stereo sample).

EDIT: I forgot tha data sectors are "scrambled" internaly (all data bytes are XORed with a "random" pattern, that's done because the error correction works better for "random" data, as opposed to all-zerofilled sectors).
So, if you have two bugged 8Bh bytes, that's really mysterious, because they are each XORed with different random bytes, so they end up with different optical codes on the disc.
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

Post Reply