asm6 assumes little-endian host [not anymore!]

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

User avatar
blargg
Posts: 3715
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

asm6 assumes little-endian host [not anymore!]

Post by blargg » Tue Jun 29, 2010 4:56 am

I was trying to get asm6 working on my machine, but all the data values were outputting as zero. I finally traced it down to it assuming that the host is little-endian, where it essentially does in many places:

Code: Select all

int i = ...;
write_bytes_to_output( &i, 1 ); // for byte output
write_bytes_to_output( &i, 2 ); // for 16-bit word output
rather than the much more portable

Code: Select all

int i = ...;
unsigned char c = i;
write_bytes_to_output( &c, 1 );
unsigned char word [2] = { i, i >> 8 };
write_bytes_to_output( word, 2 );
Last edited by blargg on Wed Oct 06, 2010 3:11 pm, edited 1 time in total.

User avatar
koitsu
Posts: 4218
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Post by koitsu » Tue Jun 29, 2010 1:44 pm

Doing this for every single write_bytes_to_output() call is wasteful.

The source code should be modified to either:
a) have an appropriate #ifdef to define little or big endian (most people use LITTLE_ENDIAN and BIG_ENDIAN), or,
b) have a detection subroutine that runs *once* that detects+defines endian state and modify write_bytes_to_output() to utilise said state.

(a) is more efficient (and is more common) but requires the user have to rebuild the binary from source if they change architectures (fairly common anyway), while (b) doesn't require this.

Detection is easy:

Code: Select all

#define LITTLE_ENDIAN 0
#define BIG_ENDIAN    1

/* Prototype declaration */
int test_endian(void);

/* Function declaration */
int test_endian(void) {
  int i = 1;
  char *p = (char *)&i;

  if (p[0] == 1) { return LITTLE_ENDIAN; }
  return BIG_ENDIAN;
}
Again, you'd only need to call this once.

User avatar
blargg
Posts: 3715
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg » Tue Jun 29, 2010 3:00 pm

I found more endian-dependence. In places it does *(short*)char_ptr = '=' for example, instead of the portable strcpy( char_ptr, "=" ). Or similarly, if ( *(short*)char_ptr == '=' instead of !strcmp( char_ptr, "=" ).

And then there are the pointer casts, that I haven't figured out yet. Some of them use pointers as booleans, doing char_ptr = (char*) 1 for true, rather than the portable static char true_sentinel; char_ptr = &true_sentinel.

The twisted nature of these optimizations makes it a kind of interesting project to consider making portable, especially doing so while making minimal modifications.

User avatar
koitsu
Posts: 4218
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Post by koitsu » Tue Jun 29, 2010 3:22 pm

I'm not sure I'd call those "optimisations" (no offence intended, loopy). The most recent examples look like an accident waiting to happen; ouch.

Regarding the pseudo-boolean stuff: depending on how the code works and what its intention is, it might make more overall sense to make it adhere to C99 (which officially offers the "bool" type).

The ASM6 parser isn't really very... well, I guess I shouldn't go there. I think the original point was that loopy wrote it for his own use, release it into the wild for others to use, yadda yadda. This is probably one of those "it works except when it doesn't" situations. :-)

User avatar
blargg
Posts: 3715
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg » Tue Jun 29, 2010 3:41 pm

Yeah, I partly just wanted to warn anyone considering compiling this on a big-endian machine, and perhaps post a portable version. I was impressed with the simplicity of the memory model. It was clearly made to just assemble and work, without arcane segments and other things used by other assemblers. I mainly just wanted to have it around so I could see how difficult it was to modify code I release to work with it.

The thing with the pointers is that it's also using them as normal pointers. So it's a sort of bool/char* union, but without having to use a union. It may even use this as the type flag, so if the pointer is (char*)1, then it's a bool with the value true. If it's NULL, then it's a bool with the value false. If it's neither, then it's a normal char* pointing at something.

Near
Founder of higan project
Posts: 1550
Joined: Mon Mar 27, 2006 5:23 pm

Post by Near » Tue Jun 29, 2010 7:11 pm

Again, you'd only need to call this once.
Of course, the problem with such a test function is that it can only be accomplished at run-time. Thus, any time you were to actually use a big or little endian specific function, you'd had to go through a function pointer or conditional test. Or go really evil and rely on self-modifying code :)

Best bet is to try and detect the platform based on compiler-specific #defines, and fall back on letting the user manually choose endianness. And finally, create a run-time assertion on startup to ensure the correct endian was chosen.

Still, for an NES assembler, is it really worth the speed benefit for all the extra hassle; when you can use the same code on all platforms? I can't imagine writing more than 1MB of data this way. Surely the added overhead isn't even close to 1ms.
I was impressed with the simplicity of the memory model. It was clearly made to just assemble and work, without arcane segments and other things used by other assemblers.
I've been trying to convince people of that approach for over a decade now. That kind of flexible magic can be there, just only require it when it is really required.
So it's a sort of bool/char* union, but without having to use a union. It may even use this as the type flag, so if the pointer is (char*)1, then it's a bool with the value true. If it's NULL, then it's a bool with the value false. If it's neither, then it's a normal char* pointing at something.
So then, I assume a value of (char*)2 represents a file not found condition?

User avatar
Dwedit
Posts: 4352
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Post by Dwedit » Tue Jun 29, 2010 7:13 pm

byuu wrote: So then, I assume a value of (char*)2 represents a file not found condition?
I wonder how many people here do not read The Daily WTF.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!

User avatar
cpow
NESICIDE developer
Posts: 1097
Joined: Mon Oct 13, 2008 7:55 pm
Location: Minneapolis, MN
Contact:

Post by cpow » Tue Jun 29, 2010 8:13 pm

blargg wrote:Yeah, I partly just wanted to warn anyone considering compiling this on a big-endian machine
Hey Blargg, ANY chance I can get you to try compiling/using PASM on a big-endian machine?

It requires flex/bison. On Windows I use cygwin but it also compiles in Linux and OSX. It compiles both a library (static, included in NESICIDE) and an executable (for me to test it externally from NESICIDE).

It is written to be ASM6 syntax compatible.

Source for it is here:

http://www.gitorious.org/nesicide/nesic ... r/compiler

(the makefile and the files prefixed with pasm_)

User avatar
loopy
Posts: 396
Joined: Sun Sep 19, 2004 10:52 pm
Location: UT

Post by loopy » Tue Jun 29, 2010 8:42 pm

I would definitely not look to asm6 as an example of good code. In some places, it's quite atrocious. Understand that it was originally written for an audience of one (myself), to be run on exactly one computer. Very much a "just make it work" mentality. At the time, I had no knowledge of how a parser should be written, I was figuring things out as I went along.
blargg wrote: The twisted nature of these optimizations makes it a kind of interesting project to consider making portable, especially doing so while making minimal modifications.
I'm glad you're finding it interesting, at least.

User avatar
blargg
Posts: 3715
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg » Wed Jun 30, 2010 5:16 am

I just want to apologize for somewhat making an example of asm6. I was kind of annoyed that what could have worked fine (no OS-specific crap, for example) was marred by endian-dependence. I understand your goals for it (audience: one) and it goes way beyond meeting that. That *(short*) stuff to compare/set one-character strings is pretty clever, even if it isn't portable. Personally I think little-endian is correct way to go, for many reasons. Seems PowerPC is one of the last holdouts (I know it has a little-endian mode, but I am pretty sure that causes extra overhead in some cases as compared to big-endian mode, like for unaligned accesses). For one, the 6502 would have taken an extra cycle all the time on absolute indexed instructions if it were big-endian, unless it threw extra hardware to compensate. The only downside is viewing in hex, but that is easily remedied by displaying from right-to-left. That way 78 56 34 12 gets displayed as 12 34 56 78.

tepples
Posts: 22052
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Wed Jun 30, 2010 5:33 am

blargg wrote:The only downside is viewing in hex, but that is easily remedied by displaying from right-to-left.
But then that would take the "om" out of "homebrew".

But seriously, this sort of union-between-pointers-and-result-codes reminds me of techniques used in dynamically typed languages such as PHP and Python. If I were writing an assembler today, I'd probably do it in Python.

User avatar
blargg
Posts: 3715
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg » Wed Jun 30, 2010 12:03 pm

Dynamic typing takes all the fun out of it. It's more fun if you are able to pack a type field into a char*, without using any more bits than normal. Just hope your platform's malloc never returns (char*)1 as a valid memory block...

Near
Founder of higan project
Posts: 1550
Joined: Mon Mar 27, 2006 5:23 pm

Post by Near » Wed Jun 30, 2010 9:10 pm

blargg wrote:Just hope your platform's malloc never returns (char*)1 as a valid memory block...
You know the sad part is that, thanks to programming and/or emulation, I actually would worry about exactly that. "Hmm, there's a one in four billion chance that malloc could return an address of one. Meh, unacceptable. The risks are just too great."

You have to wonder about what kind of effects such a perfectionist mentality has on the rest of our lives :P

So, workaround time:

Code: Select all

static const intptr_t False = NULL;
static const intptr_t True = &False;

User avatar
blargg
Posts: 3715
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg » Thu Jul 01, 2010 4:35 am

intptr_t just complicates it. Use void* and it works much more smoothly (in C, due to implicit void* conversion), and fully portably:

Code: Select all

static void* false_ptr = NULL;
static void* true_ptr = &true_ptr;
static void* po_ptr = &po_ptr;

const char* ptr;
ptr = false_ptr;
assert( !ptr );

ptr = true_ptr;
assert( ptr && ptr == true_ptr );

ptr = "str";
assert( ptr != false_ptr && ptr != true_ptr );

tepples
Posts: 22052
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Thu Jul 01, 2010 4:51 am

byuu wrote:Hmm, there's a one in four billion chance that malloc could return an address of one.
Some platforms' ABIs specify that malloc() can't ever return a 1 in the three low order bits. For example, glibc guarantees 8-byte alignment. And with int being at least 2 bytes long on CHAR_BIT==8 machines like those that run the vast majority of emulators, I'd wager that every platform worth caring about aligns all pointers from malloc() to two bytes.

Yes, I just assumed that CHAR_BIT==8, but I can document that assumption in a compile-time assertion: code that results in declaring a negative-size array if it is false.

Code: Select all

extern int CTASSERT_eight_bit_bytes[(CHAR_BIT == 8) ? 1 : -1];

Post Reply