## Math vs Language in programming

You can talk about almost anything that you want to on this board.

Moderator: Moderators

aa-dav
Posts: 91
Joined: Tue Apr 14, 2020 9:45 pm
Location: Russia

### Math vs Language in programming

(heavily inspired by 3/4 pages of neighbour topic http://forums.nesdev.com/viewtopic.php?f=5&t=20421 )
(I have posted most of text below in ArsTechnica forums a couple of months ago, but now find it appropriate for discussion here)

Another thing about computers of old era and mathematics that amused me is history of first compilers.
It is related to monumental figure in history of computers: Grace Hopper https://en.wikipedia.org/wiki/Grace_Hopper
She was part of the team that developed famous UNIVAC I computer and later she created one of the first compilers.
Her first product was A-0 which evolution led to A-1, A-2, A-3 and... B-0.
I didn't find detailed information about A-0 and A-1 (it looks like they didn't go out of lab), but A-2 was some kind of virtual machine with rich set of math instructions.
It was also known by name ARITH-MATIC https://en.wikipedia.org/wiki/ARITH-MATIC. It's 'syntax' reminds assembler for this machine, but very simplified for use.
But next A-3 language known as MATH-MATIC https://en.wikipedia.org/wiki/MATH-MATIC uses text-based programs and math notation for expressions.
For example:

Code: Select all

``````X1 = (7*10^3*Y*A*SIN ALPHA)^3 / (B POW D+C POW E) .
``````
(^3 is one symbol in original code from wiki) And as you can see namings didn't lie.
BUT
In the next language B-0 known as https://en.wikipedia.org/wiki/FLOW-MATIC math was gone!
Let's quote Grace Hopper herself:
I used to be a mathematics professor. At that time I found there were a certain number of students who could not learn mathematics. I then was charged with the job of making it easy for businessmen to use our computers. I found it was not a question of whether they could learn mathematics or not, but whether they would. […] They said, ‘Throw those symbols out — I do not know what they mean, I have not time to learn symbols.’ I suggest a reply to those who would like data processing people to use mathematical symbols that they make them first attempt to teach those symbols to vice-presidents or a colonel or admiral. I assure you that I tried it.
And 'B' stands for 'business'.
B-0 had great influence on COBOL (COmmon Business-Oriented Language) and all these languages have 'natural-language-oriented' expressions for calculations like these:

Code: Select all

``````ADD a, b TO c
``````
ABAP - programming language for SAP/R3 has similar syntax:

Code: Select all

``````ADD TAX TO PRICE.
``````
I suppose education of that times were far from modern reality and, for example, educated bookkeeper was taught to calculate X percents of Y value (for tax form) step by step without abstract knowledge of symbolic equations and operations with them like in x = y * 100 / z;
Another quote of Grace:
Manipulating symbols was fine for mathematicians but it was no good for data processors who were not symbol manipulators. Very few people are really symbol manipulators. If they are they become professional mathematicians, not data processors. It's much easier for most people to write an English statement than it is to use symbols. So I decided data processors ought to be able to write their programs in English, and the computers would translate them into machine code. That was the beginning of COBOL...
8<-------------------- (end of old post)

And you know what? Assembler languages are suspiciously 'human-readable non-math' things.
Look at all these 'MOVE/LOAD/SAVE', 'ADD/SUB' and so on!
(And my point in topic mentioned above was:) Why not:

Code: Select all

``````eax = Data1 ; mov
eax += ebx ; add
Data2 -c= eax ; sub with carry
ecx ^= edx ; xor (eor)
pc = Label1 if ZF ; conditional jump
``````
Why assembler's creators always were far from math notation?

aa-dav
Posts: 91
Joined: Tue Apr 14, 2020 9:45 pm
Location: Russia

### Re: Math vs Language in programming

I found some old i8086 assembler code (TASM):

Code: Select all

``````        mov ax, [WORD PTR zspStr + 2]
mov ds, ax
mov si, [WORD PTR zspStr]
mov ah, 2
@@Loop:
mov dl, [si]
and dl, dl
jz short @@Exit
int 21h
inc si
jmp short @@Loop
@@Exit:
ret
``````
and rewrite it to "math notation":

Code: Select all

``````        ax = [WORD PTR zspStr + 2]
ds = ax
si = [WORD PTR zspStr]
ah = 2
@@Loop:
dl = [si]
dl &= dl
ip +s=  @@Exit ifzf
int 21h
si++
ip +s= @@Loop
@@Exit:
ret
``````
Looks cryptic at first glance, but I suppose it's just temporary effect.

rox_midge
Posts: 88
Joined: Mon Sep 19, 2005 11:51 am

### Re: Math vs Language in programming

aa-dav wrote:
Thu Jul 30, 2020 6:14 pm
Why assembler's creators always were far from math notation?
The glib answer is "because anything more complicated would turn into C." Sure, you could add some syntactic sugar for assignments, but just like real sugar, you wind up wanting some more. You could rework your example like so:

Code: Select all

``````        ax = *(zspStr+2);
ds = ax;
si = *(zspStr);
ah = 2;
@@Loop:
dl = *si;
dl &= dl;
if (z) goto @@Exit;
int 21h;
si++;
goto @@Loop;
@@Exit:
return;
``````
But this is getting really close to C. There are some high-level assemblers that do take this approach - NESHLA comes to mind - but in general, if you want to reason about the code at a higher level, you'd use a higher level language.

lidnariq
Posts: 9500
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

### Re: Math vs Language in programming

Many many years ago, Tran = Thomas Pytel released the source code for Timeless but it was raw asm, and someone else did a fairly literal translation into C. Looks awfully familiar.

strat
Posts: 364
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

### Re: Math vs Language in programming

Fun fact: A programmer who worked on the N64 version of Resident Evil 2 remarked that Capcom's original C code for the PS1 looked a lot like assembly (I'd link the article but it's mostly about how he got fmv running on N64).

tokumaru
Posts: 11755
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

### Re: Math vs Language in programming

Sega had tools to convert 68000 assembly to C, which they used when porting Genesis games to the PC (Sonic CD, Sonic 3 & Knuckles, etc.). Maybe Capcom was doing something similar.

Oziphantom
Posts: 859
Joined: Tue Feb 07, 2017 2:03 am

### Re: Math vs Language in programming

Probably because it was their first C project. I remember when I first converted to C, I coded in a very Asm like style. Mainly because I was use to caring about how the code was generated and the C compilers sucked, and then the compiler for the MIPS3000 which was a very obscure system, where the tool chain was shipped rushed and buggy. These days I accept the compiler will mostly beat me. In 1996 I would smash it in every aspect of every task without thinking. So to get performance you need to sort of code in Asm. I'm sure if we looked at the CC65 code most people here are making it would be more ASM like as you are trying to game the compiler into making what you want. Also the slow performance and very tight memory limits of the PS1 mean you need to pull every trick you can. 2MB RAM might sound like a lot, but you have to remember its doesn't have a ROM cart so it has to hold all logic, decode buffers and your ELF and you really want to get your code into the 4K of cache.

Oziphantom
Posts: 859
Joined: Tue Feb 07, 2017 2:03 am

### Re: Math vs Language in programming

strat
Posts: 364
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

### Re: Math vs Language in programming

Oziphantom wrote:
Fri Jul 31, 2020 11:59 pm
2MB RAM might sound like a lot, but you have to remember its doesn't have a ROM cart so it has to hold all logic, decode buffers and your ELF and you really want to get your code into the 4K of cache.
A recent YT review of the original Bonk series on Turbografx pointed out Bonk 3 on CD has fewer frames of animation for Bonk himself than the HuCard version. Apparently whoever did the CD version found there wasn't enough room for all the engine code and assets at the same time.

turboxray
Posts: 82
Joined: Thu Oct 31, 2019 12:56 am

### Re: Math vs Language in programming

I wouldn't necessarily consider something like "r0 = r1 & \$1ff" to be closer to math syntax, simply because "=" is a glaring difference. The "=" symbol is assignment here, not equality or true equivalence like in math. I think "r1 & \$1ff => r0" or "r0 := r1 & \$1fff" would be closer to math syntax.

Anyway, that notation is pretty much "three address code" in compiler design (using quadruples). It's basically the intermediate code/syntax right before it's converted to assembly.

aa-dav
Posts: 91
Joined: Tue Apr 14, 2020 9:45 pm
Location: Russia

### Re: Math vs Language in programming

I think "math" notation is shorter, cleaner and doesn't have ambiguity like "move from vs move to" which is rocket engine of previous topic.
We were teached to symbols of math operations well and do not have problems with reading them like generals and bookkeepers Grace Hopper dealt with.

So I think 'C-like' assembler is good thing. And I made some experiment: virtual machine architecture with C-like machine command style.
That is every instruction has form:
A ?= B @cc
where A and B are operands and @cc (optional) - condition code, ? - operation code.
Discussion above give me interestion idea to replace @cc with prefix if (cc) - that will be more readable!
Every instruction takes B and possibly A (for two-operand instruction), load it into ALU with operation code, waits for result and saves it (back) to A.
All registers and memory are 16-bit wide.
Eight registers R0-R7.
R5 has synonym SP and points to stack. Indirect reading increments it and indirect writing decrements it.
R6 has synonym PC (program counter) and at the start of execution of current instruction points to the next word. Indirect reading increments it (this is how immediate data is implemented).
R7 has synonym FLAGS and cannot be read/written indirectly.
There is one instruction format only:

SRC/DST - register codes.
DI/SI - flags of indirection (operand is location of the memory with address in register used). Special case 1111 for SI/SRC or DI/DST (that is indirect R7) are treated as immediate address for argument (that is immediate is read and treated as address of argument).
COND - condition
CMD - ALU instruction code
TO - two operand instruction flag - it causes operand A to be fetched and fed to ALU with B. CMD with different TO can do different things, so quantity of different instructions is 32.

That is B could be one of the: imm16, [ addr16 ], Rx, [ Rx ]
A could be one of the: [ addr16 ], Rx, [ Rx ]

I wrote simple emulator and assembler for this machine and implemented next instructions:
ONE OPERAND:
= - copy
=? - copy with ZF and SF update
=+1 - increment
=-1 - decrement
=+2 - increment by 2
=-2 - decrement by 2
TWO OPERAND:
+=, +c= - add and add with carry
-=, -c= - the same with sub
<?> - compare
&=, |=, ^= - and/or/xor

Conditional codes are usual like: @z, @nz, @c, @nc and so on.

So, it tested this machine/assembler and expect it to be readable and easy to write.
For example:

Code: Select all

``````; string_print
; in: r0 - string buffer
string_print  r1  =? [ r0 ]
ret @z
r0  =+1 r0
[ PORT_CONSOLE ] = r1
pc  = string_print

; string_len
; in:  r0 - string buffer
; out:  r0 - length of the string
string_len  r1  = 0
.loop    r2  =? [ r0 ]
pc  = .end @z
r0  =+1 r0
r1  =+1 r1
pc  = .loop
.end    r0  = r1
ret
``````
PORT_CONSOLE is in/out port mapped to memory.
So, I have hour or something like that to test it and...
This asm didn't make any revolution for me at all. It was as hard to percieve as any other asm.
Well... I failed.

So, I think it's not worse than any asm, but it's not really better.