It is currently Mon Nov 11, 2019 5:29 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 29 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Tue Apr 09, 2019 1:44 pm 
Offline

Joined: Sun Jun 30, 2013 7:59 am
Posts: 41
abridgewater wrote:
Having written macrocode and microcode level emulation for the TI Explorer Lisp Machine, I can say that the situation was a bit more complicated than that...
Wow! I wish I could reply with something interesting, but your post has just given me so much food for thought. I sense there's going to be a few days of lost sleep ahead trying to fully grok many of the concepts you spoke of :oops:

If you have the time, I have a couple of questions:

    * Do you have any opinions on the CONS vs the CADR processor? If I understand correctly they were completely different architectures, though (if the naming wasn't germane enough!) both designed to run lisp.

    * Do you have any particular books to recommend for the layman trying to understand microcode? Or any books relating to this methodology of designing an architecture around the language? I was doing a lot of research into whether hardware can be designed around running functional languages specifically, but most of it was going over my head, sadly.

Cheers! :)

Rahsennor wrote:
I continued working my way up, coding a simple Forth compiler and evolving it into a bare-metal OS (cooperative multitasking!)
Wow! Awesome stuff. Those are the kind of projects I have grandiose visions of achieving and... always end up having to take a step back and go back to researching :lol:

Rahsennor wrote:
It also taught me that complexity is the root of all evil, and that managing it - by minimizing it, structuring it and documenting it - is the key to success as a programmer.
I sort of get the impression that... it's easy to feel like you're reducing complexity, when in fact you're just trading one kind of complexity for another. With assembly, you have little-to-no abstraction, so you're having to talk on the level of the machine. On the plus side... well, that's also the plus side! Every instruction has a 1-to-1 correspondence with the hardware. Do you sacrifice the "reality" of the situation (what the machine is actually doing) for a sort if idealised language which can better express the concepts of your program directly, which requires a compiler/interpreter/complicated syntax and all that baggage?

Rahsennor wrote:
Probably not the kind of answer you were looking for, but that's the perspective I got going from low to high instead of high to low.
The more viewpoints, the better! Being a relatively new programmer (around 1 year or so I guess) I decided to focus on functional programming almost instantly. I found a great deal of posts talking about how out of the two paths:
* Learn procedural/OOP -> learn functional
* Learn functional -> learn procedural/OOP

That the first path is supposedly significantly harder, because procedural/OOP are, for most people, quite a natural way of modelling processes. Functional, on the other hand (for someone without a mathematical background such as myself) can be much less intuitive. If you believe that to be true, it would make sense to tackle the more alien first, so you don't mould your brain into the more "obvious" mindset, thereby causing you to struggle to try and break out of that mindset at a later date. I'm not claiming these notions to be true, of course, but I must have agreed in some way or I'd be learning different languages entirely -- assembler being the exception. However, I think assembly is, in a way, as much of a head-change from OOP (and even high-level procedural to some extent) that you could make a similar argument. Better to understand what's really going on in the operation of a machine (or at least a relatively simple machine) and then build abstractions on top, than start with abstractions and struggle to break them down to get to the core or what's going on at a later date.


Top
 Profile  
 
PostPosted: Tue Apr 09, 2019 5:12 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 1097
I used to find it very aggravating that programming languages never seemed to have any connection with what was actually going on. When I learned BASIC in high school, it was obvious there was a ton of black magic under the hood that was actively being hidden from me. Same for C++ in college. Also, nobody ever explained how programs work on a low level, so I was learning the logical constructs before understanding the concept of program flow, which was really confusing (my memory is a bit dim that far back, but I aced the course so I must have figured it out at some point).

When I learned Matlab, I had to suppress my disgust at the totally artificial "environment" paradigm and just use it, and before long I actually liked it - Matlab isn't so much a programming language as an engineering tool, and if you take it for what it is it's actually quite nice. When I went back to C++, I didn't care any more because I had tasks I needed to accomplish, so understanding the syntax was more important than understanding exactly what was going on.

Mind you, C++ syntax is really obtuse to somebody who doesn't know exactly what's going on...

Then I got the idea in my head of programming a SNES game, and started learning 65816 assembly (and later Super FX and SPC700 assembly). It was an amazing breath of fresh air. For the first time, I wasn't beholden to somebody else's idea of what constituted a useful abstraction; everything I was trying to learn was real. Now I know what a bus is, I know what a program counter is, I know what a stack is, I know what an interrupt is, I know what two's complement is. I still have no idea what a heap is... or actually how operating systems work in general, although I'm sure I could learn it more easily now than I could have before.

I've also learned a lot about programming best practices. Ironically a good bit of this is learning more about how powerful an optimizing compiler can be, which is something it would have been really nice to know 10 years ago when I was writing some of the most obscure, least DRY code imaginable because I thought it would be faster...


Top
 Profile  
 
PostPosted: Tue Apr 09, 2019 7:14 pm 
Offline

Joined: Thu Aug 20, 2015 3:09 am
Posts: 471
CrowleyBluegrass wrote:
Wow! Awesome stuff. Those are the kind of projects I have grandiose visions of achieving and... always end up having to take a step back and go back to researching :lol:

The great thing about Forth is that - when applied correctly - it makes everything so simple. Writing that Forth OS was the most eye-opening thing I've ever done as a programmer. Things I'd been struggling to figure out for years just suddenly happened, in as little as three lines of code.

I highly recommend checking out Forth, and the (freely downloadable) book Thinking Forth, even if you never use the language again. Like the title of the book implies, it's more a way of thinking than a programming language, and a useful addition to anyone's toolbox.

CrowleyBluegrass wrote:
I sort of get the impression that... it's easy to feel like you're reducing complexity, when in fact you're just trading one kind of complexity for another.

And you'd be right! That's what seperates the good programmers from the great ones, if you ask me.

CrowleyBluegrass wrote:
Do you sacrifice the "reality" of the situation (what the machine is actually doing) for a sort if idealised language which can better express the concepts of your program directly, which requires a compiler/interpreter/complicated syntax and all that baggage?

That's the trade-off you have to make. The key point to remember is that it is a tradeoff - sometimes abstraction is a good thing. There are a lot of solved problems in computer science. Why not let the computer take care of them for you, so you can focus on the bits that aren't solved?

Even assembly language is an abstraction, when you think about it. Memorize the instruction set, and you can code in binary - is that worth the extra effort?

CrowleyBluegrass wrote:
Being a relatively new programmer (around 1 year or so I guess) I decided to focus on functional programming almost instantly. I found a great deal of posts talking about how out of the two paths:
* Learn procedural/OOP -> learn functional
* Learn functional -> learn procedural/OOP

This is a tangent, but please be careful not to conflate object-oriented programming and procedural programming. They're two distinct concepts - you can be object-oriented and functional, for example. Don't get me wrong, OOP is a great tool - but so is a hammer. Swing either of them too long and everything starts to look like a nail.

CrowleyBluegrass wrote:
That the first path is supposedly significantly harder, because procedural/OOP are, for most people, quite a natural way of modelling processes. Functional, on the other hand (for someone without a mathematical background such as myself) can be much less intuitive.

I don't agree with that. I think functional programming languages have a reuptation for being alien and mathematical because most of them are designed by and for mathematicians, who seem to take pride in being incomprehensible. People associate OOP with procedural programming precisely because it's such a natural way to think about procedural programming. But there's an excellent metaphor for functional programming too: plumbing! POSIX shell pipelines are the obvious example, but there are a few other examples, or borderline examples like APL.

There aren't nearly enough of them, though; it's a sorely underrated programming paradigm.

CrowleyBluegrass wrote:
Better to understand what's really going on in the operation of a machine (or at least a relatively simple machine) and then build abstractions on top, than start with abstractions and struggle to break them down to get to the core or what's going on at a later date.

That's the way I learned, and I've never regretted it. :beer: :D

(Wow, that ended up way longer than I thought. I'll stop cluttering up your thread now, haha.)


Top
 Profile  
 
PostPosted: Wed Apr 10, 2019 9:20 am 
Offline
User avatar

Joined: Fri Nov 24, 2017 2:40 pm
Posts: 170
One word of caution about trying to learn from assembly for 30+ year old CPUs is that what defines performance has radically changed on modern CPUs. Things that used to be expensive (floating point ops, a lot of math functions, etc) can on average run in a single CPU cycle or two now. On the other hand, things that used to be really cheap like RAM access or conditionals may not be. Accessing RAM that isn't in the cache can cost 100x more cycles than a square root, and a mispredicted conditional can cost dozens.

As Koitsu pointed out, modern x86 CPUs don't actually run x86 instructions anymore either. They are a hardware based interpreter for x86 bytecode, and translate it into something else (purportedly much more RISC like). The "real" CPU has more registers than the instruction set exposes. So when you cleverly zero a register by XORing it with itself, the CPU just points it at a zeroed register from the pool. Some CPUs have a dedicated "zero register" that the external register gets bound to. They can analyze, reorder, and run multiple instructions at the same time. They can even speculatively start running instructions after a conditional by assuming which branch it will take and throwing away the results if guessed wrong. Most of this applies even to the ARM CPU you probably have in your pocket.

The downside of all of this is that it's harder to make assumptions about performance by looking at the assembly. You can't simply count the cycles anymore. Memory access patterns and such have a huge effect, and it can be impossible to know how it's organized just by looking at an algorithm that processes data. The way the data is constructed often matters more.


Top
 Profile  
 
PostPosted: Wed Apr 10, 2019 10:35 am 
Offline

Joined: Sun Jun 30, 2013 7:59 am
Posts: 41
slembcke wrote:
One word of caution about trying to learn from assembly for 30+ year old CPUs is that what defines performance has radically changed on modern CPUs. Things that used to be expensive (floating point ops, a lot of math functions, etc) can on average run in a single CPU cycle or two now. On the other hand, things that used to be really cheap like RAM access or conditionals may not be. Accessing RAM that isn't in the cache can cost 100x more cycles than a square root, and a mispredicted conditional can cost dozens.

As Koitsu pointed out, modern x86 CPUs don't actually run x86 instructions anymore either. They are a hardware based interpreter for x86 bytecode, and translate it into something else (purportedly much more RISC like). The "real" CPU has more registers than the instruction set exposes. So when you cleverly zero a register by XORing it with itself, the CPU just points it at a zeroed register from the pool. Some CPUs have a dedicated "zero register" that the external register gets bound to. They can analyze, reorder, and run multiple instructions at the same time. They can even speculatively start running instructions after a conditional by assuming which branch it will take and throwing away the results if guessed wrong. Most of this applies even to the ARM CPU you probably have in your pocket.

The downside of all of this is that it's harder to make assumptions about performance by looking at the assembly. You can't simply count the cycles anymore. Memory access patterns and such have a huge effect, and it can be impossible to know how it's organized just by looking at an algorithm that processes data. The way the data is constructed often matters more.
Duly noted, I touched on that a little when discussing what "close the metal" actually means, compared to what it's often implied to mean. I suppose most asking this type of question would be talking in terms of speed or efficiency and how that could perhaps be applied to modern-day-assembly tweaking. I meant it in a more general sense. Things like, the relative simplicity and directness, or the general management of projects and how that may have influenced work in other paradigms, modern-architecture or otherwise. I don't do much tweaking of modern assembly output, so I'm rather oblivious when it comes to that subject matter in truth, other than being aware of how different it is to a 6502.


Top
 Profile  
 
PostPosted: Wed Apr 10, 2019 2:08 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 1097
I think knowing how a 30-year-old CPU works is much better than having no idea how any CPU works. All you really need to do is read a description of how modern hardware is different, like the one above, and all the differences make sense - the designers are trying to get around various bottlenecks that limited older hardware, and the methods they use are easy to understand for someone familiar with that older hardware. It's even better if you've read a description of intermediate designs like the Pentium and Pentium II, so you get a sense for how the modern stuff evolved.

If a modern CPU was just an upclocked 6502 with wider buses, it wouldn't be able to run any faster than 50 MHz or so, because a truly random access in modern RAM takes about 10 ns (if I've understood CAS latency correctly). The big gains in performance are architectural.

I do think it's kinda goofy that modern mainline CPUs are basically hardware emulators, but I guess it's faster this way if you start with backward compatibility as a requirement in every design since the 8086...


Top
 Profile  
 
PostPosted: Wed Apr 10, 2019 6:49 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 4213
Location: A world gone mad
slembcke wrote:
One word of caution about trying to learn from assembly for 30+ year old CPUs is that what defines performance has radically changed on modern CPUs. Things that used to be expensive (floating point ops, a lot of math functions, etc) can on average run in a single CPU cycle or two now.

Surely you jest (see: Latency column, and that column's description).


Top
 Profile  
 
PostPosted: Wed Apr 10, 2019 7:16 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21676
Location: NE Indiana, USA (NTSC)
FMUL happens in 5 cycles on a Silvermont/Airmont (such as the Pentium N3710 in this very laptop). FDIV or FSQRT is slower at roughly 40 cycles. But that's still a lot faster than 150 cycles for 8x8 multiplication on a 6502 without a quarter square table.

_________________
Pin Eight | Twitter | GitHub | Patreon


Top
 Profile  
 
PostPosted: Thu Apr 11, 2019 9:47 am 
Offline
User avatar

Joined: Fri Nov 24, 2017 2:40 pm
Posts: 170
Yeah, but that's the latency as run in the pipeline, if it was run in isolation. Generally speaking, the instruction scheduler and compilers are pretty good at hiding that latency. A pretty easy contrived example is to make an array of a million values, and run square root on all of them and compare against the cycle count register. On my i7 machines it averages a couple cycles per loop. Heavily pipelined, superscalar magic! In more practical code, the CPU might not be able to hide the latency quite as well, but it can often be fixed by reworking your math slightly to avoid dependencies on the results of expensive instructions.

Anyway, that's sort of what I'm getting at though. You can't just look at the cycle counts and conclude that chart is how long an instruction takes to actually run in a live program. It's way more complicated than that.

edit: Correction, I'm mis-remembering this a little. The average is less than two cycles per float for vectorized loops on my 1.7 Ghz i7 ultrabook. For non-vectorized loops it takes ~4 cycles.

Specifically, I ran this and got the following numbers. I started writing an article (that I never finished) about old optimizations no longer being viable after being involved in a conversation where somebody suggested replacing sqrt() calls with the old Quake rsqrt() approximation, then somebody else chimed in that a lookup table would be even faster. My response was: "Holy crap! No! Regular square roots are faster AND more accurate in 201X."

Code:
bzero(): 0.10 cycles/byte
memcpy(): 0.33 cycles/byte
Copy every 16th byte: 0.44 cycles/byte
Copy every 32nd byte: 0.44 cycles/byte
Copy every 64th byte: 0.47 cycles/byte
Copy every 128th byte: 0.41 cycles/byte
Copy every 256th byte: 0.24 cycles/byte
Copy bytes scrambled: 83.57 cycles/byte
x + 1: 1.68 cycles/loop
1 / x: 1.79 cycles/loop
sqrt(): 4.06 cycles/loop
sqrt() vec: 1.67 cycles/loop
q_sqrt(): 7.17 cycles/loop
table_sqrt(): 75.12 cycles/loop
rsqrt(): 3.54 cycles/loop
q_rsqrt(): 6.71 cycles/loop
log(): 7.39 cycles/loop
exp(): 7.71 cycles/loop
pow(): 8.11 cycles/loop
cos(): 7.31 cycles/loop
acos(): 10.47 cycles/loop
atan2(): 8.43 cycles/loop


Top
 Profile  
 
PostPosted: Mon Apr 15, 2019 6:27 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2900
That it takes forever.


Top
 Profile  
 
PostPosted: Fri May 24, 2019 9:38 am 
Offline
User avatar

Joined: Sat Feb 16, 2013 11:52 am
Posts: 333
Rahsennor wrote:

(There was even a pinout of a Zilog Z80 in the back! Remember when user-level computer books/manuals had actual pinouts in them? Imagine trying to do that now.)


You need to look harder but they're out there somewhere. My 2015 laptop that I still use it to this day has half of its user manual dedicated to schematics of absolutely everything in it.

Image

Image

What is a VGA NVVDD decoupler circuit? I have no idea but there are people who do know what it is so it's great to have those explicitly laid out in the manual that comes inside the laptop's box.

_________________
This is a block of text that can be added to posts you make. There is a 255 character limit.


Top
 Profile  
 
PostPosted: Tue May 28, 2019 1:14 am 
Offline

Joined: Thu Aug 20, 2015 3:09 am
Posts: 471
I stand impressed. Who made it?


Top
 Profile  
 
PostPosted: Sat Jun 01, 2019 12:45 pm 
Offline
User avatar

Joined: Sat Feb 16, 2013 11:52 am
Posts: 333
I was complaining earlier in this thread that I wasn't able to apply my ASM expertise to any real world jobs, but actually I just recently got hired as a software developer due to my 6502 stuff and there's a possibility I'll be messing with ARM programming soon. Neat.

Rahsennor wrote:
I stand impressed. Who made it?


Clevo Co. from Taiwan ROC aka Sager Computers. I think their stuff serve as OAM machines for quite a bit of companies actually.

_________________
This is a block of text that can be added to posts you make. There is a 255 character limit.


Top
 Profile  
 
PostPosted: Sat Jun 01, 2019 3:56 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21676
Location: NE Indiana, USA (NTSC)
You know you've been programming classic Nintendo consoles too long when you misspell OEM (original equipment manufacturer) as OAM (object attribute memory).

_________________
Pin Eight | Twitter | GitHub | Patreon


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 29 posts ]  Go to page Previous  1, 2

All times are UTC - 7 hours


Who is online

Users browsing this forum: Google Adsense [Bot] and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group