Compiling and linking theory
Moderator: General Moderators
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
Compiling and linking theory
Not PHP related...
I'm quite used to using gcc to compile and link source code for me, but in the back of my mind I don't *really* know what's going on under-the-surface. Should I care? Or should I just accept that compiling and linking are required steps to produce executable files from source code without really caring how it works?
I mean, what's the compiler doing when it reads source code and produces a binary file (what do the instructions in that file look like?)? And what exactly is the linker doing when it links in shared objects etc? I don't particularly want to know in any great depth, I just want to have more of an understanding about the processes involved.
Would learning assembler help me, or is that just shifting down to a slightly lower level with the same questions left unanswered? Maybe I need to know just a little about the instructions that are sent to the CPU, such as what a simple if..else looks like at the CPU level?
Anybody here ever written a simple (really simple, not very smart) virtual machine with its own defined set of instructions to demonstrate such a thing?
I'm quite used to using gcc to compile and link source code for me, but in the back of my mind I don't *really* know what's going on under-the-surface. Should I care? Or should I just accept that compiling and linking are required steps to produce executable files from source code without really caring how it works?
I mean, what's the compiler doing when it reads source code and produces a binary file (what do the instructions in that file look like?)? And what exactly is the linker doing when it links in shared objects etc? I don't particularly want to know in any great depth, I just want to have more of an understanding about the processes involved.
Would learning assembler help me, or is that just shifting down to a slightly lower level with the same questions left unanswered? Maybe I need to know just a little about the instructions that are sent to the CPU, such as what a simple if..else looks like at the CPU level?
Anybody here ever written a simple (really simple, not very smart) virtual machine with its own defined set of instructions to demonstrate such a thing?
Re: Compiling and linking theory
Only where it concerns you. I know there are situations where you need to drop out of C or C++ and into assembler for a tight loop that is running slow. Things like that. Maybe one compiler generates faster code than another. That's what I would be concerned with. Don't spin wheels learning things that won't benefit you.Chris Corbyn wrote:I don't *really* know what's going on under-the-surface. Should I care?
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
Re: Compiling and linking theory
I guess I'm hoping that understanding more about what's going on will be beneficial to my writing C code. But maybe not... It's not like I've really delved deep into the guts of the Zend Engine in a quest to find out how PHP works... but I know the general principles for how the interpreter reads source code and what it does with it (PHP is mostly just using yacc to parse the source and run a bunch of callbacks in the Zend Engine).astions wrote:Don't spin wheels learning things that won't benefit you.
Executable files are a whole different ballgame though... Apart from the parsing part of the process.
I've always been more of a "behind the scenes" programmer cos that's usually where all the interesting stuff is happening
Re: Compiling and linking theory
If I could write everything in assembly without it increasing development time by 1000... I would be all there.. seriously. I'm all into understanding what is going on under the hood and perfecting everything. But.. C and C++ were created for good reason. The compilers save you immense amounts of time by allowing you to write code at a higher level.
-
alex.barylski
- DevNet Evangelist
- Posts: 6267
- Joined: Tue Dec 21, 2004 5:00 pm
- Location: Winnipeg
Re: Compiling and linking theory
As a applications developer, it's not really a concern, IMHO -- although learning anything new will rarely kill you.I'm quite used to using gcc to compile and link source code for me, but in the back of my mind I don't *really* know what's going on under-the-surface. Should I care? Or should I just accept that compiling and linking are required steps to produce executable files from source code without really caring how it works?
Depends on the compiler I suppose, but machine code is a typical result.what do the instructions in that file look like
Learning assembly never hurts, but won't really help understand the process much. You would still need to learn the specifics of your particular compiler/linker system(s). The idea behind a linker, for historical reasons, was to allow a programmer to write modular applications, test a function library and then compile into machine language and leave it out of the compilation phase, so compilation only occured on the files you were actually working on. The compiled binaries were then linked into a single unit and the appropriate headers were added for any given platform, such as Windows PE file format, etc.Would learning assembler help me, or is that just shifting down to a slightly lower level with the same questions left unanswered? Maybe I need to know just a little about the instructions that are sent to the CPU, such as what a simple if..else looks like at the CPU level?
Machine language doesn't really have "IF" statements instead you would use conditional jmp instructions. Whether a JMP instruction "jumps" to it's label is usually depenent on a value in a flags register (might be different on todays architecture). It either jumps or falls through, nothing more, nothing less.
There are a bewildering array of jump nmenoics:
JZ = Jump if Zero
JG = Jump if greater than
JNZ = Jump if not Zero
JLE = Jump if less than or equal
JNG = Jump if not greater than
To make matters worse, many mnemonics use the exact same instruction opcode and there are usually instructions for signed and unsigned values.
High level language constructs do not translate into machine/assembly very easy so I cannot easily give an example without firing up VS and viewing the assembly.
For a simple variable test like:
Code: Select all
if($a == $b){
echo 'Hello World';
}Code: Select all
mov EAX, $a ; Copy the data at memory offset $a into register
cmp EAX, $b ; Compare the value in register EAX to value in memory offset $b and set flag to TRUE if they are equal
je HELLO_WORLD
ret
HELLO_WORLD:
echo 'Hello World';
ret
Compilers for at least 10 years have been more than capable of optimizing most loops for you. It's very rare that you can actually opitmize C/C++ code, if anything you would be best off just using inline assembler or inlining functions called inside a loop -- better yet use a macro.I know there are situations where you need to drop out of C or C++ and into assembler for a tight loop that is running slow.
Then it's probably worth reading up on just to satisfy your desire to understand...I've always been more of a "behind the scenes" programmer cos that's usually where all the interesting stuff is happening
-
alex.barylski
- DevNet Evangelist
- Posts: 6267
- Joined: Tue Dec 21, 2004 5:00 pm
- Location: Winnipeg
Re: Compiling and linking theory
Incase anyone is interested, I dug up this little article which has a graphic to explain the whole process from a C/C++ build process:
http://www.tenouk.com/ModuleW.html
http://www.tenouk.com/ModuleW.html
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
Re: Compiling and linking theory
Cheers PCSpectra. Insightful 
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
Re: Compiling and linking theory
I've discovered the iTunes U section of the iTunes store. Stanford have a ridiculous amount of free lecture videos along with the lecture slides, assignments etc online. Their CS107 course (Programming Paradigms) covers C, C++, Scheme and Python, discussing the paradigms used in those languages and also delves down into how they are translated into assembly
Haven't watched much yet but glad I've stumbled upon all their content. Kudos to Stanford!
I didn't realise there are different forms of assembly too.
I didn't realise there are different forms of assembly too.
-
alex.barylski
- DevNet Evangelist
- Posts: 6267
- Joined: Tue Dec 21, 2004 5:00 pm
- Location: Winnipeg
Re: Compiling and linking theory
I remember playing with TASM (Borland I think) and their assembler supported things like macros and even went as far as to offer OOP...imagine that...using objects in assembly.I didn't realise there are different forms of assembly too.
Re: Compiling and linking theory
Assemblers are specific to the cpu the code runs on. The assembler for a 8086 chip is different from an assembler for a 6502 chip (note: These are obsolete ancient cpu chips just for an example). And I know the assembler for my old DEC PDP-8/E wouldn't spit out anything recognizable by a PCI didn't realise there are different forms of assembly too.