Behind the scenes of the C compilation process

Alina de los Santos
4 min readJun 19, 2021

--

Every program in existence has humble origins aka the source code which are files written in a specific programming language, in our case in C, which is understood by humans but not by computers. This source code must be compiled to machine language in order to be executed by computers. This will be done with the help of a compiler, a special program that will perform the task of processing code written in a particular language, C in our case, and turn it into machine language that the computer processor can understand. In our examples we have used the GNU Compiler Collection (GCC), a compiler developed by the GNU Project which supports several programming languages and distributed for free.

GCC compiler logo and its friendly mascot (yes, its mascot is a gnu ;-) )

In this post we will look into the different stages of the compilation process. But first of all, how do we compile and run our program? Supposing we wanted to compile a program named “test.c” we would need to use the below command,

gcc test.c –o test

In this example, option -o is used to indicate the output file name, if this option is not used then the output file will be named a.out,

Otherwise, an executable will be generated under the name “test”,

In order to execute our program we would need the command ./(file name) which is exemplified below,

As it was mentioned before the compilation process takes place in stages. During each stage, each component will take a certain input from the previous component and produce a certain output for the next component. This process will continue until the last component generates an executable. But bear in mind that any failure at any stage of this process will return a compilation error and the executable will not be generated. The four stages are listed below,

  • Pre-processing
  • Compilation
  • Assembly
  • Linking

If you wished to get all the intermediary files created in your directory together with the executable you could do the below command,

gcc –save-temps test.c –o test

Now let’s see what happens during each one of these stages,

Pre-processing

During this first stage several processes take place such as expansion of the included files, comments are removed, expansión of macros and conditional compilation. This output is stored in our file test.i and, as we can see below, it stores tons of information. Most importantly we can notice that comments were removed and #include<stdio.h> was is no longer there but was actually expanded in the file,

If we wished to stop the compiler at this stage we should use the below command,

gcc -E test.c

Compiling

The next stage will generate an intermediate file under the title test.s which is in assembly language so that the assembler can understand it,

In order to stop the compiler at this step you would need to do,

gcc -S test.c

Assembly

The third stage consists of the assembler taking file test.s as its input and turning it into test.o which will contain machine level instructions. During this stage function calls, in our case printf(), are not resolved only the existing code will be converted into machine language.

Should you wish to stop the compilation process at this point, please do,

gcc -c test.c

Linking

During our final stage function calls and their definitions will be linked. The linker also adds some extra code required when the program starts and ends. We can check this by doing size test.o and size test as you can see below,

Thanks for reading this article and hope that your curiosity has been satisfied!

--

--