Introduction to C

From Applied Science

Because the discipline is an introduction, the algorithms studied doesn't use the most advanced features of C. A lot of things aren't studied in this course, just the most elementary to understand how a program is executed. Due to the similarities between mathematics and even spoken language, you just have to read a code carefully to understand the order of the operations.


A short summary about the C language
  • Keywords. They are reserved words, have a fixed and unchangeable meaning. For example: a function or variable cannot be named 'while', because 'while' already has a fixed meaning. The exception is text, we can print the word 'while' on screen because in this case, the word is treated as a char sequence, not as a command. The same applies for symbols such as commas and semicolons, the meaning is fixed and it must be used as pre-determined by the language's syntax.
  • The assignment operation. In the very beginning, to read a = 2 as "a is equal to two" is ok, because the variable 'a' has the value two after the assignment. But when we do arithmetic operations, then we have to read the assignment operation properly: "the value two is assigned to the variable a".

    For example:
    a = b + c;

    This is read right to left, first the operation 'b + c' is processed, then the resulting value is assigned to the variable 'a.
  • Variable declaration. It would be boring to program having to memorize memory positions all the time. To provide a more comfortable way of using the computer's memory the programming language exposes for us the variable's names, much easier to understand and to memorize than numerical codes. The var needs a type, else it would be confusing to have all variables with just one type for all. An analogy: it'd be confusing to store liquids in boxes and it'd be a waste to use a very large box to store a very small object.

/* Variable declaration */
int var;
float var2;
double var3;

/* Variable declaration including assignment of some value */
int var = 2;
float var2 = 4.5;

/* Precision loss, the digits of the non integer part are lost */
int var = 2.5;

/* Mixing types. The type with the highest precision dominates the expression. In this example, float has preference over the integer 2. The result of this operation is 2.5 */
float a;
float b = 5;
int c = 2;
a = b/c;

/* Special case of dominance: even though the var "a" is float, the result is going to be 2.0 because the quotient operation is done first and with integers, the assignment operation is done later. */
float a;
int b = 5;
int c = 2;
a = b/c;


  • The comma ','. In the same way we separate examples of a list with commas, variables are separated with commas in a program. They are also used to separate parameters and arguments of a function. The compiler doesn't "see" blank spaces, therefore int a,b,c; declares 3 variables of the int type, the absence or not of blank spaces before or after the commas is a matter of readability.
  • The semicolon';'. It's used to end a line, a command.
Practical example:

a = c + 3
+ d - 2;

Visually the command has two lines, but to the compiler the arithmetic expression above ends only at the semicolon in the second line. A not uncommon error is to place a semicolon where there shouldn't be one, which results in a command being prematurely interrupted.

Example:

/* the command is never going to be executed */
if (conditional);
command;


Another example:

/* it's permitted to write this way, but it's much worse for reading */
command_1;
command_2;
command_3;


  • Parenthesis '( )'. In functions and arithmetic expressions, they have the same meaning as in mathematics. In flow control commands, such as if and loops, the conditional must be enclosed by a pair of them. Take notice that the compiler doesn't "see" blank spaces, use spaces to make your code easier to read, but there is no compilation or execution damage caused by not using spaces.
  • Conditionals. Commands such as if and loops can only be executed if the conditional is true. It's common to confuse the '==' operator (logic equality) with '=' (assignment) in the beginning. Check the C's docs to know all the available operators.

    There is a subtle difference between the assignment operation and the logic equality. In the first, what is to the right is evaluated first. Whereas in the second, what is to the left is evaluated first. That evaluation order can cause some confusion, nothing really serious though, because those don't change the behaviour of the algorithms studied. Check the language's docs to know the operators' precedence. Take notice that the conditional doesn't make sense if placed outside a command, a == 2; alone doesn't produce any effect.
  • Ternary or conditional operator '?:': Some 'if ... else' statements can be rewritten with the conditional operator as follows:

(conditional expression) ? (expression 1) : (expression 2)

Which is read "is the condition true?". If answer is positive, "expression 1" is run, else, "expression 2" is run. Parenthesis aren't required if the expression is straight forward, but in case there are other operations (assignment, relational, logic, arithmetic) it's a good idea to use them to avoid confusion or compilation errors.

Side note: every conditional is an expression, but not every expression is a condition for something.

  • The curly brackets pair '{ }'. They enclose blocks of commands. They are required in functions and in flow control structures that have more than one command to execute. For example:

if (conditional) {
command_1;
command_2;
}


Another example:

for (counter; expression; increment) if (conditional) {
command_1;
command_2;
}

In this case, the 'for' didn't require brackets because there is just one command nested in it, which is the 'if'. The two commands, 1 and 2, are nested directly under the 'if', not under the 'for'. That's why the brackets are part of the 'if' block, not the 'for' block. If we would've kept the brackets, the the 'if' block could have been written in the same line as the for command.

  • The '#' (sharp) symbol. At the program header lies #include <library.h> and #define. Functions such as printf() and scanf() are defined in header files. Without them we cannot use functions which are defined in libraries that we didn't include, unless we define those functions in our program. Header files are meant to organize groups of functions. Not every function is going to be used, that's why there isn't just one header for all. Functions defined in headers are meant to be reused with ease. Because the algorithms studied in this introduction are simple, complex and large programs are not made, this type of function management is not studied.

#define YES 1
#define NO 0

With defines we can replace 1 and 0, respectively, with YES and NO, this way it becomes easier to read the code.

  • Placeholders. In arithmetic expressions '%' means the remainder operation (the remainder is always integer). In the function printf() it's the placeholder.
Example:

printf("variable's value is %d", variable);

The placeholder '%d' will be replaced by the variable's value when printed. That's how "dynamic" texts are generated, texts that vary what is going to be printed on screen depending on variables' values. Check the C's docs to know all the "placeholders" available. A common error is to print a variable's value with the wrong placeholder, which causes the output to be wrong, even though the calculations done could have been correctly done; this can bring in some serious headache...

  • Increments e decrements. Some expressions can take advantage of shorthands, such as '++' or '--' to make them sorter. Warning! There cannot be a white space between the twin plus sign '+ +', but there can be spaces before or after. Check the C language's docs to know all the shorthands available.
Example:

/* Each pair means the same thing. Shorthand first, followed by the expanded version */
count++;
count = count + 1;

count--;
count = count - 1;

count += a;
count = count + a;

/* be careful with this one! */
count = count + count++;

About the last example: suppose that the variable 'count' has the value 1, when we read the expression, the calculation done is 1 + 1 or 1 + 2? If we evaluate the increment operation first, the final result is 3, but it could be 2 if the compiler would apply the increment operation second. To avoid such ambiguity, we split the expression in two lines, two commands, this way we are sure about which operation is evaluated first. Practical example to differentiate increment done first or second:


int a = 0, b;

while (a != 10) {
b = ++a;
printf("\n%d", b);
}

What's the difference between ++a and a++? In the first case, the loop is going to count from 1 to 10, in the second case, from 0 to 9. In the first, the variable 'a' is incremented first, before the value being assigned to the variable 'b'. In the second, the assignment is done first, before the increment operation. Be careful when tracking that! There are two operations done in sequence and in the same loop's iteration, don't count one iteration and two increments!

  • Arrays and matrices. An array or matrix is declared in the same way as a variable, with an additional pair of brackets for each dimension in use.

int array[10] = {1,2,3,4,5,6,7,8,9,10};
int matrix[2][2] = {{1,2},{3,4}};

/* Wrong! It's forbidden to assign values like this after declaring the array or matrix */
array[10] = {1};

/* blank values means no value assigned to that position */
int array[10] = {1,2,3,4,5,,,,,};

/* Tricky! the array's index can also have increments or decrements operations. Be careful because the increment can also be evaluated before or after the expression in which the array is in. */
array[i++];

Notice that the compiler doesn't distinguishes white spaces, therefore array [] or matrix[] [ ] are okay to use. The dimension's index must be integer and must be constant, it cannot be variable. It doesn't make sense to have half-elements or half-positions in a set. Arrays and matrices of dynamic size, which can vary its size during program's execution, are not studied in this introduction. It's okay if not all positions of the array are used, waste of memory is nothing to worry about (for now). In the example of assigning values at the matrix's declaration, it's not a rule have all elements disposed in rows and columns like that, it's just been done like that to make it easier to read.

A common error is to confuse the element with the position and vice-versa. For example: list[10] = 123; The value 123 is assigned to the position number 10 of the array named 'list'. Warning! When an array of n elements is declared, the first index is always the zero, therefore the last one is always n - 1.

  • Characters. Fundamentally, they are the same as simple variables for single values and arrays for sequences of values, the difference is that characters rely on a table where each numerical code matches one character.

char letter = 'a';
char word[4] = {"house"};

In the same way as arrays we can assign values to a 'string' with a pair of curly brackets. Be careful because as in arrays, the first index is zero and the last is n - 1.

  • Logical expressions. By default, in C, TRUE is associated with any value different than zero, while zero is associated to FALSE. Therefore:

/* conditional is always true */
if (5) command;

/* conditional is always false */
if (0) command;

The comparison operations, such as (a > b) or (a == b), are also associated to TRUE or FALSE. Therefore, a = (1 > 2), 'a' receives the value 0.

Compound comparisons are optimized by the compiler. In the case of the '||' operator (condition_1 || condition_2), if 1 is true, then 2 isn't evaluated because the condition has already been satisfied. In the case of the '&&' operator (condition_1 && condition_2), if 1 is true, the evaluation continues to 2, because if 2 is false, then the conditional isn't satisfied. If 1 is false, then the conditional isn't satisfied before having to evaluate 2. In combinations of both (condition_1 || condition_2 && condition_3), parenthesis must be used to know which pair is evaluated with '&&' or with '||'.

It's possible to invert an affirmative test with a negative one. For example:

while !(a == 2) command;

Instead of executing when 'a' is equal to 2, the loop is going to execute when 'a' is not equal to 2. That has the same effect of exchanging '==' by '!='. This is similar to probability, sometimes it's easier to think on the opposite case.

  • Declaring and using functions. A function has to be declared much like a variable, because it's also some place in the computer's memory. When the course reaches the first program compiled in C we already make use of 'return 0' and the main function, although we don't know yet what are those. Check the C language's docs to know all the function types available.

/* this is a function definition. The parameters have no rule about the order of each one */
int func (int a) {

/* variable is local, it's invisible to other functions */
int var_local;

/* some operation done here */

/* returns some value */
return some_value;
}

It's by understanding the difference between a local variable and a "global" (quoted because it's in the sense of non local) that we understand why variables can have the same name if they belong to different functions.

/* 'a' is not a function's parameter, it's a local variable */
int func () { int a; }

/* this makes no sense, it's a confusion between to define and to call a function */
int func (int a = 2) { }

You can't declare a variable and assign a value within the function's declaration.

/* this is a function call. The arguments are sorted in the same order as the function's parameters */
func (argument1, argument2, ...);

A function must be declared before being called, otherwise the program won't compile. However, there is a way to define a function after calling it, that is achievable trough function prototypes.

/* function prototype, there is no need to give the parameter variable's names here */
int func (int, ...);

/* when the parameter is a pointer or array / matrix, you can omit the name, but don't forget the asterisk! */
int func2 (int *, int *);

/* main function, where the body of your program lies */
int main() { }

/* to declare a function inside another is forbidden */
int main() { int sub_func() { }}

/* declaring the function */
int func (int param1, ...) { }

/* array / matrix as a parameter, notice that first index is always empty */
int func2 (int array[], int matrix[][index_max]) { }

The prototype's purpose is to inform the compiler what is the function's and parameter's types. This avoids problems with function calls made with incompatible argument's types or incorrect number of arguments. Another prototype's purpose is to allow programs to be handled using multiple files, but that is not done in the introduction because all algorithms and programs studied are simple enough to not need such management technique.

Side note: if a function returns a value, it may be used as part of or be the conditional itself. But if a function has nothing to return it cannot be used in the conditional, as "nothing" can neither be associated with TRUE nor FALSE.

  • Pointers and functions. Pointers are declared much like variables, with the addition of an asterisk to denote that it's pointer and not a regular variable. The 'ampersand' operator denotes "address (of memory) of".

/* syntax to declare pointers */
int *p = &var;
int *p;

/* The first assigns the value 10 not to the pointer, but to the memory location pointed by the pointer. Be careful! What if the the pointer has been declared but without being assigned a memory address? In this case the value 10 has been assigned to some unknown memory address. The second is "linking" the pointer to some memory address. Be careful! 'p' received the address of 'var', not the value of 'var'! */
*p = 10;
p = &var;

/* this way is wrong */
*p = &var;

/* if p has been declared as a pointer, then the syntax is wrong. p cannot be "converted" to a regular variable */
p = 10;

Pointers are variables with a specific purpose. They require a type like any other variable, but they are not used as a regular variable. Pointers are meaningless if used without functions. The definition of a pointer is tightly connected to how the computer's memory work. The best way to understand them is to solve problems which require pointers.

Let's see how is the pointer's, array's, matrice's and function's syntax when all are used in combination with each other. Notice the differentiation between local and non local is also applied for pointers. The variable doesn't need to match the pointer's name:

/* can only receive variable's memory address */
int func (int *p) { }

/* receives a variable's memory address and a value */
int func2 (int *p, int value) { }

/* receives a function's memory address. In the first parenthesis the pointer. In the second, the pointed function's parameter(s). As in prototypes it's permitted to omit the variable's names. Be careful! The parameter(s) are from the pointed function, not from the func3 function. In this example, func3 has just one parameter, the pointer. */
double func3 ((*p)(float, float, ...)) { }
int main() {

/* a declaration of a var and an array with 10 positions */
int var, array[10];

/* declaration of a pointer including assignment of the array's memory address' beginning position. To properly understand why, when a function has an array as one of its parameters, we can use both the array's name and the array's memory address to index zero to achieve the same effect, we have to understand pointer's arithmetic. */
int *p = &array[0];

/* the function "func" can only be called with memory addresses as arguments, not with values */
func(&var);

/* this isn't calling the function with the value of the element number 5, but rather with the element's 5 memory address */
func(&array[5]);

/* wrong! This is passing the pointer p's memory address to the function. A pointer cannot receive the address of another pointer, unless we are dealing with pointers of pointers, which is not the case */
func(&p);

/* the first argument is a memory address, the array's beginning. That's why it only has the array's name. The second is neither the value 5 nor the memory address of it. It's the value which is stored at the array's index number 5 */
func2(array, array[5]);

/* the func3 call is done like any other function call, but in this case, the pointed function's name is the memory address itself, don't use '&' in the same way as with variables */
func3(name);
}