2 C++

2.1 Introduction

In the Introduction to Algorithms and Programming (IAP), we covered the basics of programming in C++. This course will build on that foundation. Although newer standards exist (C++20, C++23, C++26), we will use the C++17 standard throughout this course. C++11, C++14, and C++17 introduced several new features to the language and the Standard Template Library (STL). While these features may differ slightly from older versions, this will not significantly impact our course, but they do allow you to utilize some of the newer language features if you wish.

You should review a list of new C++ features to understand what the more modern versions of the language offer. Check the GCC, Clang, and MSDN websites for information on new features and compiler support. Keep in mind the specific language version when looking at code on the Internet, as such code may use features that are not part of the C++17 standard and may fail to compile properly in our environment.

The main aim of this course is to help you understand how to organize data at the lowest level in memory and consider the efficiency implications of different layouts. We will generally use raw pointers and manually handle memory allocation and deallocation. Note that this is not best practice in real-world coding; in those contexts, you should use smart pointers and RAII to manage memory, helping to avoid bugs and prevent leaks.

2.2 Compilation

2.2.1 Command Line

To compile a C++ program from the command line, use g++:

g++ -std=c++11 -o HelloWorld source1.cpp source2.cpp

Explanation

g++: Invoke the GNU C++ compiler.
-std=c++17: Tell g++ to activate all the C++17 language features.
-o HelloWorld: Compile and Link the code to create an executable called: HelloWorld.
source1.cpp source2.cpp: A list of input C++ source files.

2.2.2 Make

Typing the relevant g++ command and listing all your files is time-consuming and error-prone in larger projects. Typically, projects use Makefiles instead. make is a program that reads a Makefile and runs the relevant commands to compile your program. Once you have created your Makefile, to fully compile your program, you just type make in the terminal instead of the entire g++ command every time you need to compile.

Note that the indentation in Makefiles is mandatory and should be performed with a single TAB.

Here is a simple Makefile to automate the example above:

all:
    g++ -std=c++17 -o HelloWorld source1.cpp source2.cpp

Now, whenever you type make in the terminal, it will compile your program as though you had typed the full g++ command each time.

make is actually a very clever program and can automatically detect all the .cpp files in the folder. The code below shows a more complicated Makefile. Don’t be intimidated—all you have to do is change the value of BIN from HelloWorld to the name of the output file you want. The rest will just work, assuming all the .cpp files in the folder should be compiled and linked together. This file compiles each .cpp file into an object file, then links them all together into an executable with the name stored in the BIN variable. Additionally, make can detect which files have changed. So, if you have files that take a long time to compile, only the updated .cpp files will be recompiled, then linked with the old object files. This can significantly speed up the compilation process!

BIN=HelloWorld            # Output executable/program name
CXX=g++                   # Which compiler to invoke
CXXFLAGS=-std=c++17 -Wall # What flags/options to use

SRC=$(wildcard *.cpp)     # Creates a list of all .cpp files 
                          #  (source1.cpp, source2.cpp)
OBJ=$(SRC:%.cpp=%.o)      # Creates a list of object filenames
                          #  (source1.o, source2.o)
# Remember that source code is compiled into objects (.o) first,
#  and then linked into one executable file

all: $(OBJ)              # Before the executable, compile each OBJ 
    $(CXX) -o $(BIN) $^  # g++ -o HelloWorld source1.o source2.o

%.o: %.cpp               # If .cpp changed, recompile .o
    $(CXX) $@ -c $<      # g++ source1.o -c source1.cpp

clean:                   # Rule to clean up any extra files
    rm -f *.o            # Delete all object files
    rm $(BIN)            # Delete the main executable

We have also created a target to tidy up the folder. If you type make clean, it will delete the compiled object and executable files, leaving only your source code behind.

2.2.3 Integrated Development Environments (IDEs)

Integrated Development Environments (IDEs) are software tools that enable you to create and manage projects with multiple source code files. These IDEs typically offer syntax highlighting for easier reading, file management for proper code structuring, and features for debugging and compilation. Using an IDE is highly recommended as it simplifies the coding process.

In this course, we will use Qt Creator. However, other excellent C++ IDEs, such as VS Code and CLion, are also available and suitable for this course. You can register for a free license using your Wits email address for both Qt Creator and CLion. In each lab, we will provide project files specifically for Qt Creator. You can open these files in Qt Creator, which will automatically load all the necessary source code, header files, and settings required to run the program.

Note that the IDE is primarily an editor, and we generally still require a separate compiler for each language we want to use. On Ubuntu, we can use the compiler included in the build-essential package; on Windows, the Qt Creator installation wizard can install the MinGW package; and on macOS, Xcode can install the Clang compiler, or Brew can install the g++ toolchain. This setup is discussed later in Section 2.19.

Within Qt Creator, simply click the play or debug button to compile, link, and run your code.

2.3 Hello World!

A Hello World! example is always a good way to check that your environment is correctly setup, ensuring you’re able to compile and run the simplest of programs. Copy the code below into a file called hello.cpp:

#include <iostream>

using namespace std;

int main(){
    cout << "Hello World!" << endl;
    return 0;
}

Download the generic makefile code and save it as makefile without any extensions.

Open a terminal, navigate to the folder containing the files you just created, and run make. You should see output similar to that of Figure 2.1.

Figure 2.1: Building Hello World

We see that make ran two commands, the first to compile the C++ code into an object file (hello.o), and then to link the object into an executable (HelloWorld). List the files using the ls command. You should see hello.o and HelloWorld. Run the new executable by typing ./HelloWorld and you should see output similar to that of Figure 2.2.

Figure 2.2: Building Hello World

Recall that any line that starts with a hash (#) is a preprocessor directive. The preprocessor looks at the code first and follows any preprocessor directives before passing the final code to the actual compiler. Remember that preprocessor commands do not end with a semicolon (;). Line 1 tells the preprocessor to include the contents of the iostream header file. To view the output of the preprocessor, use the -E command of g++. In the terminal, type the following command to save the preprocessed version of our program to preprocessed.cpp and have a look at the code that the compiler actually sees!

g++ -E -o preprocessed.cpp hello.cpp

C++ allows us to organize functions and objects into groups called namespaces. C++ puts its built-in features in the std namespace. Line 3 of our program tells the compiler to look for functions and objects inside the std namespace. If we did not include Line 3, we would need to explicitly tell the compiler where to find each object using the scope (::) operator. Thus, we could have written our program like this:

#include <iostream>

//using namespace std;

int main(){
    std::cout << "Hello World!" << std::endl;
    return 0;
}

Remember that C++ always starts execution with the int main() function, which we see on line 5. This is just a normal function that returns an int when it finishes, as seen on line 7. On most systems, returning 0 tells the operating system that everything finished correctly. Returning anything else usually means there was an error.

Finally, on line 6, our program uses the cout object from the std namespace to print to the terminal through stdout. We first print the words “Hello World!” followed by the end-of-line character (endl). std::endl prints out a new line character and flushes the output buffer. Flushing the output buffer tells the system to actually print the characters to the screen.

2.4 Data Types

C++ gives us a number of built-in data types. These allow us to store different types of information, such as integers, real numbers and characters. Some of the most useful data types are listed in Table 2.1 below.

Table 2.1: C++ Data Types Minimum Sizes
Type	Name	Minimum Size
`bool`	Boolean (true/false)	1 Bit
`char`	Character	8 Bits
`int`	Integer	16 Bit
`long`	Long Integer	32 Bit
`long long`	Even Longer Integer	64 Bit
`float`	Single Precision Floating Point Number	6 Significant Digits
`double`	Double Precision Floating Point Number	10 Significant Digits

While the C++ standard provides a minimum size for each type, compilers are allowed to implement the types using larger sizes if they wish, based on the architecture of the system. Usually this is influenced by the word length of the hardware. We can use the sizeof() function to tell us how big the type is, in bytes, on our machine – by multiplying by 8 we can calculate bits.

#include <iostream>
using namespace std;

int main(){
cout << "Type\t\tSize (Bits)" << endl
     << " bool\t\t "          << 8*sizeof(bool)      << endl
     << " char\t\t "          << 8*sizeof(char)      << endl
     << " int\t\t "           << 8*sizeof(int)       << endl
     << " long\t\t "          << 8*sizeof(long)      << endl
     << " long long\t "       << 8*sizeof(long long) << endl
     << " float\t\t "         << 8*sizeof(float)     << endl
     << " double\t\t "        << 8*sizeof(double)    << endl;
    return 0;
}

Figure 2.3: Sizeof Data Types in GCC on an x86_64 architecture

Based on the sizes calculated above, 2.4 shows the range of each type. Each type - other than bool, float and double – has two versions: signed and unsigned. The C++ standard says that at a signed variable’s range should be evenly split between positive and negative numbers. It does not dictate the actual representation of these numbers although in practice most architectures, including x86, x86_64 and ARM use Twos Complement to represent negative numbers. Floats and Doubles on the other hand usually use the formats specified in the IEEE Standard for Floating-Point Arithmetic (IEEE754). By default char is unsigned, while the other integral types are signed. There are also a number of types defined in cstdint that has explicitly sized types: int8_t, int16_t, int32_t, uint8_t, uint16_t, uint32_t, which are only defined if the architecture supports it.

Figure 2.4: C++ Data Ranges on x86_64 compiled with GCC

Ranges can be found by including the limits header and using:

std::numeric_limits<T>::min()
std::numeric_limits<T>::max()

For example:

cout << numeric_limits<int>::min() << " to " << numeric_limits<int>::max() << endl;

2.5 Strings

Another useful data type is the C++ string. In C strings are represented as arrays of characters with a null character (\0) at the end. C++ strings are implemented in the same way, but are wrapped in an class that makes them easier to deal with. For example, we can check the length of a string by writing myString.size() which just returns the number of characters directly. In contrast, with C-string, we would need to loop through the array counting each character until we reach the null-character.

To use strings, we first need to include the string header. After that we can use strings in the same way that we can use any other variable. You should familiarise yourself with the basic operations of the string class.

shows some of the common operations used on the string class, but there are many more and you should look up what other functions are available so that you don’t have to rewrite them when you need them.

#include <iostream>
#include <string>

using namespace std;

int main(){
  // I'm an empty string.
  string s1;
  // s2 gets a copy of the characters in s1
  string s2 = s1;
  // s3 gets a copy of the characters in the string literal
  string s3 = "HelloWorld!";
  // s4 contains "yyyyy"
  string s4(5,'y');
  // We can print strings
  cout << s3 << endl;
  // We can read a word until we find a space
  cin >> s1;
  // ... or we can read all the individual words in the buffer
  while(cin >> s1){
    // Do something with each word we find until we run out
  }
  // We can read a line from stdin
  getline(cin, s);
  // Check if a string is empty
  if(s1.empty()) {}
  // Get the size of a string
  int n = s4.size();
  for(int i = 0; i < n; i++){
    // We can access individual characters of the string.
    char c = s4[i];
  }
  // Concatenate string. s5 is "HelloWorld!yyyyy"
  string s5 = s3 + s4;
  // Copy characters from one string to another
  s1 = s4;
  // Check whether strings have the same characters
  if (s1 == s4) {}
  // ... or check that they don't
  if (s1 != s5) {}
  // ... or check alphabetical order (<, <=, ==, >, >=)
  if (s2 < s4) {}
}

Note that strings are sequences of char elements. This means that the type of text we store in them should use no more than 8-bits per character. If we are planning on storing unicode characters (e.g. U+1F914 = 🤔), then we need to use another type of string specialisation, such as std::u16string or std::u32string.

2.6 Reading input from `stdin`

In the previous examples, we wrote text to the terminal through stdout by using the std::cout object. To read from the terminal through stdin we make use of the std::cin object.

cin’s syntax is similar to that of cout except the angle brackets face the other direction, illustrating how data goes from cin to the relevant variable. shows some standard ways of reading from the stdin using cin.

2.7 Vectors

Vectors are a type of smart array in C++. They have many uses and can be accessed in a way similar to the standard arrays that you have already seen. We will discuss them later on in this course after dealing with our first few data structures. You could store a number of strings in a vector as shown in .

2.8 Branching (If Statements)

All programs need to do some sort of logic. Under some conditions they do one thing, and under others they do something else. We use if statements to select which branch of our code will be run. If statements have the following syntax:

linenos=false if(condition) // Statements to execute if condition is true else // Statements to execute if condition is false

The condition can be any expression. If boolean, the condition is either true or false. If any other type, the result will be cast to an integer and if it is 0 the expression is false, if it is any other number the expression is true.

We can also check whether pointers point to some address or whether they are the nullptr. The two approaches in \[lst:testpointers\] are equivalent.

linenos=false if(ptr != nullptr) // ptr points to some address else // ptr is null

linenos=false if(ptr) // ptr points to some address else // ptr is null

The 3 fundamental logic operators are NOT(!), AND(&&) and OR(\(||\)). With round brackets () they can be used in combination with any other logical operators or expressions to produce a boolean value. AND and OR are both short-circuiting, which means that an evaluation starts on the left and as soon as the overall answer is known, evaluation stops. This is particularly useful when avoiding errors like dividing by zero as seen in .

In general, if the first part of a logical AND is False, then the program knows the whole thing is False and doesn’t check the second part. If the first part of a logical OR is True, then the program knows the whole thing is True and doesn’t check the second part.

2.9 Loops

Loops allow us to repeat sections of code until some condition is no longer true. There are 3 types of loops in C++. All loops can be exited early by using the break keyword. Similarly, you can skip to the end of the body by calling continue, after this the increment, if present, will be performed and condition checked. Based on the condition, we will either enter the loop again or the program will proceed with the statements after the loop.

2.9.1 While Loops

A While Loop checks the condition at the start of the loop. If the condition is false, then it will not execute the expressions inside the body of the loop. If the condition is true, then it will run through the expressions inside the body of the loop, in sequence, repeatedly, until the condition is no longer true.

If the condition starts off false, then the inner contents of the loop will never run.

linenos=false while(condition) // Do stuff

2.9.2 Do Loops

A Do Loop is almost exactly the same as a while loop, except it checks the condition at the end of the loop. This means that the body of the loop is guaranteed to run at least once, regardless of whether the condition is true or false.

Once the body has executed once, the condition is checked, if it is true, the loop repeats, if not the program continues past to statements following the loop.

linenos=false do // Do stuff while(condition);

2.9.3 For Loops

It often happens that we need to repeat some code a number of times. For Loops provide a convenient method to do so. For loops first run the initialiser, this usually declares some counter variable. It then checks that the condition, if it is true, then we enter the loop. After executing the last statement in the body of the loop, we run the code in the increment section, which usually updates our counter. After that, we check the condition again and either enter the loop again (if its true) or continue with the code that follows the loop (if its false).

In the example above, the program first gets to the loop and runs the initialiser. This only ever happens once, it declares i and initialises it to 0. After that we check that our counter is less than the length of the string (11). It is less, so we print out character 0. We then perform the increment so that the counter is now 2. We check if it is less than 11, it is, so we enter the loop. We continue like this until the counter is 10, we check the condition, 10 is less than 11 so we enter the loop and print the character. The counter is then updated to 12. We check the condition and now it is false so we jump to the end of the loop, print out the end of line character and continue with the rest of the program.

2.10 Pointers

2.10.1 Basics

Pointers are variables, just like any other, except they always contain the memory address of another object/variable. To aid understanding in this section, we will use hexadecimal representations for memory addresses, and decimal representations for values. Suppose we have the following code fragment:

int myNumber; myNumber = 25;

	0x10	0x11	0x12	0x13
		25
	myNumber

Memory Layout

When the compiler sees the code in \[lst:pointer1\], it allocates enough space to store myNumber in memory. In 5 we can see that the compiler decided to store myNumber in memory address 0x11. Now a pointer to myNumber is simply a variable that store the address 0x11.

int myNumber; myNumber = 25;

int *myPointer; myPointer = &myNumber;

	0x10	0x11	0x12	0x13
		25		0x11
	myNumber			myPointer

Memory Layout with a Pointer

To declare a pointer, we use the * operator as seen on line 4 in \[lst:pointer2\]. To get the address of myNumber, we use the reference operator (&), as seen on line 5 of \[lst:pointer2\].

Pointers are useful, because we can follow the address, 0x11, and see what value is stored there. To do this we just use the dereference operator (*) followed by the name of the pointer variable. We can also use a dereferenced pointer to edit the original variable. This is illustrated in \[lst:pointer3\] and .

int myNumber; myNumber = 25;

int *myPointer; myPointer = &myNumber;

cout << "The value stored in myNumber is: " << myNumber << "" "The value stored in myPointer is: " << myPointer << "" "If we follow myPointer we get: " << *myPointer << endl;

cout << "Running: *myPointer = 42;" << endl; *myPointer = 42; cout << "The value stored in myNumber is: " << myNumber << "" "The value stored in myPointer is: " << myPointer << "" "If we follow myPointer we get: " << *myPointer << endl;

cout << "Running: myNumber = 1;" << endl; myNumber = 1; cout << "The value stored in myNumber is: " << myNumber << "" "The value stored in myPointer is: " << myPointer << "" "If we follow myPointer we get: " << *myPointer << endl;

Sizeof Data Types in GCC on an x86_64 architecture

This processes is fairly easy to remember.

If we want to declare an integer, we use: |int varName|
If we want to declare a pointer to an integer, we use: |int *pName|

This means that both varName and *pName are of type int. Pointers can point to any type or object that has been defined, so we can have declarations like:

bool *pointer_to_bool; char *pointer_to_char; int *pointer_to_int; long *pointer_to_long; long long *pointer_to_long_long; float *pointer_to_float; double *pointer_to_double; // We can also have pointers to Structs and Classes Car *pointer_to_car; Student *pointer_to_student; string *pointer_to_string;

2.10.2 Initialisation & Null Pointers

The most common reason your program will crash is because of a Segmentation Fault².

This means that you have tried to access memory that is outside your programs memory segment. Most commonly this is because you have tried to deference or follow a pointer that doesn’t actually point to anything. Often this happens because you haven’t initialised a pointer to point to something before trying to dereference it – this is dangerous and the best outcome is that your program crashes. Other outcomes could be that your program corrupts itself or returns results that look correct, but aren’t.

int *myPointer; // This might crash myPointer doesn’t point to anything :( // or it might not crash if myPointer happens to have a // value that points to memory we own... in this case the // program might look like it works, but will break eventually. cout << myPointer\[0\] << endl;

// Allocate memory and make myPointer point to it myPointer = new int\[10\]; // Do stuff... myPointer\[0\] = 42; cout << myPointer\[0\] << endl; // We’re now finished with that memory so we can deallocate it. delete [] myPointer

// This might crash because we don’t own that memory anymore :( // or it might not crash because the operating system // hasn’t reclaimed the memory yet :"( cout << myPointer\[0\] << endl;

When you first declare a pointer, it contains garbage – whatever was in that memory space at the time. To save yourself a lot of pain and suffering, you should always initialise a pointer the nullptr as shown below:

linenos=false int *myPointer = nullptr;

If you try to dereference the nullptr your program will definitely crash – this is a good thing because you can identify the problem immediately. Similarly, when you delete an object to which a pointer points, you should set it to the nullptr so ensure that you don’t accidentally use it.

linenos=false delete [] myPointer; // Deletes the thing that myPointer points to myPointer = nullptr;

2.11 References

References are a unique construct in C++. They are like aliases to other variables or objects, while they’re similar to pointers in that they refer to some object, they are different in some fundamental ways.

A reference must be initialised to point at an object when it is declared.
A reference cannot be changed to point at a different object after its been declared.
A reference is guaranteed never to be null because of these rules.

When using a reference to an object, imagine that you’re actually using the original object. This is particularly useful for passing objects to functions by reference, rather than using pointers.

In the past, you needed to use pointers to pass by reference. The danger here, is that if one passes an invalid or null pointer, the function may crash when dereferencing it. Even worse, the program may not crash and rather corrupt other data in memory. References are safe because they must be initialised to an existing object when they’re declared and will therefore avoid these kinds of bugs.

References are much safer to use than pointers, but they are not as powerful. They should be used over pointers when appropriate, but are not replacements for pointers.

\[lst:references\] shows some example uses of references.

Passing By Value / References

2.12 Structs & Classes

Structs and classes in C++ are basically the same thing. They allow us to group variables and data together into a single object. By default, variables in a struct are public and variables in a class are private. Structs and Classes can also contain member functions. These functions are locked inside and can only access variables and other data declared inside the struct/class. From this point onwards, any mention of classes means both classes and structs. Before we can use a class we must define it.

In \[lst:student\] we can see that we are defining a class called Student. Because it is a class as opposed to a struct, the members are automatically private, so strictly line 5 is not necessary – I always include it because I think it makes the code more descriptive and easier to read. Line 7 tells the compiler that any definitions following it should be considered public and should be accessible from outside the class.

Lines 6 and 8 – 11 then declare a number of class variables. These variables are accessible by any function in the class regardless of whether they are public or private.

Line 16 is a special function with the same name as the class. This is called the constructor and is run every time we declare a new instance of a student. It can reserve memory if needed, or it do any other setup work we need before our class is ready to use. In our case, it just prints out some text and initialises the pointer to nullptr.

Line 22, on the other hand, is the destructor. The destructor is run when the object is destroyed. This function should free any memory that was explicitly reserved by the object. When an object reserves space and doesn’t free it when being destroyed, the operating system still thinks that the memory is used. If this happens again and again, eventually the system will run out of memory. This is called a memory leak and they can be very hard to find – so always be sure to free memory when you’re done with it!

Usually class definitions are split into separate files. Let’s assume that the student class above was saved in student.h. We can now use our class as we would use any other object in C++.

\[lst:student_main\] shows how we could use our class. Firstly we need to include the header file for student and iostream. We use " " for student.h because it is in the same folder, while we use \(< >\) for iostream because the system must locate the file in the system path. Line 10 shows how we can declare a variable called myStudent which is an instance of our new class. Line 12 shows a way of declaring a pointer and creating a new instance to which it points. Whenever we declare something using the new keyword, we MUST delete it and free the memory when we’re finished. Because myStudent was not declared with the new keyword, it is automatically deleted when it goes out of scope.

Lines 15 – 18 show how we access the different members of the object, while lines 21 – 24 show how we can access the members through a pointer. In line 27 we place the address of myStudent in the myFriend pointer of anotherStudent. Make sure you understand what is happening here.

Classes are very powerful and are foundational to Object Oriented Programming (OOP). We will not use most of the features of OOP in this course, but you will learn about them later on in your degree.

2.13 Arrays

Arrays allow us to store a number of items of any type. When you allocate an array its size is fixed. An arrays size can only be ‘changed’ by allocating a new array, copying the elements across and then freeing the old array.

When you ask for memory for an array, you get one large contiguous block of memory, this means that accessing an array is fast, because the compiler knows exactly where to look to find the \(n^{th}\) object (startAddress + i x sizeof(datatype)). You need to manually remember the length of the array and the program will not automatically crash if you try access something beyond the end of the array. You need to be very careful that you do not index beyond the bounds of your array. This is why things like vector exist – to wrap arrays and do some extra safety checks for you.

shows how to allocate, use and free an array. We see that there are 2 different ways to allocate the arrays, statically and dynamically, these are discussed in 14.

shows the two main ways of dealing with 2D arrays. The first is to statically allocate the arrays, allowing the compiler to calculate the correct addresses. The second method is to use an array of pointers. Always remember to free dynamically allocated memory.

2.14 Static & Dynamic Allocation

C++ has two areas where variables can be stored:

The Stack
The Heap or FreeStore

When you declare a variable using the new keyword, you are telling the compiler to reserve space on the heap for that variable. When you declare a variable normally without the new keyword, the compiler reserves space for that variable inside the stack. We will revisit this concept after we have learned about stacks.

But for now, it is sufficient to say that each function call gets its own space on the stack. When you statically allocate a variable, part of that stack memory is used. When the function returns, the computer reclaims all the memory that the function used on the stack and destroys any static variables that were created within that functions frame.

Dynamically allocated variables, however, are stored on the heap, and they continue to live until they are explicitly freed or until the program is terminated. This has 2 very important implications for now.

Firstly, when you want to return an object or array from a function using a pointer, you must allocate that object dynamically as all static objects are destroyed when the function returns. \[lst:static_return\] illustrates the incorrect way of returning data.

Output from [lst:static1]

Output from [lst:static2]

We can see that \[lst:static1\] looks like it returns the correct result, but when we rearrange the code to \[lst:static2\], the program now produces the incorrect output. This is because the first time allocate_and_fill runs, it returns a pointer to statically allocated memory. When the function finishes, this memory is reclaimed by the system. In the first example, we read the memory before anything else has a chance to overwrite it, but in the second example, when we call the function again, it happens to be allocated the same memory as the first time and overwrites the original contents in memory. This is very dangerous behaviour. In the best case, the program crashes, but often it will be very difficult to find. Always be careful when returning pointers and references to statically allocated memory.

shows one correct way of returning an array. The only change is on line 6 where we now allocate the array dynamically on the heap.

The second important implication of allocating space on the heap is that it persists until out program terminates. This means that if repeatedly call a function like the one in \[lst:dynamic1\] without manually freeing the memory, we will eventually use up all the memory on the system and the computer will probably crash. An easy rule of thumb is that every new must have a corresponding delete, and every new[] must have a corresponding delete[].

So back in our example, we must now remember that we need to call delete []arr1; delete []arr2; at the end of main in order to make sure we don’t leak memory. Later we’ll learn of some smart pointers that delete stuff automatically when we’re finished with them.

2.15 Recursion

Recursion is when a function calls itself. Recursion is used to solve problems when a larger problem can be solved in terms of a similar smaller problem. The Factorial and Fibonacci functions are good examples this.

Consider the factorial as defined in maths: \[n! = n \times (n-1) \times (n-2) \cdots 2 \times 1 = n \times (n-1)!\]

We can see that \(n!\) is just \(n\) multiplied by \((n-1)!\). In turn, \((n-1)!\) is just \((n-1)\) multiplied by \((n-2)!\) and this continues until we reach 1. This should looks very similar to mathematical induction, and just like we need a base case in induction, we need a terminating point in recursion. In programming, this termination point is called an escape hatch. In this example, we can terminate when we reach 1 or even 0. \[lst:factorial\] shows a recursive solution to the factorial problem.

Where \(F_0 = F_1 = 1\), the Fibonacci sequence is defined as: \[F_{n} = F_{n-1} + F_{n-2}\] What do you think a recursive solution to this problem would look like? What escape hatch would we need?

2.16 Classes

In C++ (and all programming languages that support Object Orientation) we are able to create our own types by connecting many of the primitive types together. The normal primitive types are the ones that come with C++ and are built into the language – such as int, float, char, double and pointers. Note that a vector and all the other types that you #include into your files are usually classes and not primitive types. They have been added to C++ through the Standard Template Library.

A vector is a good example of why we might like a class. We will study them in more detail later on, but fundamentally, vectors store an integer that tracks the number of items being stored, it tracks the amount of memory that has been reserved and it keeps a pointer to that memory. It is useful to bundle these 3 variables into a single object that we can think of as a vector. We can then perform operations on that vector, such as push_back() or size(). When we group a number of types together into a single new type, this is called a Class. A class is a new type of variable – for example, Car is a Class, but a specific instance of a Car is an object.

class Car{
public:
   int num_wheels, num_doors;
   string colour;
   int coolness;
   
   Car(){
      // This is a constructor, which runs when the car is created
      // It should allocate memory and setup variables.
      num_wheels = 4;
      num_doors = 4;
      coolness = 10;
   }
   ~Car(){
      // This is a destructor, which runs when the car is destroyed
      // It should free memory and clean up any mess that it made (opened any files etc.)
   }
   
   float price(){
      // This is a member function
      if(colour == "Pink"){
         return 100*coolness + 10*num_wheels;
      }else{
         return 10*coolness + 5*num_wheels;
      }
   }
};

void main(){
   Car richardsCar;
   richardsCar.num_wheels = 5;     // We have a spare wheel
   richardsCar.colour = "Pink";    // Because I can
   richardsCar.coolness = 1000000; // Obviously
   
   Car stevesCar;
   stevesCar.num_wheels = 3;   // We've lost a wheel or two
   stevesCar.colour = "Blue";
   stevesCar.coolness = 1;

}

In the example above, Car is a class, it tells the compiler that we can have variables of type Car. richardsCar on the other hand, is an instance of a Car and we would say that richardsCar is an object. We see that the Car type has some members called num_wheels, num_doors, colour and coolness. These are values that are associated with each car that we create. In the example above we see that richardsCar and stevesCar are both instances of Cars and therefore each one has its own version of num_wheels, coolness etc.

Figure 2.5: Classes store their own values for member variables.

Classes can contain more than just data. They can also contain member functions. These functions belong to the class and usually operate on the data stored inside the class. For example, we have a function called price. This function only makes sense in the context of an actual instance of a Car. So we could ask “How much would you pay for richardsCar or stevesCar?” but you wouldn’t ever ask “How much would you pay for Car?”

In C++, classes are just like normal variable types. We can pass an instance of a car into a function by value or by reference. We can also get the address of that object as well.

class Car{...};

void print(Car& curr){ // The car is passed by reference
   cout << "There are " << curr.num_wheels << " wheels"
        << "and the car is " << curr.colour << "." << endl;
}
void paint(Car* curr, string new_colour){ // A pointer to the car is passed
   // When we have a pointer to an object we have two different ways
   // of accessing members:
   // A) Access the member function/variable through the pointer
   curr->colour = new_colour;
   // B) or Dereference the pointer and access the member
   (*curr).colour = new_colour;
}

void broken_paint(Car curr, string new_colour){
   // Watch out, here we pass the car by VALUE
   // This means that the function receives a copy of the car object
   //  and if we make any changes, they do not affect the original car
   curr.colour = new_colour;
}

void main(){
   Car richardsCar;
   richardsCar.num_wheels = 5;     // We have a spare wheel
   richardsCar.colour = "Pink";    // Because I can
   richardsCar.coolness = 1000000; // Obviously
   
   Car stevesCar;
   stevesCar.num_wheels = 3;   // We've lost a wheel or two
   stevesCar.colour = "Blue";
   stevesCar.coolness = 1;

   // Print takes references to the object.
   print(richardsCar); // There are 5 wheels and the car is Pink.
   print(stevesCar);   // There are 3 wheels and the car is Blue.
   
   // Paint takes pointers to the object and we need to pass the address.
   paint(&richardsCar, "Neon Pink"); // Richard's car is now Neon Pink.
   paint(&stevesCar, "Jungle Grey"); // Steve's car is now Jungle Grey.
   
   // broken_paint takes a copy of the object
   broken_paint(richardsCar, "Black"); // Richard's car is still Neon Pink
}

There are many finer details to objects and Object Orientation as a programming paradigm. We can do fancy things like abstract/virtual classes, interfaces, inheritance and polymorphism which allow us to create modular, reusable code. This is the main standard for programming in industry and you should learn as much about this paradigm as you can. You will learn about these concepts more formally in second and third-year courses (mostly in Mobile Computing 2 and Software Design 3).

For now, the important thing to remember about classes is that they allow us to build abstractions – you don’t know how a vector or a car works, but you know what member functions you can call and the documentation should tell you what those functions do. We don’t need to know how the class achieves those goals, just that the class will make sure it works. When you used a vector in IAP, you knew that you could use myVector.push_back(thing) even though you didn’t know how that function manipulated the underlying memory. The class abstracts away, or hides, the implementation details from you and you can focus on solving the actual problem that you care about.

However, we can’t completely ignore what the class is doing. Just because we’re making a single function call, doesn’t mean that the object is not doing a lot of work. These functions may contain loops, they may perform various memory allocations and could do any amount of work that we don’t know about. It is important to know that the way we are using classes is efficient. This is why we will look at a number of different data structures (like vectors) and algorithms (like push_back) to understand their computational complexity. This will allow you to use the right structure for the job and you’ll know that your code will still be efficient and scalable.

Here is another detailed description of classes and how they work:

There are also some more explanations if you are still not comfortable with classes:

Additional Reading:

Chapter 1.5 (the basics covered here) – Goodrich, Tamassia, and Mount (2011)
Chapter 2 (more advanced) – Goodrich, Tamassia, and Mount (2011)
Chapter 1.4 – Weiss (2014)
Chapter 2.3, 2.4 – Dale, Weems, and Richards (2018)

Goodrich, Tamassia, and Mount (2011) cover this topic particularly nicely.

2.17 Pointers

Pointers are discussed in some detail in Section 10 of the “C++ Revision Notes” on Moodle. Please ensure that you’ve gone through those notes in detail. We will revisit the idea of pointers and dynamic memory allocation multiple times in this course.

I strongly recommend that you read through these sites and examples as well:

Additional Reading:

Chapter 1.5 – Weiss (2014)

2.18 Debugging

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

Brian W. Kernighan

Finding bugs in your code can be an incredibly difficult task. When you have large, complex software systems, tracking down and fixing a bug can be a huge investment. In the best case, your code will crash and you’ll know that there is a problem. In the worst case, your code may continue running, but might have subtly corrupted it’s internal memory structures - in this case it looks like it is working, but may give you incorrect results. Debugging is such an important process that many software design patterns and tool have been developed purely to make debugging easier.

The most important tool when there is a bug in your code is called the debugger. It allows you you pause the execution of your code (with a breakpoint) and inspect the state of your program’s memory. You can view the call stack and the value stored in any of the memory addreses belonging to your application. You can also step through your code line by line and see where things start to go wrong.

Another useful time to pull out the debugger is when your code crashes. Unfortunately, C++ does not give us particularly useful error messages when our application crashes. Often it will just say Segmentation fault (core dumped). This generally means that we’ve tried to access memory that our program is not allowed to access and the operating system shut down the program for security purposes. Without a debugger it can be extremely hard to find out where our code crashed, but if we run our code with a debugger then it will show us exactly which line was running when the code crashed.

The video below explains how to use the GDB debugger and the debugger in QtCreator. In general, you should always use the debugger in QtCreator.

2.19 Setting up your environment

There are a few different ways to get your environment set up. I suggest that you use QtCreator as your IDE. Throughout this course, all labs will come with a QtCreator project file.

Go to https://www.qt.io/offline-installers and download the “5.12.x Offline Installers” file for your relevant operating system.

2.19.1 Linux

While connected to the internet, open a terminal and run the following commands:

cd Downloads
# Install the compiler
sudo apt install build-essential                  
# Allow execution
chmod +x ./qt-opensource-linux-x64-5.12.12.run
# Install Qt, QtCreator
./qt-opensource-linux-x64-5.12.12.run

If the version numbers are different then update the filenames above accordingly. When you launch QtCreator the first time, it may show a bar along the bottom with a button to “Link with Qt”. If so, click that button and then enter the Qt installation path that you used during installation.

2.19.2 Windows/MacOS

MinGW is a version of g++ for Windows. MinGW can be installed as part of QtCreator (see the steps below). You can also download MinGW-64 directly from http://mingw-w64.org/doku.php. This will give you a terminal and technically everything that you need to compile, but generally I suggest that you just use QtCreator to install everything.

During the install process, make sure to select "MinGW 64-bit" (under both "Qt 5.12.x" and under "Developer and Designer Tools" ) and "Qt Creator 5.0.2" at the bottom of the list of components as shown in Figure 2.10.

If you would prefer to use CLion there is a tutorial here: https://www.jetbrains.com/help/clion/quick-tutorial-on-configuring-clion-on-windows.html. Wits has licences for all the JetBrains products, just sign up for an account with your Wits email address.

2.19.3 Installing Qt

Figure 2.6: If you don’t have a Qt account then create one for free, enter the username and password, then click Next.

Figure 2.7: Tick both boxes and click Next.

Figure 2.8: Click Next.

Figure 2.9: Set the path where you want to install Qt. The default value should be fine. Remember this path. Click Next.

Figure 2.10: Windows: Make sure to select “MinGW 64-bit” (in two places) and “Qt Creator 5.0.x”, then click Next.

Figure 2.11: Linux: Make sure to select “Desktop gcc 64-bit” and “Qt Creator 5.0.2”, then click Next.

Figure 2.12: Accept the terms of the GPLv3 and click Next.

Figure 2.13: Click Install.

References

Dale, Nell B, Chip Weems, and Tim Richards. 2018. C++ Plus Data Structures. 6th ed. Jones & Bartlett Learning.

Goodrich, Michael T, Roberto Tamassia, and David M Mount. 2011. Data Structures and Algorithms in c++. 2nd ed. John Wiley & Sons.

Weiss, Mark A. 2014. Data Structures & Algorithm Analysis in c++. 4th ed. Pearson Education.

You’ll remember segment registers from BCO!↩︎

2 C++

2.1 Introduction

2.2 Compilation

2.2.1 Command Line

2.2.2 Make

2.2.3 Integrated Development Environments (IDEs)

2.3 Hello World!

2.4 Data Types

2.5 Strings

2.6 Reading input from stdin

2.7 Vectors

2.8 Branching (If Statements)

2.9 Loops

2.9.1 While Loops

2.9.2 Do Loops

2.9.3 For Loops

2.10 Pointers

2.10.1 Basics

2.10.2 Initialisation & Null Pointers

2.11 References

2.12 Structs & Classes

2.13 Arrays

2.14 Static & Dynamic Allocation

2.15 Recursion

2.16 Classes

2.17 Pointers

2.18 Debugging

2.19 Setting up your environment

2.19.1 Linux

2.19.2 Windows/MacOS

2.19.3 Installing Qt

References

2.6 Reading input from `stdin`