Chapter 3: Modularity
3.1
At the language level, C++ represents interfaces by declarations. A declaration specifies all that's needed to use a function or a type.
3.2
C++ supports a notion of separate compilation where user code sees only declarations of the types and functions used. The definitions of those types and functions are in separate source files and are compiled separately. ... A library is often a collection of separately compiled code fragments. (e.g. functions)
- Header Files: place declarations in separate files, called header files, and textually
#includea header file where its declaration are needed. - Module: Define
modulefiles, compile them separately, andimportthem where needed. Only explicitlyexported declarations are seen by codeimporting themodule
Strictly speaking, using separate compilation isn't a language issue; it is an issue of how best to take advantage of a particular language implementation. However, it is of great practical importance. The best approach to program organization is to think of the program as a set of modules with well-defined dependencies, represent that modularity logically through language features, and then exploit the modularity physically through files for effective separate compilation.
A .cpp file that is compiled by itself (including the h files it #include) is called a translation unit. A program can consist of many thousand translation units.
3.3
If you #include header1.h before header2.h the declarations and macros in header1.h might affect the meaning of the code in header2.h. If instead you #include header2.h before header1.h, it is header2.h that might affect the code in header1.h.
We are finally about to get a better way of expressing physical modules in C++.
The way we use this module is to import it where we need it.
The differences between headers and modules are not just syntatic:
- A module is compiled once only (rather than in each translation unit in which it is used).
- Two module can be
imported in either order without changing their meaning. - If you import something into a module, users of your module do not implicitly gain access to (and are not bothered by) what you imported:
importis not transitive.
When defining a module, we do not have to separate declarations and definitions into separate files; we can if that improves our source code organization, but we don't have to.
3.2.2 from 3rd edition
In C++20, we finally have a language-supported way of directly expressing modularity.
export module Vector;
export class vector {
public:
Vector(int s);
double& operator[](int i);
int size();
private:
double * elem;
int sz;
};
Vector::Vector(int s): elem{new double[s]}, sz{s} {}
//... some other implementation details
export bool operator==(const Vector& v1, const Vector& v2) { /* comparison logic */ }
This defines a module called Vector, which exports the class Vector, all its member functions, and the non-member function defining operator ==.
3.4
The simplest way to access a name in another namespace is to quality it with the namespace name (e.g. std::cout and My_code::main). THe "real main()" is defined in the global namespace, that is, not local to a defined namespace, class, or function.
using-declaration
A using-declaration makes a name from a namespace usable as if it was declared in the scope in which it appears.
using-directive
A using-directive makes unqualified names from the named namespace accessible from the scope in which we placed the directive. So after the using-directive for std, we can simply write cout rather than std::cout
when we have module, things are different.
export module vector_printer;
import std;
using namespace std; // This is OK, won't leak to the users.
export
template<typename T>
void print<vector<T>& v> { /*...*/ }
Importantly, the use of the namespace directive does not affect users of our module; it is an implementation detail, local to the module.
By using a using-directive, we lose the ability to selectively use names from that namespace, so this facility should be use carefully, usually for a library that's pervasive in application or during a transition for an application that didn't use namespaces.
3.5
For error handling, The major tool is the the type system itself.
this has been moved to Chapter 4: Error Handling in the 3rd edition
3.5.1
Assuming that out-of-range access is a kind of error that we want to recover from, the solution is for the Vector implementer to detect the attempted out-of-range access and tell the user about it. The user can then take approriate action.
double& Vector::operator[](int i){
if (i < 0 || size() <= i){
throw out_of_range("Vector::operator[]");
}
return elem[i];
}
The throw (throw exception) transfers control to a handler for exceptions of type out_of_range in some function that directly or indirectly called Vector::operator[](). To do that, the implementation will unwind the function call stack as needed to get back to the context of the caller.
A function that should never throw an exception can be declared noexcept.
3.5.2
The use of exceptions to signal out-of-range access is an example of a function checking its argument and refusing to act because a basic assumption, a precondition, didn't hold. Whenever we define a function, we should consider what its preconditions are and consider whether to test them. From most applications it is a good idea to test simple invariants.
Such a statement of what is assumed to be true for a class is called a class invariant, or simply an invariant. It is the job of a constructor to establish the invariant for tis class (so that the member functions can rely on it) and for the member functions to make sure that the invariant holds when they exit.
3.5.3
Throwing an exception is not the only way of reporting an error that cannot be handled locally. A function can indicate that it cannot perform its allotted task by:
- throwing an exception
- somehow return a value indicating failure
- terminating the program (by invoking a function like
terminate(),exit(),abort()).
3.5.4
A contract mechanism is proposed for C++20. The aim is to support users who want to rely on testing to get programs right -- running with extensive run-time checks -- but then deploy code with minimal checks.
The standard library offers the debug macro, assert(), to assert that a condition must hold at runtime.
If the condition of an assert() fails in "debug mode," the program terminates. If we are not in debug mode, the assert() is not checked.
3.5.5
The static_assert mechanism can be used for anything that can be expressed in terms of constant expressions.
In general, static_assert(A, S) prints S as a compiler error message if A is not true.
3.6
3.6.1
It is not uncommon for a function argument to have a default value; that is, a value that is considered preferred or just the most common. We can specify such a default by a default function arugment.
This is notationally simpler alternative to overloading.
3.6.2
The default for value return is to copy and for small objects that's ideal. We return "by reference" only when we want to grant a caller access to something that is not local to the function.
RVO: return value optimization. Return with move semantics. you don't need to use pointer anymore. The compiler will optimize for you.
Matrix operator+(const Matrix& x, const Matrix& y)
{
Matrix res;
//... do some computation
return res;
}
A Matrix may be very large and expensive to copy even on modern hardward. So we give Matrix a move constructor that very chaeaply moves the Matrix out of operator+(). Even if we don't define a move constructor, the compiler is often able to optimize away the copy (elide the copy) and construc the Matrix exactly where it is needed. This is called copy elision.
We should not regress to use manual memory management.
3.4.4: Suffix Return Type from the 3rd edition
C++ allows adding the return type after the argument list where we can to be explicit about the return type
Bjarne found this suffix return type notation more logical than the tranditional prefixone.3.6.3
The auto [n, v] declares two local variables n and v with their types deduced from read_entry()'s return type. This mechanism for giving local names to members of a class object is called strutured binding.
As usual, we can decorate auto with const and &.
There must be the same number of names defined for the binding as there are nonstatic data members of the class, and each name introduced in the binding names the corresponding member. There will not be any difference in the object code quality compared to explicitly using a composite object; the use of structured binding is all about how best to express an idea.
A complex has two data members, but its interface consists of access functions, such as real() and imag(). Mapping a complex<double> to two local variables, such as re and im is feasible and efficient, but the techinque for doing so is beyond the scope of this book.
3.7
- Don't put a
using-directive in a header file.
TODO
implement the module.
Bazel support see
- https://github.com/bazelbuild/proposals/pull/354
- https://github.com/PikachuHyA/bazel_cxx20_module_test
- https://github.com/PikachuHyA/bazel/actions/runs/6638058320/workflow