Skip to content

Chapter 13: Utilities

13.1

Such components (classes and templates) are often called vocabulary types because they are part of the common vocabulary we use to describe our designs and programs.

13.2

Resource Management: A resource is something that must be acquired and later (explicitly or implicitly) released. Examples are memory, locks, sockets, thread handles, and file handles.

A thread will not proceed until a scoped_lock's constructor has acquired the mutex. The corresponding destructor releases the resource. This is an application of RAII (the "Resource Acquisition Is Intiliazation") technique. RAII is fundamental to the idiomatic handling of resources in C++

13.2.1 <memory> / unique_ptr & shared_ptr

In <memory> the standard library provides two "smart pointers" to help manage objects on the free store:

  • unique_ptr to represent unique ownership
  • shared_ptr to represent shared ownership

When you really need the semantics of pointers, unique_ptr is a very lightweight mechanism with no space or time overhead compared to correct use of a built-in pointer. A unique_ptr is a handle to an individual object (or an array) in much the same way that a vector is a handle to a sequence of objects.

The shared_ptr is similar to unique_ptr except that shared_ptr are copied rather than moved. The shared_tr for an object share ownership of an object; that object is destroyed hen the last of its shared_ptrs is destroyed. Thus shared_ptr provides a form of garbage collection that respects the destructor-based resource management of the memory-managed objects. This is neither cost-free nor exobitantly expensive, but it does make the lifetime of the shared object hard to predict. Use shared_ptr only if you actually need shared ownership.

struct S{
    int i;
    std::string s;
};
auto p1 = std::make_shared<S>(1, "Ankh Morpork");  // p1 is a shared_ptr
auto p2 = std::make_unique<S>(2, "Oz");            // p2 is a unique_ptr

Using make_share() is not just more convenient than separately making an object using new and the passing it to a shared_ptr, it is also notably more efficient beacuse it does not need a separate allocation for the use count that is essential in the implementation of a share_ptr.

Where do we use "smart pointers" rather than resource handles with operations designed specifically for the resource (e.g. vector/thread)? The answer is "when we need pointer semantics":

  • when we share an object, we need pointers (or references) to refer to the shared object, so a shared_ptr becomes the obvious choice (unless there is an obvious single owner).
  • when we refer to a polymorphic object in classical object-oriented code, we need a pointer (or a reference) because we don't know the exact type of the object referred to (or even its size). so a unique-ptr becomes the obvious choice.
  • A shared polymorphic object typically requires shared_ptrs.

We do not need to use a pointer to return a collection of objects from a function; a containter that is a resource handle will do that simply and efficiently

13.2.2 std::move() & std::forward()

The choice between moving and copying is mostly implicit. A compiler will prefer to move when an object is about to be destroyed (as in a return) because that's assumed to be the simlper and more efficient operation.

Confusingly, std::move() doesn't move anything. Instead, it casts its argument to an rvalue refernce, thereby saying that its argument will not be used again and therefore may be moved. It should have been called something like rvalue_cast. Like other casts, it's error-prone and best avoided. It exists to serve a few essential cases. Consider

template<typename T>
void swap(T& a, T& b){
    T tmp{std::move(a)};    // the T constructor sees an rvalue and moves
    a = std::move(b);       // the T assignment sees an rvalue and moves
    b = std::move(tmp);     // the T assignment sees an rvalue and moves
}

We don't want to repeatedly copy potential large objects, so we request moves using std::move(). however, it can causes errors

std::string s = "to-be-moved";
std::vector<std::string> vs;
vs.push_back(s);    // so far so good. 
std::cout << s[2];  // ERROR! use after move! crash? ub?

This use of std::move() is considered (by Bjarne) to be too error-prone for widespread use. Don't use it unless you can demonstrate significant and necessary performance improvement.

The state of a moved-from object is in general unspecified, but all standard-library types leave a moved-from object in a state where it can be destroyed and assigned to. It would be unwise not to follow the lead. For a container (e.g. vector and string), the moved-from state will be "empty". For many types, the default value is a good empty state: meaningful and cheap to establish.

Forwarding arguments is an important use case that requires move. We sometimes want to transit a set of arguments on to another function without changing anything (to achieve "perfect forwarding"):

template<typename T, typename... Args>
unique_ptr<T> make_unique(Args&&... args){
    return unique_ptr<T>{new T{std::forward<Args>(args)...}};
}

The standard-library std::forward() differs from the simpler std::move() by correctly handling subtleties to do with lvalue and rvalue. Use std::forward() exclusively for forwarding and don't forward() something twice; once you have forwarded an object, it's not yours to use anymore.

13.3

Use span (C++20) for range checking. A span is basically a (pointer, length) pair denoting a sequence of elements. A span gives access to a contiguous sequence of elements. The elements can be stored in many ways, including in vectors and built-in arrays. LIke a pointer, a span does not own the characters it points to. In that, it resembles a string_view and an STL pair of iterators.

void fs(span<int> p){
    for (int& x: p) x = 0;
}

int a[100];
fs(a);
fs(a, 1000);  // ERROR: span expected
fs({a+10, 100}); // ERROR: a range error

In this way, the common case, creating a span directly from an array, is not safe (the compiler computes the element count) and notationally simple. The common case where a span is passed along from function to function is simpler tha for (pointer, count) interfaces and obviously doesn't require extra checking.

void f1(span<int> p);
void f2(span<int> p){
    // ...
    f1(p);
}

When used for subscripting (e.g. r[i]), range checking is done and a gsl::fail_fast is thrown in case of a range error. Note that just a single range check is needed for the loop. Thus, for the common use where the body of a function using a span is a loop over the span, range checking is almost free.

13.4

specialized containers

container meaning
T[N] built-in array: a fixed-size contiguously allocated sequence of N elements of type T; implicitly converted to T*
array<T,N> a fxied-size contiguously allocated sequence of N elements of type T. preferred over T[N]
bitset<N> a fixed-size sequence of N bits
vector<bool> a sequence of bits compactly stored in a specialization of vector. highly dicuraged
pair<T,U> two elements of types T and U
tuple<T...> A sequence of an arbitrary number of elements of arbitrary types
basic_string<C> A sequence of characters of type C; provides string operations
valarray<T> An array of numeric value of type T; provides numeric oeparations

Some notes:

  • array, vector, tuple elements are contiguously allocated; forward_list and map are linked structures.
  • bitset and vector<bool> hold bits and access them through proxy objects; all other standard-library containers can hold a variety of types oand access elements directly.

No single container could serve all of these needs because some neds are contradictory, for example, "ability to grow" v.s. "guaranteed to be allocated in a fixed location,", and "elements do not move when elements are added" vs. "contiguously allocated.

13.4.1

the semantics here is about fixed-size sequence of elements.

An array (from <array>) is a fixed-size sequence of elements of a given type where the number of elements is specified at compile time. Thus, an array can be allocated with its elements on the stack, in an object, or in static storage. The elements are allocated in the scope where the array is defined. There is no overhead (time or space) involved in using an array compared to suing a built-in array. An array does not follow the "handle to elements" model of STL containers. Instead, an array directly contains its elements.

the number of elements in the initializer must be equal or less than the number of lements specified of an array and the element count must be a constant expression

array<int, 3> a1 = {1,2,3};   //OK
array<int, 3> a2 = {1,2,3,4}; // ERROR, wrong size in initializer
array<int, n> a3 = {1,2,3};   // ERROR, size is not constexpr

When necessary, an array can be explicitly passed to a C-style function that expects a pointer:

void f(int* p, int sz);
void g(){
    array<int, 10> a;
    f(a, a.size());       // ERROR: no conversion
    f(&a[0], a.size());   // OK: C-style use
    f(a.data(), a.size());// OK: C-style use
    auto p = find(a.beign(), a.end(), 77); // C++/STL-style use
}
void h(){
    Circle a1[10];
    array<Circle, 10> a2;

    // BAD: sizeof(Circle) may be different from sizeof(Shape)
    // if sizeof(Shape) < sizeof(Circle), then built-in array pointer conversion can
    // be a disaster.
    Shape* p1 = a1;
    p1[3].draw();  // wrong memory address

    // std::array doesn't allow this because no conversion from array<Circle, 10> to Shape*.
    Shape* p2 = a2;
}

Note that because an array is stored on the stack, we need to be careful: stack is a limited resource (especially on some embedded systems), and stack overflow is nasty.

13.4.2

the semantics here is about bit operation.

Class bitset<N> generalizes the notion of small sets of flags efficiently operated through bitwise operations on integers, by providing operatoins on a sequence of N bits [0:N), where N is known at compile time. For sets of bits that don't fit into a long long int, using a bitset is much more convenient than using integer directly. For small sets, bitset is usually optimized. If you want to name the bits, rather than numbering then, you can use a set or an enumeration.

A bitset can br initialized with an integer or a string

bitset<9> bs1 {"110001111"};
bitset<9> bs2 {0b110001111};

Operations include bitwise operators (&,^,~,|) and left/right shift (<<, >>). The operations to_ullong() and to_string() provide the inverse opeations to the constructors.

void binary(int i){
    bitset<8 * sizeof(int)> b = i;
    cout << b.to_string() << '\n';
    cout << b << '\n'; // we can also write out the bitset object
}

13.4.3

the semantics here is about data collection.

Often we need some data that is just data; that is, a collection of values, rather than an object of a class with well-defined semantics and an invariant for its values. In this case, struct is ideal, or you can consider tuple and pair.

The first member of a pair is called first, whereas the second number is called second. We can also use structured-binding to handle these two elements

void f2(const vector<Record>& v){
    // algorithm returns a pair
    auto eq_rng = equal_range(v.begin(), v.end(), Record{"Reg"}, less);
    for (auto iter = eq_rng.first(); iter != eq_rng.second(); iter++) {}
    // structured binding
    auto [first, last] = eq_rng;
    for (auto iter = first; iter != last; iter++) {}
}

The standard-library pair (from <utility>) is quite frequently used in the standard library and elsewhere. A pair provides operators, such as =, ==, <, if its elements do. Type deduction makes it easy to create a pair without explicitly mentioning its type:

std::pair p1{v.begin(), 2};
auto p2 = std::make_pair(v.begin(), 2);

If you have more than two elements (or less), you can use tuple. tuple is an hetegrogeneous sequence of elements. Older code tends to use make_tuple() because template argument type deduction from constructor arguments is C++17.

Accese to tuple members is through a get function template, adn the elements of a tuple are numbered (starting with zero) and the indices must be constants. Further, if an element of a tuple has a unique type in that tuple, the element can be named by the type.

tuple<string, int, double> t1{"Shark", 123, 3.14};
string s = get<0>t1;        // get an element by index
string s2 = get<string>t1;  // get an element by type -- if the type is unique.
get<string>t1 = "Tuna";     // get can be used to write!

13.5

the semantics here is about alternatives.

The standard library provides three types to express alternatives:

  • variant to represent one of a specified set of alternatives (from <variant>)
  • optional to represent a value of of a specified type or no value (from <optional>)
  • any to represent one of an unbounded set of alternative types (from <any>)

13.5.1

A variant<A,B,C> is often a safer and more convenient alternative to explicitly using a union. Whe nyou assign or initialize a variant with a value, it remembers the type of that type. Later, we can inquire what type the variant holds and extract the value.

if (holds_alternative<string>(m)){
    cout << m.get<string>();
} else {
    int err = m.get<int>();
}

This can be used as a way for error handling. we can also use the variant more powerfully

using Node = variant<Expression, Statement, Declaration, Type>;
void check(Node* p){
    std::visit(overloaded{
        [](Expression& e) { /*...*/ },
        [](Statement& s) { /*...*/ },
        //....
    }, *p);
}

This is basically equivalent to a virtual function call, but "potentially" faster (you really need to measure). Unfortunately, the overloaded is necessary and not srandard. It's a "pice of magic" that builds an overload set from a set of arguments (usually lambda):

template<class... Ts>
struct overloaded : Ts...{
    using Ts::operator()...;
};
template<class... Ts>
    overloaded(Ts...) -> overloaded<Ts...>; // deduction guide.

The "visitor" visit then applies () to the overload, which selects the most appropriate lambda to call according to the overload rules. A deduction guide is a mechanism for resolving subtle ambiguities, primarily for constructors of class templates in foundation libraries.

13.5.2

An optional can be useful for functions that may or may not return an object.

optional<string> compose_message(stream& s){
    string mess;
    // ...
    if (no_problem){
        return mess;
    } else{
        return {};  // the empty optional
    }
}
if (auto m = compose_message(cin))  // imply that m is valid
    cout << *m;  // dereference optional to get value.

Note the curious use of *. An optional is treated as a pointer to its object rather than the object itsefl. The optional equivalent to nullptr is the empty object, {}.

If we try to access an optional that does not hold a value, the result is undefined; an exception is not thrown. Thus, optional is not guaranteed type safe.

13.5.3

An any can hold an arbitrary type and know which type it holds.It is basically an unconstrained version of variant.

any compose_message(istream& s){
    string mess;
    //....
    if (no_problem)
        return mess;         // an string
    else
        return error_number; // an int
}
auto m = compose_message(cin);
string&s = std::any_cast<string>(m);

If we try to access an any holding a different type than the expected one, bad_any_access is thrown. There are also ways of accessing an any that do not rely on exceptions.

13.6

By default, standard-library containers allocate space using new. Operators new and delete provide a general free store (also called dynamic memory or heap) that can hold objects of arbitrary size and user-controlled lifetime.

For a long-running system used an event queue. Unfortunately, this can lead to massive fragmentation. After 100,000 events had seen passed among 16 producers and 4 consumers, more than 6GB memory had been consumed.

The traditional solution to fragmenetation problem is to rewrite the code to use a pool allocator. A pool allocator is an allocator that manages objects of a single fixed-size and allocates space for many objects at a time, rather than using individual allocations. Fortunately, C++17 offers direct support for that. The pool allocator is defined in the pmr (polymorphic memory resource) subnamespace of std.

pmr::synchronized_pool_resource pool;
struct Event{
    vector<int> data = vector<int>{512, &pool}; // let Event uses pool;
}
list<shared_ptr<Event>> q {&pool};

void producer(){
    for (int n = 0; n != LOTS; ++n){
        scoped_lock lk{m};
        q.push_back(allocate_shared<Event, pmr::polymorphic_allocator<Event>>{&pool});
        cv.notify_one();
    }
}

Now, after 100,000 events had been passed among 16 producers and 4 consumers, less than 3MB memory had been consumed. That's about a 2000-fold improvement. Naturally, the amount of memory actually in use (as opposed to memory wasted to fragmentation) is unchanged.

The standard containers optionally take allocato arguments. The default is for the containers to use new and delete. Other polymorphic memory resources include

  • unsynchronized_polymorphic_resource; like polymorphic_resource but can only be used by one thread
  • monotonic_polymorphic_resource; a fast allocator that releases its memory only upon its destruction and can only be used by one threads.

13.7

In <chrono>, the standard library provides facilities for dealing with time.

The clock returns a time_point (a point in time). Substracting two time_points gives a duration (a period of time). Various clocks give their results in various unit of time (e.g. nanoseconds), so it is usually good idea to convert a duration into a known unit. That's what duration_cast does.

To simplify notation and minimize errors, <chrono> offers time-unit suffixes.

this_thread::sleep(10ms + 33us);

the chrono suffixes are defined in namespace std::chrono_literals. An elegant and efficient extension to <chrono>, supporting longer time intervals, calendar, and time zones, is being added to the standard for C++ 20.

chrono also supports calenders

auto spring_day = April/7/2018;
cout << weekday(spring_day) << '\n';             // Sat
cout << format("{:%A}\n", weekday(spring_day));  // Saturday

auto bad_day = January/0/2024;
if (!bad_day.ok()){
    std::cout << bad_day <<  " is not a valid day";
}

Dates are composed by overloading operator/(slash) by the types year, month, int. The resuling year_month_day type has conversion to and from time_point.

A time_zone is a time relative to a standard (called GMT or UTC) used by the system_clock. The standard library synchronizes with a global data base (IANA) to get its answers right. The names of time zones are C=style strings of the form "continent / major city", such as "America/New_York", "Asia/Tokyo"

13.8

When passing a function as a function argument, the type of the argument must exactly match the expectation expressed in the called function's declaration. We have three alternatives:

  • Use a lambda
  • Use std::mem_fn() to make a function object from a member function
  • Define the function to accept a std::function

13.8.1

lambda as adaptors: use lambda to adapt a x->f() to a f(x)

// adaptr p->draw() to be draw(p), conceptually
for_each(v.begin(), v.end(), [](Shape* p){ p->draw(); });

13.8.2

Given a member function, the function adaptor mem_fn(mf) produces a function object that can be called as a nonmember function

for_each(v.begin(), v.end() mem_fn(&Shape::draw));

13.8.3

the standard-library function is a type that can hold any object you can invoke using the call operator (). That is, an object of type function is a function object.

int f1(double);
function<int(double)> fct1 {f1};

function fct3 = [](Shape* p){ p->draw(); }  // type deduction-> fct3 is function<void(Shape*)>

Obviously, functions are useful for calllbacks, for passing operations as arguments, for passing function objects, etc. However, it may introduce some run-time overhead compared to direct calls, and a function doesn't participate in overloading. if you need to overload function objects (including lambdas), consider overloaded (see 13.5.1)

13.9

A type function is a function that is evaluated at compile time given a type as its argument or returning a type. The standard library provides a variety of type functions to help library implementers to write code that takes advantage of aspects of the language, standard library, and code in general.

For numerical types, numberic_limits from <limits> presents a variety of useful infomration. (e.g. numeric_limis<float>::min()) Similarly, object size can be found by the built-in sizeof operator.

Such type functions are part of C++'s mechanisms for compile-time computation that allow tighter type checking and better performance that would otherwise have been possible. Use of such features is often called metaprogramming or template metaprogramming.

Here the book introduces two example iterator_traits and type oredicates

13.9.1

The standard library provides a mechanism, iterator_traits, that allows us to check which kind of iterator is provided. Given that, we can improve the range sort() to accept either a vector or a forward_list.

template<typename Ran>
void sort_helper(Ran beg, Ran end, random_access_iterator_tag){
    sort(beg, end);
}

template<typename T>
using Value_type = typename T::value_type;

template<typename Fwd>
void sort_helper(Fwd beg, Fwd end, forward_iterator_tag){
    vector<Value_type<Fwd>> v{beg, end};
    sort(v.begin(), v.end());
    copy(v.begin(), v.end(), beg);
}

template<typename C>
void sort(C& c){
    using iter = iterator_type<C>
    sort_helper(c.begin(), c.end(), Iterator_category<iter>{});
}

// use case
void test(vector<string>& v, forward_list<string>& lst){
    sort(v);
    sort(lst);
}

Note that Value_type<Fwd> is the type of Fwd's elements, called its value type. Every standard-library iterator has a member value_type.

Here we have two sort_helper functions: Iterator_type<C> returns the iterator type of C (i.e. C::Iterator) and then Iterator_category<iter>{} constructs a "tag" value indicating the kind of iterator provided.

  • std::random_access_iterator_tag if C's iterator supports randome access
  • std::forward_iterator_tag if C's iterator supports forward iteration

This, techinque, called tag dispatch, is one of several used in the standard library and elsewhere to imrpve flexibility and performance.

However, to extend this idea to types without member types, such as pointers, the standard-library support for tag dispatch comes in the form of a class template iterator_traits from <iterator>.

template<class T>
struct iterator_traits<T*>{
    using difference_type = ptrdiff_t;
    using value_type = T;
    using pointer = T*;
    using reference = T&;
    using iterator_category = random_access_iterator_tag;
};

template<typename Iter>
using Iterator_category = typename std::iterator_traits<Iter>::iterator_category;

Now an int* can be used as a random-access iterator despite not having a member type; iterator_category<int*> is randome_access_iterator_tag.

When concepts are introduced, many traits/traits-based techniques will be made redundant.

13.9.2

In <type_traits>, the standard library offers simple type functions, called type predicates that answers a fundamental question about type:

bool b1 = std::is_arithmetic<int>();
bool b2 = std::is_arithmetic<string>();

Other examples are is_class, is_pod, is_literal_type, has_virtual_destructor, is_base_of. They are most useufl when we write templates

template<typename T>
constexpr bool is_arithmetic_v = std::is_arithmetic<T>();

13.9.3

Obvious ways of using type predicates includes conditions for static_asserts, compile-time ifs, and enable_ifs. The standard-library enable_if is a widely used mechanism for conditionally introducing definitions.

template<typename T>
class Smart_pointer{
    // ...
    T& operator=();
    std::enable_if<is_class<T>(), T&> operator->();
};

If is_class<T>() is true, then return type of operator-> is T&; otherwise, the definition of operator->() is isgnored.

The syntax of enable_if is odd, awkward to use, and will in many cases be rendered redundant by concepts. However, enable_if is the basis for much current template metaprogramming for many standard-library components. It relies on a subtle language feature called SFINAE("Substitution Failure Is Not An Error").

16.5 source_location from the 3rd edition

When writing out a trace message or an error message, we often want to make a source location part of that message. The library provides source_location for that

const source_location loc = source_location::current();

class source_location
{
    char* file_name();
    char* function_name();
    unsigned int line();
    unsigned int column();
};

Code written before C++20 or needing to compile on older compilers uses macros __FILE__ and __LINE__ for this.

16.8 Existing a Program

The standard library provides facilities to deal with the last case ("exit the program"):

  • exit(x): call functions registered with atexit() then exit the program with the return value x. If you need to, look up atexit(), it's basically a primitive destructor mechanism shared with the C language.
  • abort(): exit the program immediately and unconditionally with a return value indicating unsuccessful termination.
  • quick_exit(x): call functions registered with at_quick_exit(); then exit the program with the return value x
  • terminate(): call the terminate_handler. The default terminate_handler is abort().

These functions are for really serious errors. They do not invoke destructors; that is, they do not do ordinary and proper clean-up. The various handlers are used to take actions before exiting.

TODO

  • 13.6 allocator needs to find a way to observe the memory usage difference.