Mastering the C++17 STL

This blog post is a quick chapter by chapter summary of the book Mastering the C++17 STL by Arthur O’Dwyer. The author makes difficult C++ concepts easy to understand and follow. The entire book is also a good summary of the newer C++17 STL offerings.

Chapter 1: Classical Polymorphism and Generic Programming

This chapter focuses on polymorphic functions vs generic templated functions, with a detailed example. However, the key takeaway from this chapter is that, C++ now has an extremely heavy paradigm shift towards generic templated programming. It is the key to understand quite a majority of the newer C++ features.

Chapter 2: Iterators and Ranges

  • Iterators are a higher level abstractions than pointers. They provide a uniform interface to work with containers, and could be easier to work with for some (or most) containers. Containers and algorithms in STL have better support for iterators.
  • A pair of iterators defines a (left closed, right open) range
  • Iterator categories (companion article to read on cplusplus.com)
    • InputIterator: can be incremented, dereferenced as rvalues, only use in single pass, cannot be meaningfully copied (think of istream)
    • OutputIterator: can be incremented, dereferenced as lvalues, only use in single pass
    • ForwardIterator: Built upon InputIterator, provides forward only traversal
    • BidirectionalIterator: Built upon ForwardIterator, provides bidirectional traversal
    • RandomAccessIterator: Built upon BidirectionalIterator, provides pointer-arithmetic like random access traversal

Chapter 3: The Iterator-Pair Algorithms

  • Read-only range algorithms
    • distance(b, e): number of elements between b and e (could be negative)
    • count_if(a, b, p): number of elements x that satisfy p(x) between a and b
    • count(a, b, v): number of elements x that satisfy x==v between a and b
    • Note: std::count could be outperformed by container specific functions (e.g., std::set::count), same is true for some other algorithms
    • find(a, b, v): returns iterator to the first x such that x==v
    • find_if(a, b, p): returns iterator to the first x such that p(x)
    • find_if_not(a, b, p): returns iterator to the first x such that !p(x)
    • find_first_of(a1, b1, a2, b2): returns iterator to the first x in [a1, b1) such that x appears in [a2, b2)
    • find_first_of(a1, b1, a2, b2, p): returns iterator to the first x in [a1, b1) such that x appears in [a2, b2) and p(x) holds
    • all_of(a, b, p): return true if for all x between a and b, p(x) holds
    • any_of(a, b, p): return true if for some x between a and b, p(x) holds
    • none_of(a, b, p): return true if for no x between a and b, p(x) holds
  • Copy related algorithms
    • copy(InputIterator a, InputIterator b, OutputIterator c): copies range [a, b) into range starting at c
    • move(InputIterator a, InputIterator b, OutputIterator c): moves range [a, b) into range starting at c
    • Note: this move algorithm is defined in <algorithm> header, different from the std::move defined in <utility>
    • transform(InputIterator a, InputIterator b, OutputIterator c, Unary op): copies all x in [a, b) as op(x) into range starting at c
  • Write-only range algorithms
    • fill(a, b, v): fills elements in range [a, b) with value v
    • iota(a, b, v): fills elements in range [a, b) with sequentially increasing values starting with v and repetitively evaluating ++v
    • generate(a, b, g): fills elements in range [a, b) with successive result of g()
  • Algorithms that affect object lifetime
    • destroy_at(T* p): explicitly calls destructor ~T() on p
    • addressof(x): equivalent to &x (unless x is of a class type that overloads operator&)
    • uninitialized_copy(a, b, c): copies elements from the range [a, b) to an uninitialized memory area beginning at c
  • Permutational algorithms:
    • sort(a, b): sort the range using default comparator
    • sort(a, b, cmp): sort the range using comparator cmp
    • stable_sort: guarantees the order of equal elements are of the same order
    • swap(a, b): swaps two elements. This is the most basic building block for classes to be used in a container
    • reverse(a, b): reverses the elements in the range
    • partition(a, b, p): puts all elements x so that p(x) is true before all elements y with !p(y). Not stable. There are also is_partitioned and stable_partition functions.
    • rotate(a, m, b): rotates elements in [a, b) such that m becomes the first element and [a, m) have been rotated to after [m, b)
    • next_permutation(a, b): reorders elements in [a, b) so that they are the next permutation of the existing elements (e.g., for a std::vector<int> x { 2, 1, 3 }, calling next_permutation(x) changes x to { 2, 3, 1 }). There is also prev_permutation to use.
    • mismatch(a1, b1, a2, b2): returns the first mismatching pair of elements from two ranges
    • lexicographical_compare(a1, b1, a2, b2): compares two ranges in lexicographical order
  • Heaps, Heapsort, and Sorting
    • make_heap, pop_heap, push_heap, sort_heap: heap related algorithms
    • merge, inplace_merge: merge algorithms
    • lower_bound(a, b, v): returns iterator to the first element in the range [a, b) that’s not less than v, or b if not found
    • upper_bound(a, b, v): returns iterator to the first element in the range [a, b) that’s greater than v, or b if not found
    • remove_if(a, b, p): removes all elements x in [a, b) satisfying p(x), returns past-the-end iterator for the new end of range

Chapter 4: The Container Zoo

  • std::array<T, N>
    • Offers begin() and end()
    • Overloads operators =, ==, <
    • Can return a std::array from a function
    • Can put into containers
  • std::vector<T>
    • Uses an array behind the scenes, can direct access via data()
    • Grows by need automatically
    • Can explicitly resize too, resizes new memory if new size is bigger
    • Insert at the end via push_back and emplace_back
    • Erase operation is linear complexity
    • vector<bool> has pitfalls: internally represented using bits to represent bools, efficient but operator[] returns std::vector<bool>::reference type, convertible to bool but not directly bool.
    • If move constructors of a class not defined as noexcept, std::vector uses the copy constructor to do resizing, which is less efficient
  • std::deque<T>
    • Roughly two ended vector that can grow on both directions, internally using arbitrary number of “chunks” of contiguous memory
    • In addition to std::vector<T>’s API functions, also has push_front and pop_front
    • No specialization of std::deque<bool>
  • std::list<T>
    • Doubly linked list, dynamically allocating memories for elements
    • Operations such as splice, merge, and removing elements (remove, remove_if, unique) become cheap
    • std::list::sort function is faster than std::sort
    • Can traverse bidirectionally
    • No iterator invalidation
  • std::forward_list<T>
    • Mostly std::list<T>, without some features
    • Can’t get size, can’t iterate backward
    • Can’t do splice, push_back
    • Provides some other remedy functions: erase_after, insert_after, splice_after
  • std::stack<T, Ctr>, std::queue<T, Ctr>
    • An adapter, underlying storage managed by container type Ctr
    • Should limit usage of the functions to the adapters only
  • std::priority_queue<T, Ctr, Cmp>
    • Priority queue adapter
  • Tree structures: std::set<T>, std::map<K, V>
    • Elements are sorted in a binary search tree! Has impact on complexity of element lookup
  • std::multiset<T>, std::multimap<K, V>
    • Instead of a “counter” for the “multi” values, multiple elements are actually stored in the underlying BST
  • Hash structures: std::unordered_set<T>, std::unordered_map<K, V>
    • Based on hashes and buckets
    • Can manipulate the underlying buckets and load factors

Chapter 5: Vocabulary Types

  • A vocabulary type purports to provide a common language for dealing with its domain, e.g., numbers, strings, bools, size_t, time_t, …
  • std::reference_wrapper<T>: tag values we’d like to behave as references
  • enum class
  • std::tuple: a fixed number of possibly different types of values
    • std::get<I>(t): returns reference to the Ith element of t
    • std::tuple_size_v<decltype(t)>: size of the tuple t
    • std::tuple_element_t<I, decltype(t)>: type of the Ith element of t
    • std::tuple_cat(t1, t2, ...): concatenates all tuples t1, t2, … together as one tuple
    • std::forward_as_tuple(a, b, c, ...): creates a tuple of references that can be extracted by std::get<I>(t)
  • std::pair: a tuple of two; has first and second fields
  • std::variant<A, B, C, ...>: expressing an alternative type of either A, B, or C, …
    • std::holds_alternative<double>(v): tells if v holds a double in it
    • std::get<double>(v): gets the double value out of v
    • Use a “visitor” to visit different variant types
    • There is no make_variant as there are make_tuple, make_pair
  • std::optional<T>
    • Using std::variant<std::monostate, int> could represent optionals with the specific type std::monostate (with std::monostate{} as the value)
    • But using std::optional<T> and std::nullopt is more convenient
    • has_value(): tells if the optional is in “engaged” state; o.has_value() is equal to bool(o), so you can put an optional directly inside if condition
    • overloads operators ==, !=, <, >, <=, >=
    • o.value() gets the value if engaged, otherwise throws
    • *o returns a reference to the value contained in o, gets undefined behavior if it’s disengaged
    • o.value_or(x) returns a copy of teh value contained by o, or a copy of x if disengaged
  • std::any: holding an element of any type
    • Implemented using type erasure
  • std::function: holding a function object

Chapter 6: Smart Pointers

  • std::unique_ptr<T>
    • Behaves just as T*
    • Can’t be copied
    • Once moved, ownership of underlying object transfers
    • Can provide a deletion callback
    • Supports std::unique_ptr<T[]> (calling the correct delete function)
  • std::shared_ptr<T>
    • Shared ownership of objects, internally implements reference counter
    • Contains two fields: ptr (pointing to the managed object) and ctrl (pointing to control block)
    • Control block: vptr, use, weak, ptr (also pointing to the object), Deleter
    • Do not call sp.get() and use that raw pointer to initialize another shared_ptr!

Chapter 7: Concurrency

  • volatile and std::atomic<T> serve different purposes
    • volatile for memory that might change (between previous and next access)
    • std::atomic<T> for thread safe access
    • Using atomic to do complex computation: load the value first, then perform operation, then use compare_exchange_weak to set the atomic value to the computed
    • std::atomic<T> doesn’t work well with “big” types, such as string. Need mutex for this purpose.
  • std::mutex
    • Basic usage: declare mutex as static inside a function scope, and call lock and unlock to protect the operations
    • Problems with basic usage:
      • locking a mutex and forget to unlock
      • code has early return / exception so that mutex is not unlocked
      • mutex are different variables than those they are protecting, so in some usages one forgets to take the mutex lock
      • two or more mutex usage could cause deadlock
    • More idiomatic, RAII style usage: std::lock_guard
    • RUle of thumb: “always associate a mutex with its controlled data” (including both read and write)
      • On the surface, if a function is known to be only called from another which has the lock, then it is fine
      • If a function can be called from a different thread, then must be lock protected
  • Special purpose mutex types
    • std::recursive_mutex: remembers which thread has locked it; that thread can lock this mutex again, mutex counter is incremented; other threads will block until counter is 0
    • std::timed_mutex: aware of passage of time; supports try_lock_for() and try_lock_until() that works with the chrono library
    • std::recursive_timed_mutex: combination of both
    • Read-write locks: use a std::shared_timed_mutex; writers use a std::lock_guard on the mutex; readers use a std::shared_lock on the mutex
    • The current library does not allow “upgrading” a read lock (std::shared_lock) to a write lock (std::unique_lock) or downgrading; must release lock and re-acquire
  • Condition variables
    • std::condition_variable cv: cv.wait(lk) (lk is a lock_guard) puts thread to sleep; cv.notify_one() wakes up one thread that’s currently waiting on cv; cv.notify_all() wakes up all
  • Promises and futures – better treatment of waiting for conditions
    • std::promise<T> p creates a variable that “promises” to give you a T in future
    • std::future<T> f = p.get_future() obtains a handle to p’s future
    • p.set_value(42) is called in one thread
    • f.get() has the value of the promise p
    • If f.get() is called before p.set_value(42) is called, then the calling thread is blocked and waits
    • There is a specialization for std::promise<void> that’s just used for communicating “job done” and not pass value
    • promise and future doesn’t get rid of mutex, locks, or condition variables, just hides them; there is this price to pay when using
  • Packaged tasks: just like std::functions but with a future.

Chapter 8: Allocators

  • An allocators is a handle to a memory resource, behaves analogously to new and delete
  • Supports two member functions: allocate (i.e., new) and deallocate (i.e., delete)
  • Supports one non-member function: operator==
  • There are some other member / non-member functions but they are deprecated in C++17 and removed in C++20
  • To create an allocator, implement the “interface” std::pmr::memory_resource
  • std::allocator<T> is a stateless empty type that provides a handle to the global heap management by new and delete
  • To make a my_class<T> allocate-aware: add class A = std::allocator<T> as template parameter; in the emplace function, obtain a pointer to std::allocator_traits<A>; then use static_cast<T *> to turn that pointer into one for my_class<T>

Chapter 9: Iostreams

  • Input stream stages: buffering, lexing, parsing
  • Output stream stages: buffering, formatting
  • Standard C APIs: fopen, fclose, ftell, printf, snprintf, …
  • IO classes hierarchy (picture from this page)

IO Classes

  • IO streams have a rich set of manipulators such as set::setw(8), set::hex etc.
  • ostringstream has been used as a string formatter; use ostringstream::str() to obtain the final string
  • Converting numbers to strings
    • snprintf(buffer, sizeof buffer, "%d", int_value);
    • oss << int_value; str = oss.str();
    • str = std::to_string(int_value);
    • r = std::to_chars(buffer, std::end(buffer), int_value, 10); *r.ptr = '\0'
  • Converting strings to numbers
    • char* endptr; int_value = strtol(buffer, &endptr, 10);
    • int rc = sscanf(buffer, "%d", &int_value);
    • size_t endidx; int_value = std:stoi(str, &endidx, 10);
    • iss.str("42"); iss >> int_value;
    • std::from_chars_result r = std::from_chars(buffer, buffer + 2, int_value, 10);
  • Floating numbers are similar, the toi parts become tof in the functions
  • Reading a line of text: std::string line; while (std::getline(std::cin, line)) { process(line); };

Chapter 10: Regular expressions

  • Skimmed over this section; if needed, some online docs from cppreference.com could be good starting points

Chapter 11: Random numbers

  • Skimmed over this section; if needed, some online docs from cppreference.com could be good starting points

Chapter 12: Filesystem

  • Skimmed over this section; if needed, some online docs from cppreference.com could be good starting points
  • One interesting quote from this section: In POSIX, file names are strings of bytes. In Windows, file names are strings of Unicode characters.