Entropy, which is a term borrowed by me for the needs of this post from thermodynamics, is in short a measure of the disorder within a framework – its state of disorder. For centuries, people seem to be constantly moving in the opposite direction – order and organization. They praise symmetry in art, try to find harmony in their apartments, adapt calendars to properly organize (be in full control) their time using disciplines of all kinds. When it comes to expenses, minimalism comes to the rescue, which in a way facilitates access to this desired order.
However, the post is about something slightly different. I want to write about programming. Such clean-up measures did not elude this subject. People invented collections (containers) to simplify and systematize their work with data sets. Let’s look at them in the context of the C++ language.
Container types in C++
In the introduction, I outlined what containers are. This is what we call data sets along with operations defined on them to make it easier for a programmer to navigate and modify these sets. In C ++ they are implemented using class templates. All definitions are in the C++ Standard Library which is already considered an integral part of this language. However, each type of collection has its specific properties which, depending on the context, are its drawbacks or advantages. We divide them into two main types:
- sequence – they store objects of the same type in a linear manner
- associative – data is stored in structures that provide a fast lookup using keys
Associative containers have two versions, ordered and unordered. Besides, we distinguish the container adapters.
Sequential containers
These are collections where the item’s position is important. It depends on the time and place of setting the data in the collection. Modern C++ provides the following sequential containers.
- array – static contiguous array
- vector – dynamic contiguous array
- forward_list – singly-linked list
- list – doubly-linked list
- deque – double-ended queue, where elements can be added to the front or back of the queue.
According to what Linus Torvalds once said “Talk is cheap. Show me code.”:
#include <iostream>
#include <vector>
int main()
{
// Create a vector containing integers
std::vector<int> myVec = { 1,2,3 };
// Add remove one integer from vector
myVec.pop_back();
// Now myVec contains {1,2}
// Add one integer to vector
myVec.push_back(4);
// Now myVec contains {1,2,}
// Print out the vector
std::cout << “myVec = { “;
for (int n : myVec) {
std::cout << n << “, “;
}
std::cout << “};” << std::endl;
}
Output:
myVec = { 1, 2, 4, }; |
All sequence containers are used and behave in a very similar way. However, each one is better suited to something else. The programmer’s job is to determine when to use which. Which is best to use when you want to search this set frequently or, for example, insert data at the beginning.
Associative containers
Associative containers are another type of container implemented in the standard library. They are more search-oriented than for iteration or random access. There are the following types of collections of this type in C++:
- map – a collection of key-value pairs, sorted by keys
- set – a collection of unique keys
- multimap – a collection of key-value pairs
- multiset – a collection of keys
All mentioned types have two versions
- ordered – implemented using red-black trees
- unordered – implemented using hash tables
#include <iostream>
#include <map>
#include <string>
int main() {
// Create a map of three strings (that map to integers)
std::map<std::string, int> shoppingList{{“egg”, 10},
{“water”, 2},
{“flour”, 1},};
// update an existing value
shoppingList[“egg”] = 20;
// insert a new value
shoppingList[“chocolate”] = 2;
// Let’s print our map
for (const auto&[key, value] : shoppingList) // possible since C++17
{
std::cout << key << ” = ” << value << std::endl;
}
}
Output:
chocolate = 2 egg = 20 flour = 1 water = 2 |
Interestingly, the code, when replacing all std::map with std::unordered_map, would look identical for ordered and unordered types. So the question is – which one to choose?
The ordered form should be used when:
- You don’t mind the memory overhead of storing the hash table
- Ordering or ordered traversal of your data set matters
- There are not enough resources for hashing
- Custom comparison operator is preferred over hashing operator
- Guaranteed performance is needed
The unordered form should be used when:
- It is affordable – we have memory for hash table
- It can absorb the occasional long operation
- String data is used as a key
However, it is worth remembering, a good rule of thumb is that if performance is the only indicator you are interested in, then in general unordered versions are the way to go.
Containers adapters
Finally, we have a special type of (semi)container class. They are not completely containers, but rather wrappers for other, full-fledged containers. Such adapters encapsulate the container and adapt it accordingly. C++ programmers are familiar with the following:
- stack – LIFO container
- queue – FIFO container
- priority queue – provides a priority queue, which allows for constant-time lookup of the largest element
#include <iostream>
#include <stack>
#include <vector>
int main ()
{
std::deque<int> mydeque (3,100); // deque with 3 elements
std::vector<int> myvector (2,200); // vector with 2 elements
std::stack<int> first; // empty stack
std::stack<int> second (mydeque); // stack initialized to copy of deque
std::stack<int,std::vector<int> > third; // empty stack using vector
std::stack<int,std::vector<int> > fourth (myvector);
std::cout << “size of first: ” << first.size() << std::endl;
std::cout << “size of second: ” << second.size() << std::endl;
std::cout << “size of third: ” << third.size() << std::endl;
std::cout << “size of fourth: ” << fourth.size() << std::endl;
}
Output:
size of first: 0 size of second: 3 size of third: 0 size of fourth: 2 |
Summary
Containers are, as you can see, a powerful tool in the hands of programmers. They allow you to tame data sets optimally and transparently. But if anyone needed reasons to use them over and above their implementations, here’s a cheat sheet:
- STL containers are implemented correctly so no bugs are caused by them in our code
- STL containers are fast and efficient
- STL containers share common interfaces so it’s very convenient to use – kind of “if you have learned one you know them all” attitude
- STL containers are well-documented and it is common knowledge which makes your code more understandable by others
Good luck!
Originally created for Iteo.