Have you seen the presentation by Bjarne Stroustrup Why you should avoid Linked Lists? This presentation has received lots of popularity from many programmers even though the example Bjarne used to compare the performance of linked lists and vectors was quite misleading

He planned to show that vectors are better than linked lists even in scenarios that most programmers think are better suited for linked lists. The problem is, that in his example, the asymptotic complexity is the same when using the linked list or when using the vector. I don’t think any competent programmer would even consider using linked list in that case. He would probably use some kind of binary search tree which would dominate the performance as shown in another response to the presentation.

On Bjarne’s defense, he did correctly mention that the problem in his example is the linear search. The problem is that many people ignore it and jump to wrong conclusions like: Arrays are good, all other fancy containers are bad, and that is all because of cache.

In this post, I wanted to show an example where linked lists dominate arrays to demonstrate that they are not useless at all. The example is simply the following task, which might be quite common in real life: **Remove all odd integers from a sequence.**

This can be solved quite easily by the following code which is quite ordinary:

1 2 3 4 5 6 7 |
for (auto it = collection.begin(); it != collection.end();) { if (*it % 2 == 1) { it = collection.erase(it); } else { ++it; } } |

For linked lists this code has complexity of **O(n) **because we already have the iterator for the element we are erasing and do not have to perform linear-search for every element as in Bjarne’s example. For vectors the complexity is **O(n^2)** because we need to shift all of the following elements after every erase we perform.

Benchmarking on my computer shows an expected decisive victory of linked lists now.

1 2 3 4 5 6 7 8 |
Vector, size: 100000, algorithm: manual, time: 0.441025 s. List, size: 100000, algorithm: manual, time: 0.0020001 s. Vector, size: 200000, algorithm: manual, time: 2.14312 s. List, size: 200000, algorithm: manual, time: 0.0030002 s. Vector, size: 500000, algorithm: manual, time: 14.1428 s. List, size: 500000, algorithm: manual, time: 0.0070004 s. |

This clearly shows that linked lists are not useless. As with everything, you just need to use them in a right situation. In this case we already have an iterator inside the list so we can perform item insertion or removal in constant time. Of course, in real life we do not have only linked lists and vectors. It is true that for most tasks std::deque would outperform linked lists. Pure linked lists have really niche applications. As an example, they can possibly outperform std::deque when the items are huge.

Now comes the gotcha and the main point of this post. There exists another way how to perform this task and that is by using the standard library:

1 2 3 |
auto newEnd = std::remove_if(collection.begin(), collection.end(), [](int v) {return (v % 2 == 1);}); coll1.erase(newEnd, coll1.end()); |

After benchmarking this the results may seem surprising to you:

1 2 3 4 5 6 7 8 |
Vector, size: 100000, algorithm: remove_if, time: 0 s. List, size: 100000, algorithm: remove_if, time: 0.0020002 s. Vector, size: 200000, algorithm: remove_if, time: 0 s. List, size: 200000, algorithm: remove_if, time: 0.0030002 s. Vector, size: 500000, algorithm: remove_if, time: 0.0010001 s. List, size: 500000, algorithm: remove_if, time: 0.0080005 s. |

As you can see the vector is now slightly faster than the linked list. How can it be? Where did the complexity go? The answer is that the std::remove_if is a bit more clever than our approach above and the complexity for vector is again **O(n).**

The lesson from this is that you should really, really look at the <algorithm> library and, instead of imperfectly reinventing wheels, try to use it when you can.

The following is the whole code I used for benchmarking:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
#include <algorithm> #include <chrono> #include <iostream> #include <list> #include <string> #include <vector> template<typename T> std::chrono::duration<double> measure(T measuredCall) { auto start = std::chrono::system_clock::now(); measuredCall(); auto end = std::chrono::system_clock::now(); return end - start; } template<template<typename Type, typename Allocator> class Collection> Collection<int, std::allocator<int>> createCollection(std::size_t size) { Collection<int, std::allocator<int>> collection; for (std::size_t i = 0; i < size; ++i) { collection.push_back(i); } return collection; } template<template<typename Type, typename Allocator> class Collection> void collectionBenchmarks(const std::string & name, std::size_t size) { Collection<int, std::allocator<int>> coll1 = createCollection<Collection>(size); auto removeifRemoval = [&coll1]() { auto newEnd = std::remove_if(coll1.begin(), coll1.end(), [](int v) {return (v % 2 == 1);}); coll1.erase(newEnd, coll1.end()); }; std::cout << name << ", size: " << size << ", algorithm: remove_if, time: " << measure(removeifRemoval).count() << " s." << std::endl; Collection<int, std::allocator<int>> coll2 = createCollection<Collection>(size); auto manualRemoval = [&coll2]() { for (auto it = coll2.begin(); it != coll2.end();) { if (*it % 2 == 1) { it = coll2.erase(it); } else { ++it; } } }; std::cout << name << ", size: " << size << ", algorithm: manual, time: " << measure(manualRemoval).count() << " s." << std::endl; } void benchmarks(size_t size) { collectionBenchmarks<std::vector>("Vector", size); collectionBenchmarks<std::list>("List", size); std::cout << std::endl; } int main(int argc, char** argv) { benchmarks(100000); benchmarks(200000); benchmarks(500000); return 0; } |

“For vectors the complexity is O(n^2) because we need to shift all of the following elements after every erase we perform.”

False. You don’t need to shift per erase, just track read and write indices separately and it’s O(n).

That is one of the points of the article. The first method is far too commonly used even though it’s really ineffective with vectors. By using standard algorithms, if possible, we can be sure most of the time, that the solution is not even shorter, but also effecient.

Also this example is intentionally simple, but we can have more complicated access pattern where the method you mentioned would not be possible. First example that comes to my mind: Each element can be a line of text and we could move cursor up/down, and delete them. Yes, std::deque (or some other data structure) would serve us probably better than linked list, but vector would be linear for the removal now.