Software & Apps

Secret Weblog • The Humble For Loop in Rust

Rust has some very nice programming facilities built in, around an iterator concept. Its performance-focused rust and low level of control make it possible to use it without paying the cost of performance. Sometimes I prefer to use humble for loop though. In some cases, it combines high performance with high readability. I thought I would be motivated by why.

I’ve written before (um, well, today) about the humble for loop in JavaScriptand I think the heuristics involved bring languages ​​with collection data structures like vectors and hashmaps (JavaScript arrays, object/HashMap, Python lists and dictionaries). Let’s see how this works in Rust. But even if you’re not a Rust expert you should be able to follow most of it, as I’ve explained Rust concepts along the way.

This is about Rust, I will provide some performance numbers in this article. The exact details of these numbers are not important. They will vary depending on which data you are using. For the purpose of this article, if there is a percentage difference in performance, that doesn’t matter. Where it gets interesting is when I start talking about 3, 10, or 100 times faster, because it shows real algorithmic differences that remain in many contexts, and therefore help guide our behavior .

map

Let’s first consider a transform loop.

let mut result = Vec::new();
for entry in list {
    result.push(transform(entry));
}

This is almost identical to the examples you see in JavaScript or Python. Vec an expandable array in Rust, like a JavaScript array or a Python list.

The simple loop has good performance, but it is actually faster, by about 30% for a set of i32 numbers :

let mut result = Vec::with_capacity(list.len());
for entry in list {
    result.push(transform(entry));
}

The road Vec works (such as Python and JS) so it preallocates space in advance, so that each push does not require memory reallocation (which requires data copying, which is expensive). But since it doesn’t know how big the final sequence will be, it may not avoid the full shift.

Here we avoid relocations entirely by giving them the final value to be used immediately with_capacity(); we know that it is equal to the length of the input list.

Now let’s see map().

let result: Vec<i32> = list.into_iter().map(transform).collect()

A few explanations are in order:

  • into_iter() turns the original vector into an iterator. the for using a loop
    into_iter() also, but not clearly.

  • map() does not return a vector, unlike Javascript where it produces an array. Instead it returns an iterator (with a built-in mapping operation). call map() doesn’t really take much time because it constructs an iterator, not executes it.

  • collect() takes an iterator and returns it to a vector. This is doing the actual work.

I think we can agree on that map() in Rust is quite readable. While there are a few more concepts involved that you don’t need to understand when you use a
for loop, these concepts are so important to Rust that you need to learn them. Therefore .map() win the more declarative.

Oddly enough, map() can also be faster. In a loop of 10000 integers it is 6 times faster on my machine!

Why? map much faster? I’m not sure. I doubt the map() option the Rust compiler describes that it can avoid allocations by simply writing the original vector, while in the loop it cannot. Or maybe it uses SIMD? I tried to look at the compiler explorer but I am not yet clever enough to know it. Maybe someone can explain!

This is a great argument for using functional programming patterns when it’s simple! More declarative and faster? Let’s use it!

Reduce

In JavaScript we see that a for loop for building a new collection is orders of magnitude faster than using reducebut is it the same as Rust? After all, map() very good! Does this mean that in Rust we have to use fold (which is equivalent to reduce in Rust) than a for
loop too?

Let’s look at the implementation of flatten, the example I also used in my JavaScript post. Here’s the simple one for loop:

let mut flat = Vec::new();
for entry in list_of_lists {
    flat.extend(entry);
}

extend() a way to push multiple objects at once, like you have in Python.

Note that we can’t really do anything with the preallocated capacity here because we don’t know the total length of the flattened list in advance, so we don’t bother.

Here is the fold version closest to the one we use in JavaScript:

let flat = list_of_lists.into_iter().fold(Vec::new(), 
  |accumulator, list| (accumulator, list).concat()
);

Let’s choose this variant:

  • We use fold(). Rust has a reduce() BUT fold is the closest equivalent to JavaScript reduce. Not important to performance.

  • |accumulator, list| ... a closure, a function we pass just like we do in JS.

  • (accumulator, list).concat() a method of adding two vectors to create a new vector.

The simple for The loop is about 180 times faster for this method (for a list of 50,000 items after flattening).

If you are a Rust developer, you may object. This method is terrible! It allocates memory all over the place, no wonder it’s slower.

But in Rust we can avoid instantiating vectors all over the place because we have iterators. Let’s test it at the iterator level:

let flat = list_of_lists.into_iter().fold(Vec::new(), |accumulator, list| {
    accumulator.into_iter().chain(list).collect()
});

What happened here? We use .chain() to combine with accumulator iterator with list iterator, and then we collect it.

Did it help? Not really — it’s faster than that concat version of only 20% or more; not even close to for loop.

Whoah, wait. We’re still doing vectors all over the place here, as we’re collecting inside the closure!

Can we avoid this and just create iterators and finally collect them at the end?

let iter: BoxIteratori32>> = Box::new(std::iter::empty());
let flat = list_of_lists.into_iter()
    .fold(iter, |accumulator, list| Box::new(accumulator.chain(list)))
    .collect()

Yes we can, but we can see that it is more difficult to read, and it actually makes it more than 10 times slower! I’m not entirely sure why – apparently the overhead of building chained iterators and heap allocation kills performance.

If you are new to Rust, sorry. the Box> thing is to tell Rust to use dynamic dispatch so I can combine the empty iterator we initialize with iterators based on chain. It has different types of Rust and this is how to make them merge.

Isn’t there a declarative way in Rust to declare this and still have performance? I mean, besides .flatten() what’s up with Rust? We can use it
flat_map:

let flat = list_of_lists.into_iter().flat_map(|list| list).collect()

flat_map You are allowed to map but also create structure, which is then flattened. It is faster than ours concat approach, but still 4 times slower than for loop.

So the pattern holds: the humble for loop, even in Rust, is worth your consideration. use fold() if it makes your code easier to write, not because it should be faster.

Errors and maps

It is common in the code for an operation to fail and we want to handle such failures. Instead of using exceptions like many programming languages, Rust has a very nice system for propagating error values ​​just like you propagate any return value. I won’t go into the details here, but hopefully you can follow along.

Let’s consider map() again, but there are errors.

struct Error;

fn transform(entry: Entry) -> Result {
    // do stuff to create a Transformed from Entry. It might fail
}

fn transform_list(list: Vec) -> Result<Vec, Error> {
    let mut result = Vec::with_capacity(list.len());
    for entry in list {
        result.push(transform(entry)?);
    }
    result
}

We can see that the transform function returns a Resultwhich is a
Transformed item (success!) yes an error (some item describing the error).

Because change is flawed, transform_list will also err and thus must return Result too. We will get a list that was successfully updated, or an error.

When we have a mistake, we want to bail it out. That’s it ?
operator afterwards transform(entry)? doing; this is equivalent to an if statement that checks if the value being evaluated is false, and if so, it returns immediately from the function with the false value.

What it looks like map()?

fn transform_list(list: Vec) -> Result<Vec, Error> {
    list.into_iter().map(transform).collect()
}

It is functionally equivalent to the above, believe it or not. First we turn the list into an iterator, so we can iterate over all its values. Then we use the functional map with transform function. Then we collect it back into a vector. You would expect it to be Vec, Error>; a list of results – either values ​​or error items.

But since we declared the return value of the function to be
Result, Error> a VARIETY collect chosen to short-circuit and bail out with a fault as soon as the first one is detected, such as for loop.

And what about with_capacity? We don’t need this, because the implementation is smart enough to ask the iterator if it has a size indicator for that, and since it’s based on a vector with known length, it has one.

So we get a lot of agility for free.

More complex error handling

Again we learned that map() very good! use map()! Sometimes however the for
The loop makes error handling easier. Some things are hard to capture in a pattern that is both efficient and declarative. Let’s consider a fault plane.

pub fn fallible_flatten(list_of_lists: Vec<Vec<Result<i32, Error>>>) -> Result<Vec<i32>, Error> {
    let mut flat = Vec::new();
    for entry in list_of_lists {
        for j in entry {
            flat.push(j?);
        }
    }
    Ok(flat)
}

We flatten a list of lists, but every entry in the list can be an error. If we encounter such an error during flattening, we want to bail out immediately. We can’t use it extend here but it’s pretty straightforward otherwise.

How to write it with fold? If we do, the bright Rust clippy will immediately suggest that we use it try_fold designed for error handling.

pub fn fallible_flatten_fold(
    list_of_lists: Vec<Vec<Result<i32, Error>>>,
) -> Result<Vec<i32>, Error> {
    list_of_lists
        .into_iter()
        .try_fold(Vec::new(), |accumulator, entry| {
            let mut result = accumulator.clone();
            for j in entry {
                result.push(j?);
            }
            Ok(result)
        })
}

If we ignore the difference in performance inherent in folding a collection like this, I’d still argue with for loop is easier to read and write in this scenario.

CONCLUSION

It’s hard to go wrong map() if all you are doing is modifying a collection.

but fold more difficult. Does that mean fold nonsense? Absolutely not. I use fold in complex scenarios that build structures where a declarative route really pays off. Sometimes going up a level of abstraction allows you to do algorithmic improvements that blow away even the performance of for loop completely out of the water. But the humble for loop has more things for it.

for there are loops break. You can return from the middle of a for loop too. Complex error handling is understated ?.

To avoid for I can scratch my head and go through a larger catalog of concepts, and improve my mathematical reasoning abilities. While there is certainly a time and place for that, in many cases the changes for the writer and reader of the code favor the humble for loop, in Rust, as well. That informs the heuristics we use as programmers.

I use simple ones i32 integers for all experiments, and your performance
Carry On can be different for different data types. Your results will vary depending on what you’re looping and what you’re doing. Simple integers are the simplest for machines to handle.

To change a i32 in a f64 .map() much faster. Turning around i32 to the String the difference in performance disappears.

Normal use extend almost double the speed in my benchmark compared to a container where objects are manually pushed into a container.

let mut flat = Vec::new();
for entry in list_of_lists {
    for j in entry {
        flat.push(j);
    }
}

Still super fast compared to the alternatives, but slower.

In your test flat_map(|list| list) the Rust linter (Clippy) comes into play suggesting that you use flatten() instead. Bizarrely enough flat_map
The method of this simple benchmark is actually 30% faster.

Here is one way to use extend:

let e = entry.into_iter().collect::<Result<Vec<_>, _>>()?;
result.extend(e);

I suspect it won’t be faster since we need to instantiate the vector collect()
to filter out the errors, but I didn’t measure it.

2024-12-12 20:10:49

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button