Neat Java 8 features - Stream API

This is the second of three posts and is about the Stream API.

Previous post: Neat Java 8 features - Lambda expressions Next post: Neat Java 8 features - Concurrency API

Stream API

These methods make it easier to work with Collections. What are you typically doing with a collection? You’re iterating over it, in order modify certain/all objects, find specific objects, filter the collection, sort the collection, again and again… And you always start with for(...) {...} or while(...) {...}.

For example if you have a Collection of Items

    List<Item> collection = new ArrayList<Item>();
    for (int i=0; i < 1000; i++)
        collection.add(Item.randomInstance());

    static class Item {
        private static Random rand = new Random();
        
        String name;
        double price;

        Item(String name, double price) {
            this.name = name;
            this.price = price;
        }
        
        static Item randomInstance() {
            String name = UUID.randomUUID().toString().substring(0, 5);
            double amount = rand.nextDouble()*10;
            return new Item(name, amount);
         }
    }

A common task is to filter this collection:

    List<Item> filtered = new ArrayList<Item>();
    for(Item i : collection) {
        if (i.price > 5)
            filtered.add(i);
    }
    System.out.println(filtered.size()+" items are more expensive than 5.00 whatever." );

With the stream methods you can do this as a one liner:

    filtered = collection.stream().filter(i -> i.price > 5).collect(Collectors.toList());
    System.out.println(filtered.size()+" items are more expensive than 5.00 whatever." );

You can also do some more fancy things without all the usual for/while loop bloat:

    // Get the names of all items with a price > 9.5
    // as alphabetically sorted list
    List<String> names = collection.stream()
                         .filter(i -> i.price > 9.5)
                         .map(i -> i.name)
                         .sorted(String::compareTo)
                         .collect(Collectors.toList());
    // print the list
    names.stream().forEach(System.out::println);

Apart from the simplification of the code a big advantage of the stream API is, that it is very easy to parallelize tasks. If you have for example a fairly expensive operation which does something with your Item:

    Function<Item, Item> expensiveOperation = i -> {
        try {
            Thread.sleep(10);
        } catch (InterruptedException e) {
        }
        i.name = i.name.toUpperCase();
        return i;
    };

…and you want to run that over the whole collection, you could use the for loop:

    long t = System.currentTimeMillis();
    for(Item i : collection) {
        expensiveOperation.apply(i);
    }
    t = System.currentTimeMillis() - t;
    System.out.println("for loop took "+t+" ms.");
    
    // output: for loop took 11793 ms.

Or you could very easily parallelize it and make use of all your CPU cores:

    long t = System.currentTimeMillis();
    collection.parallelStream().map(expensiveOperation).collect(Collectors.toList());
    t = System.currentTimeMillis() - t;
    System.out.println("parallelStream() took "+t+" ms.");
    
    // output: parallelStream took 1498 ms.

But there is a caveat, that’s why I said use parallelStream() only for ‘fairly expensive’ operations:

  • Parallelization comes with an overhead. If the operation is very simple and quick, the sequential for loop is probably faster than Java having to deal with the parallelization.
  • The stream API uses a common thread pool. If it’s possible that the operation could take a long time to finish, you easily could use up all the available threads of the pool and block other faster / more important parallel tasks.