Overview of This Lesson

Collections in Java (which used to be referred to as "The Collections Framework") are utility objects that are used in place of arrays, where arrays aren't enough. For example, an array's size (length) is immutable in Java, but the size of a collection object is not: a collection object can grow or shrink without having to copy the array contents into a new array. Collections can also be more efficient for certain tasks than arrays, because there are many different kinds of collections that are optimized for specific tasks.

Generics (Parameterized Types)

In Java, Collections are parameterized types. This means that you define collections for specific objects (much like you define an array to hold only integers or only Strings). Parameterized types make your collections more modular and more cohesive (dedicated specifically to one specific kind of object instead of several types of objects).

Pre-Requisites

Before doing this lesson, make sure you've got a good understanding of Inheritance, and Abstract classes and Interfaces.

Generics: Parameterized Types

Before working with collections, it's important to understand Generics and Parameterized Types.

Generics allow you to parameterize types when using or creating classes and interfaces. In order to understand what this means, let's review some terms from introductory lessons and define some new ones:

In the Java documentation it is stated, "To reference the generic Box class from within your code, you must perform a generic type invocation, which replaces T with some concrete value, such as Integer: Box<Integer> integerBox;
You can think of a generic type invocation as being similar to an ordinary method invocation, but instead of passing an argument to a method, you are passing a type argument -- Integer in this case -- to the Box class itself."

The last sentence is helpful towards understanding how generics works: When you parameterize a type, imagine you are invoking a method where <T> shows you where you must pass the name of the class/type you want to use, and replacing <T> with a concrete type is passing that class/type name to the method.

Why Do We Use Generics and Parameterized Types?

The main purpose of using generics and parameterized types is that it allows you to create classes and interfaces that can be used with a variety of different kinds of objects. For example, the Comparable<T> interface can work with any kind of object belonging to class <T>. This allows you to use the Arrays.sort() method on an array of a specific object. For example, if you have an array of Circle objects, if your Circle class implements Comparable<Circle> (and therefore overrides compareTo(Circle)), then you can pass an array of Circle objects into Arrays.sort() and it will sort those objects.

You will find many classes and interfaces in Java and Java frameworks that use parameterzed types, so that those classes and interfaces can work with any kind of class/object, even classes that haven't been written, yet!

An advantage of Generics and Parameterized Types is that your compiler will capture certain errors that might instead occur at run-time. To understand this, let's look at an example using ArrayList<E>, which is one of the Java Collections objects: an ArrayList is like a regular array, but it contains only objects and the size/length is mutable (allowed to change). In our first example, I'll use ArrayList without a concrete type to replace the generic type <E>:

ArrayList<Shape> list = new ArrayList<Shape>();
list.add(new Circle());
list.add(new Cylinder());
list.add(new Scanner(System.in));
for (Object o : list) {
    Circle c = (Circle)o;
    System.out.println(c.getArea());
}

When you don't specify a concrete type for the type parameter <E>, the Java list objects store only objects of type Object. Recall that a variable/element of type Object can receive any object, because Object is the parent of all Classes in Java.

Therefore, everything you store in the list ArrayList is automatically cast into an Object type, so when you want to access the list elements, you'll need to cast them back into Circle objects, as we see inside the for-each loop on line 6.

The code in this example will throw a run time exception on line 6 for the 3rd element in the ArrayList, because Scanner is not a Circle.

It's better to be notified of these kinds of errors during compile time instead of compiling and deploying your program only to have it crash later at some unknown time. Then you can fix the error while you're still developing your application, instead of later.

ArrayList is actually defined as ArrayList<E>: <E> is a type parameter for the ArrayList's element type. You should replace <E> with a concrete type: the name of the class you want ArrayList to contain instances of.

For example, we can define the ArrayList as a parameterized type for Circles (and children of Circle) using the statement:

ArrayList<Circle> list = new ArrayList<Circle>();

Using a parameterized type for ArrayList, the following code will now give us a compile error on line 4:

ArrayList<Circle> list = new ArrayList<Circle>();
list.add(new Circle());
list.add(new Cylinder());
list.add(new Scanner(System.in));
for (Circle c : list) {
    // Circle c = (Circle)o;  this line is no longer needed
    System.out.println(c.getArea());
}

The compile error will indicate that we can't place a Scanner into a list that is defined to contain Circle objects. The objects stored in the list must be of the same type as (or children of) the type argument. This is much better and easier to fix instead of having to deal with a runtime error later.

Notice that Line 6 inside the for-each loop has been commented out: Since we've defined the ArrayList to contain Circle objects (by using the concrete type Circle instead of the generic type E), it knows that it will only contain Circle objects or children of Circle. Objects stored in the list are stored as Circle objects, not Object objects, and we don't have to cast them. This then, is a second advantage to using generics: it makes casting unnecessary.

A third advantage of using generics and parameterized types is that it allows you to create code that is more consistent and allows you to create code and algorithms that will work on any kind of collection. In fact, you'll see this already in place in the Collections classes and interfaces that are already part of Java. For example, you can use an Iterator to iterate through any kind of List collection, no matter which List type it is and no matter what kind of objects the List contains.

Another example: you could write a method that sorts any kind of List object, whether it be an ArrayList, LinkedList, or some other kind of List. That kind of technique is beyond the scope of this tutorial at the moment, but perhaps it will be added some day.

As you learn about the various collection objects in Java, you'll be using parameterized types.

Collections Overview

The collections framework contains many classes and interfaces that not only model the collection objects themeselves, but also provide tools that allow you to work with collections, performing many common tasks in the most efficient way.

What's in Java's Collections?

The Collections Framework includes:

There are different types of containers in the collections framework:

Collections Interfaces

There are a set of interfaces that help to provide some consistent behaviour between the different kinds of collections. For example, collections that have numeric indexes all have a get() method that accepts an index and returns an object at that index; collections that use non-numeric indexes have a next() method to move to the next object, or a getValue(key) method to retrieve the object stored with a specific key.

Having these common sets of behaviours mean that it's easy to write re-usable code that processes collections objects, no matter what type of collection you're dealing with.

Here are some of the more common interfaces we'll be looking at:

iterable is the
                   parent of collection; collection is the parent of list, queue, 
                   and set; off to the side is map, which is the parent of
                   sortedmap
A hierarchy of some of the collections interfaces.

There are a few other interfaces in the diagram above, and we'll look at those in detail later.

Abstract Classes

There are also some abstract classes that make up the collections framework. These classes implement various interfaces and implement the methods defined in those interfaces. This makes it easier to create concrete classes: The concrete classes such as ArrayList<E> and HashSet<E> can be extended from abstract classes that already override methods defined in their respective interfaces.

AbstractCollection
                   implements Collection; it is the parent of AbstractSet,
                   which implements Set, and AbstractList, which implements
                   List.  AbstractList is the parent to AbstractSequentialList
Some common abstract classes in the collections framework

Why is it important to know the abstract classes? When you read the API documentation for a particular collections class, you'll have to remember to also check and see what methods are inherited from parent classes!

Collections: Lists

Lists are used for storing lists of objects, much like a regular array. Lists use numeric indexes to provide direct access (otherwise known as random access) to elements/objects in the list. Lists can be sorted in a specific order and also allow for duplicate elements/objects.

AbstractList 
               implements List interface; children of AbstractList
               include AbstractSequentialList, Vector, ArrayList;
               LinkedList is a child of AbstractSequentialList; 
               Stack is a child of Vector
Some of the List classes.

Before getting into the concrete list classes, you might be interested in one other interface that's often used with lists: the Queue<E> Interface.

Concrete List Classes

Collections: Sets

Sets are collections of elements/objects that don't have numeric indexes, so you don't access them directly the same way you do with lists. Additionally, a Set is not allowed to contain duplicate values. Think of a set like a poker hand with 5 cards: you wouldn't have two of the same card in your hand - you'd have 5 unique cards.

AbstractSet implements
               the Set interface; AbstractSet is the parent of HashSet
               and TreeSet; HashSet is the parent of LinkedHashSet;
               SortedSet is the child interface of the Set interface
               and it is implemented by TreeSet;
Some of the Set classes and interfaces.

Sets resemble the same sets you probably learned about in an intermediate math course. A set contains unique objects, and you can take multiple sets and find the intersection of those sets using a method like retainAll().

Sets don't use indexing, because there is no guaranteed order to a normal set; you can't just retrieve a Set element directly like you can with a list. To process a Set, you need to use a for-each loop or an Iterator.

Use a concrete set class when you want to maintain a list of objects where you need to ensure there are no duplicates. For example, you could make a Quiz set which contains a set of unique questions, so you don't give the same question twice.

Collections: Maps

The Map interface
               has a child interface SortedMap, and Map also
               contains via composition a Map.Entry interface;
               AbstractMap implements Map; AbstractMap has a child
               HashMap, which has a child LinkedHashMap, and AbstractMap
               also has a child TreeMap, which implements SortedMap
Some of the Map classes and interfaces.

Maps are used for storing collections of objects that can be easily retrieved with a specific key. The key can be any class type (so if you want a Map with integer keys, you'd have to use Integer, for example). Keys must be unique.

Each key is associated with an element value. These are called key-value pairs. The values don't have to be unique - you can store duplicate objects mapped to different keys.

A Map's elements are usually called entries, and can be modeled by the nested interface Map.Entry<K, V>.

Map.Entry<K, V> is owned by the Map<K, V> interface. In fact, Map<K, V> contains a method entrySet() that returns a Set<E>! This Set object contains the set of Map.Entry<K, V> objects stored in the map.

You might use a Map to store a collection of configuration options where the key is the String name of the configuration option (e.g. "backgroundColour" or "fontSize") and the value is the user's preferred settings for this option. Or you could store information about managers at each store in a chain: They key could be a Store object for a specific store, and the value could be the Employee object for the manager of that store.

Concrete Classes

Iterators

Iterators allow you to iterate through various types collections.

Iterators are a very common design pattern used to visit elements in any data structure in such a way that it doesn't matter how the data sits in the collection, you can still iterate through all of the elements in some way. The Collection<E> interface extends the Iterable<E> Interface, which makes any Collection<E> object iterable.

To iterate through a collection, you obtain its Iterator<E> using the iterator() method that's defined in the Iterable<E> interface (so therefore it's inherited by all the Collection classes). An Iterator<E> has methods like next(), hasNext(), and remove(). Iterators make it easy to write re-usable code that can traverse through any type of collection, regardless of the type of collection or the type of objects the collection contains.

The List<E> interface contains the listIterator() method, which retrieves a ListIterator<E>, a more specific kind of Iterator made for List<E> collections. Where an Iterator<E> can only move forward through a collection and not add/edit elements, a ListIterator<E> can move forwards and backwards through a collection, and provides methods for adding or editing the objects in the list.

Note that iterators all have a remove() method for removing the current element from the list, but it should be used with great care.

Iterator Examples:

LinkedList<String> list = new LinkedList();
list.add("cat");
list.add("dog");
list.add("fish");
list.addFirst("ferret");  
list.addLast("hamster"); 

Iterator<String> iterFwd = list.iterator();
while (iterFwd.hasNext()) {
    System.out.println(iterFwd.next());
}
System.out.println();
ListIterator<String> iterBack = list.listIterator(list.size());
while (iterBack.hasPrevious()) {
    System.out.println(iterBack.previous());
}

The biggest benefit of using Iterators is that they are able to traverse or loop through any kind of collection. The example above would also work with an ArrayList, instead of a LinkedList:

ArrayList<String> list = new ArrayList<String>();
  list.add("cat");
  list.add("dog");
  list.add("fish");
  list.add(0, "ferret");  
  list.add("hamster"); 
  
  // exact same code as previous example:

  Iterator<String> iterFwd = list.iterator();
  while (iterFwd.hasNext()) {
      System.out.println(iterFwd.next());
  }
  System.out.println();
  ListIterator<String> iterBack = list.listIterator(list.size());
  while (iterBack.hasPrevious()) {
      System.out.println(iterBack.previous());
  }

The code that uses the Iterator in the previous two examples will even work with a set:

HashSet<String> list = new HashSet<>();
  list.add("cat");
  list.add("dog");
  list.add("fish");
  list.add("ferret");
  list.add("hamster");
  
  // exact same code as previous example:

  Iterator<String> iterFwd = list.iterator();
  while (iterFwd.hasNext()) {
      System.out.println(iterFwd.next());
  }
 

Note that the ListIterator part of the code won't work because ListIterators are only available to Lists, not to Sets or Maps.

The For-Each Loop

You should already be familiar with the for-each loop:

for (String s : list) {
  System.out.println(s)
}

When you're reading (not modifying) a collection, you can use the for-each loop, because it also works in a consistent way for various kinds of collections.

Furthermore, if you're going to do nested iteration with collections, it's actually better to use a for-each loop because it's easier for programmers to unknowingly encounter logic errors when using Iterators in some nested loops (for more information see Geeks for Geeks: Iterator vs. Foreach in Java).