This article discusses java.util.EnumSet and java.util.EnumMap from Java’s standard libraries.

What are they?

EnumSet
and
EnumMap are compact, efficient implementations of the Set and Map interfaces. They have the constraint that their elements/keys come from a single enum type.

Like HashSet and HashMap, they are modifiable.

In contrast to HashSet, EnumSet:

  • Consumes less memory, usually.
  • Is faster at all the things a Set can do, usually.
  • Iterates over elements in a predictable order (the declaration order of the element type’s enum constants).
  • Rejects null elements.

In contrast to HashMap, EnumMap:

  • Consumes less memory, usually.
  • Is faster at all the things a Map can do, usually.
  • Iterates over entries in a predictable order (the declaration order of the key type’s enum constants).
  • Rejects null keys.

If you’re wondering how this is possible, I encourage you to look at the source code:

  • EnumSetA bit vector of the ordinals of the elements in the Set. This is an abstract superclass of RegularEnumSet and JumboEnumSet.
  • RegularEnumSetAn EnumSet whose bit vector is a single primitive long, which is enough to handle all enum types having 64 or fewer constants.
  • JumboEnumSetAn EnumSet whose bit vector is a long[] array, which is allocated however many slots are necessary for the given enum type. Two slots are allocated for 128 or fewer constants, three slots for 192 or fewer constants, etc.
  • EnumMapA flat array of the Map‘s values indexed by the ordinals of their keys.

EnumSet and EnumMap cheat! They use privileged code like this:


**
* Returns all of the values comprising E.
* The result is uncloned, cached, and shared by all callers.
*/
private static <E extends Enum<E>> E[] getUniverse(Class<E> elementType) {
return SharedSecrets.getJavaLangAccess()
                    .getEnumConstantsShared(elementType);
}
                    

If you want all the Month constants, you might call Month.values(), giving you a Month[] array. There is a single backing array instance of those Month constants living in memory somewhere (a private field in the Class object for Month), but it wouldn’t be safe to pass that array directly to every caller of values(). Imagine if someone modified that array! Instead, values() creates a fresh clone of the array for each caller.

EnumSet and EnumMap get to skip that cloning step. They have direct access to the backing array.

Effectively, no third-party versions of these classes can be as efficient. Third-party libraries that provide enum-specialized collections tend to delegate to EnumSet and EnumMap. It’s not that the library authors are lazy or incapable; delegating is the correct choice for them.

When should they be used?

Historically, Enum{Set,Map} were recommended as a matter of safety, taking better advantage of Java’s type system than the alternatives.

Prefer enum types and Enum{Set,Map} over int flags.

Effective Java goes into detail about this use case for Enum{Set,Map} and enum types in general. If you write a lot of Java code, then you should read that book and follow its advice.

Before enum types existed, people would declare flags as int constants. Sometimes the flags would be powers of two and combined into sets using bitwise arithmetic:


static final int OVERLAY_STREETS  = 1 << 0;
static final int OVERLAY_ELECTRIC = 1 << 1;
static final int OVERLAY_PLUMBING = 1 << 2;
static final int OVERLAY_TERRAIN  = 1 << 3;

void drawCityMap(int overlays) { ... }

drawCityMap(OVERLAY_STREETS | OVERLAY_PLUMBING);
                    

Other times the flags would start at zero and count up by one, and they would be used as array indexes:


static final int MONSTER_SLIME    = 0;
static final int MONSTER_GHOST    = 1;
static final int MONSTER_SKELETON = 2;
static final int MONSTER_GOLEM    = 3;

int[] kills = getMonstersSlain();

if (kills[MONSTER_SLIME] >= 10) { ... }
                    

These approaches got the job done for many people, but they were somewhat error-prone and difficult to maintain.

When enum types were introduced to the language, Enum{Set,Map} came with them. Together they were meant to provide better tooling for problems previously solved with int flags. We would say, “Don’t use int flags, use enum constants. Don’t use bitwise arithmetic for sets of flags, use EnumSet. Don’t use arrays for mappings of flags, use EnumMap.” This was not because the enum-based solutions were faster than int flags — they were probably slower — but because the enum-based solutions were easier to understand and implement correctly.

Fast forward to today, I don’t see many people using int flags anymore (though there are notable exceptions). We’ve had enum types in the language for more than a decade. We’re all using enum types here and there, we’re all using the collections framework. At this point, while Effective Java‘s advice regarding Enum{Set,Map} is still valid, I think most people will never have a chance to put it into practice.

Today, we’re using enum types in the right places, but we’re forgetting about the collection types that came with them.

Prefer Enum{Set,Map} over Hash{Set,Map} as a performance optimization.

  • Prefer EnumSet over HashSet when the elements come from a single enum type.
  • Prefer EnumMap over HashMap when the keys come from a single enum type.

Should you refactor all of your existing code to use Enum{Set,Map} instead of Hash{Set,Map}? No.

Your code that uses Hash{Set,Map} isn’t wrong. Migrating to Enum{Set,Map} might make it faster. That’s it.

If you’ve ever used primitive collection libraries like fastutil or Trove, then it may help to think of Enum{Set,Map} like those primitive collections. The difference is that Enum{Set,Map} are specialized for enum types, not primitive types, and you can use them without depending on any third-party libraries.

Enum{Set,Map} don’t have identical semantics to Hash{Set,Map}, so please don’t make blind, blanket replacements in your existing code.

Instead, try to remember these classes for next time. If you can make your code more efficient for free, then why not go ahead and do that, right?

If you use IntelliJ IDEA, you can have it remind you to use Enum{Set,Map} with inspections:

  • Analyze – Run inspection by name – “Set replaceable with EnumSet” or “Map replaceable with EnumMap”

…or…

  • File – Settings – Editor – Inspections – Java – Performance issues – “Set replaceable with EnumSet” or “Map replaceable with EnumMap”

SonarQube can also remind you to use Enum{Set,Map}:

  • S1641: “Sets with elements that are enum values should be replaced with EnumSet”
  • S1640: “Maps with keys that are enum values should be replaced with EnumMap”

For immutable versions of Enum{Set,Map}, see the following methods from Guava:

If you don’t want to use Guava, then wrap the modifiable Enum{Set,Map} instances in Collections.unmodifiableSet(set) or Collections.unmodifiableMap(map) and throw away the direct references to the modifiable collections.

The resulting collections may be less efficient when it comes to operations like containsAll and equals than their counterparts in Guava, which may in turn be less efficient than the raw modifiable collections themselves.

Could the implementations be improved?

Since they can’t be replaced by third-party libraries, Enum{Set,Map} had better be as good as possible! They’re good already, but they could be better.

Enum{Set,Map} have missed out on potential upgrades since Java 8. New methods were added in Java 8 to Set and Map (or higher-level interfaces like Collection and Iterable). While the default implementations of those methods are correct, we could do better with overrides in Enum{Set,Map}.

This issue is tracked as JDK-8170826.

Specifically, these methods should be overridden:

  • {Regular,Jumbo}EnumSet.forEach(action)
  • {Regular,Jumbo}EnumSet.iterator().forEachRemaining(action)
  • {Regular,Jumbo}EnumSet.spliterator()
  • EnumMap.forEach(action)
  • EnumMap.{keySet,values,entrySet}().forEach(action)
  • EnumMap.{keySet,values,entrySet}().iterator().forEachRemaining(action)
  • EnumMap.{keySet,values,entrySet}().spliterator()

I put sample implementations on GitHub in case you’re curious what these overrides might look like. They’re all pretty straightforward.

Rather than walk through each implementation in detail, I’ll share some high-level observations about them.

  • The optimized forEach and forEachRemaining methods are roughly 50% better than the defaults (in terms of operations per second).
  • EnumMap.forEach(action) benefits the most, becoming twice as fast as the default implementation.
  • The iterable.forEach(action) method is popular. Optimizing it tends to affect a large audience, which increases the likelihood that the optimization (even if small) is worthwhile. (I’d claim that iterable.forEach(action) is too popular, and I’d suggest that the traditional enhanced for loop should be preferred over forEach except when the argument to forEach can be written as a method reference. That’s a topic for another discussion, though.)
  • The iterator.forEachRemaining(action) method is more important than it seems. Few people use it directly, but many people use it indirectly through streams. The default spliterator() delegates to the iterator(), and the default stream() delegates to the spliterator(). In the end, stream traversal may delegate to iterator().forEachRemaining(...). Given the popularity of streams, optimizing this method is a good idea!
  • The iterable.spliterator() method is critical when it comes to stream performance, but writing a custom Spliterator from scratch is a non-trivial task. I recommend this approach:
    • Check whether the characteristics of the default spliterator are correct for your collection (often times the defaults are too conservative — for example, EnumSet‘s spliterator is currently missing the ORDERED, SORTED, and NONNULL characteristics). If they’re not correct, then provide a trivial override of the spliterator that uses Spliterators.spliterator(collection, characteristics) to define the correct characteristics.
    • Don’t go further than that until you’ve read through the implementation of that spliterator, and you understand how it works, and you’re confident that you can do better. In particular, your tryAdvance(action) and trySplit() should both be better. Write a benchmark afterwards to confirm your assumptions.
  • The map.forEach(action) method is extremely popular and is almost always worth overriding. This is especially true for maps like EnumMap that create their Entry objects on demand.
  • It’s usually possible to share code across the forEach and forEachRemaining methods. If you override one, you’re already most of the way there to overriding the others.
  • I don’t think it’s worthwhile to override collection.removeIf(filter) in any of these classes. For RegularEnumSet, where it seemed most likely to be worthwhile, I couldn’t come up with a faster implementation than the default.
  • Enum{Set,Map} could provide faster hashCode() implementations than the ones they currently inherit from AbstractSet and AbstractMap, but I don’t think that would be worthwhile. In general, I don’t think optimizing the hashCode() of collections is worthwhile unless it can somehow become a constant-time (O(1)) operation, and even then it is questionable. Collection hash codes aren’t used very often.

Could the APIs be improved?

The implementation-level changes I’ve described are purely beneficial. There is no downside other than a moderate increase in lines of code, and the new lines of code aren’t all that complicated. (Even if they were complicated, this is java.util! Bring on the micro-optimizations.)

Since the existing code is already so good, though, changes of this nature have limited impact. Cutting one third or one half of the execution time from an operation that’s already measured in nanoseconds is a good thing but not game-changing. I suspect that those changes will cause exactly zero users of the JDK to write their applications differently.

The more tantalizing, meaningful, and dangerous changes are the realm of the APIs.

I think that Enum{Set,Map} are chronically underused. They have a bit of a PR problem. Some developers don’t know these classes exist. Other developers know about these classes but don’t bother to reach for them when the time comes. It’s just not a priority for them. That’s totally understandable, but… There’s avoiding premature optimization and then there’s throwing away performance for no reason — performance nihilism? Maybe we can win their hearts with API-level changes.

No one should have to go out of their way to use Enum{Set,Map}. Ideally it should be easier than using Hash{Set,Map}. The EnumSet.allOf(elementType) method is a great example. If you want a Set containing all the enum constants of some type, then EnumSet.allOf(elementType) is the best solution and the easiest solution.

The high-level JDK-8145048 tracks a couple of ideas for improvements in this area. In the following sections, I expand on these ideas and discuss other API-level changes.

Add immutable Enum{Set,Map} (maybe?)

In a recent conversation on Twitter about JEP 301: Enhanced Enums, Joshua Bloch and Brian Goetz referred to theoretical immutable Enum{Set,Map} types in the JDK.

Joshua Bloch also discussed the possibility of an immutable EnumSet in Effective Java:

“The one real disadvantage of EnumSet is that it is not, as of release 1.6, possible to create an immutable EnumSet, but this will likely be remedied in an upcoming release. In the meantime, you can wrap an EnumSet with Collections.unmodifiableSet, but conciseness and performance will suffer.”

When he said “performance will suffer”, he was probably referring to the fact that certain bulk operations of EnumSet won’t execute as quickly when inside a wrapper collection (tracked as JDK-5039214). Consider RegularEnumSet.equals(object):


public boolean equals(Object o) {
    if (!(o instanceof RegularEnumSet))
        return super.equals(o);

    RegularEnumSet<?> es = (RegularEnumSet<?>)o;
    if (es.elementType != elementType)
        return elements == 0 && es.elements == 0;

    return es.elements == elements;
}
                    

It’s optimized for the case that the argument is another instance of RegularEnumSet. In that case the equality check boils down to a comparison of two primitive long values. Now that’s fast!

If the argument to equals(object) was not a RegularEnumSet but instead a Collections.unmodifiableSet wrapper, that code would fall back to its slow path.

Guava’s approach is similar to the Collections.unmodifiableSet one, although Guava does a bit better in terms of unwrapping the underlying Enum{Set,Map} and delegating to the super-fast optimized paths.

If your application deals exclusively with Guava’s immutable Enum{Set,Map} wrappers, you should get the full benefit of those optimized paths from the JDK. If you mix and match Guava’s collections with the JDK’s though, the results won’t be quite as good. (RegularEnumSet doesn’t know how to unwrap Guava’s ImmutableEnumSet, so a comparison in that direction would invoke the slow path.)

If immutable Enum{Set,Map} had full support in the JDK, however, it would not have those same limitations. RegularEnumSet and friends can be changed.

What should be done in the JDK?

I spent a long time and tested a lot of code trying to come up with an answer to this. Sadly the end result is:

I don’t know.

Personally, I’m content to use Guava for this. I’ll share some observations I made along the way.

Immutable Enum{Set,Map} won’t be faster than mutable Enum{Set,Map}.

The current versions of Enum{Set,Map} are really, really good. They’ll be even better once they override the defaults from Java 8.

Sometimes, having to support mutability comes with a tax on efficiency. I don’t think this is the case with Enum{Set,Map}. At best, immutable versions of these classes will be exactly as efficient as the mutable ones.

The more likely outcome is that immutable versions will come with a small penalty to performance by expanding the Enum{Set,Map} ecosystem.

Take RegularEnumSet.equals(object) for example. Each time we create a new type of EnumSet, are we going to change that code to add a new instanceof check for our new type? If we add the check, we make that code worse at handling everything except our new type. If we don’t add the check, we…. still make that code worse! It’s less effective than it used to be; more EnumSet instances trigger the slow path.

Classes like Enum{Set,Map} have a userbase that is more sensitive to changes in performance than average users. If adding a new type causes some call site to become megamorphic, we might have thrown their carefully-crafted assumptions regarding performance out the window.

If we decide to add immutable Enum{Set,Map}, we should do so for reasons unrelated to performance.

As an exception to the rule, an immutable EnumSet containing all constants of a single enum type would be really fast.

RegularEnumSet sets such a high bar for efficiency. There is almost no wiggle room in Set operations like contains(element) for anyone else to be faster. Here’s the source code for RegularEnumSet.contains(element):


public boolean contains(Object e) {
    if (e == null)
        return false;
    Class<?> eClass = e.getClass();
    if (eClass != elementType && eClass.getSuperclass() != elementType)
        return false;

    return (elements & (1L << ((Enum<?>)e).ordinal())) != 0;
}
                    

If you can’t do contains(element) faster than that, you’ve already lost. Your EnumSet is probably worthless.

There is a worthy contender, which I’ll call FullEnumSet. It is an EnumSet that (always) contains every constant of a single enum type. Here is one way to write that class:



import java.util.function.Consumer;
import java.util.function.Predicate;

class FullEnumSet<E extends Enum&lt;E>> extends EnumSet<E> {

  // TODO: Add a static factory method somewhere.
  FullEnumSet(Class<E> elementType, Enum<?>[] universe) {
    super(elementType, universe);
  }

  @Override
  @SuppressWarnings("unchecked")
  public Iterator<E> iterator() {
    // TODO: Avoid calling Arrays.asList.
    //       The iterator class can be shared and used directly.
    return Arrays.asList((E[]) universe).iterator();
  }

  @Override
  public Spliterator<E> spliterator() {
    return Spliterators.spliterator(
        universe,
        Spliterator.ORDERED |
            Spliterator.SORTED |
            Spliterator.IMMUTABLE |
            Spliterator.NONNULL |
            Spliterator.DISTINCT);
  }

  @Override
  public int size() {
    return universe.length;
  }

  @Override
  public boolean contains(Object e) {
    if (e == null)
      return false;

    Class<?> eClass = e.getClass();
    return eClass == elementType || eClass.getSuperclass() == elementType;
  }

  @Override
  public boolean containsAll(Collection<?> c) {
    if (!(c instanceof EnumSet))
      return super.containsAll(c);

    EnumSet<?> es = (EnumSet<?>) c;
    return es.elementType == elementType || es.isEmpty();
  }

  @Override
  @SuppressWarnings("unchecked")
  public void forEach(Consumer<? super E> action) {
    int i = 0, n = universe.length;
    if (i >= n) {
      Objects.requireNonNull(action);
      return;
    }
    do action.accept((E) universe[i]);
    while (++i < n);
  }

  @Override void addAll()               {throw uoe();}
  @Override void addRange(E from, E to) {throw uoe();}
  @Override void complement()           {throw uoe();}

  @Override public boolean add(E e)                          {throw uoe();}
  @Override public boolean addAll(Collection<? extends E> c) {throw uoe();}
  @Override public void    clear()                           {throw uoe();}
  @Override public boolean remove(Object e)                  {throw uoe();}
  @Override public boolean removeAll(Collection<?> c)        {throw uoe();}
  @Override public boolean removeIf(Predicate<? super E> f)  {throw uoe();}
  @Override public boolean retainAll(Collection<?> c)        {throw uoe();}

  private static UnsupportedOperationException uoe() {
    return new UnsupportedOperationException();
  }

  // TODO: Figure out serialization.
  //       Serialization should preserve these qualities:
  //         - Immutable
  //         - Full
  //         - Singleton?
  //       Maybe it's a bad idea to extend EnumSet?
  private static final long serialVersionUID = 0;
}
                    

FullEnumSet has many desirable properties. Of note:

  • contains(element) only needs to check the type of the argument to know whether it’s a member of the set.
  • containsAll(collection) is extremely fast when the argument is an EnumSet (of any kind); it boils down to comparing the element types of the two sets. It follows that equals(object) is just as fast in that case, since equals delegates the hard work to containsAll.
  • Since all the elements are contained in one flat array with no empty spaces, conditions are ideal for iterating and for splitting (splitting efficiency is important in the context of parallel streams).
  • It beats RegularEnumSet in all important metrics:
    • Query speed (contains(element), etc.)
    • Iteration speed
    • Space consumed

Asking for the full set of enum constants of some type is a very common operation. See: every user of values(), elementType.getEnumConstants(), and EnumSet.allOf(elementType). I bet the vast majority of those users do not modify (their copy of) that set of constants. A class that is specifically tailored to that use case has a good chance of being worthwhile.

Since it’s immutable, the FullEnumSet of each enum type could be a lazy-initialized singleton.

Should immutable Enum{Set,Map} reuse existing code, or should they be rewritten from scratch?

As I said earlier, the immutable versions of these classes aren’t going to be any faster. If they’re built from scratch, that code is going to look near-identical to the existing code. There would be a painful amount of copy and pasting, and I would not envy the people responsible for maintaining that code in the future.

Suppose we want to reuse the existing code. I see two general approaches:

  1. Do what Guava did, basically. Create unmodifiable wrappers around modifiable Enum{Set,Map}. Both the wrappers and the modifiable collections should be able to unwrap intelligently to take advantage of the existing optimizations for particular Enum{Set,Map} types (as in RegularEnumSet.equals(object)).
  2. Extend the modifiable Enum{Set,Map} classes with new classes that override modifier methods to throw UnsupportedOperationException. Optimizations that sniff for particular Enum{Set,Map} types (as in RegularEnumSet.equals(object)) remain exactly as effective as before without changes.

Of those two, I prefer the Guava-like approach. Extending the existing classes raises some difficult questions about the public API, particularly with respect to serialization.

What’s the public API for immutable Enum{Set,Map}? What’s the immutable version of EnumSet.of(e1, e2, e3)?

Here’s where I gave up.

  • Should we add public java.util.ImmutableEnum{Set,Map} classes?
  • If not, where do we put the factory methods, and what do we name them? EnumSet.immutableOf(e1, e2, e3)? EnumSet.immutableAllOf(Month.class)? Yuck! (Clever synonyms like “having” and “universeOf” might be even worse.)
  • Are the new classes instances of Enum{Set,Map} or do they exist in an unrelated class hierarchy?
  • If the new classes do extend Enum{Set,Map}, how is serialization affected? Do we add an “isImmutable” bit to the current serialized forms? Can that be done without breaking backwards compatibility?

Good luck to whoever has to produce the final answers to those questions.

That’s enough about this topic. Let’s move on.

Add factory methods

JDK-8145048 mentions the possibility of adding factory methods in Enum{Set,Map} to align them with Java 9’s Set and Map factories. EnumSet already has a varargs EnumSet.of(...) factory method, but EnumMap has nothing like that.

It would be nice to be able to declare EnumMap instances like this, for some reasonable number of key-value pairs:


Map<DayOfWeek, String> dayNames =
    EnumMap.of(
        DayOfWeek.MONDAY,    "lunes",
        DayOfWeek.TUESDAY,   "martes",
        DayOfWeek.WEDNESDAY, "miércoles",
        DayOfWeek.THURSDAY,  "jueves",
        DayOfWeek.FRIDAY,    "viernes",
        DayOfWeek.SATURDAY,  "sábado",
        DayOfWeek.SUNDAY,    "domingo");
                    

Users could use EnumMap‘s copy constructor in conjunction with Java 9’s Map factory methods to achieve the same result less efficiently…


Map<DayOfWeek, String> dayNames =
    new EnumMap<>(
        Map.of(
            DayOfWeek.MONDAY,    "lunes",
            DayOfWeek.TUESDAY,   "martes",
            DayOfWeek.WEDNESDAY, "miércoles",
            DayOfWeek.THURSDAY,  "jueves",
            DayOfWeek.FRIDAY,    "viernes",
            DayOfWeek.SATURDAY,  "sábado",
            DayOfWeek.SUNDAY,    "domingo"));
                    

…but the more we give up efficiency like that, the less EnumMap makes sense in the first place. A reasonable person might start to question why they should bother with EnumMap at all — just get rid of the new EnumMap<>(...) wrapper and use Map.of(...) directly.

Speaking of that EnumMap(Map) copy constructor, the fact that it may throw IllegalArgumentException when provided an empty Map leads people to use this pattern instead:


Map<DayOfWeek, String> copy = new EnumMap<>(DayOfWeek.class);
copy.putAll(otherMap);
                    

We could give them a shortcut:


Map<DayOfWeek, String> copy = new EnumMap<>(DayOfWeek.class, otherMap);
                    

Similarly, to avoid an IllegalArgumentException from EnumSet.copyOf(collection), I see code like this:


Set<Month> copy = EnumSet.noneOf(Month.class);
copy.addAll(otherCollection);
                    

We could give them a shortcut too:


Set<Month> copy = EnumSet.copyOf(Month.class, otherCollection);
                    

Existing code may define mappings from enum constants to values as standalone functions. Maybe the users of that code would like to view those (function-based) mappings as Map objects.

To that end, we could give people the means to generate an EnumMap from a Function:


Locale locale = Locale.forLanguageTag("es-MX");

Map<DayOfWeek, String> dayNames =
    EnumMap.map(DayOfWeek.class,
                day -> day.getDisplayName(TextStyle.FULL, locale));

// We could interpret the function returning null to mean that the
// key is not present.  That would allow this method to support
// more than the "every constant is a key" use case while dropping
// support for the "there may be present null values" use case,
// which is probably a good trade.
                    

We could provide a similar factory method for EnumSet, accepting a Predicate instead of a Function:


Set<Month> shortMonths =
    EnumSet.filter(Month.class,
                   month -> month.minLength() < 31);
                    

This functionality could be achieved less efficiently and more verbosely with streams. Again, the more we give up efficiency like that, the less sense it makes to use Enum{Set,Map} in the first place. I acknowledge that there is a cost to making API-level changes like the ones I’m discussing, but I feel that we are solidly in the “too little API-level support for Enum{Set,Map}” part of the spectrum and not even close to approaching the opposite “API bloat” end.

I don’t mean to belittle streams. There should also be more support for Enum{Set,Map} in the stream API.

Add collectors

Code written for Java 8+ will often produce collections using streams and collectors rather than invoking collection constructors or factory methods directly. I don’t think it would be outlandish to estimate that one third of collections are produced by collectors. Some of these collections will be (or could be) Enum{Set,Map}, and more could be done to serve that use case.

Collectors with these signatures should exist somewhere in the JDK:


public static <T extends Enum<T>>
Collector<T, ?, EnumSet<T>> toEnumSet(
    Class<T> elementType)

public static <T, K extends Enum<K>, U>
Collector<T, ?, EnumMap<K, U>> toEnumMap(
    Class<K> keyType,
    Function<? super T, ? extends K> keyMapper,
    Function<? super T, ? extends U> valueMapper)

public static <T, K extends Enum<K>, U>
Collector<T, ?, EnumMap<K, U>> toEnumMap(
    Class<K> keyType,
    Function<? super T, ? extends K> keyMapper,
    Function<? super T, ? extends U> valueMapper,
    BinaryOperator<U>; mergeFunction)
                    

Similar collectors can be obtained from the existing collector factories in the Collectors class (specifically toCollection(collectionSupplier) and toMap(keyMapper, valueMapper, mergeFunction, mapSupplier)) or by using Collector.of(...), but that requires a little more effort on the users’ part, adding a little bit of extra friction to using Enum{Set,Map} that we don’t need.

I referenced these collectors from Guava earlier in this article:

They do not require the Class object argument, making them easier to use than the collectors that I proposed. The reason the Guava collectors can do this is that they produce ImmutableSet and ImmutableMap, not EnumSet and EnumMap. One cannot create an Enum{Set,Map} instance without having the Class object for that enum type. In order to have a collector that reliably produces Enum{Set,Map} (even when the stream contains zero input elements to grab the Class object from), the Class object must be provided up front.

We could provide similar collectors in the JDK that would produce immutable Set and Map instances. For streams with no elements, the collectors would produce Collections.emptySet() or Collections.emptyMap(). For streams with at least one element, the collectors would produce an Enum{Set,Map} instance wrapped by Collections.unmodifiable{Set,Map}.

The signatures would look like this:


public static <T extends Enum<T>>
Collector<T, ?, Set<T>> toImmutableEnumSet()

public static <T, K extends Enum<K>, U>
Collector<T, ?, Map<K, U>> toImmutableEnumMap(
    Function<? super T, ? extends K> keyMapper,
    Function<? super T, ? extends U> valueMapper)

public static <T, K extends Enum<K>, U>
Collector<T, ?, Map<K, U>> toImmutableEnumMap(
    Function<? super T, ? extends K> keyMapper,
    Function<? super T, ? extends U> valueMapper,
    BinaryOperator<U>gt; mergeFunction)
                    

I’m not sure that those collectors are worthwhile. I might never recommend them over their counterparts in Guava.

The StreamEx library also provides a couple of interesting enum-specialized collectors:

They’re interesting because they are potentially short-circuiting. With MoreCollectors.toEnumSet(elementType), when the collector can determine that it has encountered all of the elements of that enum type (which is easy — the set of already-collected elements can be compared to EnumSet.allOf(elementType)), it stops collecting. These collectors may be well-suited for streams having a huge number of elements (or having elements that are expensive to compute) mapping to a relatively small set of enum constants.

I don’t know how feasible it is to port these StreamEx collectors to the JDK. As I understand it, the concept of short-circuiting collectors is not supported by the JDK. Adding support may necessitate other changes to the stream and collector APIs.

Be navigable? (No)

Over the years, many people have suggested that Enum{Set,Map} should implement the NavigableSet and NavigableMap interfaces. Every enum type is Comparable, so it’s technically possible. Why not?

I think the Navigable{Set,Map} interfaces are a poor fit for Enum{Set,Map}.

Those interfaces are huge! Implementing Navigable{Set,Map} would bloat the size of Enum{Set,Map} by 2-4x (in terms of lines of code). It would distract them from their core focus and strengths. Supporting the navigable API would most likely come with a non-zero penalty to runtime performance.

Have you ever looked closely at the specified behavior of methods like subSet and subMap, specifically when they might throw IllegalArgumentException? Those contracts impose a great deal of complexity for what seems like undesirable behavior. Enum{Set,Map} could take a stance on those methods similar to Guava’s ImmutableSortedSet and ImmutableSortedMap: acknowledge the contract of the interface but do something else that is more reasonable instead…

I say forget about it. If you want navigable collections, use TreeSet and TreeMap (or their thread-safe cousins, ConcurrentSkipListSet and ConcurrentSkipListMap). The cross-section of people who need the navigable API and the efficiency of enum-specialized collections must be very small.

There are few cases where the Comparable nature of enum types comes into play at all. In practice, I expect that the ordering of most enum constants is arbitrary (with respect to intended behavior).

I’ll go further than that; I think that making all enum types Comparable in the first place was a mistake.

  • Which ordering of Collector.Characteristics is “natural”, [CONCURRENT,UNORDERED] or [UNORDERED,CONCURRENT]?
  • Which is the “greater” Thread.State, WAITING or TIMED_WAITING?
  • FileVisitOption.FOLLOW_LINKS is “comparable” — to what? (There is no other FileVisitOption.)
  • How many instances of RoundingMode are in the “range” from FLOOR to CEILING?
    
    import java.math.RoundingMode;
    import java.util.EnumSet;
    import java.util.Set;
    
    class RangeTest {
      public static void main(String[] args) {
        Set<RoundingMode> range =
            EnumSet.range(RoundingMode.FLOOR,
                          RoundingMode.CEILING);
        System.out.println(range.size());
      }
    }
    
    // java.lang.IllegalArgumentException: FLOOR > CEILING
                                

There are other enum types where questions like that actually make sense, and those should be Comparable.

  • Is Month.JANUARY “before” Month.FEBRUARY? Yes.
  • Is TimeUnit.HOURS “larger” than TimeUnit.MINUTES? Yes.

Implementing Comparable or not should have been a choice for authors of individual enum types. To serve people who really did want to sort enum constants by declaration order for whatever reason, we could have automatically provided a static Comparator from each enum type:


Comparator<JDBCType> c = JDBCType.declarationOrder();
                    

It’s too late for that now. Let’s not double down on the original mistake by making Enum{Set,Map} navigable.

Conclusion

EnumSet
and
EnumMap are cool collections, and you should use them!

They’re already great, but they can become even better with changes to their private implementation details. I propose some ideas here. If you want to find out what happens in the JDK, the changes (if there are any) should be noted in JDK-8170826.

API-level changes are warranted as well. New factory methods and collectors would make it easier to obtain instances of Enum{Set,Map}, and immutable Enum{Set,Map} could be better-supported. I propose some ideas here, but if there are any actual changes made then they should be noted in JDK-8145048.