Tuesday, March 27, 2012

Refactoring to Java Enums

There are only two books (so far) I've ever read three times in full: Dune by Frank Herbert and The Pragmatic Progammer by Dave Thomas and Andy Hunt. From these you could rightly infer what I consider the pinnacle of SF and programming/software engineering literature. (Cryptonomicon will certainly join this list at some point.)

There is a much larger number of books I've read twice in full, so I'll spare you (and myself) the tedium of listing them here.

Then there are those "reference style" books where you have read parts of them dozens of times and the whole thing through at least once, maybe twice. Two that fall into that category for me are Josh Bloch's Effective Java and Martin Fowler's (et al.) Refactoring.

In fact, I recently just re-read Effective Java cover to cover and decided to read most of Refactoring again. The former, with the 2nd edition, has been updated to use many of the latest features of Java 5 and 6. The latter has not.


/* ---[ Java 5 enums: powerful, underutilized ]--- */

One of the themes in Effective Java (and there are many) is that Java enums are powerful and underutilized. For example, in Item 30 Bloch says:

Java's enum types are full-fledged classes, far more powerful than their counterparts in these other languages [C,C++,C#], where enums are essentially int values....

In addition to rectifying the deficiencies of int enums, enum types let you add arbitrary methods and fields and implement arbitrary interfaces. They provide high-quality implementations of all the Object methods, [and] they implement Comparable and Serializable...

So why would you want to add methods or fields to an enum type? For starters, you might want to associate data with its constants.... You can augment an enum type with any method that seems appropriate. An enum type can start life as a simple collection of enum constants and evolve over time into a full-featured abstraction.

While reading Chapter 1 of Refactoring, I felt the urge to update the example Fowler works through - not only to replace Vectors and Enumerations with Lists and Iterators, but in particular to explore the power of the Java enum.


/* ---[ Refactoring, Ch. 1, in brief ]--- */

So today I focus on the section of Ch. 1 where Fowler is extracting a switch statement into the State pattern and replacing a conditional with polymorphic classes.

You may want to peruse that chapter again to refresh yourself on the context (if you don't own the book and you are a programmer, you should obtain one post-haste), but I will review the part that is relevant to the refactoring of Fowler's code that I did.

Fowler starts with a Movie class that uses the old int constants style to doing enums to distinguish between three types of Movies that can be rented at a video rental store (and thus the example is showing its age in more ways than one).

package refactor;

public class Movie {
  public static final int REGULAR = 0;
  public static final int NEW_RELEASE = 1;
  public static final int CHILDRENS = 2;
  private String _title;
  private int _priceCode;

  public Movie(String title, int priceCode) {
    _title = title;
    _priceCode = priceCode;
  }

  public int getPriceCode() {
    return _priceCode;
  }

  public void setPriceCode(int arg) {
    _priceCode = arg;
  }

  public String getTitle (){
    return _title;
  };
}

At the starting point of the refactoring exercise, there are three classes: Customer, Rental and Movie. A Customer can have multiple Rentals and a Rental (in this case) is allowed to have only a single Movie. Only Customer has any business logic, the other classes are just value objects.

Fowler proceeds to refactor functionality out of Customer, first into Rental and then into Movie. In particular, since the Movie class has the "int enum" of REGULAR, CHILDRENS and NEW_RELEASE, he refactors Movie to provide two pieces of functionality that were originally in the Customer class:

  • getCharge: calculates the charge for renting the movie, which varies by Movie enum type
  • getFrequentRenterPoints: calculates the rental agency's bonus "frequent renter" points, which also varies by Movie enum type

So far he has basically been changing the location of a switch statement based on the int constants in Movie.

  switch (rental.getMovie().getPriceCode()) {
  case Movie.REGULAR:
    thisAmount += 2;
    if (rental.getDaysRented() > 2)
      thisAmount += (rental.getDaysRented() - 2) * 1.5;
    break;
  case Movie.NEW_RELEASE:
    thisAmount += rental.getDaysRented() * 3;
    break;
  case Movie.CHILDRENS:
    thisAmount += 1.5;
    if (rental.getDaysRented() > 3)
      thisAmount += (rental.getDaysRented() - 3) * 1.5;
    break;
  }    

After he has the switch statement in the right class, now he works to replace the conditional with polymorphism. He decides that the right way to do it is not to make subclasses of Movie (e.g., RegularMovie, ChildrensMovie, etc.), because a specific movie can change its type while the application is running (e.g., from "New Release" to "Regular"). Thus, he uses the Replace Type Code with State/Strategy refactoring pattern.

In the end, he creates an abstract Price state class that is used by Movie to implement the state-dependent methods getCharge and getFrequentRenterPoints. Here is the UML diagram:

State Pattern Using Price class

Here is the code for the abstract Price class and its concrete subclasses:

// in Price.java file
public abstract class Price {
  public abstract int getPriceCode();
  public abstract double getCharge(int daysRented);
  public int getFrequentRenterPoints(int daysRented) {
    return 1;
  }
}

// in ChildrensPrice.java file
public class ChildrensPrice extends Price {

  @Override
  public int getPriceCode() {
    return Movie.CHILDRENS;
  }

  @Override
  public double getCharge(int daysRented) {
    double result = 1.5;
    if (daysRented > 3)
      result += (daysRented - 3) * 1.5;
    return result;
  }

}

// in NewReleasePrice.java file
public class NewReleasePrice extends Price {

  @Override
  public int getPriceCode() {
    return Movie.NEW_RELEASE;
  }

  @Override
  public double getCharge(int daysRented) {
    return daysRented * 3;
  }

  @Override
  public int getFrequentRenterPoints(int daysRented) {
    if (daysRented > 1)
      return 2;
    else
      return 1;
  }
}

// in RegularPrice.java file
public class RegularPrice extends Price {

  @Override
  public int getPriceCode() {
    return Movie.REGULAR;
  }

  @Override
  public double getCharge(int daysRented) {
    double result = 2;
    if (daysRented > 2)
      result += (daysRented - 2) * 1.5;
    return result;
  }
}

Note that Price is an abstract class, rather than an interface. Because the getFrequentRenterPoints method is shared between two of the subclasses, it has been pulled up into the abstract base class. When I refactor this code further, you'll see that we can duplicate this behavior with an enum as well.

In the end, Fowler has achieved a clean design that adheres to the open/closed principle. It is open for extension by adding new subclasses to Price and it is closed for modification in that one never needs to modify the Price class in order to add a new type or state.


/* ---[ Refactoring to use Java 5 enums ]--- */

It is easy to see how to use a Java 5 enum for the original code (before Fowler's refactoring). It would just be a matter of replacing this:

public class Movie {
  public static final int REGULAR = 0;
  public static final int NEW_RELEASE = 1;
  public static final int CHILDRENS = 2;

  ...
}

with this:

public class Movie {

  public enum Price {
    REGULAR, NEW_RELEASE, CHILDRENS;
  }
  ...
} 

... and then refactoring the dependent classes to refer to Movie.Price.REGULAR, etc.

And that is how most people, from what I've seen and read, use Java enums. But as Bloch said earlier, Java enums are much more powerful than that. They are type-safe full-fleged Java immutable (final) classes where each enum entry (REGULAR, NEW_RELEASE, etc.) is a singleton for that entry. They can have constructors, implement arbitrary methods and implement interfaces. In fact, you can even put abstract methods on the "base" enum to require the concrete enum singleton entries to implement a method (as we'll see in my example below).

Here is the end product of my refactoring starting from Fowler's end product:

public enum Price {
  REGULAR {
    @Override
    public double getCharge(int daysRented) {
      double result = 2;
      if (daysRented > 2)
        result += (daysRented - 2) * 1.5;
      return result;
    }
  }, 
  CHILDRENS {
    @Override
    public double getCharge(int daysRented) {
      double result = 1.5;
      if (daysRented > 3)
        result += (daysRented - 3) * 1.5;
      return result;
    }
  }, 
  NEW_RELEASE{
    @Override
    public double getCharge(int daysRented) {
      return daysRented * 3;
    }

    @Override
    public int getFrequentRenterPoints(int daysRented) {
      if (daysRented > 1) return 2;
      else        return 1;
    }
  };

  public abstract double getCharge(int daysRented);

  /**
   * Default implementation of getFrequentRenterPoints for all 
   * types in the enum. May be overridden by specific enum types
   * if they give fewer or higher numbers of bonus points.
   * 
   * @param daysRented number of days the movie was rented
   * @return number of bonus points
   */
  public int getFrequentRenterPoints(int daysRented) {
    return 1;
  }
}

If you haven't used the advanced features of enums, this may look surprising.

First, notice that you can declare methods in the "main body" of the enum, and that they can even be abstract. In this case, I have created the getCharge and getFrequentRenterPoints methods. This means that all specific (singleton) entries of the enum have these methods. In the case of the abstract method, the compiler will require you to implementat that method in each entry body.

Which brings me to the second point - enum entries can have bodies that are specific to that entry and not shared with the other entries. Methods outside the entries are common to all entries and methods inside an entry are specific to the entry.

The zone of "inside an enum entry" is demarcated by a matching pair of curly braces after the declaration of the entry's name (e.g., REGULAR). This is just like the notation for a class body.

You create specific data fields and methods inside the enum entry body. In this case these entries don't have any state to retain (their name is the representation of the state needed), so I only have methods, not fields.

If one did need fields, how you would populate them with user/client-provided data? You would define a constructor, which would go in the section outside the entries (but inside the enum class body of course). Effective Java has a nice example of when you might need to have enum constructors.


/* ---[ Using the Price enum ]--- */

If you scroll back up and review the UML diagram, you'll see that the Movie class has both getCharge and getFrequentRenterPoints methods. In my refactoring the Movie class does the same thing. The only difference is that Price is an enum, not a regular Java class.

public class Movie {
  private String _title;
  private Price priceCode;

  public Movie(String title, Price priceCode) {
    _title = title;
    this.priceCode = priceCode;
  }

  public Price getPriceCode() {
    return priceCode;
  }

  public String getTitle() {
    return _title;
  }

  double getCharge(int daysRented) {
    return priceCode.getCharge(daysRented);
  }

  public int getFrequentRenterPoints(int daysRented) {
    return priceCode.getFrequentRenterPoints(daysRented);
  } 
}

Users of the Movie class would then need references the standalone Price enum when creating a Movie:

Movie m1 = new Movie("Grease", Price.REGULAR);
Movie m2 = new Movie("Cars", Price.CHILDRENS);


/* ---[ Analysis ]--- */

So is this better? When are enums more appropriate?

First of all, Java 5 enums are always more appropriate than using the "int enum" pattern. Their primary purpose is to replace that pattern. They bring type-safety to those constants, they bring a toString() method that prints out their name, they implement Comparable and Serializable and allow for the addition of arbitrary behavior.

Java 5 enums are appropriate when you need singleton behavior from each entry - only one copy of the REGULAR, CHILDREN and NEW_RELEASE enum objects will ever be created. In fact, they are such perfect singleton implementations, both in terms of proper initialization and thread-safety, that Josh Bloch recommends them as the best way to implement the Singleton pattern in Java now. If you want it to be a true singleton, then you only create one "entry", which you might call INSTANCE, like so:

public enum MySingleton {
  INSTANCE {
    // put instance fields and methods here
  };
}

Creating a singleton with guaranteed creation thread-safety and no known ways to create two (such as by Serialization attacks) has never been so easy.

Of course, as singletons, enums inherit some well-known pitfalls of the Singleton pattern, including the fact that they are difficult to test, as you can't dependency inject a mock or stub version of them. With a good dependency injection framework, like Spring, singletons are frequently not needed any more.

In any case, an enum with extra behavior is appropriate when you need behavior without state or where the state of each enum entry would the be same for any objects using them. In the case of my refactored version, the only state the Price enum needs is to know its type (which is what the enum is for) and do some calculations based on that state. There is only a need for one of them in the entire app, no matter how many Movie objects I need to create, so using a set of singleton immutable enums works. And the behavior of those enums does not leverage any external dependencies or resources - they just do some simple arithmetic and return an answer, so I don't need to mock or stub them for testing.

So finally, what about the open/closed principle? Well, with enums in my refactored version, one could argue that they do violate this principle. It isn't open for extension, since enums are final and cannot be subclassed. And it isn't closed for modification, as new entries of the enum cannot be created without editing the Price enum java source code file directly.

True critique. Enums do not support the open/closed principle and thus should only be used in situations where you have (and preferably own) the source code and can modify it yourself or when you actively want to prevent anyone from creating any other types. It is closed/closed intentionally.

But is my refactoring really worse than Fowler's on this criterion? Actually no, because even though his Price class follows the open/closed principle, the Movie class does not - it still expects its clients to indirectly reference the Price class via its int constants. To add a new int constant one would have to edit the Movie java file source code. So Fowler's overall refactored design doesn't follow the open/closed principle either.

If we truly wanted to follow that principle, we would have to refactor to some third design. I'll leave that as an exercise for any reader that might be interested in trying that out.


/* ---[ Antipattern: double bad ]--- */

A final side note: Fowler's example code has a smell in it that he didn't correct: using double to handle monetary values. A refactoring should also be done to use long, BigDecimal or a self-constructed Money class instead.

And that's the joy of refactoring - with new language features evolving and the list of code smells you are aware of growing, you will frequently be able to go back to old code and find a way to improve it.

No comments:

Post a Comment