|
Mark Davis / http://www.macchiato.com/ |
Durable Java | Immutables | Abstraction | Serialization | Liberté, Égalité, Fraternité | Hashing and Cloning
|
[Note to the editor: here is the contents of the left-hand first-page information. Both it and the column title have changed.] Design your code from the start to be durable--so it can evolve without breaking your clients' code.
Dr. Mark Davis is lead architect at IBM's Center for Java Technology, Silicon Valley, co-founder and president of the Unicode Consortium, and architect for the bulk of JDK1.1 internationalization. [Note to the editor: end of first-page left-hand information.] |
During the recent holidays, my extended family played a great set of Dictionary games.
For those of you who have never played it, the rules are pretty simple. All you need is an unabridged dictionary, pencils and paper. In each round, one person is the dealer. He looks up an obscure word in the dictionary, and writes down the definition. Each of other players writes down a counterfeit definition, and gives it to the dealer. The dealer then reads off all the definitions, and all the other players vote for one (other than their own).
For each vote for a counterfeit definition, the dealer and the counterfeit's author each get a point. For each vote for the right definition, the voter gets one point. So the strategy is for the dealer to pick a word with the most implausible real definition, and the other players to write the most plausible counterfeit definition. You get some really creative counterfeits--and some very funny real words.
So let's take an example: which of the following looks the most plausible?
im·mu·ta·ble /i(m)·'myû·tù·bùl/ adj. Not capable of or susceptible to change. [Middle English, from Latin immutabilis, from in- + mutabilis mutable. 15th century.]
That by two immutable things, in which it was impossible for God to lie, we might have a strong consolation. Heb. vi. 18.
im·mu·ta·ble /i(m)·'myû·tù·bùl/ adj. Not capable of being mute. [extension of Middle English muet, from Middle French, from Old French mu, from Latin mutus, probably from mu, representation of a muttered sound. 19th century.]
Percy blathered on and on despite Lizzy's best efforts to restrain him."That man is immutable!" she thought. Jane Austen
For Java, the correct answer (the first definition) is very important, since the concept of immutable objects is crucial to designing durable APIs--and also a major factor in performance.
So what are immutable objects? They are objects that cannot be changed once they are created. There are a number of significant advantages to them:
The main disadvantage of immutable objects can be performance, so it is important to know when and where to use them.
We don't have to look far for examples: the Java API provides two very similar classes, one immutable and one mutable. String was designed to be immutable, and thus simpler and safer. StringBuffer was designed to be mutable, with tremendous advantages in performance. For example, look at the following code, where one uses string concatenation and the other uses the StringBuffer append method. (The doSomething() method is overloaded to take characters as either Strings or StringBuffers.)
| Case 1 |
for (i = 0; i < source.length; ++i) {
doSomething(i + ": " + source[i]);
}
| Case 2 |
StringBuffer temp = new StringBuffer();
for (i = 0; i < source.length; ++i) {
temp.setLength(0); // clear previous contents
temp.append(i).append(": ").append(source[i]);
doSomething(temp);
}
Behind your back, the compiler will optimize Case 1, but it still involves the creation of two objects per iteration. Case 2, on the other hand, avoids any object creation in the loop. Because of this, Case 2 can be over 1000% faster than Case 1 (depending on your JIT, of course). So let's take a little time out to talk about object creation.
Many articles have been written about just how bad excessive object creation can be for your performance. Object creation alone will represent a significant proportion of the CPU time taken by your program. This time is so important that for most programs it is a valuable proxy for the overall speed--when doing performance analysis and enhancements, reducing object creation by 10% generally corresponds to a 10% increase in speed!
The lifetime cost of an allocated object includes a number of items you don't often consider. Think in terms of the lifetime cost of object creation, not just creation time alone:
Even though the lifetime cost may not be noticeable when your program is run in isolation, it can still matter. Nowadays, there are more and more programs running concurrently on the same machine. If your program eats up extra CPU cycles, it can affect the overall performance of the user's machine. While garbage collection may operate during lulls in your threads' operation, it will still affect the performance of other programs running concurrently on your machine.
One of the goals of your API should be to not require unnecessary object creation. Let's take a look at a simple example from Component to see how to do this. (The implementation given here is simplified slightly.)
public Dimension getSize() {
return new Dimension(width, height);
}
Since the method has a functional return, I can have a complex expression involving the return value from this function. For example, here we find the horizontal position of a piece of text which is centered in a component.
x = (myComponent.getSize().width - metrics.stringWidth(s))/2;
|
Other examples of chaining functional returns are:
|
If Dimension were an immutable class, then a functional return of an object in a getter would have no cost. (Presumably, it would just be one of the component's fields, which is simply returned.) By being immutable, this would be a safe operation--the caller couldn't corrupt the internals of the component.
However, Dimension is not immutable, so Component is forced to create a new object each time this method is called. In this particular case, the drain on preformance is probably small enough to be acceptable--but often it isn't.
One option for not forcing object creation is to make the Dimension an output parameter instead:
public void getSize(Dimension output) {
output.width = width;
output.height = height;
}
This avoids the requirement of object creation--or rather, it puts the responsibility on the caller. So why is this more efficient? Because the caller can then reuse the same Dimension object for a whole slew of code.
However, there are a lot of circumstances where a functional return is very handy for the programmer. Even where we have allocated a temporary Dimension previously, many people dislike breaking up the code into multiple statements. Look at how an output parameter changes our previous example:
myComponent.getSize(temp); y = (temp.width - metrics.stringWidth(s))/2;
One way to get around this is to use the pass-in, pass-out idiom. With this idiom, you return the output parameter from the routine. The implementation becomes:
public Dimension getSize(Dimension output) {
output.width = width;
output.height = height;
return output;
}
You can then provide your original API as a convenience method. By making it final, any good JIT will compile out any overhead.
public final Dimension getSize() {
return getSize(new Dimension());
}
Your clients then have the speed advantages of not forcing object creation, and the convenience of functional returns. (The cost of having a return value is very small.) The example becomes the following, if someone is very concerned about performance (anyone not concerned with performance can continue to use the original API).
y = (myComponent.getSize(temp).width - metrics.stringWidth(s))/2;
|
|
When setters have functional returns, the normal return value is the main object itself, e.g. return this. For example, StringBuffer.append(...) returns itself, which allows functional chaining such as: result.append(", ").append(s); |
|---|
Let's summarize some of the main points about object creation before we turn back to immutables.
We saw that immutable objects are easier to return. So do they solve all our problems? Unfortunately not. The big disadvantage of immutables is: to make one with a different value, you have to create a whole new object.
For example, let's set two fields from different objects to the value 3.
myDimension.width = 3;
myColor = new Color(myColor.getRed(),
myColor.getGreen(),
3);
Dimension is mutable, and can just be set directly. Color is not, and a whole new object must be created. For a feeling of what that represents, look at a Mandelbrot program drawing on the screen. Since essentially every point displayed is a different color, repainting a 200 x 200 window will create (and eventually destroy) 40,000 objects.
The first step in making an object immutable is to make sure that there are no setters: only the constructor will change the values of any fields. Next, make all fields private. This is good advice for any class, but is crucial for immutables so that subclasses and other classes in the same package can't muck with your fields. There is one exception to this rule: you might have a lazy-evaluated field, such as a cache, whose value changes after construction time. This is ok as long as (1) semantically the object always behaves the same after construction, and (2) all accesses to that field internally are synchronized.
Creating a safe, truly immutable object can be expensive in some circumstances. If any getter returns a mutable field, that return value must be cloned. Otherwise the object is not truly immutable. Look at the following example:
public class AClass {
public int[] getItems() {
return items; // error
}
private int[] items;
}
...
int[] myItems = aClass.getItems();
myItems[1] = 0; // modifies aClass!
Dealing with arrays is particularly unpleasant, since there is no good way to make an immutable version of an array. If you pass back an array--or any other mutable--from an Immutable class, you have to clone it.
return (int[])items.clone(); // corrected
Cloning setter results may be obvious to you. What may not be obvious is that when any mutable objects are passed in to the constructor, they must also be cloned. Otherwise the object is not truly immutable. Take this example:
public class AClass {
public AClass(int count, int[] items) {
this.count = count;
this.items = items; // error
}
private int count;
private int[] items;
}
...
int[] myItems = {1, 2, 3};
AClass aClass = new AClass(2, myItems);
myItems[1] = 0; // modifies aClass!
To fix this, the bad line has to be changed to:
this.items = (int[])items.clone(); // corrected
Cloning is not always an easy answer, either. Many mutable classes do not support a safe clone; even after cloning, changes to the original object can damage changes to the cloned object, or vice versa. For example, Vector.clone() is not safe. You have to look at each class in question to make sure that the clone method is safe. If it is not safe, then you can either avoid using the class in construction or getters, or write your own safe clone (where possible). This hits on the topics of deep-cloning and object ownership, which I'll take up in a future column.
|
|
Notice that partial immutability is like bad home insurance--worse than nothing! If a class purports to be immutable but is not, then you don't take the precautions that you would with a mutable object, producing nasty, intermittent bugs. You depend upon it, but it fails you just when you need it. |
|---|
If Java had the const keyword, as in C or C++, immutability would not be much of an issue. However, it doesn't, it probably will never, and there is no use whining about it. We just have to make do.
|
|
For those of you familiar with C or C++, an immutable class is actually much stronger than const. With a const parameter, for example, you can't change the parameter, but you have no idea whether some else can behind your back. Moreover, const can always be cast away, making your code fragile. Warning: static final is not the same as const. Look at the following! public static final Point ORIGIN = new Point(0,0);
void someMethod() {
ORIGIN.x = 3; // alters ORIGIN--no problem!
}
|
|---|
For the clients of your class, the best way to address the issue is for you to supply both kinds of objects (mutable and immutable), with a common base class. This can supply the advantages of both to your clients. For example, suppose we had the following.
public class ColorBase {
public int getRed() {...
...
}
public class Color extends ColorBase {
// current API, plus...
public Color(ColorBase other) {...}
...
}
public class ColorBuffer extends ColorBase {...
public int setRed() {...}
...
set(ColorBase color) {...}
...
}
|
|
Important! Don't make the mistake of supplying a base class that purports to be immutable with a subclass that is not. For example, suppose that String were not final, and had subclasses like StringBuffer. If I got passed an object of type String, I wouldn't know whether the actual object was immutable or some mutable subclass that someone else could change behind my back. I don't know whether I need to clone the object for thread-safety or not! |
|---|
Providing this structure involved adding a constructor to Color that takes a ColorBase. The new class ColorBuffer should supply setters for all the fields that you were able to set with the constructor for Color. In addition, it should supply set(ColorBase color). It may optionally supply overloads of set that take some of the sets of constructor arguments for Color, such as:
set(int red, int green, int blue); set(float red, float green, float blue);
With these three classes, clients could then return (or convert to) Color when they needed immutability, and use ColorBuffer if not. Methods in other classes that didn't care about immutability or mutability could take parameters of type ColorBase. This framework allows for using immutables where necessary for safety, mutables where necessary for speed, and avoiding unnecessary object creation and conversion. The Java class library itself could allow programs to be written with much better performance if this were done: String and StringBuffer are the key example.We'll go into this in more detail in the future.
|
|
I recommend the simple name "set" for setters corresponding to a constructor. The Java API names for these setters vary over the map. For example: myPoint.setLocation(otherPoint); myRectangle.setBounds(otherRectange); myDimension.setSize(otherDimension); |
|---|
"What? You've been saying all along that you can't modify them!"
That's right. You don't really modify them. But often you want to produce the effect of modifying them by creating a new object based on them, with some modification. For example, to add a character to a string, I set the variable to point to a new string created by using the old one:
str = str + 'a';
In this simple (and common) case, I don't care that the object formerly pointed to by the variable str has not changed. As far as I am concerned, str now has an 'a' at the end, where it didn't before. Similarly, with BigDecimal I replace the value of a variable with a newly created one:
aBigDecimal = aBigDecimal.divide(otherBigDecimal);
We really need a new term for "producing the effect of modifying". Our trusty online Webster's gives us quite a range of choices for such a term:
alter, modify, mutate, refashion, turn, vary;
convert, metamorphose, transform, transmute; diversify, variegate;
transform, commute, convert, metamorphose, transfigure, translate, transmogrify, transmute, transpose, transubstantiate;
sterilize, castrate, desexualize, fix, geld, mutilate, neuter, unsex;
exchange, replace, substitute, swap, switch, trade, interchange.
None of them really fit, though a few would be pretty interesting to toss around in design discussions! So for now, until some helpful reader supplies a better term, I'll use the phrase "produce a modification".
If you don't want the extra overhead or hassle of providing dual classes, then you have to decide whether to make your class mutable or immutable. The trade-offs are not always clear-cut. The factors that will tend to push you towards or away from immutability depend on the common usage patterns.
Unfortunately, there is one other complication. The Java language and class libraries have no direct support for immutability. Surprisingly--for a concept that is crucial to effective use of the Java class libraries--classes are immutable only by convention. This means that as a class designer, you must depend on documentation to show that your class is immutable; you cannot mark your class as implementing Immutable, as you can with Cloneable, for example.
One way to address this, at least within your own classes, is to introduce your own Immutable interface, allowing users of your classes to test for immutability at runtime with instanceof.
Unless you document clearly or use the interface mechanism, when you have a class that looks like it is immutable, your subclassers may depend on it being so. Yet this might only be a coincidence, because you didn't supply setters on the first release. But once you have done that, you are stuck. Your clients may be depending on immutable behavior from your object, so you shouldn't make any changes in the future that make it mutable, since it might break their subclasses.
On the other hand, if you are a subclasser, drive defensively. Try not to depend on a class being immutable unless it is clearly documented that it is so. If you have to make an assumption in the absence of documentation, ensure that there are no setters (any methods that can change state) and check that all fields are private (and not exposed through getters): no public, protected, or package-private fields. If the class is not final, remember that when you get an object of that type, it might actually be a subclass that is mutable.
Let's summarize some of the main points about immutables.
Immutables represent a powerful technique in Java for making safe, ease-to-use code, and increasing the durability of your API overall. But consider the points above when deciding whether this technique is appropriate for your class design, and whether it fits in with the design of your whole class library.
ResourcesIf this topic is of interest to you, you might also look at:
This is a new column, so feedback is especially welcome: new topics you'd like to see covered, questions about this column, or objections to what I've written. I can be contacted at mark@macchiato.com. |
Copyright (c) 1999, Mark Davis, All rights
reserved.