toString() on value objects
Tom Wetjens
toString() can be implemented on Java types to return a human-readable textual representation useful for debugging,
but should it also be used to return a portable string suitable for serialization and persistence?
In my team we use a lot of value-based classes (also known as value objects) to represent things like identifiers, e-mail addresses, phone numbers. We had also implemented toString() on each of these types
for serialization to JSON and for persisting in a database.
An example of an “identifier” value object representing a ContractId:
class ContractId {
private final String value;
// ... private constructor
public static ContractId fromString(String str) {
return new ContractId(str);
}
@Override
public String toString() {
return value;
}
// ... equals and hashCode
}
However when we used ContractId in Kotlin, when changing a non-nullable variable to nullable, any calls to toString() will not trigger compile errors.
The toString() on a null value will simply return the string "null" which can lead to unexpected and undetected bugs as we experienced! This is because of the Any?.toString extension function in the Kotlin Standard Library.
It triggered an interesting discussion in my team about how to properly implement and use toString().
And whether we should use toString() strictly for debugging purposes and offer an alternative method (such as asString()) on each value object to get the portable string representation, which would trigger compile errors if called on a nullable receiver.
The toString() method
Everything in Java is an Object (except primitives) and therefore declares a toString method. From the JavaDoc:
Returns a string representation of the object. In general, the toString method returns a string that “textually represents” this object. The result should be a concise but informative representation that is easy for a person to read. It is recommended that all subclasses override this method.
If a type does not provide its own implementation of toString, it defaults to a string consisting of the name of the class of which the object is an instance, the at-sign character `@’, and the unsigned hexadecimal representation of the hash code of the object.
See also: Objects.toIdentityString.
The JavaDoc does not specify any further constraints on the return value of toString, leaving unexperienced developers in the dark about its exact intent.
Most importantly: I was always taught not to depend on the implementation of toString as its behavior is not well-defined and can change without notice.
Value-based classes in the Java Standard Library
However, there are several types in the Java Standard Library that override toString to be used not only for debugging purposes, but as a way to get a portable string representation.
In all these cases the exact return value of the toString is extensively documented and can be seen as a well-defined “contract” on which can be depended.
java.util.UUID
The JavaDoc on UUID#toString says:
Returns a String object representing this UUID. The UUID string representation is as described by this BNF:
There are other methods on UUID to access the underlying value, such as as getLeastSignificantBits and getMostSignificantBits, which could be used for serialization.
Actually, it even provides a UUID#fromString method that returns a UUID instance from the string standard representation as returned by the toString() method.
java.net.URI
The JavaDoc on URI#toString says:
Returns the content of this URI as a string.
If this URI was created by invoking one of the constructors in this class then a string equivalent to the original input string, or to the string computed from the originally-given components, as appropriate, is returned.
Otherwise this URI was created by normalization, resolution, or relativization, and so a string is constructed from this URI’s components according to the rules specified in RFC 2396, section 5.2, step 7.
Interesting about the URI#toString specification is that it may return a different value than what was used to construct the instance. This means it could be considered “lossy” - original information is lost.
java.time.Duration
The JavaDoc on Duration#toString says:
A string representation of this duration using ISO-8601 seconds based representation
In this case ISO format was chosen as there is no way to know what locale or formatting style is desired. It is portable and lossless. However since it does not output days, months, years - can it really be considered “easy for a person to read”?
Other languages
Rust
A modern language like Rust has separated the concerns into different derivable traits:
- output in a programmer-facing, debugging context
- is for user-facing output
- might not necessarily be a lossless or complete representation of the type
- preferable to only implement when there is a single most “obvious” way that values can be formatted as text, according to the “invariant” culture and “undefined” locale
I like this approach and it could be added to Java as interfaces similar to Comparable.
.NET
https://learn.microsoft.com/en-us/dotnet/api/system.object.tostring?view=net-10.0
- string representation so that it is suitable for display
- should be friendly and readable by humans.
- should be as short as possible so that it is suitable for display by a debugger
- should have no observable side effects to avoid complications in debugging
It leaves the possibility open for returning a string value that can be parsed back into an object instance.
So we have no guarantee a type actually overrides the default toString to something sensible.
And the textual representation is intended for a person to read. Not necessarily a String that can be stored in a database or sent via an API. In addition the implementation of toString might change causing unexpected breakages.
Conclusion
toString() can be (and as we have seen - is) used to return a portable string suitable for serialization and persistence. But should we?
If we are intentional and its output is well-defined, then I think it provides a consistent interface and ease of use.
However, there are some exceptions:
- No sensitive data in
toString()output (e.g. passwords, client secrets, personal information). - Output format of
toString()not well-defined by a specification. - If there already are more explicit ways to get a reliable output for the given situation, such as
DateTimeFormatter.ISO_INSTANT.
For example, you could provide an alternative method to get the underlying value:
class ContractId {
// ...
public String getValue() {
return value;
}
}
Or in case of a sensitive value a method to explicitly get the plaintext value:
class ClientSecret {
// ...
public String toPlaintext() {
return value;
}
}