Increase Cohesion
The
academic definition of cohesion is that it is a measure of how closely
related all the responsibilities, data, and methods of a class are to
each other. I like to think of cohesion as a measure of whether a class
has a well-defined role within the system. We generally consider high
cohesion to be a good thing and repeat the words "highly cohesive" like a
mantra. But why?
Let's
think of coding as having a conversation with the computer. More
accurately, we're having several simultaneous conversations with the
computer. We're having conversations about how security is carried out,
how the infrastructure concerns are supposed to behave, and what the
business rules are.
When
you're at a loud party with a lot of different conversations going on
at once, it can be difficult to focus on the one conversation you're
trying to have. It's much easier to have a conversation in a quiet
setting where there's only one conversation going on.
An
easy test for cohesion is to look at a class and decide whether all the
contents of the class are directly related to and described by the name
of the class—vague class names, such as InvoiceManager, do not count.
If the class has responsibilities that don't relate to its name, those
responsibilities probably belong to a different class. If you find a
subset of methods and fields that could easily be grouped separately
under another class name, then maybe you should extract these methods
and fields to a new class.
To
illustrate, if you find methods and data in your TrainStation, Train,
and Conductor classes that seem to most accurately fit the theme of the
TrainSchedule class, move those methods and data into the TrainSchedule.
One of my favorite sayings about design is applicable here: put code
where you'd expect to find it. It's most logical to me that
functionality involving a train schedule would be in the TrainSchedule
class.
To
stretch my earlier conversation analogy, a system consisting of
cohesive classes and subsystems is like a well-designed online
discussion group. Each area in the online group is narrowly focused on
one specific topic so the discussion is easy to follow, and if you're
looking for a dialog on a certain subject, there's only one room you
have to visit.
Eliminate Inappropriate Intimacy
Inappropriate
intimacy refers to a method in a class that has too much intimate
knowledge of another class. Inappropriate intimacy is a sign of harmful,
tight coupling between classes. Let's say that we have a business logic
class that calls an instance of class DataServer1 to get the data it
needs for its business-logic processing. Figure 2 shows
an example. In this case, the Process method has to know a lot of the
inner workings of DataServer1 and a bit about the SqlDataReader class.
Figure 2 Inappropriate Intimacy
public void Process() {
string connectionString = getConnectionString();
SqlConnection connection = new SqlConnection(connectionString);
DataServer1 server = new DataServer1(connection);
int daysOld = 5;
using (SqlDataReader reader = server.GetWorkItemData(daysOld)) {
while (reader.Read()) {
string name = reader.GetString(0);
string location = reader.GetString(1);
processItem(name, location);
}
}
}
Now, let's rewrite the code in Figure 2 to remove the inappropriate intimacy:
public void Process() {
DataServer2 server = new DataServer2();
foreach (DataItem item in server.GetWorkItemData(5)) {
processItem(item);
}
}
As
you can see in this version of the code, I've encapsulated all the
SqlConnection and SqlDataReader object manipulation inside the
DataServer2 class. DataServer2 is also assumed to be taking care of its
own configuration, so the new Process method doesn't have to know
anything about setting up DataServer2. The DataItem objects returned
from GetWorkItemData are also strongly typed objects.
Now,
let's analyze the two versions of Process method against some of the
goals of loose coupling. First, what about making the code easier to
read? The first and second versions of Process perform the same basic
task, but which is easier to read and understand? Personally, I can more
easily read and understand the business logic processing without wading
through data-access code.
How
about making our classes easier to consume? Consumers of DataServer1
have to know how to create a SqlConnection object, understand the
structure of the DataReader returned, and iterate through and clean up
the DataReader. A consumer of DataServer2 only has to call a no-argument
constructor and then call a single method that returns an array of
strongly typed objects. DataServer2 is taking care of its own ADO.NET
connection setup and cleaning up open DataReaders.
And
as for isolating potential changes to a small area of code? In the
first version of the code, almost any change in the way DataServer works
would impact the Process method. In the second, more encapsulated
version of DataServer, you could switch the data store to an Oracle
database or to an XML file without any effect on the Process method.
The Law of Demeter
The
Law of Demeter is a design rule of thumb. The terse definition of the
law is: only talk to your immediate friends. The Law of Demeter is a
warning about the potential dangers of code, as in Figure 3.
Figure 3 Breaking the Law
public interface DataService {
InsuranceClaim[] FindClaims(Customer customer);
}
public class Repository {
public DataService InnerService { get; set; }
}
public class ClassThatNeedsInsuranceClaim {
private Repository _repository;
public ClassThatNeedsInsuranceClaim(Repository repository) {
_repository = repository;
}
public void TallyAllTheOutstandingClaims(Customer customer) {
// This line of code violates the Law of Demeter
InsuranceClaim[] claims =
_repository.InnerService.FindClaims(customer);
}
}
The
ClassThatNeedsInsuranceClaim class needs to get InsuranceClaim data. It
has a reference to the Repository class, which itself has a DataService
object. ClassThatNeedsInsuranceClaim reaches inside Repository to grab
the inner DataService object, then calls Repository.FindClaims to get
its data. Note that the call to
_repository.InnerService.FindClaims(customer) is a clear violation of
the Law of Demeter because ClassThatNeedsInsuranceClaim is directly
calling a method on a property of its Repository field. Now please turn
your attention to Figure 4, which shows another sample of the same code, but this time it is following the Law of Demeter.
Figure 4 Better Decoupling
public class Repository2 {
private DataService _service;
public Repository2(DataService service) {
_service = service;
}
public InsuranceClaim[] FindClaims(Customer customer) {
// we're simply going to delegate to the inner
// DataService for now, but who knows what
// we want to do in the future?
return _service.FindClaims(customer);
}
}
public class ClassThatNeedsInsuranceClaim2 {
private Repository2 _repository;
public ClassThatNeedsInsuranceClaim2(
Repository2 repository) {
_repository = repository;
}
public void TallyAllTheOutstandingClaims(
Customer customer) {
// This line of code now follows the Law of Demeter
InsuranceClaim[] claims = _repository.FindClaims(customer);
}
}
What
did we accomplish? Repository2 is easier to use than Repository because
you have a direct method to call for the InsuranceClaim information.
When we were violating the Law of Demeter, the consumers of Repository
were tightly coupled to the implementation of Repository. With the
revised code, I have more ability to change the Repository
implementation to add more caching or swap out the underlying
DataService with something completely different.
The
Law of Demeter is a powerful tool to help you spot potential coupling
problems, but don't follow the Law of Demeter blindly. Violating the Law
of Demeter does add tighter coupling to your system, but in some cases
you may judge that the potential cost of coupling to a stable element of
your code is less expensive than writing a lot of delegation code to
eliminate the Law of Demeter violations.
Tell, Don't Ask
The
Tell, Don't Ask design principle urges you to tell objects what to do.
What you don't want to do is ask an object about its internal state,
make some decisions about that state, then tell that object what to do.
Following the Tell, Don't Ask style of object interaction is a good way
to ensure that responsibilities are put in the right places.
Figure 5
illustrates a violation of Tell, Don't Ask. The code is taking a
purchase of some sort, checking for the possibility of a discount for
purchases of more than $10,000, and finally examining the account data
to decide whether there are sufficient funds. The previous DumbPurchase
and DumbAccount classes are, well, dumb. The account and purchase
business rules are both encoded in ClassThatUsesDumbEntities.
Figure 5 Asking Too Much
public class DumbPurchase {
public double SubTotal { get; set; }
public double Discount { get; set; }
public double Total { get; set; }
}
public class DumbAccount {
public double Balance { get; set;}
}
public class ClassThatUsesDumbEntities {
public void MakePurchase(
DumbPurchase purchase, DumbAccount account) {
purchase.Discount = purchase.SubTotal > 10000 ? .10 : 0;
purchase.Total =
purchase.SubTotal*(1 - purchase.Discount);
if (purchase.Total < account.Balance) {
account.Balance -= purchase.Total;
}
else {
rejectPurchase(purchase,
"You don't have enough money.");
}
}
}
This
type of code can be problematic in a couple of ways. In a system like
this, you potentially have duplication because the business rules for an
entity are scattered in procedural code outside of the entities. You
might duplicate logic unknowingly because it's not obvious where the
previously written business logic would be.
Figure 6
shows the same code, but this time following the Tell, Don't Ask
pattern. In this code, I moved the business rules for purchasing and
accounts into the Purchase and Account classes themselves. When we go to
make a purchase, we just tell the Account class to deduct the purchase
from itself. Account and Purchase know about themselves and their
internal rules. All a consumer of Account has to know is to call the
Account.Deduct(Purchase, PurchaseMessenger) method.
Figure 6 Telling Your App What to Do
public class Purchase {
private readonly double _subTotal;
public Purchase(double subTotal) {
_subTotal = subTotal;
}
public double Total {
get {
double discount = _subTotal > 10000 ? .10 : 0;
return _subTotal*(1 - discount);
}
}
}
public class Account {
private double _balance;
public void Deduct(
Purchase purchase, PurchaseMessenger messenger) {
if (purchase.Total < _balance) {
_balance -= purchase.Total;
}
else {
messenger.RejectPurchase(purchase, this);
}
}
}
public class ClassThatObeysTellDontAsk {
public void MakePurchase(
Purchase purchase, Account account) {
PurchaseMessenger messenger = new PurchaseMessenger();
account.Deduct(purchase, messenger);
}
}
The
Account and Purchase objects are easier to use because you don't have
to know quite so much about these classes to execute our business logic.
We've also potentially reduced duplication in the system. The business
rules for Accounts and Purchases can be effortlessly reused throughout
the system because those rules are inside the Account and Purchase
classes instead of being buried inside the code that uses those classes.
In addition, it should be easier to change the business rules for
Purchases and Accounts as those rules are found in just one place in the
system.
Closely
related to Tell, Don't Ask is the Information Expert pattern. If you
have a new responsibility for your system, in what class should the new
responsibility go? The Information Expert pattern asks, who knows the
information necessary to fulfill this responsibility? In other words,
your first candidate for any new responsibility is the class that
already has the data fields affected by that responsibility. In the
purchasing sample, the Purchase class knows the information about a
purchase that you would use to determine any possible discount rates, so
the Purchase class itself is the immediate candidate for calculating
discount rates.