Sunday, 21 December 2014

SOLID: Part 2 - The Open/Closed Principle

This post is part of a series called The SOLID Principles.
SOLID: Part 1 - The Single Responsibility Principle
SOLID: Part 3 - Liskov Substitution & Interface Segregation Principles
Single Responsibility (SRP), Open/Closed (OCP), Liskov's Substitution, Interface Segregation, and Dependency Inversion. Five agile principles that should guide you every time you need to write code.
Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.
The Open/Closed Principle, OCP in short, is credited to Bertrand Mayer, a French programmer, who first published it in his book n Object-Oriented Software Construction in 1988.
The principle rose in popularity in the early 2000s when it became one of the SOLID principles defined by Robert C. Martin in his book Agile Software Development, Principles, Patterns, and Practices and later republished in the C# version of the book Agile Principles, Patterns, and Practices in C#.
What we are basically talking about here is to design our modules, classes and functions in a way that when a new functionality is needed, we should not modify our existing code but rather write new code that will be used by existing code. This sounds a little bit strange, especially if we are working in languages like Java, C, C++ or C# where it applies not only to the source code itself but to the binary also. We want to create new features in ways that will not require us to redeploy existing binaries, executables or DLLs.
As we progress with these tutorials, we can put each new principle in the context of the already discussed ones. We already discussed the Single Responsibility (SRP) that stated that a module should have only one reason to change. If we think about OCP and SRP, we can observe that they are complementary. Code specifically designed with SRP in mind will be close to OCP principles or easy to make it respect those principles. When we have code that has a single reason to change, introducing a new feature will create a secondary reason for that change. So both SRP and OCP would be violated. In the same way, if we have code that should only change when its main function changes and should remain unchanged when a new feature is added to it, thus respecting OCP, will mostly respect SRP also.
This does not mean that SRP always leads to OCP or vice versa, but in most cases if one of them is respected, achieving the second one is quite simple.
From a purely technical point of view, the Open/Closed Principle is very simple. A simple relationship between two classes, like the one below violates the OCP.
violate1
The User class uses the Logic class directly. If we need to implement a second Logic class in a way that will allow us to use both the current one and the new one, the existing Logic class will need to be changed. User is directly tied to the implementation of Logic, there is no way for us to provide a new Logic without affecting the current one. And when we are talking about statically typed languages, it is very possible that the User class will also require changes. If we are talking about compiled languages, most certainly both the User executable and the Logic executable or dynamic library will require recompilation and redeployment to our clients, a process we want to avoid whenever possible.
Based only on the schema above, one can deduce that any class directly using another class would actually violate the Open/Closed Principle. And that is right, strictly speaking. I found it quite interesting to find the limits, the moment when you draw the line and decide that it is more difficult to respect OCP than modify existing code, or the architectural cost does not justify the cost of changing existing code.
Let's say we want to write a class that can provide progress as a percent for a file that is downloaded through our application. We will have two main classes, a Progress and a File, and I imagine we will want to use them like in the test below.
1
2
3
4
5
6
7
8
9
function testItCanGetTheProgressOfAFileAsAPercent() {
    $file = new File();
    $file->length = 200;
    $file->sent = 100;
 
    $progress = new Progress($file);
 
    $this->assertEquals(50, $progress->getAsPercent());
}
In this test we are a user of Progress. We want to obtain a value as a percent, regardless of the actual file size. We use File as the source of information for our Progress. A file has a length in bytes and a field called sent representing the amount of data sent to the one doing the download. We do not care about how these values are updated in the application. We can assume there is some magical logic doing it for us, so in a test we can set them explicitly.
1
2
3
4
class File {
    public $length;
    public $sent;
}
The File class is just a simple data object containing the two fields. Of course in real life, it would probably contain other information and behavior also, like file name, path, relative path, current directory, type, permissions and so on.
01
02
03
04
05
06
07
08
09
10
11
12
13
class Progress {
 
    private $file;
 
    function __construct(File $file) {
        $this->file = $file;
    }
 
    function getAsPercent() {
        return $this->file->sent * 100 / $this->file->length;
    }
 
}
Progress is simply a class taking a File in its constructor. For clarity, we specified the type of the variable in the constructor's parameters. There is a single useful method on Progress, getAsPercent(), which will take the values sent and length from File and transform them into a percent. Simple, and it works.
1
2
3
4
5
Testing started at 5:39 PM ...
PHPUnit 3.7.28 by Sebastian Bergmann.
.
Time: 15 ms, Memory: 2.50Mb
OK (1 test, 1 assertion)
This code seems to be right, however it violates the Open/Closed Principle. But why? And How?
Every application that is expected to evolve in time will need new features. One new feature for our application could be to allow streaming of music, instead of just downloading files. File's length is represented in bytes, the music's duration in seconds. We want to offer a nice progress bar to our listeners, but can we reuse the one we already have?
No, we can not. Our progress is bound to File. It understands only files, even though it could be applied to music content also. But in order to do that we have to modify it, we have to make Progress know about Music and File. If our design would respect OCP, we would not need to touch File or Progress. We could just simply reuse the existing Progress and apply it to Music.
Dynamically typed languages have the advantages of guessing the types of objects at runtime. This allows us to remove the typehint from Progress' constructor and the code will still work.
01
02
03
04
05
06
07
08
09
10
11
12
13
class Progress {
 
    private $file;
 
    function __construct($file) {
        $this->file = $file;
    }
 
    function getAsPercent() {
        return $this->file->sent * 100 / $this->file->length;
    }
 
}
Now we can throw anything at Progress. And by anything, I mean literally anything:
01
02
03
04
05
06
07
08
09
10
11
12
13
class Music {
 
    public $length;
    public $sent;
 
    public $artist;
    public $album;
    public $releaseDate;
 
    function getAlbumCoverFile() {
        return 'Images/Covers/' . $this->artist . '/' . $this->album . '.png';
    }
}
And a Music class like the one above will work just fine. We can test it easily with a very similar test to File.
1
2
3
4
5
6
7
8
9
function testItCanGetTheProgressOfAMusicStreamAsAPercent() {
    $music = new Music();
    $music->length = 200;
    $music->sent = 100;
 
    $progress = new Progress($music);
 
    $this->assertEquals(50, $progress->getAsPercent());
}
So basically, any measurable content can be used with the Progress class. Maybe we should express this in code by changing the variable's name also:
01
02
03
04
05
06
07
08
09
10
11
12
13
class Progress {
 
    private $measurableContent;
 
    function __construct($measurableContent) {
        $this->measurableContent = $measurableContent;
    }
 
    function getAsPercent() {
        return $this->measurableContent->sent * 100 / $this->measurableContent->length;
    }
 
}
Good, but we have a huge problem with this approach. When we had File specified as a typehint, we were positive about what our class can handle. It was explicit and if something else came in, a nice error told us so.
1
2
3
Argument 1 passed to Progress::__construct()
must be an instance of File,
instance of Music given.
But without the typehint, we must rely on the fact that whatever comes in will have two public variables of some exact names like "length" and "sent". Otherwise we will have a refused bequest.
Refused bequest: a class that overrides a method of a base class in such a way that the contract of the base class is not honored by the derived class. ~Source Wikipedia.
This is one of the code smells presented in much more detail in the Detecting Code Smells premium course. In short, we do not want to end up trying to call methods or access fields on objects that do not conform to our contract. When we had a typehint, the contract was specified by it. The fields and methods of the File class. Now that we have nothing, we can send in anything, even a string and it would result in an ugly error.
1
2
3
4
function testItFailsWithAParameterThatDoesNotRespectTheImplicitContract() {
    $progress = new Progress('some string');
    $this->assertEquals(50, $progress->getAsPercent());
}
A test like this, where we send in a simple string, will produce a refused bequest:
1
Trying to get property of non-object.
While the end result is the same in both cases, meaning the code breaks, the first one produced a nice message. This one, however, is very obscure. There is no way of knowing what the variable is - a string in our case - and what properties were looked for and not found. It is difficult to debug and to solve the problem. A programmer needs to open the Progress class and read it and understand it. The contract, in this case, when we do not explicitly specify the typehint, is defined by the behavior of Progress. It is an implicit contract, known only to Progress. In our example, it is defined by the access to the two fields, sent and length, in the getAsPercent() method. In real life the implicit contract can be very complex and hard to discover by just looking for a few seconds at the class.
This solution is recommended only if none of the other suggestions below can easily be implemented or if they would inflict serious architectural changes that do not justify the effort.
This is the most common and probably the most appropriate solution to respect OCP. It is simple and effective.
strategy
The Strategy Pattern simply introduces the use of an interface. An interface is a special type of entity in Object Oriented Programming (OOP) which defines a contract between a client and a server class. Both classes will adhere to the contract to ensure the expected behavior. There may be several, unrelated, server classes that respect the same contract thus being capable of serving the same client class.
1
2
3
4
interface Measurable {
    function getLength();
    function getSent();
}
In an interface we can define only behavior. That is why instead of directly using public variables we will have to think about using getters and setters. Adapting the other classes will not be difficult at this point. Our IDE can do most of the job.
1
2
3
4
5
6
7
8
9
function testItCanGetTheProgressOfAFileAsAPercent() {
    $file = new File();
    $file->setLength(200);
    $file->setSent(100);
 
    $progress = new Progress($file);
 
    $this->assertEquals(50, $progress->getAsPercent());
}
As usual, we start with our tests. We will need to use setters to set the values. If considered mandatory, these setters may also be defined in the Measurable interface. However, be careful what you put there. The interface is to define the contract between the client class Progress and the different server classes like File and Music. Does Progress need to set the values? Probably not. So the setters are highly unlikely to be needed to be defined in the interface. Also, if you would define the setters there, you would force all of the server classes to implement setters. For some of them, it may be logical to have setters, but others may behave totally differently. What if we want to use our Progress class to show the temperature of our oven? The OvenTemperature class may be initialized with the values in the constructor, or obtain the information from a third class. Who knows? To have setters on that class would be odd.
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
class File implements Measurable {
 
    private $length;
    private $sent;
 
    public $filename;
    public $owner;
 
    function setLength($length) {
        $this->length = $length;
    }
 
    function getLength() {
        return $this->length;
    }
 
    function setSent($sent) {
        $this->sent = $sent;
    }
 
    function getSent() {
        return $this->sent;
    }
 
    function getRelativePath() {
        return dirname($this->filename);
    }
 
    function getFullPath() {
        return realpath($this->getRelativePath());
    }
 
}
The File class is modified slightly to accommodate the requirements above. It now implements the Measurable interface and has setters and getters for the fields we are interested in. Music is very similar, you can check its content in the attached source code. We are almost done.
01
02
03
04
05
06
07
08
09
10
11
12
13
class Progress {
 
    private $measurableContent;
 
    function __construct(Measurable $measurableContent) {
        $this->measurableContent = $measurableContent;
    }
 
    function getAsPercent() {
        return $this->measurableContent->getSent() * 100 / $this->measurableContent->getLength();
    }
 
}
Progress also needed a small update. We can now specify a type, using typehinting, in the constructor. The expected type is Measurable. Now we have an explicit contract. Progress can be sure the accessed methods will be always present because they are defined in the Measurable interface. File and Music can also be sure they can provide all that is needed for Progress by simply implementing all the methods on the interface, a requirement when a class implements an interface.
This design pattern is explained in greater detail in the Agile Design Patterns course.
People tend to name interfaces with a capital I in front of them, or with the word "Interface" attached at the end, like IFile or FileInterface. This is an old-style notation imposed by some outdated standards. We are so much past the Hungarian notations or the need to specify the type of a variable or object in its name in order to easier identify it. IDEs identify anything in a split second for us. This allows us to concentrate on what we actually want to abstract.
Interfaces belong to their clients. Yes. When you want to name an interface you must think of the client and forget about the implementation. When we named our interface Measurable we did so thinking about Progress. If I would be a progress, what would I need to be able to provide the percent? The answer is simple, something we can measure. Thus the name Measurable.
Another reason is that the implementation can be from various domains. In our case, there are files and music. But we may very well reuse our Progress in a racing simulator. In that case, the measured classes would be Speed, Fuel, etc. Nice, isn't it?
The Template Method design pattern is very similar to the strategy, but instead of an interface it uses an abstract class. It is recommended to use a Template Method pattern when we have a client very specific to our application, with reduced reusability and when the server classes have common behavior.
template_method
This design pattern is explained in greater detail in the Agile Design Patterns course.
Advertisement
So, how is all of this affecting our high level architecture?
HighLevelDesign
If the image above represents the current architecture of our application, adding a new module with five new classes (the blue ones) should affect our design in a moderate way (red class).
HighLevelDesignWithNewClasses
In most systems you can't expect absolutely no effect on the existing code when new classes are introduced. However, respecting the Open/Closed Principle will considerably reduce the classes and modules that require constant change.
As with any other principle, try not to think about everything from before. If you do so, you will end up with an interface for each of your classes. Such a design will be hard to maintain and understand. Usually the safest way to go is to think about the possibilities and if you can determine whether there will be other types of server classes. Many times you can easily imagine a new feature or you can find one on the project's backlog that will produce another server class. In those cases, add the interface from the beginning. If you can not determine, or if you are unsure - most of the time - simply omit it. Let the next programmer, or maybe even yourself, to add the interface when you need a second implementation.
If you follow your discipline and add interfaces as soon as a second server is needed, modifications will be few and easy. Remember, if code required changes once, there is a high possibility it will require change again. When that possibility turns into reality, OCP will save you a lot of time and effort.
Thank you for reading.

No comments:

Post a Comment