Blog: Don't Copy Code. Oh, and Inheritance and Composition are Bad, Too

The Propel Team – 03 March 2011

I often see, inside the code produced by some of our junior devs, entire blocks of code copied from one class to another. I'm always shocked by this very bad practice. But it's hard to get these developers back into the right path, because there is no perfect alternative.

Don't Copy Code

Ctrl+C and Ctrl+V are probably my favorite computer tools, just after email. That's a great invention and an incredible time saver. The problem is that it should never, ever be used by a developer. It's like giving food to a Gremlin: no matter if you can do it without a second thought during daytime, you mustn't do it after midnight. That's the rule.

Why is it bad to copy code? Because each occurrence of the code will need to be tested and maintained separately. It multiplies the costs by the number of times Ctrl+V was hit. Plus it is a source of confusion and of a false security feeling. You find a bug, you fix it once, and you think you're done. Except that piece of code was copied 17 times across the application, but you won't remember it until the bug reappears all of a sudden.

So there should be an alarm bell ringing in your mind every time you see something like this:

class Book
{
  // some methods

  public function persist()
  {
    apc_store(get_class($this) . $this->id, serialize($this));
  }

  public static function restore($id)
  {
    return unserialize(apc_fetch(get_class($this) . $id));
  }
}

class Author
{
  // some methods

  public function persist()
  {
    apc_store(get_class($this) . $this->id, serialize($this));
  }

  public static function restore($id)
  {
    return unserialize(apc_fetch(get_class($this) . $id));
  }
}

Obviously, in this case the developer needed to give the same ability to two classes: the ability to be persisted in the APC cache, and then restored. And since it was done once in the Book class, the easiest way to give the same ability to the Author class was to copy the code. Right? Wrong.

Don't Repeat Yourself. I won't write it twice.

Tip: If you want to check if a PHP application contains some duplicated code blocks, I recommend phpcd, a Copy/Paste detector library by Sebastian Bergmann. It's dead simple to install, and very efficient.

Don't Use Inheritance for Horizontal Code Reuse

Junior devs quickly understand their mistake once they fix a bug duplicated 17 times. And since they learned Object-Oriented Programming, they often turn up to Inheritance as a great pattern to avoid code duplication. Therefore, the previous piece of code ends up looking like:

class Persistable
{ 
  public function persist()
  {
    apc_store(get_class($this) . $this->id, serialize($this));
  }

  public static function restore($id)
  {
    return unserialize(apc_fetch(get_class($this) . $id));
  }
}

class Book extends Persistable
{
  // some methods
}

class Author extends Persistable
{
  // some methods
}

That's much better: there is no more code duplication. A bug in one of the Persistable methods only needs to be fixed once, and this behavior class can be further reused with only two words (extends Persistable).

Except that's an abuse of the inheritance concept. From the business point of view, a Book is not a Persistable. It's a Publication, or a StoreItem. It has a persistable ability, but that's not the same verb. In fact, the Book class needs to reuse an ability that is not specific to its parent. That's called "horizontal code reuse", as opposed to the "vertical code reuse" of inheritance.

And to better distinguish the two types of reuse, a good rule of thumb is that a class often needs several horizontal reuses, while it can only have at most one vertical reuse. For instance, the Book class needs to be persistable, but it also needs to be sellable (so it must have a price), storable (so it must have a stock quantity), etc. But inheritance only accepts one parent (at least in PHP), so you'll have to choose one.

There is a pattern here: every time you create a class with a name ending with '-able', that's an ability, or "behavior" class, and that's a class that you should not extend. Or you won't be able to extend anything else. And a class isn't mostly distinguished by what it can do, but by what it is.

Composition to The Rescue

In PHP, a good workaround for taking the ability of multiple parent objects is composition. Transform a "behavior" class into a "service" class, and inject that class to all the classes that need it. Now that's clean:

class PersistenceService
{ 
  public function persist($object, $class)
  {
    apc_store($class . $object->id, serialize($object));
  }

  public static function restore($id, $class)
  {
    return unserialize(apc_fetch($class . $id));
  }
}

class Book
{
  // some methods

  protected $persistence;

  public function setPersistence(PersistenceService $persistence)
  {
    $this->persistence = $persistence;
  }

  public function persist()
  {
    $this->persistence->persist($this, get_class($this));
  }

  public static function restore($id)
  {
    return $this->persistence::restore($id, get_class($this));
  }
}

class Author
{
  // some methods

  protected $persistence;

  public function setPersistence(PersistenceService $persistence)
  {
    $this->persistence = $persistence;
  }

  public function persist()
  {
    $this->persistence->persist($this, get_class($this));
  }

  public static function restore($id)
  {
    return $this->persistence::restore($id, get_class($this));
  }
}

All the persistence logic is now encapsulated into one unique class. It's testable, it's isolated, and it's easy to reuse. Cherry on the cake, the Book and Author classes are Plain Old PHP Objects (or POPO) again, meaning they don't extend anything. Unit test gurus advocate that everything should be POPO, and despise inheritance, so that is probably a good thing.

Notice that the end user isn't allowed to chain method calls and has no knowledge of the actual persistence layer. She manipulates the Book::persist() and the Book::restore() proxy methods, and there is no public way to access the persistence layer. That's following the "Principle of Least Knowledge", also called "Law of Demeter".

You may complain that the Book and Author classes are now bigger than in the first code snippet, and that they show a slight code duplication. That's a valid complaint, that I will address shortly. But before that, let me make things a little bit more complex.

Manage Dependencies with a Dependency Injection Container

So inheritance was replaced by composition, that's fine. If the Book class needs to reuse code from several service classes, then you'll add new service setters, and new proxy methods for the service abilities.

But the above code doesn't work out of the box: the service class must be initialized. Before, you could just write:

$book = new Book();
$book->persist();

Now you must write:

$book = new Book();
$book->setPersistence(new PersistenceService());
$book->persist();

Besides, you may not want to duplicate the PersistenceService class, because it may be initialized with some configuration settings, or because it's expensive in terms of memory consumption. So you should rather keep a service container somewhere with access to public services:

// $sc is a service container
$book = new Book();
$book->setPersistence($sc->getService('PersistenceService'));
$book->persist();

There is a smarter way to do this - and to avoid the pitfall of creating a registry class, often plagued by the static keyword. A Dependency Injection Container (or DIC) lazy-loads services on demand, manages dependencies, and allows to subclass a service easily. There are a few good implementations of DIC in PHP, I recommend the one from the Symfony Components library, available in PHP 5.3 or in PHP 5.2.

With a DIC, you can manage services into a configuration file. This file is then compiled into a PHP class for a faster execution.

<?xml version="1.0" ?>
<container xmlns="http://symfony-project.org/2.0/container">
  <services>
    <service id="persistence" class="PersistenceService" shared="false">
    </service>
  </services>
</container>

Code Generation: A Good Alternative

If you're from a PHP background and not from a Java background, you may be overwhelmed by the complexity of a dependency injection container. I tend to agree - seeing that classes like Compiler\ResolveDefinitionTemplatesPass or ContainerAwareInterface are necessary for simple horizontal code reuse feels like using a sledgehammer to crack a nut.

And yet the classes using the services need a lot of duplicated code written by hand, just for the composition and interfaces to the injected services. In addition, the service class may execute expensive code to adapt the service to the class it's used in. For instance, a better persistence service class should check the class of the object passed to the persist() method instead of requiring it as an argument. That's not an expensive execution, but in the real world this kind of runtime introspection really penalizes performance.

One important thing to notice is that the complexity behind a DIC implies a compilation pass, which generates PHP code for a better runtime performance.

Since code generation is used, why not extend it to directly add the necessary code to the end class? What if you could just generate the following code?

class Book
{
  // some methods

  public function persist()
  {
    apc_store(get_class($this) . $this->id, serialize($this));
  }

  public static function restore($id)
  {
    return unserialize(apc_fetch(get_class($this) . $id));
  }
}

class Author
{
  // some methods

  public function persist()
  {
    apc_store(get_class($this) . $this->id, serialize($this));
  }

  public static function restore($id)
  {
    return unserialize(apc_fetch(get_class($this) . $id));
  }
}

Yep, that's exactly the same code as in the first snippet. But if it's generated, code duplication isn't a bad thing anymore. There is still a central place where the code lies - in the generator. The public interface is mostly the same as in the DIC version. The service doesn't need to be injected - it's "baked in" by code generation.

And it's blazingly fast. No runtime introspection, no need to instantiate multiple service objects.

And it offers IDE completion for "composed" services.

Propel uses code generation to offer a fast and configurable way to do horizontal code reuse. Just like with a DIC, you need a few lines of configuration to enable a behavior on a class:

<table phpName="Book" name="book" >
  <behavior name="persistable" />
</table>

Then, Propel generates all the necessary code in a 'Base' class that you only need to extend. That's right, horizontal reuse through inheritance becomes possible again once you use code generation. And it doesn't prevent true model inheritance - Propel supports it out of the box.

Since you'll probably be using code generation to manage horizontal code reuse, use it for good. Skip the DIC and generate the code directly in the final classes.

Mixins: The Only Good Solution

Dependency injection and code generation are just workarounds for a limit of the PHP language. PHP doesn't offer a native way to handle horizontal code reuse. Other languages, like Ruby, support the concept of "Mixin", which offers some kind of multiple inheritance. Here is an example taken from the Ruby documentation:

module Debug
  def whoAmI?
    "#{self.type.name} (\##{self.id}): #{self.to_s}"
  end
end

class Phonograph
  include Debug
  # ...
end

class EightTrack
  include Debug
  # ...
end

The Phonograph and EightTrack classes can reuse the code from the Debug behavior without extending it. That completely removes the need for code duplication, and for all the workaround that were exposed in this article.

Fortunately, PHP will soon have some sort of Mixin support - only they're called "Traits". Traits would allow the Book and Author to get the "persistable" behavior in a clean way:

trait Persistable
{
  public function persist()
  {
    apc_store(get_class($this) . $this->id, serialize($this));
  }

  public static function restore($id)
  {
    return unserialize(apc_fetch(get_class($this) . $id));
  }
}

class Book
{
  use Persistable;

  // some methods
}

class Author
{
  use Persistable;

  // some methods
}

Notice the return of the "-able" suffix in this example. Classes are getting access to a behavior, not a service.

Conclusion

Traits are already implemented in the development version of PHP (probably called PHP 5.4). Until this version is released (no date for now), you're stuck with just workarounds. Choose the one that better fits your needs, and don't get too attached to it. Dependency Injection Containers and Code Generators will soon become overkill once you can do mixins in PHP.

But one thing will always remain valid: You do no copy code. This is the First Rule of the Developer. And do you know what the Second Rule is?