Class extensions in D?

One thing interesting about Objective-C is that you can extend a class by adding new methods to it at runtime. This is done through Categories. There has been some demand for something like it in D too. Here is my take at how it should work in D, and what kind of problem it would solve.

Say that a library provides you with some classes for accessing the filesystem:

class Node
{
    string path();
    string name();
}

class File : Node
{
    ...
}

class Directory : Node
{
    Node[] children();
    ...
}

And a couple of functions giving you directories and files:

File getConfigFile();
Directory getResourceDirectory();
Directory getDataDirectory();

As you don’t have control over these functions, you can’t change the class returned by them. Even if you subclass File or Directory to do what you want, it won’t cause these functions to magically create instances of your class.

It has been suggested many times that D could allow functions where the first argument is a class to be called as if they were a member of that class. For instance, this function:

void backup(Node node, string backupPath);

could be called this way:

backup(node, backupPath);

or this way:

node.backup(backupPath);

The simplicity of this is interesting, but faking member functions like this has a major drawback: contrary to true methods in classes, that function is not part of the virtual table, and thus cannot be dispatched dynamically based on the runtime type. If you wanted to do different things depending on the type of node, the ideal way would be to add a true member function in Node, which you’ll override in the File and Directory subclasses.

Unfortunately, since Node comes from an external library, overriding is not an option… well, that’s what I’m suggesting we add to the language through class extensions. Class extensions are somewhat akin to categories in Objective-C, although safer in regard to accidental name clashes.

Here’s how it works.

The idea of an extension is that you can add member functions to a class:

extension NodeBackup : Node
{
    void backup(string backupPath) { }
}

With this syntax, you say that the NodeBackup extension applies to class Node. Since it’s an extension, it’s not a type you can instantiate. Since it applies to class Node, you can call its functions as if it they were part of Node whenever you have imported the extension’s module.

Then you can override that function in another extension. You do that by deriving that extension from the first one (NodeBackup), and applying to a subclass of the first’s extension base class (File):

extension FileBackup : NodeBackup, File
{
    override void backup(string backupPath)
    {
        copy(this.path, backupPath ~ "/" ~ this.name);
    }
}

The function backup defined here overrides NodeBackup.backup and will be called whenever the Node is a File at runtime.

You can then do the same for Directory.

extension DirectoryBackup : NodeBackup, Directory
{
    override void backup(string backupPath)
    {
        foreach(child; children)
            child.backup(backupPath ~ "/" ~ this.name);
    }
}

And now you can use it like that:

getDataDirectory.backup("/backup_disk");

Special Considerations

How can the compiler generate code that does this?

There are several ways. One way is using dynamic vtable offsets and constructing the vtables at runtime, before first using the class. There are others. I’ll leave the details to another post.

Say you already have a `backup` function in the `Node` or `File` class, what happens?

That should be flagged as ambiguous at the call site and you’d have to manually specify which version of the function you want, something like that (invented syntax):

    (&Node.backup)("/path");

    (&NodeBackup.backup)("/path");

If you don’t like this syntax, avoid defining duplicate names, or avoid importing the module containing that annoying extension.

Say you update the library containing `Node` and that it suddenly adds a `backup` function after your code was compiled, which one get used?

Since your code was compiled by calling the NodeBackup’s backup function, it should continue to do so. The dispatch mechanism should be good enough to tell that you were calling Node.

When you recompile, it’ll get flagged as ambiguous (see previous point).

Say I want to override `backup` and use private variables of the subclass?

Either define a new extension in the same module as the class (private protection doesn’t apply to code in the same module), or merge the extension directly in your subclass by declaring the extension of the base class as an ancestor:

class DirectoryWithSpecialBackupName : Directory, DirectoryBackup
{
    override backup(string path) { ... }
}

Doesn’t that pose some of the same issues as multiple inheritance?

Yes it does. In the preceding example, if say that both the Directory class and the DirectoryBackup extension implements the backup function, a call to DirectoryWithSpecialBackupName will be ambigous. I suggest we forbid deriving a class from an extension implementing a function of the same name.

You can still override the extension’s function from another extension if necessary. And of course, in this case, a call to the class’s version is going to be ambiguous.

Can extensions access private and protected members of the attached class?

No. This would break encapsulation.

Extensions are designed for adding functionalities, not changing existing behaviour, and therefore are not granted any more rights than any function in the same scope as the extension. (That last sentence would make it a yes if the extension is defined in the same module though.)