Saturday 28 February 2009

Managing object ownership in C++ with auto_ptr

I really like garbage collected languages, they make you feel safe, confident for not having to explicitly manage memory...but at some price: expressiveness on object ownership and lack of control of objects lifecycle; when an instance of a class is created, no one but the garbage collector is responsible for taking care of that object.

We are fine as long as we are only dealing with memory, but lacking such ability to communicate responsibilities is a pain when resources (e.g. files, locks, connections to external systems) come to scene.

C++ has mechanisms to express responsibilities a class has over other classes and to control when objects are destroyed. This first of a series of post in which I want to write about such mechanisms; today it's the turn for std::auto_ptr.

Smart pointers are classes that mimic "classic" pointers by overloading operator* and operator->, but provide "extra features" (e.g. memory management, locking) to their pointees.

std::auto_ptr is a smart pointer included in the Standard Template Library; its main features are:
  • it owns its pointee, which means that when an auto_ptr is destroyed, so is its pointee.

  • it can transfer the ownership of its pointee to another auto_ptr.


This is a simplified version of auto_ptr:

namespace std {
template <class X>
class auto_ptr {
public:
explicit auto_ptr(X* p =0) : pointee_(p) { };
auto_ptr(auto_ptr& rhs) {
pointee_ = rhs.pointee_;
rhs.pointee_ = 0; // relinquish ownership
}

~auto_ptr() { delete pointee_; }

X& operator*() const { return *pointee_; }
X* operator->() const { return pointee_; }
private:
X* pointee_;
};
};

Notice how pointee_ is destroyed in the destructor of auto_ptr and how ownership of pointee_ is transfered in the copy constructor.

auto_ptr's power arises when it is used with value semantics, that is:
  • as a local variable (stack allocated): when the auto_ptr goes out of scope, it is destroyed and so its pointee is deleted, avoiding any explicit memory management on our side. This still holds true in presence of exceptions:

    void doStuff1 () {
    MyClass *myClass (new MyClass);

    // some dangerous stuff here that likely
    // throws an exception, leaking *myClass

    delete myClass;
    }

    void doStuff2 () {
    auto_ptr<MyClass> myClass (new MyClass);

    // some dangeous stuff here that, even
    // upon exceptions, does not cause leaks

    // no explicit delete needed
    }

    This is a form of the RAII (Resource Acquisition Is Initialization) C++ idiom, which I plan to post about soon...


  • as a function parameter passed by value: when an invocation to that function occurs, and because of the copy of the auto_ptr performed to build the parameter, the caller relinquishes ownership on the auto_ptr in favor of the callee. The auto_ptr's pointee will be destroyed when the callee returns, unless it also relinquishes ownership within its body. This idiom is called the "auto_ptr sink".


  • as return value of a function: when a function returns an auto_ptr by value, it is relinquishing ownership on it in favor of the caller. This idiom is called the "auto_ptr source".


  • as a class member: this is often used to tie the lifetime of the auto_ptr's pointee with that of the class instance the auto_ptr is member of; when that object is destroyed, so is the auto_ptr and so the pointee.



These uses of auto_ptr enable the programmer to communicate how variables are meant to be used. Let's see some examples:

  • Object creation by factories: factories are meant to encapsulate object creation. They are often implemented this way:


    class MyFactory {
    public:
    MyObject* createMyObject();
    };

    Seeing that code, you cannot be sure whether the caller is responsible or not for the destruction of the just created object. A possible solution would be to drop a comment depicting who the responsible one is, but instead let's use a smart pointer to replace the dumb one:

    class MyFactory {
    public:
    auto_ptr<MyObject> createMyObject();
    };

    This way, using the auto_ptr source idiom, the factory relinquishes ownership of the just created object in favour of the caller, so we are sure about who the one responsible for the object is just looking at the code.


  • Decorator pattern: with such design pattern, an object wraps another one to provide extra features; both them implement the same interface, so the wrapped version is used seamlessly as if it were not wrapped. In the usual implementation of this pattern, the decorator class receives a pointer to the wrapped one in its constructor, stores it as a member variable, and destroys it in its own destructor. As in the previous example, let's use a smart pointer instead of a dumb one to better communicate responsibilities on the wrapped object's lifetime.


    class IStuff {
    public:
    virtual std::string getStuff () = 0;
    };

    class Foo : public IStuff {
    public:
    Foo () { }
    std::string getStuff () {
    return "foo";
    }
    };

    class Decorator : public IStuff {
    public:
    Decorator (std::auto_ptr<IStuff> decorated)
    : decorated_ (decorated) { }
    std::string getStuff () {
    return "decorated " + decorated_->getStuff();
    }
    private:
    std::auto_ptr<IStuff> decorated_;
    };

    int main (void) {
    std::auto_ptr<IStuff> foo (new Foo);
    std::auto_ptr<IStuff> decoratedFoo (new Decorator(foo));

    std::cout << decoratedFoo->getStuff() << std::endl;

    return 0;
    }

    Note how the decorated object is passed inside an auto_ptr to Decorator's constructor, thus we are expressing Decorator is the one responsible for managing it. Also note how the auto_ptr is stored in a member variable, thus tying the lifetime of the decorated object with that of the decorator.


Some final thoughts...

  • A variation in how auto_ptr behaves, arises when it is declared const: a const auto_ptr cannot transfer the ownership of its pointee. In this case, the pointee is tied to that very auto_ptr, no matter what happens. This is a somehow degenerated use of auto_ptr; for this, it is more appropriate to use boost's scoped_ptr.

  • Beware of auto_ptr when dealing with STL containers, they are not auto_ptr friendly because they make internal copies of the data, sometimes maintaining more than one copy at the same time, which goes agains the auto_ptr allowed usage. So just avoid using lists (or vectors, or whatever STL container) of auto_ptr.

  • I intentionally avoided mentioning a member function of auto_ptr called "reset" that enables reusing the auto_ptr for other pointee. I tend not to use it because I think it's quite misleading, as auto_ptr is often used to communicate durable ownership semantics.



Preparing for the future: unique_ptr
Since its inclusion in the C++ standard, auto_ptr has hold much controversy due to both its semantics and its implementation. In the new C++ standard, known as C++0x, auto_ptr is DEPRECATED. It is to be replaced with class unique_ptr, taken from boost, which basically work like auto_ptr but without its deficiencies. To do so, it makes use of rvalue references, one of the goodies C++0x will bring, which enable to express "move semantics".

No comments:

Post a Comment