| Copyright David Abrahams 2006. Distributed under the Boost |
| Software License, Version 1.0. (See accompanying |
| file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) |
| |
| .. This is a comment. Note how any initial comments are moved by |
| transforms to after the document title, subtitle, and docinfo. |
| |
| .. Need intro and conclusion |
| .. Exposing classes |
| .. Constructors |
| .. Overloading |
| .. Properties and data members |
| .. Inheritance |
| .. Operators and Special Functions |
| .. Virtual Functions |
| .. Call Policies |
| |
| ++++++++++++++++++++++++++++++++++++++++++++++ |
| Introducing Boost.Python (Extended Abstract) |
| ++++++++++++++++++++++++++++++++++++++++++++++ |
| |
| |
| .. bibliographic fields (which also require a transform): |
| |
| :Author: David Abrahams |
| :Address: 45 Walnut Street |
| Somerville, MA 02143 |
| :Contact: dave@boost-consulting.com |
| :organization: `Boost Consulting`_ |
| :date: $Date: 2006-09-11 18:27:29 -0400 (Mon, 11 Sep 2006) $ |
| :status: This is a "work in progress" |
| :version: 1 |
| :copyright: Copyright David Abrahams 2002. All rights reserved |
| |
| :Dedication: |
| |
| For my girlfriend, wife, and partner Luann |
| |
| :abstract: |
| |
| This paper describes the Boost.Python library, a system for |
| C++/Python interoperability. |
| |
| .. meta:: |
| :keywords: Boost,python,Boost.Python,C++ |
| :description lang=en: C++/Python interoperability with Boost.Python |
| |
| .. contents:: Table of Contents |
| .. section-numbering:: |
| |
| |
| .. _`Boost Consulting`: http://www.boost-consulting.com |
| |
| ============== |
| Introduction |
| ============== |
| |
| Python and C++ are in many ways as different as two languages could |
| be: while C++ is usually compiled to machine-code, Python is |
| interpreted. Python's dynamic type system is often cited as the |
| foundation of its flexibility, while in C++ static typing is the |
| cornerstone of its efficiency. C++ has an intricate and difficult |
| meta-language to support compile-time polymorphism, while Python is |
| a uniform language with convenient runtime polymorphism. |
| |
| Yet for many programmers, these very differences mean that Python and |
| C++ complement one another perfectly. Performance bottlenecks in |
| Python programs can be rewritten in C++ for maximal speed, and |
| authors of powerful C++ libraries choose Python as a middleware |
| language for its flexible system integration capabilities. |
| Furthermore, the surface differences mask some strong similarities: |
| |
| * 'C'-family control structures (if, while, for...) |
| |
| * Support for object-orientation, functional programming, and generic |
| programming (these are both *multi-paradigm* programming languages.) |
| |
| * Comprehensive operator overloading facilities, recognizing the |
| importance of syntactic variability for readability and |
| expressivity. |
| |
| * High-level concepts such as collections and iterators. |
| |
| * High-level encapsulation facilities (C++: namespaces, Python: modules) |
| to support the design of re-usable libraries. |
| |
| * Exception-handling for effective management of error conditions. |
| |
| * C++ idioms in common use, such as handle/body classes and |
| reference-counted smart pointers mirror Python reference semantics. |
| |
| Python provides a rich 'C' API for writers of 'C' extension modules. |
| Unfortunately, using this API directly for exposing C++ type and |
| function interfaces to Python is much more tedious than it should be. |
| This is mainly due to the limitations of the 'C' language. Compared to |
| C++ and Python, 'C' has only very rudimentary abstraction facilities. |
| Support for exception-handling is completely missing. One important |
| undesirable consequence is that 'C' extension module writers are |
| required to manually manage Python reference counts. Another unpleasant |
| consequence is a very high degree of repetition of similar code in 'C' |
| extension modules. Of course highly redundant code does not only cause |
| frustration for the module writer, but is also very difficult to |
| maintain. |
| |
| The limitations of the 'C' API have lead to the development of a |
| variety of wrapping systems. SWIG_ is probably the most popular package |
| for the integration of C/C++ and Python. A more recent development is |
| the SIP_ package, which is specifically designed for interfacing Python |
| with the Qt_ graphical user interface library. Both SWIG and SIP |
| introduce a new specialized language for defining the inter-language |
| bindings. Of course being able to use a specialized language has |
| advantages, but having to deal with three different languages (Python, |
| C/C++ and the interface language) also introduces practical and mental |
| difficulties. The CXX_ package demonstrates an interesting alternative. |
| It shows that at least some parts of Python's 'C' API can be wrapped |
| and presented through a much more user-friendly C++ interface. However, |
| unlike SWIG and SIP, CXX does not include support for wrapping C++ |
| classes as new Python types. CXX is also no longer actively developed. |
| |
| In some respects Boost.Python combines ideas from SWIG and SIP with |
| ideas from CXX. Like SWIG and SIP, Boost.Python is a system for |
| wrapping C++ classes as new Python "built-in" types, and C/C++ |
| functions as Python functions. Like CXX, Boost.Python presents Python's |
| 'C' API through a C++ interface. Boost.Python goes beyond the scope of |
| other systems with the unique support for C++ virtual functions that |
| are overrideable in Python, support for organizing extensions as Python |
| packages with a central registry for inter-language type conversions, |
| and a convenient mechanism for tying into Python's serialization engine |
| (pickle). Importantly, all this is achieved without introducing a new |
| syntax. Boost.Python leverages the power of C++ meta-programming |
| techniques to introspect about the C++ type system, and presents a |
| simple, IDL-like C++ interface for exposing C/C++ code in extension |
| modules. Boost.Python is a pure C++ library, the inter-language |
| bindings are defined in pure C++, and other than a C++ compiler only |
| Python itself is required to get started with Boost.Python. Last but |
| not least, Boost.Python is an unrestricted open source library. There |
| are no strings attached even for commercial applications. |
| |
| .. _SWIG: http://www.swig.org/ |
| .. _SIP: http://www.riverbankcomputing.co.uk/sip/index.php |
| .. _Qt: http://www.trolltech.com/ |
| .. _CXX: http://cxx.sourceforge.net/ |
| |
| =========================== |
| Boost.Python Design Goals |
| =========================== |
| |
| The primary goal of Boost.Python is to allow users to expose C++ |
| classes and functions to Python using nothing more than a C++ |
| compiler. In broad strokes, the user experience should be one of |
| directly manipulating C++ objects from Python. |
| |
| However, it's also important not to translate all interfaces *too* |
| literally: the idioms of each language must be respected. For |
| example, though C++ and Python both have an iterator concept, they are |
| expressed very differently. Boost.Python has to be able to bridge the |
| interface gap. |
| |
| It must be possible to insulate Python users from crashes resulting |
| from trivial misuses of C++ interfaces, such as accessing |
| already-deleted objects. By the same token the library should |
| insulate C++ users from low-level Python 'C' API, replacing |
| error-prone 'C' interfaces like manual reference-count management and |
| raw ``PyObject`` pointers with more-robust alternatives. |
| |
| Support for component-based development is crucial, so that C++ types |
| exposed in one extension module can be passed to functions exposed in |
| another without loss of crucial information like C++ inheritance |
| relationships. |
| |
| Finally, all wrapping must be *non-intrusive*, without modifying or |
| even seeing the original C++ source code. Existing C++ libraries have |
| to be wrappable by third parties who only have access to header files |
| and binaries. |
| |
| ========================== |
| Hello Boost.Python World |
| ========================== |
| |
| And now for a preview of Boost.Python, and how it improves on the raw |
| facilities offered by Python. Here's a function we might want to |
| expose:: |
| |
| char const* greet(unsigned x) |
| { |
| static char const* const msgs[] = { "hello", "Boost.Python", "world!" }; |
| |
| if (x > 2) |
| throw std::range_error("greet: index out of range"); |
| |
| return msgs[x]; |
| } |
| |
| To wrap this function in standard C++ using the Python 'C' API, we'd |
| need something like this:: |
| |
| extern "C" // all Python interactions use 'C' linkage and calling convention |
| { |
| // Wrapper to handle argument/result conversion and checking |
| PyObject* greet_wrap(PyObject* args, PyObject * keywords) |
| { |
| int x; |
| if (PyArg_ParseTuple(args, "i", &x)) // extract/check arguments |
| { |
| char const* result = greet(x); // invoke wrapped function |
| return PyString_FromString(result); // convert result to Python |
| } |
| return 0; // error occurred |
| } |
| |
| // Table of wrapped functions to be exposed by the module |
| static PyMethodDef methods[] = { |
| { "greet", greet_wrap, METH_VARARGS, "return one of 3 parts of a greeting" } |
| , { NULL, NULL, 0, NULL } // sentinel |
| }; |
| |
| // module initialization function |
| DL_EXPORT init_hello() |
| { |
| (void) Py_InitModule("hello", methods); // add the methods to the module |
| } |
| } |
| |
| Now here's the wrapping code we'd use to expose it with Boost.Python:: |
| |
| #include <boost/python.hpp> |
| using namespace boost::python; |
| BOOST_PYTHON_MODULE(hello) |
| { |
| def("greet", greet, "return one of 3 parts of a greeting"); |
| } |
| |
| and here it is in action:: |
| |
| >>> import hello |
| >>> for x in range(3): |
| ... print hello.greet(x) |
| ... |
| hello |
| Boost.Python |
| world! |
| |
| Aside from the fact that the 'C' API version is much more verbose than |
| the BPL one, it's worth noting that it doesn't handle a few things |
| correctly: |
| |
| * The original function accepts an unsigned integer, and the Python |
| 'C' API only gives us a way of extracting signed integers. The |
| Boost.Python version will raise a Python exception if we try to pass |
| a negative number to ``hello.greet``, but the other one will proceed |
| to do whatever the C++ implementation does when converting an |
| negative integer to unsigned (usually wrapping to some very large |
| number), and pass the incorrect translation on to the wrapped |
| function. |
| |
| * That brings us to the second problem: if the C++ ``greet()`` |
| function is called with a number greater than 2, it will throw an |
| exception. Typically, if a C++ exception propagates across the |
| boundary with code generated by a 'C' compiler, it will cause a |
| crash. As you can see in the first version, there's no C++ |
| scaffolding there to prevent this from happening. Functions wrapped |
| by Boost.Python automatically include an exception-handling layer |
| which protects Python users by translating unhandled C++ exceptions |
| into a corresponding Python exception. |
| |
| * A slightly more-subtle limitation is that the argument conversion |
| used in the Python 'C' API case can only get that integer ``x`` in |
| *one way*. PyArg_ParseTuple can't convert Python ``long`` objects |
| (arbitrary-precision integers) which happen to fit in an ``unsigned |
| int`` but not in a ``signed long``, nor will it ever handle a |
| wrapped C++ class with a user-defined implicit ``operator unsigned |
| int()`` conversion. The BPL's dynamic type conversion registry |
| allows users to add arbitrary conversion methods. |
| |
| ================== |
| Library Overview |
| ================== |
| |
| This section outlines some of the library's major features. Except as |
| necessary to avoid confusion, details of library implementation are |
| omitted. |
| |
| ------------------------------------------- |
| The fundamental type-conversion mechanism |
| ------------------------------------------- |
| |
| XXX This needs to be rewritten. |
| |
| Every argument of every wrapped function requires some kind of |
| extraction code to convert it from Python to C++. Likewise, the |
| function return value has to be converted from C++ to Python. |
| Appropriate Python exceptions must be raised if the conversion fails. |
| Argument and return types are part of the function's type, and much of |
| this tedium can be relieved if the wrapping system can extract that |
| information through introspection. |
| |
| Passing a wrapped C++ derived class instance to a C++ function |
| accepting a pointer or reference to a base class requires knowledge of |
| the inheritance relationship and how to translate the address of a base |
| class into that of a derived class. |
| |
| ------------------ |
| Exposing Classes |
| ------------------ |
| |
| C++ classes and structs are exposed with a similarly-terse interface. |
| Given:: |
| |
| struct World |
| { |
| void set(std::string msg) { this->msg = msg; } |
| std::string greet() { return msg; } |
| std::string msg; |
| }; |
| |
| The following code will expose it in our extension module:: |
| |
| #include <boost/python.hpp> |
| BOOST_PYTHON_MODULE(hello) |
| { |
| class_<World>("World") |
| .def("greet", &World::greet) |
| .def("set", &World::set) |
| ; |
| } |
| |
| Although this code has a certain pythonic familiarity, people |
| sometimes find the syntax bit confusing because it doesn't look like |
| most of the C++ code they're used to. All the same, this is just |
| standard C++. Because of their flexible syntax and operator |
| overloading, C++ and Python are great for defining domain-specific |
| (sub)languages |
| (DSLs), and that's what we've done in BPL. To break it down:: |
| |
| class_<World>("World") |
| |
| constructs an unnamed object of type ``class_<World>`` and passes |
| ``"World"`` to its constructor. This creates a new-style Python class |
| called ``World`` in the extension module, and associates it with the |
| C++ type ``World`` in the BPL type conversion registry. We might have |
| also written:: |
| |
| class_<World> w("World"); |
| |
| but that would've been more verbose, since we'd have to name ``w`` |
| again to invoke its ``def()`` member function:: |
| |
| w.def("greet", &World::greet) |
| |
| There's nothing special about the location of the dot for member |
| access in the original example: C++ allows any amount of whitespace on |
| either side of a token, and placing the dot at the beginning of each |
| line allows us to chain as many successive calls to member functions |
| as we like with a uniform syntax. The other key fact that allows |
| chaining is that ``class_<>`` member functions all return a reference |
| to ``*this``. |
| |
| So the example is equivalent to:: |
| |
| class_<World> w("World"); |
| w.def("greet", &World::greet); |
| w.def("set", &World::set); |
| |
| It's occasionally useful to be able to break down the components of a |
| Boost.Python class wrapper in this way, but the rest of this paper |
| will tend to stick to the terse syntax. |
| |
| For completeness, here's the wrapped class in use: |
| |
| >>> import hello |
| >>> planet = hello.World() |
| >>> planet.set('howdy') |
| >>> planet.greet() |
| 'howdy' |
| |
| Constructors |
| ============ |
| |
| Since our ``World`` class is just a plain ``struct``, it has an |
| implicit no-argument (nullary) constructor. Boost.Python exposes the |
| nullary constructor by default, which is why we were able to write: |
| |
| >>> planet = hello.World() |
| |
| However, well-designed classes in any language may require constructor |
| arguments in order to establish their invariants. Unlike Python, |
| where ``__init__`` is just a specially-named method, In C++ |
| constructors cannot be handled like ordinary member functions. In |
| particular, we can't take their address: ``&World::World`` is an |
| error. The library provides a different interface for specifying |
| constructors. Given:: |
| |
| struct World |
| { |
| World(std::string msg); // added constructor |
| ... |
| |
| we can modify our wrapping code as follows:: |
| |
| class_<World>("World", init<std::string>()) |
| ... |
| |
| of course, a C++ class may have additional constructors, and we can |
| expose those as well by passing more instances of ``init<...>`` to |
| ``def()``:: |
| |
| class_<World>("World", init<std::string>()) |
| .def(init<double, double>()) |
| ... |
| |
| Boost.Python allows wrapped functions, member functions, and |
| constructors to be overloaded to mirror C++ overloading. |
| |
| Data Members and Properties |
| =========================== |
| |
| Any publicly-accessible data members in a C++ class can be easily |
| exposed as either ``readonly`` or ``readwrite`` attributes:: |
| |
| class_<World>("World", init<std::string>()) |
| .def_readonly("msg", &World::msg) |
| ... |
| |
| and can be used directly in Python: |
| |
| >>> planet = hello.World('howdy') |
| >>> planet.msg |
| 'howdy' |
| |
| This does *not* result in adding attributes to the ``World`` instance |
| ``__dict__``, which can result in substantial memory savings when |
| wrapping large data structures. In fact, no instance ``__dict__`` |
| will be created at all unless attributes are explicitly added from |
| Python. BPL owes this capability to the new Python 2.2 type system, |
| in particular the descriptor interface and ``property`` type. |
| |
| In C++, publicly-accessible data members are considered a sign of poor |
| design because they break encapsulation, and style guides usually |
| dictate the use of "getter" and "setter" functions instead. In |
| Python, however, ``__getattr__``, ``__setattr__``, and since 2.2, |
| ``property`` mean that attribute access is just one more |
| well-encapsulated syntactic tool at the programmer's disposal. BPL |
| bridges this idiomatic gap by making Python ``property`` creation |
| directly available to users. So if ``msg`` were private, we could |
| still expose it as attribute in Python as follows:: |
| |
| class_<World>("World", init<std::string>()) |
| .add_property("msg", &World::greet, &World::set) |
| ... |
| |
| The example above mirrors the familiar usage of properties in Python |
| 2.2+: |
| |
| >>> class World(object): |
| ... __init__(self, msg): |
| ... self.__msg = msg |
| ... def greet(self): |
| ... return self.__msg |
| ... def set(self, msg): |
| ... self.__msg = msg |
| ... msg = property(greet, set) |
| |
| Operators and Special Functions |
| =============================== |
| |
| The ability to write arithmetic operators for user-defined types that |
| C++ and Python both allow the definition of has been a major factor in |
| the popularity of both languages for scientific computing. The |
| success of packages like NumPy attests to the power of exposing |
| operators in extension modules. In this example we'll wrap a class |
| representing a position in a large file:: |
| |
| class FilePos { /*...*/ }; |
| |
| // Linear offset |
| FilePos operator+(FilePos, int); |
| FilePos operator+(int, FilePos); |
| FilePos operator-(FilePos, int); |
| |
| // Distance between two FilePos objects |
| int operator-(FilePos, FilePos); |
| |
| // Offset with assignment |
| FilePos& operator+=(FilePos&, int); |
| FilePos& operator-=(FilePos&, int); |
| |
| // Comparison |
| bool operator<(FilePos, FilePos); |
| |
| The wrapping code looks like this:: |
| |
| class_<FilePos>("FilePos") |
| .def(self + int()) // __add__ |
| .def(int() + self) // __radd__ |
| .def(self - int()) // __sub__ |
| |
| .def(self - self) // __sub__ |
| |
| .def(self += int()) // __iadd__ |
| .def(self -= int()) // __isub__ |
| |
| .def(self < self); // __lt__ |
| ; |
| |
| The magic is performed using a simplified application of "expression |
| templates" [VELD1995]_, a technique originally developed by for |
| optimization of high-performance matrix algebra expressions. The |
| essence is that instead of performing the computation immediately, |
| operators are overloaded to construct a type *representing* the |
| computation. In matrix algebra, dramatic optimizations are often |
| available when the structure of an entire expression can be taken into |
| account, rather than processing each operation "greedily". |
| Boost.Python uses the same technique to build an appropriate Python |
| callable object based on an expression involving ``self``, which is |
| then added to the class. |
| |
| Inheritance |
| =========== |
| |
| C++ inheritance relationships can be represented to Boost.Python by adding |
| an optional ``bases<...>`` argument to the ``class_<...>`` template |
| parameter list as follows:: |
| |
| class_<Derived, bases<Base1,Base2> >("Derived") |
| ... |
| |
| This has two effects: |
| |
| 1. When the ``class_<...>`` is created, Python type objects |
| corresponding to ``Base1`` and ``Base2`` are looked up in the BPL |
| registry, and are used as bases for the new Python ``Derived`` type |
| object [#mi]_, so methods exposed for the Python ``Base1`` and |
| ``Base2`` types are automatically members of the ``Derived`` type. |
| Because the registry is global, this works correctly even if |
| ``Derived`` is exposed in a different module from either of its |
| bases. |
| |
| 2. C++ conversions from ``Derived`` to its bases are added to the |
| Boost.Python registry. Thus wrapped C++ methods expecting (a |
| pointer or reference to) an object of either base type can be |
| called with an object wrapping a ``Derived`` instance. Wrapped |
| member functions of class ``T`` are treated as though they have an |
| implicit first argument of ``T&``, so these conversions are |
| necessary to allow the base class methods to be called for derived |
| objects. |
| |
| Of course it's possible to derive new Python classes from wrapped C++ |
| class instances. Because Boost.Python uses the new-style class |
| system, that works very much as for the Python built-in types. There |
| is one significant detail in which it differs: the built-in types |
| generally establish their invariants in their ``__new__`` function, so |
| that derived classes do not need to call ``__init__`` on the base |
| class before invoking its methods : |
| |
| >>> class L(list): |
| ... def __init__(self): |
| ... pass |
| ... |
| >>> L().reverse() |
| >>> |
| |
| Because C++ object construction is a one-step operation, C++ instance |
| data cannot be constructed until the arguments are available, in the |
| ``__init__`` function: |
| |
| >>> class D(SomeBPLClass): |
| ... def __init__(self): |
| ... pass |
| ... |
| >>> D().some_bpl_method() |
| Traceback (most recent call last): |
| File "<stdin>", line 1, in ? |
| TypeError: bad argument type for built-in operation |
| |
| This happened because Boost.Python couldn't find instance data of type |
| ``SomeBPLClass`` within the ``D`` instance; ``D``'s ``__init__`` |
| function masked construction of the base class. It could be corrected |
| by either removing ``D``'s ``__init__`` function or having it call |
| ``SomeBPLClass.__init__(...)`` explicitly. |
| |
| Virtual Functions |
| ================= |
| |
| Deriving new types in Python from extension classes is not very |
| interesting unless they can be used polymorphically from C++. In |
| other words, Python method implementations should appear to override |
| the implementation of C++ virtual functions when called *through base |
| class pointers/references from C++*. Since the only way to alter the |
| behavior of a virtual function is to override it in a derived class, |
| the user must build a special derived class to dispatch a polymorphic |
| class' virtual functions:: |
| |
| // |
| // interface to wrap: |
| // |
| class Base |
| { |
| public: |
| virtual int f(std::string x) { return 42; } |
| virtual ~Base(); |
| }; |
| |
| int calls_f(Base const& b, std::string x) { return b.f(x); } |
| |
| // |
| // Wrapping Code |
| // |
| |
| // Dispatcher class |
| struct BaseWrap : Base |
| { |
| // Store a pointer to the Python object |
| BaseWrap(PyObject* self_) : self(self_) {} |
| PyObject* self; |
| |
| // Default implementation, for when f is not overridden |
| int f_default(std::string x) { return this->Base::f(x); } |
| // Dispatch implementation |
| int f(std::string x) { return call_method<int>(self, "f", x); } |
| }; |
| |
| ... |
| def("calls_f", calls_f); |
| class_<Base, BaseWrap>("Base") |
| .def("f", &Base::f, &BaseWrap::f_default) |
| ; |
| |
| Now here's some Python code which demonstrates: |
| |
| >>> class Derived(Base): |
| ... def f(self, s): |
| ... return len(s) |
| ... |
| >>> calls_f(Base(), 'foo') |
| 42 |
| >>> calls_f(Derived(), 'forty-two') |
| 9 |
| |
| Things to notice about the dispatcher class: |
| |
| * The key element which allows overriding in Python is the |
| ``call_method`` invocation, which uses the same global type |
| conversion registry as the C++ function wrapping does to convert its |
| arguments from C++ to Python and its return type from Python to C++. |
| |
| * Any constructor signatures you wish to wrap must be replicated with |
| an initial ``PyObject*`` argument |
| |
| * The dispatcher must store this argument so that it can be used to |
| invoke ``call_method`` |
| |
| * The ``f_default`` member function is needed when the function being |
| exposed is not pure virtual; there's no other way ``Base::f`` can be |
| called on an object of type ``BaseWrap``, since it overrides ``f``. |
| |
| Admittedly, this formula is tedious to repeat, especially on a project |
| with many polymorphic classes; that it is necessary reflects |
| limitations in C++'s compile-time reflection capabilities. Several |
| efforts are underway to write front-ends for Boost.Python which can |
| generate these dispatchers (and other wrapping code) automatically. |
| If these are successful it will mark a move away from wrapping |
| everything directly in pure C++ for many of our users. |
| |
| --------------- |
| Serialization |
| --------------- |
| |
| *Serialization* is the process of converting objects in memory to a |
| form that can be stored on disk or sent over a network connection. The |
| serialized object (most often a plain string) can be retrieved and |
| converted back to the original object. A good serialization system will |
| automatically convert entire object hierarchies. Python's standard |
| ``pickle`` module is such a system. It leverages the language's strong |
| runtime introspection facilities for serializing practically arbitrary |
| user-defined objects. With a few simple and unintrusive provisions this |
| powerful machinery can be extended to also work for wrapped C++ objects. |
| Here is an example:: |
| |
| #include <string> |
| |
| struct World |
| { |
| World(std::string a_msg) : msg(a_msg) {} |
| std::string greet() const { return msg; } |
| std::string msg; |
| }; |
| |
| #include <boost/python.hpp> |
| using namespace boost::python; |
| |
| struct World_picklers : pickle_suite |
| { |
| static tuple |
| getinitargs(World const& w) { return make_tuple(w.greet()); } |
| }; |
| |
| BOOST_PYTHON_MODULE(hello) |
| { |
| class_<World>("World", init<std::string>()) |
| .def("greet", &World::greet) |
| .def_pickle(World_picklers()) |
| ; |
| } |
| |
| Now let's create a ``World`` object and put it to rest on disk:: |
| |
| >>> import hello |
| >>> import pickle |
| >>> a_world = hello.World("howdy") |
| >>> pickle.dump(a_world, open("my_world", "w")) |
| |
| In a potentially *different script* on a potentially *different |
| computer* with a potentially *different operating system*:: |
| |
| >>> import pickle |
| >>> resurrected_world = pickle.load(open("my_world", "r")) |
| >>> resurrected_world.greet() |
| 'howdy' |
| |
| Of course the ``cPickle`` module can also be used for faster |
| processing. |
| |
| Boost.Python's ``pickle_suite`` fully supports the ``pickle`` protocol |
| defined in the standard Python documentation. There is a one-to-one |
| correspondence between the standard pickling methods (``__getinitargs__``, |
| ``__getstate__``, ``__setstate__``) and the functions defined by the |
| user in the class derived from ``pickle_suite`` (``getinitargs``, |
| ``getstate``, ``setstate``). The ``class_::def_pickle()`` member function |
| is used to establish the Python bindings for all user-defined functions |
| simultaneously. Correct signatures for these functions are enforced at |
| compile time. Non-sensical combinations of the three pickle functions |
| are also rejected at compile time. These measures are designed to |
| help the user in avoiding obvious errors. |
| |
| Enabling serialization of more complex C++ objects requires a little |
| more work than is shown in the example above. Fortunately the |
| ``object`` interface (see next section) greatly helps in keeping the |
| code manageable. |
| |
| ------------------ |
| Object interface |
| ------------------ |
| |
| Experienced extension module authors will be familiar with the 'C' view |
| of Python objects, the ubiquitous ``PyObject*``. Most if not all Python |
| 'C' API functions involve ``PyObject*`` as arguments or return type. A |
| major complication is the raw reference counting interface presented to |
| the 'C' programmer. E.g. some API functions return *new references* and |
| others return *borrowed references*. It is up to the extension module |
| writer to properly increment and decrement reference counts. This |
| quickly becomes cumbersome and error prone, especially if there are |
| multiple execution paths. |
| |
| Boost.Python provides a type ``object`` which is essentially a high |
| level wrapper around ``PyObject*``. ``object`` automates reference |
| counting as much as possible. It also provides the facilities for |
| converting arbitrary C++ types to Python objects and vice versa. |
| This significantly reduces the learning effort for prospective |
| extension module writers. |
| |
| Creating an ``object`` from any other type is extremely simple:: |
| |
| object o(3); |
| |
| ``object`` has templated interactions with all other types, with |
| automatic to-python conversions. It happens so naturally that it's |
| easily overlooked. |
| |
| The ``extract<T>`` class template can be used to convert Python objects |
| to C++ types:: |
| |
| double x = extract<double>(o); |
| |
| All registered user-defined conversions are automatically accessible |
| through the ``object`` interface. With reference to the ``World`` class |
| defined in previous examples:: |
| |
| object as_python_object(World("howdy")); |
| World back_as_c_plus_plus_object = extract<World>(as_python_object); |
| |
| If a C++ type cannot be converted to a Python object an appropriate |
| exception is thrown at runtime. Similarly, an appropriate exception is |
| thrown if a C++ type cannot be extracted from a Python object. |
| ``extract<T>`` provides facilities for avoiding exceptions if this is |
| desired. |
| |
| The ``object::attr()`` member function is available for accessing |
| and manipulating attributes of Python objects. For example:: |
| |
| object planet(World()); |
| planet.attr("set")("howdy"); |
| |
| ``planet.attr("set")`` returns a callable ``object``. ``"howdy"`` is |
| converted to a Python string object which is then passed as an argument |
| to the ``set`` method. |
| |
| The ``object`` type is accompanied by a set of derived types |
| that mirror the Python built-in types such as ``list``, ``dict``, |
| ``tuple``, etc. as much as possible. This enables convenient |
| manipulation of these high-level types from C++:: |
| |
| dict d; |
| d["some"] = "thing"; |
| d["lucky_number"] = 13; |
| list l = d.keys(); |
| |
| This almost looks and works like regular Python code, but it is pure C++. |
| |
| ================= |
| Thinking hybrid |
| ================= |
| |
| For many applications runtime performance considerations are very |
| important. This is particularly true for most scientific applications. |
| Often the performance considerations dictate the use of a compiled |
| language for the core algorithms. Traditionally the decision to use a |
| particular programming language is an exclusive one. Because of the |
| practical and mental difficulties of combining different languages many |
| systems are written in just one language. This is quite unfortunate |
| because the price payed for runtime performance is typically a |
| significant overhead due to static typing. For example, our experience |
| shows that developing maintainable C++ code is typically much more |
| time-consuming and requires much more hard-earned working experience |
| than developing useful Python code. A related observation is that many |
| compiled packages are augmented by some type of rudimentary scripting |
| layer. These ad hoc solutions clearly show that many times a compiled |
| language alone does not get the job done. On the other hand it is also |
| clear that a pure Python implementation is too slow for numerically |
| intensive production code. |
| |
| Boost.Python enables us to *think hybrid* when developing new |
| applications. Python can be used for rapidly prototyping a |
| new application. Python's ease of use and the large pool of standard |
| libraries give us a head start on the way to a first working system. If |
| necessary, the working procedure can be used to discover the |
| rate-limiting algorithms. To maximize performance these can be |
| reimplemented in C++, together with the Boost.Python bindings needed to |
| tie them back into the existing higher-level procedure. |
| |
| Of course, this *top-down* approach is less attractive if it is clear |
| from the start that many algorithms will eventually have to be |
| implemented in a compiled language. Fortunately Boost.Python also |
| enables us to pursue a *bottom-up* approach. We have used this approach |
| very successfully in the development of a toolbox for scientific |
| applications (scitbx) that we will describe elsewhere. The toolbox |
| started out mainly as a library of C++ classes with Boost.Python |
| bindings, and for a while the growth was mainly concentrated on the C++ |
| parts. However, as the toolbox is becoming more complete, more and more |
| newly added functionality can be implemented in Python. We expect this |
| trend to continue, as illustrated qualitatively in this figure: |
| |
| .. image:: python_cpp_mix.png |
| |
| This figure shows the ratio of newly added C++ and Python code over |
| time as new algorithms are implemented. We expect this ratio to level |
| out near 70% Python. The increasing ability to solve new problems |
| mostly with the easy-to-use Python language rather than a necessarily |
| more arcane statically typed language is the return on the investment |
| of learning how to use Boost.Python. The ability to solve some problems |
| entirely using only Python will enable a larger group of people to |
| participate in the rapid development of new applications. |
| |
| ============= |
| Conclusions |
| ============= |
| |
| The examples in this paper illustrate that Boost.Python enables |
| seamless interoperability between C++ and Python. Importantly, this is |
| achieved without introducing a third syntax: the Python/C++ interface |
| definitions are written in pure C++. This avoids any problems with |
| parsing the C++ code to be interfaced to Python, yet the interface |
| definitions are concise and maintainable. Freed from most of the |
| development-time penalties of crossing a language boundary, software |
| designers can take full advantage of two rich and complimentary |
| language environments. In practice it turns out that some things are |
| very difficult to do with pure Python/C (e.g. an efficient array |
| library with an intuitive interface in the compiled language) and |
| others are very difficult to do with pure C++ (e.g. serialization). |
| If one has the luxury of being able to design a software system as a |
| hybrid system from the ground up there are many new ways of avoiding |
| road blocks in one language or the other. |
| |
| .. I'm not ready to give up on all of this quite yet |
| |
| .. Perhaps one day we'll have a language with the simplicity and |
| expressive power of Python and the compile-time muscle of C++. Being |
| able to take advantage of all of these facilities without paying the |
| mental and development-time penalties of crossing a language barrier |
| would bring enormous benefits. Until then, interoperability tools |
| like Boost.Python can help lower the barrier and make the benefits of |
| both languages more accessible to both communities. |
| |
| =========== |
| Footnotes |
| =========== |
| |
| .. [#mi] For hard-core new-style class/extension module writers it is |
| worth noting that the normal requirement that all extension classes |
| with data form a layout-compatible single-inheritance chain is |
| lifted for Boost.Python extension classes. Clearly, either |
| ``Base1`` or ``Base2`` has to occupy a different offset in the |
| ``Derived`` class instance. This is possible because the wrapped |
| part of BPL extension class instances is never assumed to have a |
| fixed offset within the wrapper. |
| |
| =========== |
| Citations |
| =========== |
| |
| .. [VELD1995] T. Veldhuizen, "Expression Templates," C++ Report, |
| Vol. 7 No. 5 June 1995, pp. 26-31. |
| http://osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html |