Python bindings
The Python bindings are implemented using pybind11, which handles most of the work of exposing C++ classes and members to Python.
Global Interpreter Lock
The Python Global Interpreter Lock (GIL) is both good news and bad news for the bindings. The good news is that it ensures that (by default) only one Python thread can be calling the spead2 API at a time, which provides some basic thread safety. Here’s the bad news (but also read the pybind11 documentation on the topic):
Without additional steps, blocking calls to the spead2 API would prevent other Python threads from making progress. Mostly this just hurts performance, but it can even lead to deadlocks. This is mostly handled by judicious use of
pybind11::gil_scoped_release
in functions that wait for events or take C++ locks.When a thread does not hold the GIL (either because it released it, or because it is a C++ thread created internally) it cannot safely call Python APIs. While
pybind11::object
is a very convenient interface for reference counting, it can cause accidental use of the reference counting APIs through destructors or copy constructors. For any class that embeds a Python object, it is important to know from which threads it could be destroyed (and in some cases the core spead2 design has had to be modified to account for this). Some pieces of code that do not expect to do any reference counting choose to usepybind11::handle
instead to eliminate the risk.If a worker thread tries to acquire the GIL, this could block for a significant period of time, which in turn could cause high latency for time-critical work (such as processing packets).
Logging to the Python logging system is one area where this used to be problematic. That has been solved by putting log messages (as C++ strings) on a ringbuffer and using a dedicated worker thread to pass the log messages to Python. If this worker thread is blocked, log messages simply accumulate in the ringbuffer, and eventually log messages get dropped, but the thread that is doing the logging is not blocked.
A similar issue is notifying Python code that some asynchronous work has completed. This is covered in a later section.
Note that once PEP 703 is implemented, the spead2 bindings may need significant rework to support no-GIL mode.
Semaphores
spead2 uses semaphores extensively for producer/consumer relationships between threads. Dropping the GIL before calling a blocking function will partially help, but has two potential issues:
If the blocking function waits on a semaphore and then takes action that uses the Python API, it will need to re-acquire the GIL before taking that action.
If the user presses Ctrl-C, it will not interrupt the program: it will interrupt the low-level semaphore wait (with
EINTR
), but the default semaphore implementation in spead2 will simply retry that wait. Graceful handling of Ctrl-C requires usingPyErr_CheckSignals()
to give Python a chance to handle the interrupt.
To address both concerns, several functions (particularly in
ringbuffer
) take a variadic number of extra arguments, which are
passed on to semaphore_get()
. The Python bindings provide an
overload of semaphore_get()
, selected by passing an instance of the
empty class gil_release_tag
, that handles both releasing the GIL
and checking PyErr_CheckSignals()
each time the semaphore wait is
interrupted.
Asynchronous callbacks
The C++ interface for sending heaps asynchronously takes a callback which is
executed when the heap has been transmitted. While the high-level Python
interface exposed to users wraps this into an asyncio.Future
, the
low-level interface exposed from C++ to Python still takes a callback.
However, the callback is Python code and can only be called with the GIL held.
Furthermore, since asyncio
is not thread-safe, it should be run in the
thread that’s running the asyncio event loop, rather than the C++ thread
that’s doing the networking.
To solve these problems, the C++-level callback doesn’t directly invoke the
Python callback. Instead, it puts it in a vector of deferred callbacks. To
wake up the event loop without locking the GIL, it writes data to a file
descriptor (eventfd in Linux, but it’s abstracted by
semaphore_fd
), which the asyncio event loop watches. The
event loop thread then requests the stream to execute all its deferred
callbacks.
As an optimisation, the file descriptor is only touched if the callback list was empty. Thus, a rapid burst of complete heaps only requires one wakeup. This requires some careful use of a mutex to correctly handle the case where new callbacks are added while the callback list is being processed.
Interpreter shutdown
If spead2 objects are still present when the Python interpreter is shut down,
their destructors may try to interact with the Python API after it is too late
to safely do so. For example, the logging system may try to interact with the
GIL after the GIL has already been destroyed. Thus, some objects need to be
shut down earlier in the interpreter shutdown process, and this is achieved
using the atexit
module. An exit_stopper
class simplifies
the process.