Support for ibverbs

The support for libibverbs is essentially the same as for Python, with the same limitations. The programmatic interface is via the spead2::recv::udp_ibv_reader and spead2::send::udp_ibv_stream classes:

class udp_ibv_config : public spead2::detail::udp_ibv_config_base<udp_ibv_config>

Configuration for udp_ibv_reader.

Subclassed by spead2::recv::udp_ibv_config_wrapper

Public Functions

inline std::size_t get_max_size() const

Get maximum packet size to accept.

udp_ibv_config &set_max_size(std::size_t max_size)

Set maximum packet size to accept.

inline const std::vector<boost::asio::ip::udp::endpoint> &get_endpoints() const

Get the configured endpoints.

udp_ibv_config &set_endpoints(const std::vector<boost::asio::ip::udp::endpoint> &endpoints)

Set the endpoints (replacing any previous).

Throws:

std::invalid_argument – if any element of endpoints is invalid.

udp_ibv_config &add_endpoint(const boost::asio::ip::udp::endpoint &endpoint)

Append a single endpoint.

Throws:

std::invalid_argument – if endpoint is invalid.

inline const boost::asio::ip::address get_interface_address() const

Get the currently set interface address.

udp_ibv_config &set_interface_address(const boost::asio::ip::address &interface_address)

Set the interface address.

Throws:

std::invalid_argument – if interface_address is not an IPv4 address.

inline std::size_t get_buffer_size() const

Get the currently configured buffer size.

udp_ibv_config &set_buffer_size(std::size_t buffer_size)

Set the buffer size.

The value 0 is special and resets it to the default. The actual buffer size used may be slightly different to round it to a whole number of packet-sized slots.

inline int get_comp_vector() const

Get the completion channel vector (see set_comp_vector)

udp_ibv_config &set_comp_vector(int comp_vector)

Set the completion channel vector (interrupt) for asynchronous operation.

Use a negative value to poll continuously. Polling should not be used if there are other users of the thread pool. If a non-negative value is provided, it is taken modulo the number of available completion vectors. This allows a number of streams to be assigned sequential completion vectors and have them load-balanced, without concern for the number available.

inline int get_max_poll() const

Get maximum number of times to poll in a row (see set_max_poll)

udp_ibv_config &set_max_poll(int max_poll)

Set maximum number of times to poll in a row.

If interrupts are enabled (default), it is the maximum number of times to poll before waiting for an interrupt; if they are disabled by set_comp_vector, it is the number of times to poll before letting other code run on the thread.

Throws:

std::invalid_argument – if max_poll is zero.

Public Static Attributes

static constexpr std::size_t default_buffer_size = 16 * 1024 * 1024

Receive buffer size, if none is explicitly set.

static constexpr std::size_t default_max_size = udp_reader_base::default_max_size

Maximum packet size to accept, if none is explicitly set.

static constexpr int default_max_poll = 10

Number of times to poll in a row, if none is explicitly set.

class udp_ibv_reader : public spead2::recv::detail::udp_ibv_reader_base<udp_ibv_reader>

Synchronous or asynchronous stream reader that reads UDP packets using the Infiniband verbs API.

It currently only supports IPv4, with no fragmentation, IP header options, or VLAN tags.

Public Functions

udp_ibv_reader(stream &owner, const udp_ibv_config &config)

Constructor.

Parameters:
  • owner – Owning stream

  • config – Configuration

Throws:
  • std::invalid_argument – If no endpoints are set.

  • std::invalid_argument – If no interface address is set.

class udp_ibv_config : public spead2::detail::udp_ibv_config_base<udp_ibv_config>

Configuration for udp_ibv_stream.

Subclassed by spead2::send::udp_ibv_config_wrapper

Public Functions

inline std::uint8_t get_ttl() const

Get the IP TTL.

udp_ibv_config &set_ttl(std::uint8_t ttl)

Set the IP TTL.

inline const std::vector<memory_region> &get_memory_regions() const

Get currently registered memory regions.

udp_ibv_config &set_memory_regions(const std::vector<memory_region> &memory_regions)

Register a set of memory regions (replacing any previous).

Items stored inside such pre-registered memory regions can (in most cases) be transmitted without making a copy. A memory region is defined by a start pointer and a size in bytes.

Memory regions must not overlap; this is only validating when constructing the stream.

udp_ibv_config &add_memory_region(const void *ptr, std::size_t size)

Append a memory region (see set_memory_regions)

inline const std::vector<boost::asio::ip::udp::endpoint> &get_endpoints() const

Get the configured endpoints.

udp_ibv_config &set_endpoints(const std::vector<boost::asio::ip::udp::endpoint> &endpoints)

Set the endpoints (replacing any previous).

Throws:

std::invalid_argument – if any element of endpoints is invalid.

udp_ibv_config &add_endpoint(const boost::asio::ip::udp::endpoint &endpoint)

Append a single endpoint.

Throws:

std::invalid_argument – if endpoint is invalid.

inline const boost::asio::ip::address get_interface_address() const

Get the currently set interface address.

udp_ibv_config &set_interface_address(const boost::asio::ip::address &interface_address)

Set the interface address.

Throws:

std::invalid_argument – if interface_address is not an IPv4 address.

inline std::size_t get_buffer_size() const

Get the currently configured buffer size.

udp_ibv_config &set_buffer_size(std::size_t buffer_size)

Set the buffer size.

The value 0 is special and resets it to the default. The actual buffer size used may be slightly different to round it to a whole number of packet-sized slots.

inline int get_comp_vector() const

Get the completion channel vector (see set_comp_vector)

udp_ibv_config &set_comp_vector(int comp_vector)

Set the completion channel vector (interrupt) for asynchronous operation.

Use a negative value to poll continuously. Polling should not be used if there are other users of the thread pool. If a non-negative value is provided, it is taken modulo the number of available completion vectors. This allows a number of streams to be assigned sequential completion vectors and have them load-balanced, without concern for the number available.

inline int get_max_poll() const

Get maximum number of times to poll in a row (see set_max_poll)

udp_ibv_config &set_max_poll(int max_poll)

Set maximum number of times to poll in a row.

If interrupts are enabled (default), it is the maximum number of times to poll before waiting for an interrupt; if they are disabled by set_comp_vector, it is the number of times to poll before letting other code run on the thread.

Throws:

std::invalid_argument – if max_poll is zero.

Public Static Attributes

static constexpr std::size_t default_buffer_size = 512 * 1024

Default send buffer size.

static constexpr int default_max_poll = 10

Default number of times to poll in a row.

class udp_ibv_stream : public spead2::send::stream

Public Functions

udp_ibv_stream(io_service_ref io_service, const stream_config &config, const udp_ibv_config &ibv_config)

Constructor.

Parameters:
  • io_service – I/O service for sending data

  • config – Common stream configuration

  • ibv_config – Class-specific stream configuration

Throws:
  • std::invalid_argument – if ibv_config does not have an interface address set.

  • std::invalid_argument – if ibv_config does not have any endpoints set.

  • std::invalid_argument – if memory regions overlap.

PeerDirect

The pointer given to spead2::send::udp_ibv_config::add_memory_region() is passed to ibv_reg_mr(). When using an NVIDIA NIC, this can be a pointer that is handled by PeerDirect, such as a GPU device pointer. This can be used to transfer data directly from a GPU to the network without passing though the CPU.

This approach does need some care, because the spead2 implementation will fall back to copying if a packet contains too many discontiguous pieces of memory. It will be safe as long as there is only one item in a heap that uses a registered memory region, or as long as all such items are at least as big as the packet size.

For an example of this, see examples/gpudirect_example.cu in the spead2 source distribution.