Support for ibverbs

Receiver performance can be significantly improved by using the Infiniband Verbs API instead of the BSD sockets API. This is currently only tested on Linux with Mellanox ConnectX®-3 NICs. It depends on device managed flow steering (DMFS), which may require using the Mellanox OFED version of libibverbs.

There are a number of limitations in the current implementation:

  • Only IPv4 is supported
  • VLAN tagging, IP optional headers, and IP fragmentation are not supported
  • Only multicast is supported

Within these limitations, it is quite easy to take advantage of this faster code path. The main difficulty is that one must specify the IP address of the interface that will send or receive the packets. The netifaces module can help find the IP address for an interface by name.

System configuration

It is likely that some system configuration will be needed to allow this mode to work correctly. For ConnectX®-3, add the following to /etc/modprobe.d/mlnx.conf:

options ib_uverbs disable_raw_qp_enforcement=1
options mlx4_core fast_drop=1
options mlx4_core log_num_mgm_entry_size=-1

For more information, see the libvma documentation.

Receiving

The ibverbs API can be used programmatically by using an extra method of spead2.recv.Stream.

spead2.recv.Stream.add_udp_ibv_reader(endpoints, interface_address, max_size=DEFAULT_UDP_IBV_MAX_SIZE, buffer_size=DEFAULT_UDP_IBV_BUFFER_SIZE, comp_vector=0, max_poll=DEFAULT_UDP_IBV_MAX_POLL)

Feed data from multicast IPv4 traffic. For backwards compatibility, one can also pass a single address and port as two separate arguments in place of endpoints.

Parameters:
  • endpoints (list) – List of 2-tuples, each containing a hostname/IP address the multicast group and the UDP port number.
  • interface_address (str) – Hostname/IP address of the interface which will be subscribed
  • max_size (int) – Maximum packet size that will be accepted
  • buffer_size (int) – Requested memory allocation for work requests. Note that this is used to determine the number of packets to buffer; if the packets are smaller than max_size, then fewer bytes will be buffered.
  • comp_vector (int) – Completion channel vector (interrupt) for asynchronous operation, or a negative value to poll continuously. Polling should not be used if there are other users of the thread pool. If a non-negative value is provided, it is taken modulo the number of available completion vectors. This allows a number of readers to be assigned sequential completion vectors and have them load-balanced, without concern for the number available.
  • max_poll (int) – Maximum number of times to poll in a row, without waiting for an interrupt (if comp_vector is non-negative) or letting other code run on the thread (if comp_vector is negative).

Environment variables

An existing application can be forced to use ibverbs for all multicast IPv4 readers, by setting the environment variable SPEAD2_IBV_INTERFACE to the IP address of the interface to receive the packets. Note that calls to spead2.recv.Stream.add_udp_reader() that pass an explicit interface will use that interface, overriding SPEAD2_IBV_INTERFACE; in this case, SPEAD2_IBV_INTERFACE serves only to enable the override.

It is also possible to specify SPEAD2_IBV_COMP_VECTOR to override the completion channel vector from the default.

Note that this environment variable currently has no effect on senders.

Sending

Sending is done by using the class spead2.send.UdpIbvStream instead of spead2.send.UdpStream. It has a different constructor, but the same methods. There is also a spead2.send.trollius.UdpIbvStream class, analogous to spead2.send.trollius.UdpStream.

class spead2.send.UdpIbvStream(thread_pool, multicast_group, port, config, interface_address, buffer_size, ttl=1, comp_vector=0, max_poll=DEFAULT_MAX_POLL)

Create a multicast IPv4 UDP stream using the ibverbs API

Parameters:
  • thread_pool (spead2.ThreadPool) – Thread pool handling the I/O
  • multicast_group (str) – Multicast group hostname/IP address
  • port (int) – Destination port
  • config (spead2.send.StreamConfig) – Stream configuration
  • interface_address (str) – Hostname/IP address of the interface which will be subscribed
  • buffer_size (int) – Socket buffer size. A warning is logged if this size cannot be set due to OS limits.
  • ttl (int) – Multicast TTL
  • buffer_size – Requested memory allocation for work requests.
  • comp_vector (int) – Completion channel vector (interrupt) for asynchronous operation, or a negative value to poll continuously. Polling should not be used if there are other users of the thread pool. If a non-negative value is provided, it is taken modulo the number of available completion vectors. This allows a number of streams to be assigned sequential completion vectors and have them load-balanced, without concern for the number available.
  • max_poll (int) – Maximum number of times to poll in a row, without waiting for an interrupt (if comp_vector is non-negative) or letting other code run on the thread (if comp_vector is negative).