Command-line tools
spead2_bench
A benchmarking tool is provided to estimate the maximum throughput for UDP. There are two versions: one implemented in Python 3 (spead2_bench.py) and one in C++ (spead2_bench), which are installed by the corresponding installers. The examples show the Python version, but the C++ version functions very similarly. However, they cannot be mixed: use the same version on each end of the connection.
On the receiver, pick a port number (which must be free for both TCP and UDP) and run
spead2_bench.py agent <port>
Then, on the sender, run
spead2_bench.py master [options] <host> <port>
where host is the hostname of the receiver. This script will run tests at a variety of speeds to determine the maximum speed at which the connection seems reliable most of the time. This speed is right at the edge of stability: for a totally reliable setup, you should use a lower speed.
spead2_send/spead2_recv
There are also separate spead2_send and spead2_recv (and Python 3 equivalent) programs. The former generates a stream of meaningless data, while the latter consumes an existing stream and reports the heaps and items that it finds. Apart from being useful for debugging a stream, spead2_recv has a similar plethora of command-line options for tuning that allow for exploration.
mcdump
mcdump is a tool similar to tcpdump, but specialised for high-speed capture of UDP traffic using hardware that supports the Infiniband Verbs API. It has only been tested on NVIDIA ConnectX NICs. Like gulp, it uses a separate thread for disk I/O and CPU core affinity to achieve reliable performance. With a sufficiently fast disk subsystem, it is able to capture line rate from a 40Gb/s adapter.
It is not limited to capturing SPEAD data. It is included with spead2 rather than released separately because it reuses a lot of the spead2 code.
Installation
The tool is automatically compiled and installed with spead2, provided that libibverbs support is detected at configure time.
It may also be necessary to configure the system to work with ibverbs. See Support for ibverbs for more information.
Usage
The simplest incantation is
mcdump -i xx.xx.xx.xx output.pcap yy.yy.yy.yy:zzzz
which will capture on the interface with IP address xx.xx.xx.xx, for the
multicast group yy.yy.yy.yy on UDP port zzzz. mcdump will take care of
subscribing to the multicast group. Note that only IPv4 is supported. Capture
continues until interrupted by Ctrl-C. You can also list more
group:port
pairs, which will all stored in the same pcap file.
While originally written for multicast, mcdump also supports unicast. An IP address must still be provided; usually it will be the same as the interface address, but it could be a different address if the interface has multiple IP addresses.
You can also specify -
in place of the filename to suppress the write to
file. This is useful to simply count the bytes/packets received without being
limited by disk throughput.
Unfortunately, unlike tcpdump, it is not possible to directly tell whether packets were dropped. NIC counters (on Linux, accessed with ethtool -S) can give an indication, although sometimes packets are dropped during the shutdown process.
These options are important for performance:
- -N <cpu>, -C <cpu>, -D <cpu>
Set CPU core IDs for various threads. The
-D
option can be repeated multiple times to use multiple threads for disk I/O. By default, the threads are not bound to any particular core. It is recommended that these cores be on the same CPU socket as the NIC.
- --direct-io
Use the
O_DIRECT
flag to open the file. This bypasses the kernel page cache, and can in some cases yield higher performance. However, not all filesystems support it, and it can also reduce performance when capturing a small enough amount of data that it will fit into RAM.
- --count <count>
Stop after <count> packets have been received. Without this option, mcdump will run until SIGINT (Ctrl-C) is received.
Limitations
Only IPv4 is supported.
It is not optimised for small packets (below about 1KB). Packet capture rates top out around 6Mpps for current hardware.
spead2_net_raw
When using ibverbs, it is necessary to have the
CAP_NET_RAW
capability on Linux. While this can be achieved by running as
root, doing so may be undesirable. The spead2_net_raw utility
program can be used to simplify running ibverbs applications. To use it, the
program must first be given the capability. After installation, this can be
done by running
sudo setcap cap_net_raw+p /usr/local/bin/spead2_net_raw
Adjust the path as necessary to match your installation. If
spead2_net_raw did not get installed, check that you have the libcap
development headers installed (for example, libcap-dev
in Ubuntu), and
rerun configure to detect it.
Now you can prefix any command with spead2_net_raw and it will have
the CAP_NET_RAW
capability. It is an “ambient” capability, so all
child processes will have the capability too, which can be useful if the
process you run is a shell.
Warning
After doing the above, any user on the system that can run spead2_net_raw will be able to intercept any incoming network traffic or generate arbitrary outgoing traffic. You should not do this blindly if there are untrusted users on your system, or if the system allows untrusted code to run outside of a secure sandbox.
This is not the only way to give spead2_net_raw the capability (you can, for example, make it an “inherited” capability), but a full discussion of the Linux capabilities model is beyond the scope of this manual.