network performance test wrt. 100G ================================== Receiver uses: iperf -si1 -t610 Sender uses: iperf -i1 -t600 -c ${receiver} -P8 All: MCX515A-CCAT (Mellanox ConnectX-5) via QSFP28 + OM4 NOTE: If the receiver runs Linux with 3 GHz CPUs, usually 4 iperf sender threads on CPUs >= 2.4 GHz are sufficient to saturate/achieve ~ 94 Gbps which seems to be the max. on Linux without any tuning. If the client has CPUs >= 2.8 GHz, 3 threads can saturate the 94 Gbps. However, within the tests we use 8 threads to give Solaris a chance to ketchup ;-). For the tests made, Linux side isn't tuned at all - just used Ubuntu minimal server as is. Bionic comes with iperf 2.0.10, focal with 2.0.13. Solaris 11.4 doesn't ship iperf v2 anymore, so the 2.0.13 got compiled with the same patch set Ubuntu is using, because the Solaris 11.3 shipped version 2.0.5 coredumped pretty often or hangs when the sender uses more than 8 threads. Also, unless explictly mentioned (FW), neither sender nor receiver have the firewall (Solaris: pf, Linux: iptables) enabled. So tunings mentioned below refer to Solaris. slowlaris.svg ------------- No drv or kernel tuning, just changing the rcv_buf. slowlaris-16rx.svg ------------------ Changed /kernel/drv/mlxne.conf: max_rings_enable=1; rx_ring_num = 32; tx_ring_num = 32; rx_ring_size=2048; tx_ring_size=2048; mlxne_rx_ring_num = 32; mlxne_tx_ring_num = 32; mlxne_rx_ring_size=2048; mlxne_tx_ring_size=2048; slowlaris-tuned.svg ------------------- Changed in addition to the above /etc/system.d/netperf: set mac:mac_bm_rings_adj=32 ** allow doubling the number of mac layer buffers with up to 32 RX/TX rings set mac:mac_poll_enable=0 ** disable polling mode set mac:mac_rx_fanout_hiwat=100000 ** high water mark for mac RX queuing, default 10,000 set mac:mac_rx_fanout_inline_max=0 ** disable inline (in interrupt context) RX stack processing; use worker threads. And changed tcp parameters max-buf = rcv-buf = cwnd-max = 16 MiB and adjusting rcv-buf size during the tests. slowlaris-threads.svg --------------------- Same settings as above, but rcv-buf reset to the default (256 KB). Number of sending (and thus receiving) threads are changed. First 3 tests from Linux@3GHz to Solaris@2.8 GHz. The last 4-1 tests are vice versa, i.e. from Solaris@2.8 GHz to Linux@3GHz. The last test is with solaris /etc/system.d/netperf and /kernel/drv/mlxne.conf changes removed. Linux 100G perf =============== fwNN - no firewall on both sides fwSN - firewall enabled on sending side, no firewall on receiving side fwNR - no firewall on sending side, firewall enabled on receiving side fwSR - firewall on both sides enabled