42. Appendix 1 – DPDK Configuration

Wanguard 8.4 is compatible with DPDK 24.11 running on Ubuntu 18+, Debian 10+, and Rocky/AlmaLinux 8+. The code is currently optimized for the Broadwell microarchitecture and runs on all Intel microarchitectures from Sandy Bridge onward (Ivy Bridge, Haswell, Broadwell, Skylake, etc.), as well as AMD Zen processors. Consult the comparison table in Choosing a Method of DDoS Mitigation for other DPDK limitations.

42.1. DPDK Installation

To use DPDK 24.11, follow the installation guide from https://www.dpdk.org and reserve at least 8 x 1GB hugepages for best performance. For reference:

Install packages required for building DPDK:

[root@localhost ~]# apt install build-essential meson ninja-build libpcap-dev libnuma-dev python3-pyelftools

If you use a Mellanox or NVIDIA network card, install the OFED driver driver first:

[root@localhost ~/MLNX_OFED_LINUX]# ./mlnxofedinstall --add-kernel-support --dpdk

Download and build the latest version of the stable DPDK 24.11 branch:

[root@localhost ~]# wget https://fast.dpdk.org/rel/dpdk-24.11.1.tar.xz
[root@localhost ~]# tar xf ./dpdk-24.11.1.tar.xz
[root@localhost ~]# cd dpdk-24.11
[root@localhost ~/dpdk-24.11]# meson build
[root@localhost ~/dpdk-24.11]# ninja -C build
[root@localhost ~/dpdk-24.11]# ninja -C build install

Configure the OS to reserve 1GB hugepages. Edit /etc/default/grub to pass these options to the kernel:

default_hugepagesz=1G hugepagesz=1G hugepages=8

Update grub and create a mount point:

[root@localhost ~]# update-grub
[root@localhost ~]# mkdir /mnt/huge
[root@localhost ~]# mount -t hugetlbfs pagesize=1GB /mnt/huge

To make this mount permanent, add the following line to /etc/fstab:

nodev /mnt/huge hugetlbfs pagesize=1GB 0 0

Reboot and verify the hugepage configuration and driver status:

[root@localhost ~]# ~/dpdk-24.11/usertools/dpdk-hugepages.py --show
[root@localhost ~]# ~/dpdk-24.11/usertools/dpdk-devbind.py --status

Mellanox or NVIDIA NICs don’t require special binding for DPDK, but other NICs do:

[root@localhost ~]# modprobe vfio-pci
[root@localhost ~]# ~/dpdk-24.11/usertools/dpdk-devbind.py -b vfio-pci <pci_id> #replace <pci_id> with the number shown in the first column from the --status output

Finally, check port status via dpdk-testpmd, to confirm that DPDK recognizes and manages the interfaces:

[root@localhost ~]# dpdk-testpmd -c 0xff -- -i
testpmd> show port info all

42.2. Application Workflow

The architecture is depicted in the diagram below, illustrating a specific scenario with: 2 I/O RX lcores and 2 I/O TX lcores, handling packet I/O for 4 NIC ports (2 ports per I/O lcore), 2 Distributor lcores, and 6 Worker lcores for CPU-intensive tasks.

10000000000018450000099F904BCF07F884E1B4_png

I/O RX Lcore receives packets from the assigned NIC RX ring(s) and dispatches them to one or more Distributor lcores (logical CPU cores).

Distributor Lcore reads packets from one or more I/O RX lcores, extracts packet metadata, possibly applies the Dataplane firewall functionality, then sends metadata to one or more Worker lcores.

Worker Lcore performs heavier CPU tasks like traffic analysis and attack detection.

I/O TX Lcore handles packet TX for a predefined set of NIC ports. Packets are forwarded in batches of at least 4, causing high latency (>50 ms) if only a few packets/s are forwarded. At thousands of packets/s, latency drops under 1 ms.

A single Master Lcore is used to aggregate data from the workers.

42.3. DPDK Capture Engine Options

DPDK Driver – If using a Mellanox/NVIDIA NIC, select the second option; otherwise, pick the first one
EAL Options – A mandatory parameter. Consult the DPDK Getting Started Guide or the Configuration Example below for details
RX Parameters – A mandatory parameter specifying NIC RX ports/queues for I/O RX lcores, thus defining the set of I/O RX lcores. Syntax: “(PORT,QUEUE,LCORE)..”
Distributor Mode – Choose the algorithm for dispatching packets from RX to the Distributor lcores:
Round Robin – Evenly shares load across Distributor lcores. Best option if packets are not forwarded
Receive Side Scaling (RSS) – All packets with the same RSS go to the same Distributor lcore. Best for forwarding; preserves packet order
Custom – Manually assign Distributor lcores per RX port. In that case, the RX Parameters syntax is “(PORT,QUEUE,LCORE,DISTRIBUTOR_LCORE)..”
Distributor Lcores – Mandatory parameter listing the lcore(s) used by the Distributor thread(s). Can be a single lcore or multiple lcores separated by commas
Worker Lcores – Mandatory parameter specifying the set of worker lcores. Each worker lcore handles traffic analysis tasks
Master Lcore – Use a single lcore for thread management. This is mandatory
Forwarding Mode – Specifies TX functionality:
Disabled – No packet forwarding, application acts as a passive sniffer
Transparent Bridge – Ethernet frames are either dropped (filtered) or forwarded unchanged, acting as a transparent bridge. Fastest forwarding method
IP Forwarding – For each IPv4 packet (IPv6 not supported), the application responds to ARP queries for IPs on its interfaces, rewrites SRC MAC with the output interface MAC, and rewrites DST MAC with the MAC from below. No RFC 1812 checks, TTL not decremented. Use this mode when the server is deployed out-of-line with traffic redirected by BGP. Packets matching filtering rules are not forwarded
TX Parameters – Mandatory if Forwarding Mode ≠ Disabled. Defines which lcores handle TX and which NIC TX ports are handled by the I/O TX lcores. Syntax: “(PORT,LCORE)..”
Forwarding Table – Determines the output interface based on the input interface. Syntax: “(PORT_IN,PORT_OUT)..”
Interface IPs – Mandatory when Forwarding Mode = IP Forwarding. Gives each port an IP, letting the application respond to ARP requests. The application does not provide a complete TCP/IP stack, so it’s recommended to configure the ARP table manually on the router, because the application’s ARP responses may be too slow due to bulk processing. Syntax: “(PORT,IPV4)..”
Destination MACs – Specifies the gateway MAC address for each port, used when Forwarding Mode = IP Forwarding. Syntax: “(PORT,MAC_ADDRESS)..”
Maximum Frame Size – If jumbo frames are used, enter the maximum size (often 9000). Default: 1518 for standard Ethernet frames
IP Hash Table Size – By default, IPs are tracked in a hash table with 1,048,576 entries per traffic direction, separate for IPv4 and IPv6
Int. IP Mempool Size – Default: 1,048,576 (split equally across worker lcores). Pre-allocated RAM to track up to 1 million internal IPs within the IP Zone, assuming they all exchange traffic within 1–5 seconds. RAM usage per IP is listed in Sensor Graphs under IP Structure RAM
Ext. IP Mempool Size – For recording traffic data on external IPs. Default: 1,048,576 (also split among worker lcores)
Ring Sizes – The format is “A, B, C, D”, defining the number of descriptors or elements in each ring:
◦ A = Size (descriptors) of the NIC RX rings used by I/O RX lcores
◦ B = Size (elements) of the software rings from I/O RX to worker lcores
◦ C = Size (elements) of the software rings from worker lcores to I/O TX lcores
◦ D = Size (descriptors) of the NIC TX rings used by I/O TX lcores
The default is “1024, 1024, 1024, 1024”, optimal for Intel ixgbe driver, but other NICs or drivers might need different values
Burst Sizes – The format is “(A, B), (C, D), (E, F)”, describing how many packets are processed in each read/write operation:
◦ A = I/O RX lcore read burst from NIC RX
◦ B = I/O RX lcore write burst to software rings
◦ C = Worker lcore read burst from software rings
◦ D = Worker lcore write burst to software rings
◦ E = I/O TX lcore read burst from software rings
◦ F = I/O TX lcore write burst to the NIC TX
The default is “(144,144),(144,144),(144,144)” when Forwarding Mode is Disabled, and “(8,8),(8,8),(8,8)” otherwise. A burst size of 8 means the software processes at least 8 packets simultaneously. For very low packet rates, this can introduce considerable latency

42.4. DPDK Configuration Example

The following configuration assumes a 14-core Xeon CPU with the layout obtained from usertools/cpu_layout.py (in the DPDK source):

Core 1 [0, 14]
Core 2 [1, 15]
Core 3 [2, 16]
Core 4 [3, 17]
Core 5 [4, 18]
Core 6 [5, 19]
Core 7 [6, 20]
Core 8 [7, 21]
Core 9 [8, 22]
Core 10 [9, 23]
Core 11 [10, 24]
Core 12 [11, 25]
Core 13 [12, 26]
Core 14 [13, 27]

These core mappings guide how lcores are assigned to I/O RX, Distributor, Worker, I/O TX threads, etc. Be sure to align your actual CPU layout with the configuration.

DPDK_OPTIONS8.01_png

EAL Options

  • -l 1-27 → DPDK uses lcores 1–27 (28 lcores total, covering a 14-core CPU with Hyper-threading)

  • -n 4 → Uses 4 memory channels, matching the reference 14-core Broadwell CPU’s capability

  • –log-level=user*.info –syslog → Logs DPDK engine activity to syslog with user-level info messages

RX parameters

  • Configures the application to listen on interfaces 0 and 1 (DPDK-enabled), on two NIC queues (0 and 1)

  • Uses lcores 15 and 16 (hyper-threads of CPU cores 1 and 2) for these RX tasks

Distributor Mode

  • Ensures packets are forwarded in the same order they are received

Dataplane Firewall

  • Runs on three CPU cores: 4, 5, 6 (with hyper-threads 18, 19, and 20)

Worker Cores

  • Seven CPU cores: 7 through 13 (with hyper-threads 21 through 27) handle packet analysis and attack detection

Master Lcore

  • Lcore 14 The hyper-thread of CPU core 0, which the OS uses

  • DPDK’s “master” or “manager” lcore for orchestrating tasks

TX parameters

  • Configured to use a single CPU core for TX

  • Lcore 3 transmits over port 0, while lcore 17 (hyper-thread of CPU core 3) transmits over port 1

Forwarding Table

  • In this example, the DPDK Engine acts as an L3 pseudo-router for an out-of-line topology via BGP

  • Incoming packets on port 0 go to port 1, and vice versa

Interface IPs

  • DPDK Engine responds to ARP on port 0 with IP 192.168.1.1, and on port 1 with IP 192.168.0.1

  • If a router wants to route traffic to 192.168.1.1 or 192.168.0.1, DPDK Engine replies with the MAC of the correct interface

  • Once the router learns these MACs, it can start forwarding packets to the appropriate interface

Destination MACs

  • For port 0, packets are forwarded to a gateway with MAC 00:07:52:12:25:f1

  • For port 1, packets are forwarded to a gateway with MAC 68:05:ca:23:5c:40

  • Since DPDK Engine doesn’t provide a full IP stack or ARP table, the gateway MACs must be configured manually

  • Because there’s no true IP stack, the DPDK Engine behaves more like a forwarding bridge at Layer 3, without dynamic ARP resolution

Note

The distribution of lcores can be optimized by observing the performance-related statistics from Reports » Devices » Overview » Dataplane.