Quantcast
Channel: Embedded Community : All Content - All Communities
Viewing all articles
Browse latest Browse all 3032

Low DPDK performance

$
0
0

Hi

 

I'm trying to measure the throughput of DPDK for physical ports (using 64-byte udp packets), but it seems very low. My test setup is shown below:


+─────────────────────────────+

82599ES 10-Gigabit SFI/SFP+

+───+────+──────────+────+────+

    p0           p1

    +──+          +──+

      ^               ^

      |               |

      v               v

    +──+          +──+

    p0           p1

+───+────+──────────+────+────+

NAPATECH Adapter - 2 port  

+─────────────────────────────+


I isolated cores (master and logical cores on hyper-threading) from Linux kernel according to the output of dpdk/tools/cpu_layout.p, iommu is set to pt, intel_iommu is off and 1G hugepages allocated for numa nodes as follows;

 

$ cat /sys/devices/system/node/node*/meminfo | fgrep Huge

Node 0 AnonHugePages:    137216 kB

Node 0 HugePages_Total:     3

Node 0 HugePages_Free:      2

Node 0 HugePages_Surp:      0

Node 1 AnonHugePages:    227328 kB

Node 1 HugePages_Total:     2

Node 1 HugePages_Free:      1

Node 1 HugePages_Surp:      0

 

$ ./cpu_layout.py

cores =  [0, 1, 2, 8, 9, 10]

sockets =  [0, 1]

 

        Socket 0        Socket 1       

        --------        --------       

Core 0  [0, 12]         [6, 18]        

Core 1  [1, 13]         [7, 19]        

Core 2  [2, 14]         [8, 20]        

Core 8  [3, 15]         [9, 21]        

Core 9  [4, 16]         [10, 22]       

Core 10 [5, 17]         [11, 23]


$ cat /etc/default/grub

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash iommu=pt intel_iommu=off default_hugepagesz=1G hugepagesz=1G hugepages=5 isolcpus=1,2,3,4,5,7,8,9,10,11,13,14,15,16,17,19,20,21,22,23"

 

After that I started forwarding (fwd) on testpmd application as below;


$ sudo ./testpmd  -c 0xFBEFBE -n2 -v -m 1024M -- --burst=512 -i --rxq 1 --txq 1 --rxd 64 --txd 64


testpmd> set nbcore 10

Number of forwarding cores set to 10


testpmd> show config fwd

io packet forwarding - ports=2 - cores=2 - streams=2 - NUMA support disabled, MP over anonymous pages disabled

Logical Core 2 (socket 0) forwards packets on 1 streams:

  RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01

Logical Core 3 (socket 0) forwards packets on 1 streams:

  RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00


testpmd> start

  io packet forwarding - CRC stripping disabled - packets/burst=512

  nb forwarding cores=10 - nb forwarding ports=2

  RX queues=1 - RX desc=64 - RX free threshold=32

  RX threshold registers: pthresh=8 hthresh=8 wthresh=0

  TX queues=1 - TX desc=64 - TX free threshold=32

  TX threshold registers: pthresh=32 hthresh=0 wthresh=0

  TX RS bit threshold=32 - TXQ flags=0xf01


testpmd> clear port stats all

  NIC statistics for port 0 cleared

  NIC statistics for port 1 cleared


testpmd> show port stats all

 

  ######################## NIC statistics for port 0  ########################

  RX-packets: 716315071  RX-missed: 0          RX-bytes:  42978906308

  RX-errors: 5229181298

  RX-nombuf:  0       

  TX-packets: 0          TX-errors: 0          TX-bytes:  0

  ############################################################################

 

  ######################## NIC statistics for port 1  ########################

  RX-packets: 0          RX-missed: 0          RX-bytes:  0

  RX-errors: 0

  RX-nombuf:  0       

  TX-packets: 704317861  TX-errors: 0          TX-bytes:  42259067720

  ############################################################################

 

According to NAPATECH output, DPDK forwarding rate is ~4200 Mbps (~6Mpps) and it is very low when compared to Intel's benchmark documents. Also, the RX-errors field is increasing rapidly. Moreover, EAL says that "PCI device 0000:60:00.1 on NUMA socket -1" in the testpmd application output as below.


Do you have an idea of the problem's origin? What is your suggestion and How should I continue to figure the performance problem out?


Thanks in advance


- Volkan


testpmd> argela@argela-HP-Z800-Workstation:~/ovs_dpdk/dpdk/app/test-pmd/build/app$ sudo ./testpmd  -c 0xFBEFBE -n2 -v -m 1024M -- --burst=512 -i --rxq 1 -rxd 64 --txd 64

[sudo] password for argela:

EAL: Detected lcore 0 as core 0 on socket 0

EAL: Detected lcore 1 as core 1 on socket 0

EAL: Detected lcore 2 as core 2 on socket 0

EAL: Detected lcore 3 as core 8 on socket 0

EAL: Detected lcore 4 as core 9 on socket 0

EAL: Detected lcore 5 as core 10 on socket 0

EAL: Detected lcore 6 as core 0 on socket 1

EAL: Detected lcore 7 as core 1 on socket 1

EAL: Detected lcore 8 as core 2 on socket 1

EAL: Detected lcore 9 as core 8 on socket 1

EAL: Detected lcore 10 as core 9 on socket 1

EAL: Detected lcore 11 as core 10 on socket 1

EAL: Detected lcore 12 as core 0 on socket 0

EAL: Detected lcore 13 as core 1 on socket 0

EAL: Detected lcore 14 as core 2 on socket 0

EAL: Detected lcore 15 as core 8 on socket 0

EAL: Detected lcore 16 as core 9 on socket 0

EAL: Detected lcore 17 as core 10 on socket 0

EAL: Detected lcore 18 as core 0 on socket 1

EAL: Detected lcore 19 as core 1 on socket 1

EAL: Detected lcore 20 as core 2 on socket 1

EAL: Detected lcore 21 as core 8 on socket 1

EAL: Detected lcore 22 as core 9 on socket 1

EAL: Detected lcore 23 as core 10 on socket 1

EAL: Support maximum 128 logical core(s) by configuration.

EAL: Detected 24 lcore(s)

EAL: RTE Version: 'RTE 2.2.0-rc2'

EAL: Setting up physically contiguous memory...

EAL: Ask a virtual area of 0xc0000000 bytes

EAL: Virtual area found at 0x7f4700000000 (size = 0xc0000000)

EAL: Ask a virtual area of 0x80000000 bytes

EAL: Virtual area found at 0x7f4640000000 (size = 0x80000000)

EAL: Requesting 1 pages of size 1024MB from socket 0

EAL: Requesting 1 pages of size 1024MB from socket 1

EAL: TSC frequency is ~2664050 KHz

EAL: Master lcore 1 is ready (tid=c504e900;cpuset=[1])

EAL: lcore 13 is ready (tid=be947700;cpuset=[13])

EAL: lcore 14 is ready (tid=be146700;cpuset=[14])

EAL: lcore 8 is ready (tid=c094b700;cpuset=[8])

EAL: lcore 17 is ready (tid=bc943700;cpuset=[17])

EAL: lcore 2 is ready (tid=c3150700;cpuset=[2])

EAL: lcore 3 is ready (tid=c294f700;cpuset=[3])

EAL: lcore 4 is ready (tid=c214e700;cpuset=[4])

EAL: lcore 11 is ready (tid=bf148700;cpuset=[11])

EAL: lcore 16 is ready (tid=bd144700;cpuset=[16])

EAL: lcore 22 is ready (tid=ba93f700;cpuset=[22])

EAL: lcore 21 is ready (tid=bb140700;cpuset=[21])

EAL: lcore 23 is ready (tid=ba13e700;cpuset=[23])

EAL: lcore 19 is ready (tid=bc142700;cpuset=[19])

EAL: lcore 15 is ready (tid=bd945700;cpuset=[15])

EAL: lcore 10 is ready (tid=bf949700;cpuset=[10])

EAL: lcore 9 is ready (tid=c014a700;cpuset=[9])

EAL: lcore 5 is ready (tid=c194d700;cpuset=[5])

EAL: lcore 7 is ready (tid=c114c700;cpuset=[7])

EAL: lcore 20 is ready (tid=bb941700;cpuset=[20])

EAL: PCI device 0000:60:00.0 on NUMA socket -1

EAL:   probe driver: 8086:10fb rte_ixgbe_pmd

EAL:   PCI memory mapped at 0x7f4740000000

EAL: Trying to map BAR 4 that contains the MSI-X table. Trying offsets: 0x40000000000:0x0000, 0x1000:0x3000

EAL:   PCI memory mapped at 0x7f4740081000

PMD: eth_ixgbe_dev_init(): MAC: 2, PHY: 12, SFP+: 3

PMD: eth_ixgbe_dev_init(): port 0 vendorID=0x8086 deviceID=0x10fb

EAL: PCI device 0000:60:00.1 on NUMA socket -1

EAL:   probe driver: 8086:10fb rte_ixgbe_pmd

EAL:   PCI memory mapped at 0x7f4740084000

EAL: Trying to map BAR 4 that contains the MSI-X table. Trying offsets: 0x40000000000:0x0000, 0x1000:0x3000

EAL:   PCI memory mapped at 0x7f4740105000

PMD: eth_ixgbe_dev_init(): MAC: 2, PHY: 15, SFP+: 6

PMD: eth_ixgbe_dev_init(): port 1 vendorID=0x8086 deviceID=0x10fb

Interactive-mode selected

Configuring Port 0 (socket 0)

PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f471533cd40 hw_ring=0x7f471533d180 dma_addr=0x1d533d180

PMD: ixgbe_set_tx_function(): Using simple tx code path

PMD: ixgbe_set_tx_function(): Vector tx enabled.

PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f471532c640 sw_sc_ring=0x7f471532c300 hw_ring=0x7f471532c980 dma_addr=0x1d532c980

PMD: ixgbe_set_rx_function(): Vector rx enabled, please make sure RX burst size no less than 4 (port=0).

Port 0: 00:1B:21:65:42:FC

Configuring Port 1 (socket 0)

PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f471531bc40 hw_ring=0x7f471531c080 dma_addr=0x1d531c080

PMD: ixgbe_set_tx_function(): Using simple tx code path

PMD: ixgbe_set_tx_function(): Vector tx enabled.

PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f471530b540 sw_sc_ring=0x7f471530b200 hw_ring=0x7f471530b880 dma_addr=0x1d530b880

PMD: ixgbe_set_rx_function(): Vector rx enabled, please make sure RX burst size no less than 4 (port=1).

Port 1: 00:1B:21:65:42:FD

Checking link statuses...

Port 0 Link Up - speed 10000 Mbps - full-duplex

Port 1 Link Up - speed 10000 Mbps - full-duplex

Done


Viewing all articles
Browse latest Browse all 3032

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>