Single Root I/O Virtualization (SR-IOV) is a acceleration technique in virtualized environment. This post is a setup guide for the use of SR-IOV CNI on Kubernetes.
Comparison with DPDK
See DPDK vs SR-IOV for NFV? – Why a wrong decision can impact performance!
Usage
Please refer to Intel - SR-IOV Configuration Guide and GitHub - intel/sriov-network-device-plugin# workflow
SR-IOV Configuration
Enable I/O Memory Management Unit (IOMMU) support
1
2
3
4
5
6
7
8
9
10
11
12# Enable IOMMU support for linux kernel
# Update Grub `GRUB_CMDLINE_LINUX` with `intel_iommu=on`
$ cat /etc/default/grub
...
GRUB_CMDLINE_LINUX="intel_iommu=on"
$ sudo update-grub
$ sudo reboot
# Check if IOMMU is correctly enabled
$ dmesg | grep IOMMU
[ 0.000000] DMAR: IOMMU enabledLoad device’s kernel module
For Intel® Ethernet Server Adapter I350-T2, VF driver is included in Ubuntu Xenial distribution
One can simply load it
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24# Check which driver is used by the adapter
$ ethtool -i enp1s0f0
driver: igb
version: 5.4.0-k
firmware-version: 1.67, 0x80000d6a, 15.0.27
expansion-rom-version:
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
# Check available driver, see https://unix.stackexchange.com/a/184880
$ find /lib/modules/$(uname -r) -type f -name '*.ko' | grep igb
/lib/modules/4.15.0-76-generic/kernel/drivers/net/ethernet/intel/igb/igb.ko
/lib/modules/4.15.0-76-generic/kernel/drivers/net/ethernet/intel/igbvf/igbvf.ko
# Load kernel module, PF driver (igb) and VF driver (igbvf)
$ modprobe igb
# It is not necessary to load VF driver,
# since it would be automatically loaded upon creating VFs later
$ modprobe igbvfIf you want to use the latest driver, download, compile and then load it
Create VFs
1
2
3
4
5
6
7# Check maximum number of VFs supported by the adapter
$ cat /sys/class/net/enp1s0f0/device/sriov_totalvfs
7
# Create VFs to `sriov_numvfs`
$ echo 4 > /sys/class/net/enp1s0f0/device/sriov_numvfs
# echo 4 | sudo tee /sys/class/net/enp1s0f0/device/sriov_numvfsTo ensure number of VFs are created each time the server is power-cycled (not verified)
1
2
3
4
5# Append the creating VFs command to `rc.local` file
$ cat /etc/rc.local
echo 4 > /sys/class/net/<device_name>/device/sriov_numvfsConfirm VFs are created
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37# Confirm VFs are created
# Each PCI device is identified by a unique slot name ([domain:]bus:device.function)
# See http://manpages.ubuntu.com/manpages/trusty/man8/lspci.8.html for more information
$ lspci | grep Ethernet
01:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
01:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
# The four devices below are created VFs
02:10.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
02:10.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
02:11.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
02:11.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
# Check VF MAC address assignment
$ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq state UP mode DEFAULT group default qlen 1000
link/ether 4c:ed:fb:cb:bc:28 brd ff:ff:ff:ff:ff:ff
3: enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether a0:36:9f:39:be:32 brd ff:ff:ff:ff:ff:ff
vf 0 MAC 66:6e:2d:05:55:12, spoof checking on, link-state auto
vf 1 MAC 6a:8b:4a:98:64:7e, spoof checking on, link-state auto
vf 2 MAC 46:6b:c9:b5:dc:a8, spoof checking on, link-state auto
vf 3 MAC 96:12:1f:38:8b:78, spoof checking on, link-state auto
4: enp1s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether a0:36:9f:39:be:33 brd ff:ff:ff:ff:ff:ff
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:8b:85:39:32 brd ff:ff:ff:ff:ff:ff
6: enp2s16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq state UP mode DEFAULT group default qlen 1000
link/ether 66:6e:2d:05:55:12 brd ff:ff:ff:ff:ff:ff
7: enp2s16f4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq state UP mode DEFAULT group default qlen 1000
link/ether 6a:8b:4a:98:64:7e brd ff:ff:ff:ff:ff:ff
8: enp2s17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq state UP mode DEFAULT group default qlen 1000
link/ether 46:6b:c9:b5:dc:a8 brd ff:ff:ff:ff:ff:ff
9: enp2s17f4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq state UP mode DEFAULT group default qlen 1000
link/ether 96:12:1f:38:8b:78 brd ff:ff:ff:ff:ff:ff
Additional Information
1 | # List network hardware |
1 | # Display PCI device in verbose format using slot name |
1 | # A list of all known PCI ID's (vendors, devices, classes and subclasses) |
Work with Kubernetes
Please refer to GitHub - intel/sriov-network-device-plugin#Quick Start
Build SR-IOV CNI
1
2
3
4$ git clone https://github.com/intel/sriov-cni.git
$ cd sriov-cni
$ make
$ cp build/sriov /opt/cni/binBuild and run SR-IOV network device plugin
1
2
3
4$ git clone https://github.com/intel/sriov-network-device-plugin.git
$ cd sriov-network-device-plugin
$ make
$ make imageCreate a ConfigMap that defines SR-IOV resrouce pool configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22# `vendors` and `devices` in `selectors` have not verified yet
# See https://github.com/intel/sriov-network-device-plugin#configurations
$ cat deployment/configMap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: sriovdp-config
namespace: kube-system
data:
config.json: |
{
"resourceList": [{
"resourceName": "intel_sriov",
"selectors": {
"vendors": ["8086"],
"devices": ["1521"],
"pfName": ["enp1s0f0"]
}
}
]
}
$ kubectl create -f deployments/configMap.yamlDeploy SRIOV network device plugin Daemonset
1
$ kubectl create -f deployments/k8s-v1.16/sriovdp-daemonset.yaml
Check the allocatable resource for the node
1
2
3
4
5
6
7
8
9
10$ kubectl get node <node> -o json | jq '.status.allocatable'
{
"cpu": "6",
"ephemeral-storage": "424282646236",
"hugepages-1Gi": "0",
"hugepages-2Mi": "0",
"intel.com/intel_sriov": "5",
"memory": "32676196Ki",
"pods": "110"
}Install one compatible CNI meta plugin, which is Multus here
Create the SRIOV Network CRD
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23$ cat deployments/sriov-crd.yaml
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: sriov-net1
annotations:
k8s.v1.cni.cncf.io/resourceName: intel.com/intel_sriov
spec:
config: '{
"type": "sriov",
"cniVersion": "0.3.1",
"name": "sriov-network",
"ipam": {
"type": "host-local",
"subnet": "192.168.3.0/24"
},
"args": {
"cni": {
"ips": ["192.168.3.10"]
}
}
}'
$ kubectl create -f deployments/sriov-crd.yamlCreate deployment/pod using the SR-IOV interface
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32$ cat deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-deployment
labels:
app: test
spec:
replicas: 1
selector:
matchLabels:
app: test
template:
metadata:
labels:
app: test
annotations:
k8s.v1.cni.cncf.io/networks: sriov-net1
spec:
containers:
- name: test
image: ubuntu:16.04
imagePullPolicy: IfNotPresent
command: ["/bin/bash"]
stdin: true
tty: true
resources:
requests:
intel.com/intel_sriov: '1'
limits:
intel.com/intel_sriov: '1'
$ kubectl apply -f deployment.yaml
Troubleshooting
Cannot Configure VFs
Problem
1 | $ echo 4 > /sys/class/net/enp2s0f1/device/sriov_numvfs |
Solution
1 | # Add `pci=assign-busses` to Grub |
Also see
- Bug 1223376 - not enough MMIO resources for SR-IOV
- SRIOV fails with “SR-IOV: bus number out of range”
- 创建vf报错,问题求助