IDKCS009163
How to setup and test Jumbo Frames on Linux (SLES) WrittenNov 17, 2017

 
Attachments0
  

Product Categories
Operating System

 Objective
How to setup and test Jumbo Frames on Linux (SLES)

Environment/Conditions/Configuration
Linux (SLES)


Procedure
Note: Use of MTU 9198 is now obsolete. MTU 9000 is recommended for Jumbo frames.
 
First make sure ALL routers and switches in your network path are configured for jumbo frames. If there is one device in the network path that is not configured for jumbo frames then you will either get dropped packets or the smallest MTU on the network path will be used. There is no benefit to configuring a node to use jumbo frames if ALL devices on the network path are NOT configured for jumbo frames.
 
1) Setting up Linux (SLES) Jumbo Frames:
 
(i) Edit /etc/sysconfig/network/ifcfg-ethX file(X choose your interface eth0,eth1 etc) to add the MTU value
 
From:
# To override the default MTU change the following line.
MTU=''
 
To:
# To override the default MTU change the following line.
MTU='9000'
 
(ii)To make the change effective without bouncing the interface, run below command
 
# ifconfig ethX mtu 9000
 
 
MTU='' means do not change the MTU value in the NIC when the interface is comes up. Therefore, if you ever want to revert an MTU back to 1500 without rebooting then you must specify 1500 for the MTU value.
MTU for virtual interfaces (bonding and vlan) should be set according to the instructions in the template files ifcfg-bond0, ifcfg-slave and ifcfg-vlan which are under /etc/gsctools.

2) Test whether Jumbo Frames are passing through the network:
 
Let's take a Linux example where the MTU on the interface is set to 9000 and the different scenarios you may encounter.
 
A ping packet is an ICMP packet. An ICMP header is 8 bytes. An ICMP packet is an IP packet with a 20 byte header. Therefore we run ping with a size of MTU - 28. In this case 9000-28 = 8972. When you ping with 8972 bytes you will see 8980 bytes because 8 bytes are added for the ICMP header.
192.168.50.58 is the Linux TPA node. 192.168.50.57 is a Windows BAR server.
 
Scenario 1:
 
# This is the perfect scenario and shows the MTU through the network path is 9000.
 
Ping response with no fragmentation. ping is run in one window and tcpdump in another window. You can see there is no fragmentation because each ICMP request and reply occupy one line of output. That is, the lines are either "ICMP echo request" or "ICMP echo reply". There is no line "icmp".
 
# tcpdump -i eth4 host 192.168.50.57
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth4, link-type EN10MB (Ethernet), capture size 96 bytes
18:44:08.898126 IP 192.168.50.58 > 192.168.50.57: ICMP echo request, id 14640, seq 6, length 8980
18:44:08.898572 IP 192.168.50.57 > 192.168.50.58: ICMP echo reply, id 14640, seq 6, length 8980
18:44:09.898155 IP 192.168.50.58 > 192.168.50.57: ICMP echo request, id 14640, seq 7, length 8980
18:44:09.898713 IP 192.168.50.57 > 192.168.50.58: ICMP echo reply, id 14640, seq 7, length 8980
 
# ping -s 8972 192.168.50.57
PING 192.168.50.57 (192.168.50.57) 8170(8198) bytes of data.
8980 bytes from 192.168.50.57: icmp_seq=1 ttl=128 time=0.516 ms
8980 bytes from 192.168.50.57: icmp_seq=2 ttl=128 time=0.345 ms
 
Scenario 2:
 
Ping response with fragmentation. ping is run in one window and tcpdump in another window. You can see there is fragmentation in the reply because there are lines containing "icmp" following the "ICMP echo reply". Any "icmp" line indicates a fragment of the preceding packet. There may be more than one. For best performance you want to eliminate fragmentation. See the section below "Solution to Scenario 2 and Scenario 3".
 
# tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth4, link-type EN10MB (Ethernet), capture size 96 bytes
18:59:11.729118 IP 192.168.50.58 > 192.168.50.57: ICMP echo request, id 59698, seq 1, length 8980
18:59:09.730768 IP 192.168.50.57 > 192.168.50.58: ICMP echo reply, id 59698, seq 1, length 1496
18:59:09.730780 IP 192.168.50.57 > 192.168.50.58: icmp
18:59:09.730780 IP 192.168.50.57 > 192.168.50.58: icmp
18:59:09.730780 IP 192.168.50.57 > 192.168.50.58: icmp
18:59:09.730780 IP 192.168.50.57 > 192.168.50.58: icmp
18:59:10.730129 IP 192.168.50.58 > 192.168.50.57: ICMP echo request, id 59698, seq 2, length 8980
18:59:10.730651 IP 192.168.50.57 > 192.168.50.58: ICMP echo reply, id 59698, seq 2, length 1496
18:59:10.730662 IP 192.168.50.57 > 192.168.50.58: icmp
18:59:10.730662 IP 192.168.50.57 > 192.168.50.58: icmp
18:59:10.730662 IP 192.168.50.57 > 192.168.50.58: icmp
18:59:10.730662 IP 192.168.50.57 > 192.168.50.58: icmp
 
# ping -s 8972 192.168.50.57
PING 192.168.50.57 (192.168.50.57) 8170(8198) bytes of data.
8980 bytes from 192.168.50.57: icmp_seq=1 ttl=128 time=0.516 ms
8980 bytes from 192.168.50.57: icmp_seq=2 ttl=128 time=0.345 ms
 
Scenario 3:
 
No ping response (no ICMP echo reply). This means something between the Linux node and BAR server has a smaller MTU and the packet is getting dropped. It may be an intermediate router or switch or the BAR server. See the section below "Solution to Scenario 2 and Scenario 3".
 
# tcpdump -i eth4 host 192.168.50.57
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth4, link-type EN10MB (Ethernet), capture size 96 bytes
18:54:11.618467 IP 192.168.50.58 > 192.168.50.57: ICMP echo request, id 63793, seq 1, length 8980
18:54:12.617808 IP 192.168.50.58 > 192.168.50.57: ICMP echo request, id 63793, seq 2, length 8980
18:54:13.619377 IP 192.168.50.58 > 192.168.50.57: ICMP echo request, id 63793, seq 3, length 8980
18:54:14.620097 IP 192.168.50.58 > 192.168.50.57: ICMP echo request, id 63793, seq 4, length 8980
 
# ping -s 8972 192.168.50.57
PING 192.168.50.57 (192.168.50.57) 8170(8198) bytes of data.
From 192.168.50.57: icmp_seq=1 Destination Host Unreachable

Solution to Scenario 2 and Scenario 3:
  • If the switches and BAR server are managed by Teradata then set the MTU of the BAR interfaces on the TPA nodes, BAR servers and switch ports to 9000.
  • If the switches and BAR server are not managed by Teradata then discover the lowest MTU along the network path.
  • ping -s 1472 should always work because that would be the test for an MTU of 1500.
  • If the network MTU is 1500 then you will need to inform the customer that Jumbo frames will not work due to an MTU of 1500 on their end. If they can enable Jumbo Frames on their end then change the BAR interfaces on the TPA nodes to MTu=9000.

Special Considerations
Use of MTU greater than 9198 is now obsolete. This was the historic reason:

Linux BAR documentation will state that the MTU on BAR Interfaces should be 9198 and the ports on the Ethernet switch should be set to an MTU of 9216. The only exception is that BAR interfaces connect to an EMC Data Domain Device should be 9014.  The max size jumbo frames on Cisco switches is 9198. This value includes the 802.1q tag or ISL VLAN tag, but does not include the Ethernet header and CRC trailer. Thus, the maximum Ethernet frame size, including the Ethernet header/trailer, is 9198 + 18 = 9216 bytes.

Additional Information
‚Äč