Implementation framework and Application of ethtool in Linux

Source: Internet
Author: User
Tags network function ibm developerworks

Implementation framework and Application of ethtool in Linux

A notable feature of Linux is its powerful network function. Linux supports almost all network protocols and provides a wide range of applications based on these protocols. The importance of Linux network management is self-evident. These management depends on network tools, such as the most common ifconfig, route, ip, and ethtool, ethtool provides powerful nic and nic Driver management capabilities. Its specific implementation framework is closely related to network drivers and network hardware, which is easy to modify and expand, allows Linux Network developers and administrators to set, view, and debug Nic hardware, drivers, and network protocol stacks.

The support and implementation of ethtool by the Linux NIC driver starts with a typical Ethernet controller.

The network adapter works on the last two layers of the OSI Network System: Physical Layer and data link layer, the physical layer defines the electrical and optical signals, line statuses, clock baselines, data encoding, and circuits required for data transmission and receiving, and provides standard interfaces to data link layer devices. The Physical Layer Chip is called PHY. The data link layer provides addressing mechanisms, data frame construction, data error check, transfer control, and standard data interfaces to the network layer. The data link layer chip in the ethernet card is called the MAC controller. Many NICs work together. The relationship between them is that the PCI bus is connected to the MAC bus, the MAC is connected to the PHY, And the PHY is connected to the network cable (of course, it is not directly connected, there is also a pressure change device ).

Generally, the basic structure 1 of a typical Ethernet controller is shown below:

Figure 1. A typical Ethernet controller that complies with the IEEE 802.3 standard

Data link layer MAC is short for Media Access Control, that is, the Media Access Control Sub-layer protocol. The Protocol is located in the lower half of the data link layer in the OSI Layer 7 protocol and is mainly responsible for controlling and connecting physical media of the physical layer. When sending data, the MAC protocol can determine whether data can be sent in advance. If data can be sent, it adds some control information to the data, and finally sends the data and control information to the physical layer in the specified format; when receiving data, the MAC protocol first checks the input information and determines whether a transmission error has occurred. If there is no error, the control information is removed and sent to the LLC layer. Ethernet MAC is defined by IEEE-802.3 Ethernet standards.

The physical layer PHY is a physical interface transceiver that implements the physical layer. Including the MII/GMII (Media independent interface) Sub-layer, PCS (physical encoding sub-layer), PMA (physical media attachment) Sub-layer, PMD (physical media-related) Sub-layer, and MDI sub-layer.

MII is an independent media interface. "Media Independence" indicates that any type of PHY device can work normally without re-designing or replacing MAC hardware. Two independent channels are used for the transmitter and receiver respectively. Each channel has its own data, clock, and control signal. The MII data interface requires a total of 16 signals, including TX_ER, TXD <:0>, TX_EN, TX_CLK, COL, RXD <:0>, RX_EX, RX_CLK, CRS, and RX_DV.

RMII (Reduced Media Independant Interface) is a simplified MII Interface. It has twice the signal line of the MII Interface in data sending and receiving, so it generally requires a 50 MB bus clock. RMII is generally used in a multi-port switch. Instead of sending and receiving two clocks for each port, a clock is used for sending and receiving all the data ports, this reduces the number of ports. One RMII port requires seven data lines, which is twice less than MII, so the switch can access multiple data ports. Like MII, RMII supports 10 m and 100 M Bus Interface speeds.

GMII (Gigabit MII) is a Gigabit network MII interface, which also has the corresponding RGMII interface, indicating the simplified GMII interface. GMII uses eight-bit interface data with a 125 MHz clock, so the transmission rate can reach 1000 Mbps. It is also compatible with the MII 10/100 Mbps working mode.

The MII management interface is a dual-signal interface. One is the clock signal MDC, and the other is the data signal MDIO. Through the management interface, the upper layer can monitor and control the register of the PHY. Some registers in the PHY are defined by IEEE. In this way, the PHY reflects its current status to the Register, MAC constantly reads the Status Register of the PHY through the management interface to obtain the current PHY status, such as the connection speed and duplex capability. Of course, you can also set the register of the PHY through the management interface to achieve the purpose of control. For example, the stream control is enabled or disabled, the self-negotiation mode or the forced mode, which is also the working principle of ethtool.

MDIO/MDC, that is, the PHY management interface serial communication bus, which is defined by IEEE through several terms of the Ethernet standard IEEE 802.3. MDIO is a simple dual-line serial interface that combines management devices (such as MAC controllers and microprocessors) with management-enabled transceiver (such as multi-port gibit Ethernet transceiver or 10GbE XAUI transceiver) connect to control the transceiver and collect status information from the transceiver. Information collected includes Link Status, transmission speed and selection, power failure, low-power sleep status, TX/RX mode selection, automatic negotiation control, and loop back mode control. In addition to features required by IEEE, transceiver vendors can also add more information collection functions.

MDC is the clock input for data management, and the maximum speed can be 8.3 MHz. MDIO is a bidirectional interface for data input and output management, and data is synchronized with the MDC clock. The MDIO workflow is as follows:

  • The MDIO interface is in a high-impedance state on the IDLE data line that does not transmit data.
  • A 2bit start code (01) is displayed for MDIO and a read/write operation starts.
  • MDIO displays a 2bit data to identify whether it is a read operation (10) or a write operation (01 ).
  • MDIO displays a 5-Bit Data mark PHY address.
  • A 5 bitPHY Register address is displayed in MDIO.
  • MDIO requires two clock access times.
  • MDIO reads/writes 16-bit register data in sequence.
  • The MDIO is restored to the IDLE state, and the MDIO enters the high-impedance state.

Note: The above content is excerpted from the Internet.

Support for ethtool in Linux Device Drivers

Currently, almost all Nic drivers support ethtool. framework 2 shows that the ethtool framework consists of kernel space and user space: the user space is responsible for sending ethtool commands to the kernel and receiving command execution results. The kernel space reads and writes the MII registers through MDIO/MDC based on the corresponding command words, manage the NIC and return the execution results to the user space. As the Linux network driver is a complex and huge system, we will only introduce the definition of the MII register in the driver, support for MDIO/MDC and how to implement ethtool in the driver.

Figure 2. Implementation Framework of ethtool in Linux

MII registers stipulated by IEEE 802.3

The MII/GMII interface's PHY register is defined in 22.2.4 Management functions of 802.3. in this section, as shown in Table 22-6 and Table 22-7 (figures 3 and 4 in this article are from http://standards.ieee.org/getieee802/download/802.3-2008_section2.pdf,

Figure 3. MII management register set defined in 802.3

We can see that registers are divided into basic sets and extended sets. The definition of basic sets varies with GMII and MII. For MII, basic sets include register 0 control registers and 1 status registers, while for GMII; the basic set includes registers 0, 1, and 15. The definition of control register 0 and Status Register 1 is as follows:

Figure 4. Register 0 control register and 1 State Register defined in IEEE 802.3

Figure 4. Register 0 control register and 1 State Register defined in IEEE 802.3

You can manage the NIC by reading and writing registers 0 and 1. Listing 1 lists some of the register management registers and the bit definitions of the control registers and status registers.

Listing 1,/kernel/drivers/net/Mii. h, define the PHY Management Register
 #define MII_BMCR            0x00        /* Basic mode control register */  #define MII_BMSR            0x01        /* Basic mode status register  */  #define MII_PHYSID1         0x02        /* PHYS ID 1                   */  #define MII_PHYSID2         0x03        /* PHYS ID 2                   */  #define MII_ADVERTISE       0x04        /* Advertisement control reg   */  #define MII_LPA             0x05        /* Link partner ability reg    */  #define MII_EXPANSION       0x06        /* Expansion register          */  #define MII_CTRL1000        0x09        /* 1000BASE-T control          */  ...  /* Basic mode control register. */  #define BMCR_RESV               0x003f  /* Unused...                   */  #define BMCR_SPEED1000  0x0040  /* MSB of Speed (1000)         */  #define BMCR_CTST               0x0080  /* Collision test              */  #define BMCR_FULLDPLX           0x0100  /* Full duplex                 */  #define BMCR_ANRESTART          0x0200  /* Auto negotiation restart    */  #define BMCR_ISOLATE            0x0400  /* Disconnect DP83840 from MII */  #define BMCR_PDOWN              0x0800  /* Powerdown the DP83840       */  #define BMCR_ANENABLE           0x1000  /* Enable auto negotiation     */  #define BMCR_SPEED100           0x2000  /* Select 100Mbps              */  #define BMCR_LOOPBACK           0x4000  /* TXD loopback bits           */  #define BMCR_RESET              0x8000  /* Reset the DP83840           */  /* Basic mode status register. */  #define BMSR_ERCAP              0x0001  /* Ext-reg capability          */  #define BMSR_JCD                0x0002  /* Jabber detected             */  #define BMSR_LSTATUS            0x0004  /* Link status                 */  #define BMSR_ANEGCAPABLE        0x0008  /* Able to do auto-negotiation */  #define BMSR_RFAULT             0x0010  /* Remote fault detected       */  #define BMSR_ANEGCOMPLETE       0x0020  /* Auto-negotiation complete   */  #define BMSR_RESV               0x00c0  /* Unused...                   */  #define BMSR_ESTATEN  0x0100  /* Extended Status in R15 */  #define BMSR_100FULL2  0x0200  /* Can do 100BASE-T2 HDX */  #define BMSR_100HALF2  0x0400  /* Can do 100BASE-T2 FDX */  #define BMSR_10HALF             0x0800  /* Can do 10mbps, half-duplex  */  #define BMSR_10FULL             0x1000  /* Can do 10mbps, full-duplex  */  #define BMSR_100HALF            0x2000  /* Can do 100mbps, half-duplex */  #define BMSR_100FULL            0x4000  /* Can do 100mbps, full-duplex */  #define BMSR_100BASE4           0x8000  /* Can do 100mbps, 4k packets  */

How to read and write MII registers through MDC/MDIO

In the previous sections of this article, we introduced the MDC/MDIO workflow, the MDIO read/Write Functions mdio_read and mdio_write In the NIC driver, that is to say, the specific implementation of the function pointer in listing 3 is completed in the driver files of each Nic, and all follow the frame format of 802.3 MDIO. The typical frame format is the format defined in section 22nd:

Figure 5. The MDIO frame format defined in Clause 22

Domain Length (bit) Description
ST 2 bits 01b
OP 2 bits Operation Code, written as 01b, read as 10b
PHYADR 5 bits PHY ID
REGADR 5 bits Register address
TA 2 bits Status transition domain, read operation is X0b, write operation is 10b
DATA 16 bits Data

Implement ethtool in the driver

In kernel/include/linux/ethtool. h defines the struct ethtool_ops. All the members of this struct are function pointer types and define the functions that ethtool can implement. This struct has many member variables and does not list the code here. At the same time, the ethtool_ops member variable also exists in the struct net_device, as shown in Listing 2,

Listing 2: kernel/include/linux/NetDevice. h. The ethtool_ops member variable in net_device
 struct net_device  {  ...  const struct ethtool_ops *ethtool_ops;  ...  }

The NIC driver needs to initialize ethtool_ops and implement its defined function functions to support ethtool. Take Dm9000.c as an example.

Listing 3: Support for ethtool by the kernel/drivers/net/Dm9000.c and DM9000 drivers
 static const struct ethtool_ops dm9000_ethtool_ops = {  .get_drvinfo  = dm9000_get_drvinfo,  .get_settings= dm9000_get_settings,  .set_settings  = dm9000_set_settings,  .get_msglevel  = dm9000_get_msglevel,  .set_msglevel  = dm9000_set_msglevel,  .nway_reset= dm9000_nway_reset,  .get_link  = dm9000_get_link,   .get_eeprom_len  = dm9000_get_eeprom_len,   .get_eeprom  = dm9000_get_eeprom,  .set_eeprom= dm9000_set_eeprom,  .get_rx_csum= dm9000_get_rx_csum,  .set_rx_csum= dm9000_set_rx_csum,  .get_tx_csum= ethtool_op_get_tx_csum,  .set_tx_csum= dm9000_set_tx_csum,  };  ...  ndev->ethtool_ops  = &dm9000_ethtool_ops;  ...

Each function in listing 3 is implemented in the driver of DM9000. For example, to view the connection status of the current network, you can obtain it through dm9000_get_link. The specific implementation of the function is shown in Listing 4:

Listing 4, dm9000_get_link
 static u32 dm9000_get_link(struct net_device *dev)  {  board_info_t *dm = to_dm9000_board(dev);  u32 ret;  if (dm->flags & DM9000_PLATF_EXT_PHY)  ret = mii_link_ok(&dm->mii);  else  ret = dm9000_read_locked(dm, DM9000_NSR) & NSR_LINKST ? 1 : 0;  return ret;  }  kernel/drivers/net/Mii.c  int mii_link_ok (struct mii_if_info *mii)  {  /* first, a dummy read, needed to latch some MII phys */  mii->mdio_read(mii->dev, mii->phy_id, MII_BMSR);  if (mii->mdio_read(mii->dev, mii->phy_id, MII_BMSR) & BMSR_LSTATUS)  return 1;  return 0;  }

We can see that the final implementation is still achieved by reading the PHY register through MDIO/MDC.

In addition to NIC Management commands, ethtool also has other extended functions. The framework of ethtool is very conducive to the expansion of new functions, developers can add the desired functions in this framework to implement other functions except NIC Management. In fact, the current ethtool has provided some other functions, for example, the new functions are used to update and update the NIC Firmware, and control the network driver logs. These new functions are very beneficial for debugging programs and correcting errors.

Listing 5: Some ethtool extensions: firmware updates and modifies the Log Level
 ethtool -f|--flash DEVNAME   FILENAME  ethtool -s|--change DEVNAME  msglvl %d
Use ethtool to configure and manage NICs

The previous section describes the basics and methods for implementing ethtool. This section describes the usage of ethtool, which focuses on the usage of ethtool in configuring and managing network interfaces.

The best way to understand ethtool usage is to view ethtool's help information "ethtool-h" or "man ethtool". Because there are many help information, we will not list them here, some practical application examples will be provided.

Instance 1. Use ethtool to view eth4 information of the NIC interface.

Listing 6: view the interface information of the NIC
 root@IMMV2-DEV4:~# ethtool eth4  Settings for eth4:         Supported ports: [ TP ]         Supported link modes:   10baseT/Half 10baseT/Full                                 100baseT/Half 100baseT/Full                                 1000baseT/Full         Supports auto-negotiation: Yes         Advertised link modes:  10baseT/Half 10baseT/Full                                 100baseT/Half 100baseT/Full                                 1000baseT/Full         Advertised auto-negotiation: Yes         Speed: 100Mb/s         Duplex: Full         Port: Twisted Pair         PHYAD: 1         Transceiver: internal         Auto-negotiation: on         Supports Wake-on: g         Wake-on: g         Link detected: yes

Instance 2: Disable Automatic Nic negotiation and view the Modification result.

Listing 7: Disable Automatic Nic negotiation and view the Modification result
 root@IMMV2-DEV4:~# ethtool -s eth4 autoneg off  root@IMMV2-DEV4:~# ethtool eth4  Settings for eth4:  Supported ports: [ TP ]  Supported link modes:   10baseT/Half 10baseT/Full                         100baseT/Half 100baseT/Full                         1000baseT/Full  Supports auto-negotiation: Yes  Advertised link modes:  Not reported  Advertised auto-negotiation: No  Speed: 100Mb/s  Duplex: Full  Port: Twisted Pair  PHYAD: 1  Transceiver: internal  Auto-negotiation: off  Supports Wake-on: g  Wake-on: g  Link detected: yes

Instance 3: Disable Automatic Nic negotiation and change the NIC speed to 10 Mb/s.

Listing 8: Disable Automatic Nic negotiation and change the NIC speed to 10 Mb/s.
 root@IMMV2-DEV4:~# ethtool -s eth4 autoneg off speed 10  root@IMMV2-DEV4:~# ethtool eth4  Settings for eth4:         Supported ports: [ TP ]         Supported link modes:   10baseT/Half 10baseT/Full                                 100baseT/Half 100baseT/Full                                 1000baseT/Full         Supports auto-negotiation: Yes         Advertised link modes:  Not reported         Advertised auto-negotiation: No         Speed: 10Mb/s         Duplex: Full         Port: Twisted Pair         PHYAD: 1         Transceiver: internal         Auto-negotiation: off         Supports Wake-on: g         Wake-on: g         Link detected: yes

Other functions of ethtool can be implemented according to the syntax stipulated in the help information.

Expand ethtool

According to some NIC features, ethtool can be extended to support special functions of the NIC. A typical extended application is to add ethtool's support function for SideBand, for an introduction to SideBand, refer to the IBM developerWorks introduction to NCSI and its implementation on Linux.

. Figure 6 shows a diagram of SideBand select_channel, enable_channel, disable_channel, and other functions by adding custom cmd and corresponding implementation functions. Taking select_channel as an example, you can perform the following steps.

Figure 6. Expand the sideband management function of ethtool

  1. Add the command word ETHTOOL_SELCHANNEL to both the user space and the kernel space of ethtool;
  2. Add the execution function ethtool_select_channel corresponding to ETHTOOL_SELCHANNEL in ethtool. ops;
  3. Implement the ethtool_select_channel () function in the dev_ethtool function. This function uses the packet sending interface of the protocol stack to send the packaged NCSI Command Protocol package to the mac layer of NIC, and accept the corresponding response. Similarly, for ethtool_enable_channel (), ethtool_disable_channel can be expanded in the same way. It can be seen that the ethtool framework is highly scalable, it is helpful for developers to customize according to their actual needs.
Summary

Ethtool is a powerful network management tool in Linux. This article first introduces the implementation principles and methods of this tool, the MII Management Register, MDIO/MDC standard, and ethtool support in Linux network drivers in 802.3.22 are introduced in detail. The example shows how to use this tool to manage network adapters, finally, we introduced the extended SideBand management instance in the ethtool framework, which can be used as a reference by developers.

This article permanently updates the link address:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.