WO2017071667A1 - 报文转发 - Google Patents

报文转发 Download PDF

Info

Publication number
WO2017071667A1
WO2017071667A1 PCT/CN2016/103943 CN2016103943W WO2017071667A1 WO 2017071667 A1 WO2017071667 A1 WO 2017071667A1 CN 2016103943 W CN2016103943 W CN 2016103943W WO 2017071667 A1 WO2017071667 A1 WO 2017071667A1
Authority
WO
WIPO (PCT)
Prior art keywords
pci
board
packet
ethernet
interface board
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2016/103943
Other languages
English (en)
French (fr)
Inventor
赵志宇
慕长林
左彦峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Technologies Co Ltd
Original Assignee
New H3C Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Technologies Co Ltd filed Critical New H3C Technologies Co Ltd
Priority to EP16859102.2A priority Critical patent/EP3370377B1/en
Priority to US15/771,963 priority patent/US10430364B2/en
Priority to JP2018521989A priority patent/JP6592599B2/ja
Publication of WO2017071667A1 publication Critical patent/WO2017071667A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/40Constructional details, e.g. power supply, mechanical construction or backplane
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4027Coupling between buses using bus bridges
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/35Switches specially adapted for specific applications
    • H04L49/351Switches specially adapted for specific applications for local area network [LAN], e.g. Ethernet switches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/70Virtual switches

Definitions

  • the network device can include a forwarding board and at least two interface boards.
  • the forwarding board is a board responsible for network packet forwarding processing.
  • the interface board is an outgoing physical network interface that is responsible for receiving and sending network packets.
  • the forwarding board and each interface board can be interconnected by using a PCI-E (Peripheral Component Interconnect Express) bus.
  • PCI-E Peripheral Component Interconnect Express
  • 1 is an architectural diagram of a network device in an example
  • FIG. 2 is a mapping diagram of a PCI-E memory space in the network device shown in FIG. 1;
  • 3a and 3b are schematic diagrams of packet forwarding principles based on PCI-E memory space mapping as shown in FIG. 2;
  • FIG. 4 is an extended map of a PCI-E memory space in the network device shown in FIG. 1;
  • 5a and 5b are schematic diagrams of packet forwarding principles based on PCI-E memory space mapping as shown in FIG. 4;
  • 6a and 6b are schematic diagrams showing the configuration of entries in the network device shown in FIG. 1;
  • FIG. 7 is a schematic diagram of an extended architecture of the network device shown in FIG. 1;
  • FIG. 8 is a map of a PCI-E memory space in the network device shown in FIG. 7;
  • FIG. 9 is a schematic diagram of a packet forwarding principle based on PCI-E memory space mapping as shown in FIG. 8;
  • FIG. 10 is a logical structural diagram of a first logic device in the network device shown in FIG. 1 or FIG. 7;
  • Figure 11 is a schematic diagram showing the operation of the first logic device shown in Figure 10;
  • FIG. 12 is a mapping diagram of a PCI-E memory space based on a fragmentation mechanism in the network device shown in FIG. 1;
  • 13a and 13b are schematic diagrams of packet forwarding principles based on PCI-E memory space mapping as shown in FIG. 12;
  • Figure 14 is a logical structural diagram of a first logic device in the network device shown in Figures 13a and 13b;
  • 15a and 15b are schematic diagrams showing the operation of the first logic device shown in FIG. 14;
  • Figure 16 is a flow chart of a message forwarding method in an example.
  • the network device 10 includes a forwarding board 20, a first interface board 30, and a second interface board 40.
  • the forwarding board 20 includes a CPU 21, a system memory 22, a system memory controller 23, and a first PCI-E RC (Root Complex) 241 and a second PCI-E RC 242.
  • the CPU 21, the system memory controller 23, and the first PCI-E RC 241 and the second PCI-E RC 242 are interconnected with each other, and the system memory 22 and the system memory controller 23 are connected to each other.
  • the CPU 21 of the forwarding board, the system memory controller 23, and the first PCI-E RC 241 and the second PCI-E RC 242 may be integrated into a SOC (System-on-a-Chip) CPU;
  • the system memory controller 23 of the forwarding board may be integrated in the CPU 21, and the first PCI-E RC 241 and the second PCI-E RC 242 of the forwarding board may be integrated into a PCH (Platform Controller Hub) independent of the CPU 21 Device).
  • PCH Plate Controller Hub
  • first PCI-E RC 241 and the second PCI-E RC 242 shown in FIG. 1 are independent of each other, but this does not mean that the first PCI-E RC 241 and the second PCI-E RC 242 are necessarily independent of each other.
  • the entity, that is, the first PCI-E RC 241 and the second PCI-E RC 242 shown in FIG. 1 may be logically formed by an entity having a PCI-E RC function through an external PCI-E Switch (switch) Two equivalent PCI-E RCs.
  • switch PCI-E Switch
  • the first interface board 30 includes a first Ethernet switch chip 31, a first logic device 32, a first card memory 33, and a first PCI-E Endpoint connected to the first PCI-E RC 241 through the first PCI-E bus 11. 34.
  • the first Ethernet switch chip 31 and the first logic device 32 are connected to each other (for example, through an Ethernet bus), and the first logic device 32, the first card memory 33, and the first PCI-E Endpoint 34 may be interconnected with each other. .
  • the second interface board 40 includes a second Ethernet switch chip 41, a second logic device 42, a second card memory 43, and a second PCI-E Endpoint connected to the second PCI-E RC 242 through the second PCI-E bus 12. 44.
  • the second Ethernet switch chip 41 and the second logic device 42 may be connected to each other (for example, through an Ethernet bus), and the second logic device 42, the second card memory 43, and the second PCI-E Endpoint 44 may mutually interact with each other. even.
  • the first card memory 33 and the second card memory 43 are both mapped in the PCI-E memory space 60 of the network device 10, and the first card memory 33 is in the PCI.
  • the address section 60b mapped in the -E memory space 60 is different from the address section 60a in which the second board memory 43 is mapped in the PCI-E memory space 60.
  • the first card memory 33 and the second card memory 43 are visible to the first PCI-E RC 241 and the second PCI-E RC 242, and the first PCI-E Endpoint 34 and the second PCI-E Endpoint 44. .
  • the first logic device 32 in the first interface board 30 receives the Ethernet data message 51 from the first Ethernet switch chip 31, the first logic device 32 can determine the destination board of the Ethernet data message 51.
  • the Ethernet data packet 51 may be received by the first Ethernet switch chip 31 from the outside of the network device 10.
  • the first logic device 32 may determine the Ethernet data packet 51 according to the mapping table that maintains the Ethernet MAC address and the board identifier. The purpose of the board.
  • the Ethernet data message 51 will be forwarded to the system memory 22 in the forwarding board 20 in the PCI-E format.
  • the post processing is read by the CPU 21 from the system memory 22.
  • the first logic device 32 may encapsulate the Ethernet data packet 51 as The PCI-E message with the PCI-E memory space address of the second card memory 43 (ie, the address in the address range 60a shown in FIG. 2) is the destination address ("PCI-E message" in this document". It may be referred to as a short for PCI-E write message 61 for the first PCI-E Endpoint 34 to forward the PCI-E message 61 to the forwarding board 20, and the forwarding board 20 according to the PCI-E message 61.
  • the destination address forwards the PCI-E message 61 to the second card memory 43 of the second interface board 40 which is the destination board.
  • the first PCI-E Endpoint 34 sends the PCI-E message 61 to the first PCI-E RC 241, and the PCI-E RC 241 determines the PCI according to the destination address of the PCI-E message 61.
  • the -E message 61 is forwarded to the second PCI-E RC 242; the PCI-E message 61 encapsulating the Ethernet data message 51 is forwarded by the second PCI-E RC 242 to the second PCI-E Endpoint 44, and is The second PCI-E Endpoint 44 is written to the second board memory 43.
  • the second logic device 42 can obtain the PCI-E message 61 from the second card memory 43 and parse the Ethernet data message 51 from the PCI-E message 61 to the second Ethernet switch chip. 41, thereby being issued by the second Ethernet switch chip 41 to the outside of the network device 10.
  • the second logic device 42 in the second interface board 40 receives the Ethernet data message 52 from the second Ethernet switch chip 41, the second logic device 42 can determine the destination board of the Ethernet data message 52.
  • the Ethernet data message 52 may be received by the second Ethernet switch chip 41 from the outside of the network device 10.
  • the second logic device 42 may encapsulate the Ethernet data packet 52 with the first board.
  • the PCI-E memory space address of the memory 33 i.e., the address in the address range 60b shown in FIG. 2 is the PCI-E message 62 of the destination address for the second PCI-E Endpoint 44 to be the PCI-E.
  • the message 62 is forwarded to the forwarding board 20, and the forwarding board 20 forwards the PCI-E message 62 to the first board memory of the first interface board 30 as the destination board according to the destination address of the PCI-E message 62. 33 forwarded.
  • the second PCI-E Endpoint 44 sends the PCI-E message 62 to the second PCI-E RC 242 and is referenced by the second PCI-E RC 242 according to the destination address of the PCI-E message 62.
  • the -E message 62 is forwarded to the first PCI-E RC 241;
  • the PCI-E message 62 encapsulating the Ethernet data message 52 is forwarded by the first PCI-E RC 241 to the first PCI-E Endpoint 34, and is A PCI-E Endpoint 34 is written to the first board memory 33.
  • the first logic device 32 can obtain the PCI-E message 62 from the first card memory 33 and parse the Ethernet data message 52 from the PCI-E message 62 to the first Ethernet switch chip. 31, which can thus be issued by the first Ethernet switch chip 31 to the outside of the network device 10.
  • the forwarding of the Ethernet data message 51 or 52 between the first interface board 30 and the second interface board 40 can be performed without the CPU 21 of the forwarding board 20, and the Ethernet data message 51 or 52 is on the first interface board.
  • the forwarding performance between the 30 and the second interface board 40 can be improved to some extent.
  • the system memory 22 in the forwarding board 20 is mapped in the PCI-E memory space 60 of the network device 10, that is, the memory space 60 is paired with the first PCI-E RC 241 and the second PCI-E RC 242. And the first PCI-E Endpoint 34 and the second PCI-E Endpoint 44 are visible.
  • the address interval 60c mapped by the system memory 22 in the PCI-E memory space is different from the address interval 60b mapped in the PCI-E memory space 60 by the first card memory 33, and the second card memory 43 is in the PCI-E.
  • the mapping of the system memory 22 may be that all or part of the memory space of the system memory 22 is mapped in the PCI-E memory space 60.
  • the first logic device 32 in the first interface board 30 may determine the destination board of the Ethernet data packet 53, for example, the first The logic device 32 can be configured by the CPU 21 of the forwarding board 20 according to the mapping table of the Ethernet MAC address and the board identifier. The set entry determines the destination board of the Ethernet data message 53.
  • the first logic device 32 may encapsulate the Ethernet data packet 53 as PCI-E in the system memory 22.
  • the memory space address ie, the address in the address range 60c shown in FIG.
  • PCI-E message 63 is the PCI-E message 63 of the destination address for the first PCI-E Endpoint 34 to send the PCI-E message 63 to the first a PCI-E RC 241, and the PCI-E message 63 can be written into the system memory 22 by the first PCI-E RC 241 according to the destination address of the PCI-E message 63, so that when the CPU 21 passes
  • the PCI-E message 63 can be read, and the Ethernet data message 53 can be parsed from the PCI-E message 63.
  • the second logic device 42 in the second interface board 40 receives the Ethernet data message 54 from the second Ethernet switch chip 41, the second logic device 42 can determine the destination board of the Ethernet data message 54.
  • the second logic device 42 may encapsulate the Ethernet data packet 54 into the PCI-E of the system memory 22.
  • the memory space address ie, the address in the address interval 60c shown in FIG.
  • PCI-E message 64 is the PCI-E message 64 of the destination address for the second PCI-E Endpoint 44 to send the PCI-E message 64 to the
  • the PCI-E RC 242 can be written by the second PCI-E RC 242 to the system memory 22 according to the destination address of the PCI-E message 64, so that when the CPU 21 passes
  • the PCI-E message 64 can be read and the Ethernet data message 54 can be parsed from the PCI-E message 64.
  • the forwarding between the first interface board 30 and the second interface board 40 used in this example may not affect the forwarding between each of the first interface board 30 and the second interface board 40 and the forwarding board 20. .
  • mapping tables 71a and 71b of the Ethernet MAC address and the board identifier may be respectively maintained in the first logic device 32 and the second logic device 42 for using the Ethernet data message 51 or 52 or 53 or
  • the destination MAC address of 54 determines the destination card for the Ethernet data message 51 or 52 or 53 or 54.
  • the mapping table 72 of the board identifier and the PCI-E memory space address may also be maintained in the first logic device 32 and the second logic device 42 for using the Ethernet data packet 51 or 52.
  • the destination card of 53 or 54 determines the destination address of the PCI-E message 61 or 62 or 63 or 64 in which the Ethernet data message 51 or 52 or 53 or 54 is encapsulated.
  • mapping table 71a and 71b of the Ethernet MAC address and the board identifier and the mapping table 72 of the board identifier and the PCI-E memory space address are shown in FIG. 6a after the network device 10 is initialized, as shown in FIG. 6a.
  • the entries in the mapping table 71a and 71b of the Ethernet MAC address and the board identifier are corresponding to the entries of the forwarding board 20, and all the entries in the mapping table 72 of the board identifier and the PCI-E memory space address are available to the CPU 21.
  • the network device 10 is delivered by using the PCI-E packet 60 when it is initialized.
  • "MAC0" shown in FIGS. 6a and 6b represents the MAC address of the forwarding board 20.
  • the PCI-E message 60 for the next publication item is forwarded by the CPU 21 using the first PCI-E bus 11 and the second PCI-E bus 12, that is, the CPU 21 will be the first PCI-E.
  • the bus 11 and the second PCI-E bus 12 are multiplexed into a management bus for the next published item. It can be understood that, in the network device 10, a management bus that is independent of the first PCI-E bus 11 and the second PCI-E bus 12 and dedicated to the next published item can also be provided. In the protocol message of the independent management bus.
  • the entry corresponding to the second interface board 40 in the mapping table 71a of the Ethernet MAC address and the board identifier shown in FIG. 6b may be passed by the first logic device 32 through the PCI-E message from the second interface board 40.
  • the Ethernet data packet 52 encapsulated in 62 is learned, and "MAC2" shown in FIG. 6b represents the source MAC address of the Ethernet data message 52.
  • the entry corresponding to the first interface board 30 in the mapping table 71b of the Ethernet MAC address and the board identifier shown in FIG. 6b can be passed by the second logic device 42 through the PCI-E message from the first interface board 30.
  • the Ethernet data packet 51 encapsulated in 61 is learned, and "MAC1" shown in FIG. 6b represents the source MAC address of the Ethernet data packet 51.
  • the first logic device 32 can obtain the PCI-E message 62 from the first board memory 33.
  • the PCI-E message 62 encapsulates the Ethernet data packet 52 received by the second logic device 42 from the second Ethernet switch chip 41 and carries the board identifier of the second interface board 40.
  • the first logic device 32 can be based on the Ethernet parsed from the PCI-E message 62.
  • the source MAC address of the data packet 52 and the board identifier carried in the PCI-E packet 62 create an entry corresponding to the second interface board 40 in the mapping table 71a of the Ethernet MAC address and the board identifier.
  • the second logic device 42 can obtain the PCI-E report from the second board memory 43.
  • the PCI-E packet 61 is encapsulated with the Ethernet data packet 51 received by the first logic device 32 from the first Ethernet switch chip 31 and carries the board identifier of the first interface board 30.
  • the second logic device 42 may be based on the Ethernet parsed from the PCI-E message 61.
  • the source MAC address of the data packet 51 and the board carried in the PCI-E packet 61 The card identifier is used to create an entry corresponding to the first interface board 30 in the mapping table 71b of the Ethernet MAC address and the board identifier.
  • the first logical device 32 or the second logical device 42 may not be able to find a destination data card of an Ethernet data packet according to the mapping table of the Ethernet MAC address and the board identifier.
  • the Ethernet data packet 51 is a broadcast packet. Because its destination MAC address is a broadcast address, it cannot uniquely determine the destination card of the Ethernet data packet 51 through the mapping table of the Ethernet MAC address and the board identifier. The destination board query failed. In this case, you can use the method of forwarding PCI-E packets one by one to all interface boards except the board.
  • the identifier "51" is still used to indicate the Ethernet data packet received by the first Ethernet switch chip 31, but it can be understood that the Ethernet data message represented by the identifier "51" in the following example may be different. The Ethernet data message 51 described above.
  • the network device 10' may further include a third interface board 80 including a third PCI-E Endpoint 84 and a third board memory 83.
  • the forwarding board 20' may further include a third PCI-E RC 243 that connects the third PCI-E Endpoint 84 through the third PCI-E bus 13.
  • the third PCI-E RC 243 may be a separate entity with PCI-E RC, or may be a logically formed equivalent PCI-E. RC.
  • the third board memory 83 can be mapped in the PCI-E memory space 60, and the address area 60d mapped by the third board memory 83 in the PCI-E memory space 60 is different from the system memory 22.
  • the first board memory 33 and the second board memory 43 are respectively mapped to the address sections 60c, 60b, and 60a in the PCI-E memory space.
  • the entries in the mapping table 72 of the board identifier and the PCI-E memory space address that are sent by the CPU 21 during the initialization further include entries corresponding to the third interface board 80.
  • the third PCI-E RC 243 and the third PCI-E Endpoint 84 can also be mapped in the PCI-E memory space 60.
  • the first logic device 32 determines the destination board of the Ethernet data packet 51 by matching the entries in the mapping table 71a of the Ethernet MAC address and the board identifier. card;
  • the first logic device 32 determines that the Ethernet data packet 51 is not the destination board of the forwarding board 20 (for example, only the mapping table 71a)
  • the initial MAC address of the MAC address forwarding board 20 may be a broadcast address, and the first logical device 32 may use the mapping table 72 of the board identifier and the PCI-E memory space address.
  • the Ethernet data message 51 is encapsulated into a PCI-E memory space address of the second board memory 43 (ie, The address in the address section 60a shown in FIG. 8 and the PCI-E memory space address of the third board memory 83 (that is, the address in the address section 60d shown in FIG.
  • PCI-E messages 61 and 61' for the first PCI-E Endpoint 34 to send PCI-E messages 61 and 61' to the first PCI-E RC 241, and to make the first PCI-E RC 241 according to PCI
  • the destination addresses of the -E messages 61 and 61' forward the PCI-E messages 61 and 61' to the second PCI-E RC 242 and the third PCI-E RC 243, respectively.
  • the PCI-E message 61 encapsulating the Ethernet data packet 51 can reach the second interface board 40, and the second logical device 42 in the second interface board 40 is in the mapping table 71b of the Ethernet MAC address and the board identifier. An entry corresponding to the first interface board 30 is created.
  • the PCI-E message 61' may reach the third interface board 80, and the third interface board 80 may have a similar device to the first logic device 32 or the second logic device 42 (in FIGS. 7 and 9) If not shown, the device can parse the Ethernet data packet 51, and create an entry corresponding to the first interface board 30 in the mapping table of the Ethernet MAC address and the board identifier.
  • the first logic device 32 may include: an Ethernet bus controller 320, an Ethernet packet receiving processing module 321, a first mapping table maintenance module 322, a PCI-E message sending processing module 323, The second mapping table maintenance module 324, the PCI-E message receiving processing module 325, the Ethernet message sending processing module 326, and the CPU switching register 327.
  • the first logic device 32 operates as follows (since the Ethernet data messages 51 and 53 are processed in the first logic device 32 in substantially the same manner, and the PCI-E report The manners of the words 61 and 63 in the first logic device 32 are substantially the same. Therefore, in order to simplify the view, the Ethernet data message 53 and the PCI-E message 63) are omitted in FIG. 11:
  • the Ethernet bus controller 320 connects the first Ethernet switch chip 31 of the board so that the Ethernet data messages 51 and 52 between the first logic device 32 and the first Ethernet switch chip 31 can interact.
  • the Ethernet message reception processing module 321 receives the Ethernet data message 51 from the first Ethernet switch chip 31 from the Ethernet bus controller 320.
  • the first mapping table maintenance module 322 maintains a mapping table 71a of the Ethernet MAC address and the board identifier, and the Ethernet packet receiving processing module 321 determines the Ethernet by matching the entries in the mapping table 71a of the Ethernet MAC address and the board identifier.
  • the destination board of data message 51 The first mapping table maintenance module 322 maintains a mapping table 71a of the Ethernet MAC address and the board identifier, and the Ethernet packet receiving processing module 321 determines the Ethernet by matching the entries in the mapping table 71a of the Ethernet MAC address and the board identifier. The destination board of data message 51.
  • the Ethernet data packet 51 is sent to the PCI-E packet transmission processing module 323 together with the matching result.
  • the matching result described herein may be a board identifier of the successfully matched destination board, or a matching failure.
  • the board identifier matched to the second interface board 40 is taken as an example (as shown in FIG. 11). S1111)).
  • the PCI-E message transmission processing module 323 encapsulates the received Ethernet data message 51.
  • the second mapping table maintenance module 324 maintains a mapping table 72 of the board identifier and the PCI-E memory space address, and the PCI-E packet sending processing module 323 passes the mapping table of the matching board identifier and the PCI-E memory space address.
  • the entry in 72 determines the destination address of the PCI-E message 61 formed by the encapsulation.
  • the Ethernet data packet 51 is encapsulated in the PCI-E packet 61, and the destination address of the PCI-E packet 61 is determined according to the matching result, and then the PCI- The E message 61 is provided to the first PCI-E Endpoint 34.
  • the matching result described herein may be the board identifier of the determined one-piece board, or the board identifier of all other interface boards (when the destination board fails to be determined), and the matching in FIG. 11
  • the address section 60a of the second board memory 43 is taken as an example (as shown by S1112 and S1113 in Fig. 11).
  • all the other interface boards (ie, the second interface board 40 and the third interface board 80) of the corresponding network device 10 except the board 30 can be maintained in the first mapping table maintenance module 322.
  • the Ethernet packet receiving and processing module 321 can report the matching result to the PCI-E packet sending and processing module 323 as all other interface boards except the board 30 of the network device 10, so that the PCI-E packet is sent and processed.
  • the module 323 can match the entries corresponding to all the other interface boards except the board 30; or the Ethernet packet receiving processing module 321 can report the matching failure to the PCI-E packet sending processing module 323 to enable the PCI.
  • the -E message transmission processing module 323 can poll the entries corresponding to all other interface boards except the board 30.
  • the CPU switch register 327 can store the board identifier of the board sent by the CPU 21, and the PCI-E packet sending processing module 323 can encapsulate the board identifier of the board in the PCI-E message 61 (eg, The other interface boards (including but not limited to the second interface board 40) for receiving the PCI-E message 61 can utilize the board identification of the board and the Ethernet data packet 51.
  • the source MAC address learns the mapping entry corresponding to the Ethernet MAC address of the board and the board identifier.
  • the PCI-E message receiving processing module 325 can read the PCI-E messages 60 and 62 of the board for the purpose of the board from the first board memory 33.
  • the PCI-E packet receiving and processing module 325 can parse the entry sent by the CPU 21 when the network device 10 is initialized, and write to the first mapping table maintenance module 322 and the second mapping table to be maintained from the PCI-E packet 60.
  • the module 324 shown as S1121 in FIG. 11
  • the PCI-E message receiving and processing module 325 can parse the board of the board that is sent by the CPU 21 when the network device 10 is initialized from the PCI-E message 60.
  • the card is identified and written into the CPU exchange register 327 (as shown by S1122 in Fig. 11).
  • the PCI-E packet receiving processing module 325 can parse the Ethernet data packet 52 from the second interface board 40 and the board identifier of the second interface board 40 from the PCI-E packet 62, and parse the obtained Ethernet data.
  • the message 52 and the board identifier are forwarded to the Ethernet message transmission processing module 326.
  • the Ethernet packet sending processing module 326 extracts the source MAC address from the received Ethernet data packet 52, and the first mapping table maintenance module 322 according to the extracted source MAC address and the received board identifier of the second interface board 40. An entry corresponding to the second interface board 40 is created.
  • the Ethernet message transmission processing module 326 also transmits the received Ethernet data message 52 to the Ethernet bus controller 320.
  • the second logic device 42 can have substantially the same structure as the first logic device 32.
  • the second card memory 43 may include a plurality of data message buffers 60a_1 6060a_m in the address interval 60a mapped in the PCI-E memory space 60 (m is greater than 1) Integer) and at least one control message buffer 60a_ctr.
  • the first card memory 33 may include a plurality of data message buffers 60b_1 to 60b_n in the address interval 60b mapped in the PCI-E memory space 60 (n is A positive integer greater than 1) and at least one control message buffer 60b_ctr.
  • the entry 91 of the corresponding second interface board 40 included in the mapping table 72 of the PCI-E memory space address maintained by the first logical device 32 is divided into multiple data message buffers respectively.
  • a child entry of 60a_1 to 60a_m, and each of the child entry has a flag flag for indicating the state of the data packet buffer 60a_i (i is a positive integer greater than or equal to 1 and less than or equal to m) corresponding to the child entry. Is it occupied or idle?
  • the entry 92 of the corresponding first interface board 30 included in the mapping table 72 of the PCI-E memory space address maintained by the second logical device 42 is also divided into multiple data packets respectively.
  • a sub-event of the buffer slices 60b_1 to 60b_n, and each sub-table has a flag Flag for indicating a data packet buffer 60b_j corresponding to the sub-element (j is a positive integer greater than or equal to 1 and less than or equal to n) The status is occupied or idle.
  • control message buffers 60a_ctr and 60b_ctr may be used as a sub-entry of the board identifier and the PCI-E memory space address mapping table 72 without the flag flag, or may be independent of the board identifier and the PCI- The mapping table 72 of the E memory space address is stored separately.
  • the first logic device 32 in the first interface board 30 receives the Ethernet data message 51 from the first Ethernet switch chip 31, the first logic device 32 can determine the destination board of the Ethernet data message 51.
  • the first logic device 32 selects an idle data message buffer 60a_i from the plurality of data message buffers 60a_1 to 60a_m mapped in the PCI-E memory space 60 by the second card memory 43.
  • the flag bit Flag of the corresponding sub-event entry is set to the occupied state, and then the Ethernet data packet 51 is encapsulated into a PCI-E packet 61 with the PCI-E memory space address of the data packet buffer 60a_i as the destination address.
  • the PCI-E message 61 is forwarded to the second PCI-E RC 242.
  • the destination address set by the first logic device 32 for the PCI-E message 61 is the PCI-E memory space address of an idle data message buffer 60a_i mapped by the second card memory 43.
  • the PCI-E message 61 encapsulated with the Ethernet data message 51 is forwarded by the second PCI-E RC 242 to the second PCI-E Endpoint 44 and written to the second board by the second PCI-E Endpoint 44.
  • the second logic device 42 can obtain the PCI-E message 61 from the second card memory 43 and parse the PCI-E message 61 from the address area corresponding to the data message buffer 60a_i in the card memory 43.
  • the outgoing Ethernet data message 51 is sent to the second Ethernet switch chip 41 so that it can be sent out to the outside of the network device 10 by the second Ethernet switch chip 41.
  • the second logic device 42 is further configured to control the message buffer 60b_ctr as the destination address of the PCI-E message 65 for the second PCI-E Endpoint 44 to send the PCI-E message 65 to the second PCI-
  • the E-RC 242 may forward the PCI-E message 65 to the first PCI-E RC 241 according to the destination address of the PCI-E message 65 by the second PCI-E RC 242.
  • the PCI-E packet 65 further carries a data packet buffer release information indicating that the data packet buffer 60a_i is released.
  • the PCI-E message 65 encapsulated with the data message buffer release information is forwarded by the first PCI-E RC 241 to the first PCI-E Endpoint 34 and written by the first PCI-E Endpoint 34.
  • the first logic device 32 can obtain the PCI-E message 65 from the first card memory 33 and from the PCI-E message 65.
  • the data message buffer release information is parsed, so that the flag of the sub-entry corresponding to the data packet buffer 60a_i can be set to an idle state according to the data message buffer release information.
  • the destination address set by the second logic device 42 for the PCI-E message 62 may be the first board memory. 33.
  • the PCI-E memory space address of the idle data packet buffer 60b_j mapped, and the flag of the sub-entry corresponding to the data packet buffer 60b_j is set to the occupied state.
  • the first logic device 32 can also be configured to control the message buffer 60a_ctr as the destination address of the PCI-E message 66 for the second logic device 42 to count According to the flag bit Flag of the sub-event corresponding to the message buffer 60b_j, the flag is set to the idle state.
  • backpressure flow control can be implemented on the forwarding of the Ethernet data packet between the first interface board 30 and the second interface board 40.
  • the first logic device 32 may further include a release execution module 328 and a release notification module 329.
  • the PCI-E packet sending processing module 323 may encapsulate the Ethernet data packet 51 into the PCI-E memory space address of the data packet buffer 60a_i as the destination address PCI-E packet 61.
  • the flag Flag of the sub-item corresponding to the data message buffer 60a_i is set to the occupied state (as shown by S1411 in Fig. 15a).
  • the PCI-E message receiving processing module 325 can read from the PCI-E message when reading the PCI-E message 65 from the address range of the control packet buffer 60b_ctr corresponding to the board in the first card memory 33.
  • the data message buffer slice release information parsed in the text 65 is provided to the release execution module 328.
  • the release execution module 328 can set the flag of the sub-event corresponding to the data packet buffer 60a_i to an idle state according to the data packet buffer release information provided by the PCI-E packet receiving processing module 325 (as shown in the figure). S1412 in 15a).
  • the PCI-E packet receiving processing module 325 reads the PCI-E packet from the address range of the corresponding data packet buffer 60b_j in the first card memory 33 to the PCI-E packet 62.
  • the Ethernet data packet 52 is parsed, and the destination address of the PCI-E packet 62 and the board identifier of the second interface board 40 are provided to the Ethernet packet sending processing module 326.
  • the Ethernet packet sending processing module is configured. When the Ethernet data message 52 is sent out, the destination address of the PCI-E message 62 and the board identifier of the second interface board 40 can be provided to the release notification module 329.
  • the release notification module 329 notifies the PCI-E message transmission processing module 323 to correspond to the control message cache of the second interface board 40 according to the destination address of the PCI-E message 62 and the board identifier of the second interface board 40.
  • the PCI-E message 66 of the slice 60a_ctr, and the destination address of the PCI-E message 62 is carried as the data message cache slice release information in the PCI- E message 66.
  • the packet forwarding method when the packet forwarding method is applied to the first interface board 30 or the second interface board 40 in the network device 10 shown in FIG. 1, the packet forwarding method may be included in the first logic device 32. Or the following steps performed by the second logic device 42:
  • the Ethernet data packet When successfully determining that the destination card of the Ethernet data packet received from the Ethernet switch chip is another interface board of the network device, the Ethernet data packet is encapsulated into a PCI-E of the card memory of the other interface board.
  • the PCI-E packet whose address is the destination address is used by the PCI-E Endpoint to forward the PCI-E packet to the forwarding board, and the forwarding board performs the PCI-E according to the destination address of the PCI-E packet.
  • the packet is forwarded to the card memory of the destination board;
  • the packet forwarding method may further include: encapsulating the Ethernet data packet into a PCI-E with system memory.
  • the PCI-E packet whose address is the destination address is used by the PCI-E Endpoint to forward the PCI-E packet to the forwarding board and is written in the system memory of the forwarding board.
  • the packet forwarding method may further maintain a mapping table of the board identifier and the PCI-E memory space address, and determine that the package is included according to the destination board of the Ethernet data packet received from the Ethernet switch chip.
  • the destination address of the PCI-E packet of the Ethernet data packet ; and, according to the configuration of the CPU of the forwarding board, an entry is created in the mapping table of the board identifier and the PCI-E memory space address.
  • the packet forwarding method can further maintain the mapping table of the Ethernet MAC address and the board identifier, and the packet forwarding method can parse other interface boards from the network device from the PCI-E packets from other interface boards.
  • the mapping table of the address and the board identifier creates an entry corresponding to the other interface board; and the packet forwarding method can be corresponding to the forwarding board in the mapping table of the Ethernet MAC address and the board identifier according to the configuration of the CPU when the network device is initialized.
  • the foregoing S1601 may determine the destination board of the Ethernet data packet according to the destination MAC address of the Ethernet data packet.
  • the packet forwarding method may further encapsulate the Ethernet data packet into each other interface of the network device, respectively, when determining that the destination card of the Ethernet data packet received from the Ethernet switch chip fails.
  • the PCI-E memory space address of the card memory is more than one PCI-E packet at the destination address, so that the PCI-E Endpoint forwards more than one PCI-E packet to the forwarding board, and the forwarding board is based on multiple Forwards more than one PCI-E packet to the corresponding other interface board in the destination address of a PCI-E packet.
  • the message forwarding method in this example can adopt anti-voltage control in the same manner as the principle of FIG.
  • the flow control mechanism correspondingly:
  • Each of the interface cards of the interface board includes a plurality of data packet buffers and at least one control packet buffer in the address range mapped in the PCI-E memory space; and the packet forwarding method further includes:
  • the PCI-E memory space address of an idle data message buffer of the board memory map of the other interface board is selected as the The destination address of the PCI-E packet is set; and the status record of the corresponding data packet buffer of the other interface board is changed from idle to occupied according to the destination address of the PCI-E packet;
  • the PCI-E packet carrying the data packet buffer release information is constructed, and the PCI-E is constructed.
  • the packet is a destination address of the control packet buffer mapped by the card memory of the other interface board, and the PCI-E Endpoint forwards the PCI-E packet to the forwarding board, and the forwarding board is configured according to the configuration.
  • the destination address of the PCI-E packet is forwarded to the other interface board.
  • the state of the corresponding data packet buffer of the other interface board is obtained according to the parsed data packet cache release information.
  • the record is reset to idle by occupancy.
  • the logic device of the interface board can use the Ethernet.
  • the data packet is encapsulated into a PCI-E packet with the PCI-E memory address of the card memory in the other interface card as the destination address, so that the PCI-E packet can be forwarded to other interface boards through the forwarding board.
  • the logic device of the interface board can parse the Ethernet data packet from the PCI-E packet when the interface board obtains the PCI-E packet from the other interface board. Forward. Therefore, the forwarding of the Ethernet data packets between the interface boards does not require the CPU of the forwarding board to participate, and the forwarding performance can be improved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Transfer Systems (AREA)
  • Small-Scale Networks (AREA)

Abstract

当第一接口板从以太交换芯片接收到第一以太数据报文时,当确定该第一以太数据报文以第二接口板为目的板卡,则该第一接口板的逻辑装置将该第一以太数据报文封装为以所述第二接口板中的板卡内存的PCI-E内存空间地址为目的地址的第一PCI-E报文,以使得PCI-E端点将该第一PCI-E报文向网络设备转发板转发;当第一接口板从板卡内存中获取到来自第三接口板的第二PCI-E报文时,该第一接口板的逻辑装置从该第二PCI-E报文中解析出第二以太数据报文送至以太交换芯片,其中所述第三接口板与第二接口板相同,或者不同。

Description

报文转发 背景技术
网络设备中可以包括转发板和至少两块接口板,转发板是负责网络报文转发处理的单板。接口板是一个出物理网络接口,用于负责网络报文接收和发送。并且,转发板和每块接口板之间可以利用PCI-E(Peripheral Component Interconnect Express,高速外设部件互连标准)总线实现互连。
当选用PCI-E总线实现转发板和接口板之间的互连时,要考虑提高不同接口板之间的转发性能、并降低转发板的CPU占用率。
附图简要说明
图1为一个例子中的网络设备的架构图;
图2为如图1所示的网络设备中的PCI-E内存空间的映射图;
图3a和图3b为基于如图2所示的PCI-E内存空间映射的报文转发原理的示意图;
图4为如图1所示的网络设备中的PCI-E内存空间的扩展映射图;
图5a和图5b为基于如图4所示的PCI-E内存空间映射的报文转发原理的示意图;
图6a和图6b为如图1所示的网络设备中的表项配置的示意图;
图7为如图1所示的网络设备的扩展架构示意图;
图8为如图7所示的网络设备中的PCI-E内存空间的映射图;
图9为基于如图8所示的PCI-E内存空间映射的报文转发原理的示意图;
图10为如图1或图7所示的网络设备中的第一逻辑装置的逻辑结构图;
图11为如图10所示的第一逻辑装置的工作原理的示意图;
图12为如图1所示的网络设备中的PCI-E内存空间基于分片机制的映射图;
图13a和图13b为基于如图12所示的PCI-E内存空间映射的报文转发原理的示意图;
图14为如图13a和图13b所示的网络设备中的第一逻辑装置的逻辑结构图;
图15a和图15b为如图14所示的第一逻辑装置的工作原理的示意图;
图16为一个例子中的报文转发方法的流程图。
实施方式
请参见图1,在一个例子中,网络设备10包括转发板20、第一接口板30以及第二接口板40。
转发板20包括CPU 21、系统内存22、系统内存控制器23、以及第一PCI-E RC(Root Complex,根复合体)241和第二PCI-E RC 242。其中,CPU 21、系统内存控制器23、以及第一PCI-E RC 241和第二PCI-E RC 242彼此互连,并且,系统内存22与系统内存控制器23相互连接。
在该例子中,转发板的CPU 21、系统内存控制器23、以及第一PCI-E RC241和第二PCI-E RC242可以集成为SOC(System-on-a-Chip,片上系统)CPU;或者,转发板的系统内存控制器23可以集成于CPU 21中,并且,转发板的第一PCI-E RC241和第二PCI-E RC242可以集成为独立于CPU 21的PCH(Platform Controller Hub,平台控制器)。
并且,图1中示出的第一PCI-E RC241和第二PCI-E RC242相互独立,但这并不意味着第一PCI-E RC241和第二PCI-E RC242必然为两个相互独立的实体,即,图1中示出的第一PCI-E RC241和第二PCI-E RC242可以是由一个具有PCI-E RC功能的实体通过外接PCI-E Switch(交换器)在逻辑上形成的两个等效PCI-E RC。并且,当第一PCI-E RC241和第二PCI-E RC242为两个相互独立的实体时,其中的每一个也可以外接PCI-E Switch。
第一接口板30包括第一以太交换芯片31、第一逻辑装置32、第一板卡内存33、以及通过第一PCI-E总线11连接第一PCI-E RC 241的第一PCI-E Endpoint 34。其中,第一以太交换芯片31和第一逻辑装置32相互连接(例如通过以太总线),并且,第一逻辑装置32、第一板卡内存33、以及第一PCI-E Endpoint 34可以彼此互连。
第二接口板40包括第二以太交换芯片41、第二逻辑装置42、第二板卡内存43、以及通过第二PCI-E总线12连接第二PCI-E RC 242的第二PCI-E Endpoint 44。其中,第二以太交换芯片41和第二逻辑装置42可以相互连接(例如通过以太总线),并且,第二逻辑装置42、第二板卡内存43、以及第二PCI-E Endpoint 44可以彼此互连。
请参见图2,在该网络设备10中,第一板卡内存33和第二板卡内存43均映射在网络设备10的PCI-E内存空间60中,并且,第一板卡内存33在PCI-E内存空间60中映射的地址区间60b不同于第二板卡内存43在PCI-E内存空间60中映射的地址区间60a。从而,第一板卡内存33和第二板卡内存43对第一PCI-E RC 241和第二PCI-E RC 242、以及第一PCI-E Endpoint 34和第二PCI-E Endpoint 44均可见。
请参见图3a并结合图2,对于从第一接口板30向第二接口板40转发的情况:
当第一接口板30中的第一逻辑装置32从第一以太交换芯片31接收以太数据报文51时,第一逻辑装置32可以确定该以太数据报文51的目的板卡。其中,该以太数据报文51可以是由第一以太交换芯片31从网络设备10的外部接收,第一逻辑装置32可以根据维护有以太MAC地址与板卡标识的映射表确定以太数据报文51的目的板卡。
在一个例子中,无论以太数据报文51的目的板卡是第二接口板40还是转发板20,该以太数据报文51都将以PCI-E格式转发至转发板20中的系统内存22、并由CPU 21从系统内存22中读取后处理。
在本例中,当确定从第一以太交换芯片31接收到的以太数据报文51是以第二接口板40为目的板卡时,第一逻辑装置32可以将该以太数据报文51封装为以第二板卡内存43的PCI-E内存空间地址(即,图2中示出的地址区间60a内的地址)为目的地址的PCI-E报文(本文中的“PCI-E报文”可以是指PCI-E写报文的简称)61,以供第一PCI-E Endpoint 34将该PCI-E报文61向转发板20转发、并使转发板20根据该PCI-E报文61的目的地址将该PCI-E报文61向作为目的板卡的第二接口板40的第二板卡内存43转发。例如,第一PCI-E Endpoint 34将该PCI-E报文61发送至第一PCI-E RC 241、并由第一PCI-E RC 241根据该PCI-E报文61的目的地址将该PCI-E报文61转发给第二PCI-E RC 242;封装有以太数据报文51的PCI-E报文61被第二PCI-E RC 242转发至第二PCI-E Endpoint 44、并被第二PCI-E Endpoint 44写入第二板卡内存43。
相应地,第二逻辑装置42可以从第二板卡内存43中获取该PCI-E报文61、并从该PCI-E报文61中解析出以太数据报文51发送至第二以太交换芯片41,从而可以由第二以太交换芯片41发出至网络设备10的外部。
请参见图3b并结合图2,与从第一接口板30向第二接口板40转发同理,对于从第二接口板40向第一接口板30转发的情况:
当第二接口板40中的第二逻辑装置42从第二以太交换芯片41接收以太数据报文52时,第二逻辑装置42可以确定该以太数据报文52的目的板卡。其中,该以太数据报文52可以是由第二以太交换芯片41从网络设备10的外部接收。
当确定从第二以太交换芯片41接收到的以太数据报文52是以第一接口板30为目的板卡时,第二逻辑装置42可以将该以太数据报文52封装为以第一板卡内存33的PCI-E内存空间地址(即,图2中示出的地址区间60b内的地址)为目的地址的PCI-E报文62,以供第二PCI-E Endpoint 44将该PCI-E报文62向转发板20转发、并使转发板20根据该PCI-E报文62的目的地址将该PCI-E报文62向作为目的板卡的第一接口板30的第一板卡内存33转发。例如,第二PCI-E Endpoint 44将该PCI-E报文62发送至第二PCI-E RC 242、并由第二PCI-E RC 242根据该PCI-E报文62的目的地址将该PCI-E报文62转发给第一PCI-E RC 241;封装有以太数据报文52的PCI-E报文62被第一PCI-E RC 241转发至第一PCI-E Endpoint 34、并被第一PCI-E Endpoint 34写入第一板卡内存33。
相应地,第一逻辑装置32可以从第一板卡内存33中获取该PCI-E报文62、并从该PCI-E报文62中解析出以太数据报文52转发至第一以太交换芯片31,从而可以由第一以太交换芯片31发出至网络设备10的外部。
如上可见,以太数据报文51或52在第一接口板30与第二接口板40之间的转发可以无需转发板20的CPU 21参与,并且,以太数据报文51或52在第一接口板30与第二接口板40之间的转发性能可以得到一定程度的提升。
另外,请参见图4,转发板20中的系统内存22映射在网络设备10的PCI-E内存空间60中,即,内存空间60对第一PCI-E RC 241和第二PCI-E RC 242、以及第一PCI-E Endpoint 34和第二PCI-E Endpoint 44均可见。并且,系统内存22在PCI-E内存空间中映射的地址区间60c不同于第一板卡内存33在PCI-E内存空间60中映射的地址区间60b、以及第二板卡内存43在PCI-E内存空间60中映射的地址区间60a。其中,系统内存22的映射,可以是系统内存22的全部内存空间或部分内存空间映射在PCI-E内存空间60中。
请参见图5a并结合图4,对于从第一接口板30向转发板20转发的情况:
当第一接口板30中的第一逻辑装置32从第一以太交换芯片31接收以太数据报文53时,第一逻辑装置32可以确定该以太数据报文53的目的板卡,例如,第一逻辑装置32可以根据以太MAC地址与板卡标识的映射表中由转发板20的CPU 21配 置的表项确定以太数据报文53的目的板卡。
当确定从第一以太交换芯片31接收到的以太数据报文53以转发板20为目的板卡时,第一逻辑装置32可以将该以太数据报文53封装为以系统内存22的PCI-E内存空间地址(即,图4中示出的地址区间60c内的地址)为目的地址的PCI-E报文63,以供第一PCI-E Endpoint 34将该PCI-E报文63发送至第一PCI-E RC 241、并可以由第一PCI-E RC 241根据该PCI-E报文63的目的地址将该PCI-E报文63写入至系统内存22中,从而,当CPU 21通过系统内存控制器23访问系统内存22时即可读取到该PCI-E报文63、并从该PCI-E报文63中解析得到以太数据报文53。
请参见图5b并结合图4,与从第一接口板30向转发板20转发同理,对于从第二接口板40向转发板20转发的情况:
当第二接口板40中的第二逻辑装置42从第二以太交换芯片41接收以太数据报文54时,第二逻辑装置42可以确定该以太数据报文54的目的板卡。
当确定从第二以太交换芯片41接收到的以太数据报文54以转发板20为目的板卡时,第二逻辑装置42可以将该以太数据报文54封装为以系统内存22的PCI-E内存空间地址(即,图4中示出的地址区间60c内的地址)为目的地址的PCI-E报文64,以供第二PCI-E Endpoint 44将该PCI-E报文64发送至第二PCI-E RC 242、并可以由第二PCI-E RC 242根据该PCI-E报文64的目的地址将该PCI-E报文64写入至系统内存22中,从而,当CPU 21通过系统内存控制器23访问系统内存22时即可读取到该PCI-E报文64、并从该PCI-E报文64中解析得到以太数据报文54。
如上可见,这个例子中采用的第一接口板30与第二接口板40之间的转发,可以不影响第一接口板30和第二接口板40中的每一个与转发板20之间的转发。
请参见图6a和图6b,第一逻辑装置32和第二逻辑装置42中可以分别维护以太MAC地址与板卡标识的映射表71a和71b,用于根据以太数据报文51或52或53或54的目的MAC地址确定该以太数据报文51或52或53或54的目的板卡。
并且,仍参见图6a和图6b,第一逻辑装置32和第二逻辑装置42中还可以维护板卡标识与PCI-E内存空间地址的映射表72,用于根据以太数据报文51或52或53或54的目的板卡确定封装有该以太数据报文51或52或53或54的PCI-E报文61或62或63或64的目的地址。
其中,图6a中示出了以太MAC地址与板卡标识的映射表71a和71b以及板卡标识与PCI-E内存空间地址的映射表72在网络设备10初始化后的状态,如图6a所 示,以太MAC地址与板卡标识的映射表71a和71b中对应转发板20的表项、以及板卡标识与PCI-E内存空间地址的映射表72中的所有表项均可以由CPU 21在网络设备10初始化时利用PCI-E报文60下发。图6a和图6b中示出的“MAC0”表示转发板20的MAC地址。
在该例子中,用于下发表项的PCI-E报文60是由CPU 21利用第一PCI-E总线11和第二PCI-E总线12转发的,即,CPU 21将第一PCI-E总线11和第二PCI-E总线12复用为用于下发表项的管理总线。可以理解的是,网络设备10中也可以设置独立于第一PCI-E总线11和第二PCI-E总线12、并专用于下发表项的管理总线,此时,下发的表项可以携带在独立的管理总线的协议报文中。
另外,图6b中示出的以太MAC地址与板卡标识的映射表71a中对应第二接口板40的表项则可以由第一逻辑装置32通过来自第二接口板40的PCI-E报文62中封装的以太数据报文52学习得到,图6b中示出的“MAC2”表示以太数据报文52的源MAC地址。
同理,图6b中示出的以太MAC地址与板卡标识的映射表71b中对应第一接口板30的表项可以由第二逻辑装置42通过来自第一接口板30的PCI-E报文61中封装的以太数据报文51学习得到,图6b中示出的“MAC1”表示以太数据报文51的源MAC地址。
当第一板卡内存33中被写入由第二逻辑装置42封装的PCI-E报文62时,第一逻辑装置32可以从第一板卡内存33中获取该PCI-E报文62,该PCI-E报文62中封装有第二逻辑装置42从第二以太交换芯片41接收到的以太数据报文52、并携带有第二接口板40的板卡标识。并且,若以太MAC地址与板卡标识的映射表71a中此时不存在对应第二接口板40的表项,则第一逻辑装置32可以根据从该PCI-E报文62中解析出的以太数据报文52的源MAC地址和该PCI-E报文62中携带的板卡标识,在以太MAC地址与板卡标识的映射表71a中创建对应第二接口板40的表项。
同理,当第二板卡内存43中被写入由第一逻辑装置32封装的PCI-E报文61时,第二逻辑装置42可以从第二板卡内存43中获取该PCI-E报文61,该PCI-E报文61中封装有第一逻辑装置32从第一以太交换芯片31接收到的以太数据报文51、并携带有第一接口板30的板卡标识。并且,若以太MAC地址与板卡标识的映射表71b中此时不存在对应第一接口板30的表项,则第二逻辑装置42可以根据从该PCI-E报文61中解析出的以太数据报文51的源MAC地址和该PCI-E报文61中携带的板 卡标识,在以太MAC地址与板卡标识的映射表71b中创建对应第一接口板30的表项。
其中,第一逻辑装置32或者第二逻辑装置42有可能不能根据以太MAC地址与板卡标识的映射表查找到一个以太数据报文的目的板卡。
例如,以太数据报文51是广播报文,由于其目的MAC地址为广播地址,所以其不能够通过以太MAC地址与板卡标识的映射表唯一地确定以太数据报文51的目的板卡,导致目的板卡查询失败。,此时,可以采用向除本板之外的其它所有接口板逐一转发PCI-E报文的方式。在如下示例中,仍继续使用标识“51”表示第一以太交换芯片31接收到的以太数据报文,但可以理解的是,如下示例中利用标识“51”表示的以太数据报文可以不同于前文描述的以太数据报文51。
请参见图7,网络设备10’中可以进一步包括第三接口板80,该第三接口板80包括第三PCI-E Endpoint 84和第三板卡内存83。相应地,转发板20’可以进一步包括第三PCI-E RC 243,第三PCI-E RC 243通过第三PCI-E总线13连接第三PCI-E Endpoint 84。其中,与第一PCI-E RC 241和第二PCI-E RC 242同理,第三PCI-E RC243可以是具有PCI-E RC的独立实体,也可以是逻辑上形成的等效PCI-E RC。
如图8所示,第三板卡内存83可以映射在PCI-E内存空间60中,并且,第三板卡内存83在PCI-E内存空间60中映射的地址区间60d不同于系统内存22、第一板卡内存33、以及第二板卡内存43在PCI-E内存空间中分别映射的地址区间60c、60b以及60a。其中,CPU 21在初始化时下发的板卡标识与PCI-E内存空间地址的映射表72中的所有表项进一步包括了对应第三接口板80的表项。并且,第三PCI-E RC 243和第三PCI-E Endpoint 84也可以映射在PCI-E内存空间60中。
请参见图9并结合图8:
当从第一以太交换芯片31接收到以太数据报文51时,第一逻辑装置32通过匹配以太MAC地址与板卡标识的映射表71a中的表项来确定该以太数据报文51的目的板卡;
若在以太MAC地址与板卡标识的映射表71a中匹配表项失败,即,第一逻辑装置32确定该以太数据报文51不是以转发板20为目的板卡(例如,映射表71a中只有初始的MAC0转发板20的表项),此时以太数据报文51的目的MAC可以是广播地址,则第一逻辑装置32可以根据板卡标识与PCI-E内存空间地址的映射表72,将该以太数据报文51封装为分别以第二板卡内存43的PCI-E内存空间地址(即, 图8中示出的地址区间60a内的地址)和第三板卡内存83的PCI-E内存空间地址(即,图8中示出的地址区间60d内的地址)为目的地址的多于一个PCI-E报文61和61’,以供第一PCI-E Endpoint 34将PCI-E报文61和61’发送至第一PCI-E RC 241、并使第一PCI-E RC 241根据PCI-E报文61和61’的目的地址将PCI-E报文61和61’分别转发给第二PCI-E RC 242和第三PCI-E RC 243。
从而,封装有以太数据报文51的PCI-E报文61可以到达第二接口板40,并使第二接口板40中的第二逻辑装置42在以太MAC地址与板卡标识的映射表71b中创建对应第一接口板30的表项。同理,PCI-E报文61’可以到达第三接口板80,并且,第三接口板80中可以具有与第一逻辑装置32或第二逻辑装置42类似的装置(图7和图9中均未示出),该装置可以解析出以太数据报文51,在以太MAC地址与板卡标识的映射表中创建对应第一接口板30的表项。
基于上述的原理,请参见图10,第一逻辑装置32可以包括:以太总线控制器320、以太报文接收处理模块321、第一映射表维护模块322、PCI-E报文发送处理模块323、第二映射表维护模块324、PCI-E报文接收处理模块325、以太报文发送处理模块326、以及CPU交换寄存器327。
请参见图11,基于如图10所示的结构,第一逻辑装置32的工作原理如下(由于以太数据报文51和53在第一逻辑装置32中的处理方式基本相同、并且PCI-E报文61和63在第一逻辑装置32中的处理方式基本相同,因此,为了简化视图,图11中省略了以太数据报文53和PCI-E报文63):
以太总线控制器320连接本板的第一以太交换芯片31,从而第一逻辑装置32与第一以太交换芯片31之间的以太数据报文51和52可以交互。
以太报文接收处理模块321从以太总线控制器320接收来自第一以太交换芯片31的以太数据报文51。
第一映射表维护模块322中维护有以太MAC地址与板卡标识的映射表71a,供以太报文接收处理模块321通过匹配以太MAC地址与板卡标识的映射表71a中的表项确定该以太数据报文51的目的板卡。
以太报文接收处理模块321在完成匹配后,会将以太数据报文51与匹配结果一起发送至PCI-E报文发送处理模块323。其中,此处所述的匹配结果可以为成功匹配到的目的板卡的板卡标识、或匹配失败,图11中以匹配到第二接口板40的板卡标识为例(如图11中的S1111所示)。
PCI-E报文发送处理模块323对接收到的以太数据报文51进行封装。
第二映射表维护模块324中维护有板卡标识与PCI-E内存空间地址的映射表72,供PCI-E报文发送处理模块323通过匹配板卡标识与PCI-E内存空间地址的映射表72中的表项确定封装形成的PCI-E报文61的目的地址。
PCI-E报文发送处理模块323完成匹配后,会将以太数据报文51封装在PCI-E报文61中、并根据匹配结果确定PCI-E报文61的目的地址,然后,将PCI-E报文61提供给第一PCI-E Endpoint 34。
其中,此处所述的匹配结果可以为已确定的一块目的板卡的板卡标识、或者为所有其它接口板的板卡标识(在确定目的板卡失败时),图11中以匹配到第二板卡内存43的地址区间60a为例(如图11中的S1112和S1113所示)。
对于确定目的板卡失败的情况:第一映射表维护模块322中可以维护有对应网络设备10中除本板30之外的所有其它接口板(即第二接口板40和第三接口板80)的表项,以太报文接收处理模块321可以向PCI-E报文发送处理模块323上报匹配结果为网络设备10中除本板30之外的所有其它接口板,使PCI-E报文发送处理模块323可以逐一匹配到除本板30之外的所有其它接口板对应的表项;或者,以太报文接收处理模块321可以向PCI-E报文发送处理模块323上报匹配失败的通知,使PCI-E报文发送处理模块323可以轮询除本板30之外的所有其它接口板对应的表项。
另外,CPU交换寄存器327中可以存放由CPU 21下发的本板的板卡标识,PCI-E报文发送处理模块323可以将本板的板卡标识封装在PCI-E报文61中(如图11中的S1114所示),以供接收到该PCI-E报文61的其它接口板(包括但不限于第二接口板40)能够利用本板的板卡标识和以太数据报文51的源MAC地址学习对应于本板的以太MAC地址与板卡标识的映射表项。
PCI-E报文接收处理模块325可以从第一板卡内存33中读取以本板为目的板卡的PCI-E报文60和62。
PCI-E报文接收处理模块325可以从PCI-E报文60中解析得到CPU 21在网络设备10初始化时下发的表项、并写入至第一映射表维护模块322和第二映射表维护模块324中(如图11中的S1121所示);并且,PCI-E报文接收处理模块325可以从PCI-E报文60中解析得到CPU 21在网络设备10初始化时下发的本板的板卡标识、并写入至CPU交换寄存器327中(如图11中的S1122所示)。
PCI-E报文接收处理模块325可以从PCI-E报文62中解析得到来自第二接口板40的以太数据报文52以及第二接口板40的板卡标识,并将解析得到的以太数据报文52和板卡标识转发给以太报文发送处理模块326。
以太报文发送处理模块326从接收到的以太数据报文52中提取源MAC地址,并根据提取的源MAC地址与接收到的第二接口板40的板卡标识在第一映射表维护模块322中创建对应第二接口板40的表项。
以太报文发送处理模块326还将接收到的以太数据报文52发送至以太总线控制器320。
第二逻辑装置42可以具有与第一逻辑装置32基本相同的结构。
请参见图12,在另一个例子中,第二板卡内存43在PCI-E内存空间60中映射的地址区间60a中可以包括多个数据报文缓存片60a_1~60a_m(m为大于1的正整数)以及至少一个控制报文缓存片60a_ctr,同理,第一板卡内存33在PCI-E内存空间60中映射的地址区间60b中可以包括多个数据报文缓存片60b_1~60b_n(n为大于1的正整数)以及至少一个控制报文缓存片60b_ctr。
相应地,第一逻辑装置32维护的板卡标识与PCI-E内存空间地址的映射表72中包括的对应第二接口板40的表项91,被划分为分别对应多个数据报文缓存片60a_1~60a_m的子表项,并且每个子表项具有一个标志位Flag,用于表示该子表项对应的数据报文缓存片60a_i(i为大于等于1且小于等于m的正整数)的状态是占用还是空闲。同理,第二逻辑装置42维护的板卡标识与PCI-E内存空间地址的映射表72中包括的对应第一接口板30的表项92中,也被划分为分别对应多个数据报文缓存片60b_1~60b_n的子表项,并且每个子表项具有一个标志位Flag,用于表示该子表项对应的数据报文缓存片60b_j(j为大于等于1且小于等于n的正整数)的状态是占用还是空闲。
另外,控制报文缓存片60a_ctr和60b_ctr可以作为板卡标识与PCI-E内存空间地址的映射表72中的不具有标志位Flag的一个子表项,或者,可以独立于板卡标识与PCI-E内存空间地址的映射表72单独存放。
请参见图13a并结合图12,对于从第一接口板30向第二接口板40转发的情况:
当第一接口板30中的第一逻辑装置32从第一以太交换芯片31接收以太数据报文51时,第一逻辑装置32可以确定该以太数据报文51的目的板卡。
当确定从第一以太交换芯片31接收到的以太数据报文51是以第二接口板40为 目的板卡时,第一逻辑装置32从第二板卡内存43在PCI-E内存空间60中映射的多个数据报文缓存片60a_1~60a_m中选择一片空闲的数据报文缓存片60a_i、并将其对应的子表项的标志位Flag置为占用状态,然后将该以太数据报文51封装为以数据报文缓存片60a_i的PCI-E内存空间地址为目的地址的PCI-E报文61,以供第一PCI-E Endpoint 34将该PCI-E报文61发送至第一PCI-E RC 241、并可以由第一PCI-E RC 241根据该PCI-E报文61的目的地址将该PCI-E报文61转发给第二PCI-E RC242。
即,第一逻辑装置32为PCI-E报文61设定的目的地址为第二板卡内存43映射的一个空闲的数据报文缓存片60a_i的PCI-E内存空间地址。
相应地,当封装有以太数据报文51的PCI-E报文61被第二PCI-E RC 242转发至第二PCI-E Endpoint 44、并被第二PCI-E Endpoint 44写入第二板卡内存43中对应数据报文缓存片60a_i的地址区间时,第二逻辑装置42可以从第二板卡内存43中获取该PCI-E报文61、并从该PCI-E报文61中解析出以太数据报文51发送至第二以太交换芯片41,从而可以由第二以太交换芯片41发出至网络设备10的外部。
并且,第二逻辑装置42还构造以控制报文缓存片60b_ctr为目的地址的PCI-E报文65,以供第二PCI-E Endpoint 44将该PCI-E报文65发送至第二PCI-E RC 242、并可以由第二PCI-E RC 242根据该PCI-E报文65的目的地址将该PCI-E报文65转发给第一PCI-E RC 241。其中,PCI-E报文65中还携带有表示释放数据报文缓存片60a_i的数据报文缓存片释放信息。
相应地,当封装有数据报文缓存片释放信息的PCI-E报文65被第一PCI-E RC241转发至第一PCI-E Endpoint 34、并被第一PCI-E Endpoint 34写入第一板卡内存33中对应控制报文缓存片60b_ctr的地址区间时,第一逻辑装置32可以从第一板卡内存33中获取该PCI-E报文65、并从该PCI-E报文65中解析出数据报文缓存片释放信息,从而可以根据数据报文缓存片释放信息将数据报文缓存片60a_i对应的子表项的标志位Flag置为空闲状态。
请参见图13b并结合图12,与从第一接口板30向第二接口板40转发同理,第二逻辑装置42为PCI-E报文62设定的目的地址可以为第一板卡内存33映射的一个空闲的数据报文缓存片60b_j的PCI-E内存空间地址,并将数据报文缓存片60b_j对应的子表项的标志位Flag置为占用状态。相应地,第一逻辑装置32也可以构造以控制报文缓存片60a_ctr为目的地址的PCI-E报文66,以供第二逻辑装置42将数 据报文缓存片60b_j对应的子表项的标志位Flag置为空闲状态。
如上可见,通过对PCI-E内存地址空间的分片划分,可以对以太数据报文在第一接口板30与第二接口板40之间的转发实施反压流控。
请参见图14,当对PCI-E内存地址空间采用分片划分时,第一逻辑装置32中可以进一步包括释放执行模块328和释放通告模块329。
请参见图15a,PCI-E报文发送处理模块323在将以太数据报文51封装为以数据报文缓存片60a_i的PCI-E内存空间地址为目的地址PCI-E报文61时,可以将数据报文缓存片60a_i对应的子表项的标志位Flag置为占用状态(如图15a中的S1411所示)。并且,PCI-E报文接收处理模块325在从第一板卡内存33中对应本板的控制报文缓存片60b_ctr的地址区间读取到PCI-E报文65时,可以从PCI-E报文65中解析出数据报文缓存片释放信息提供给释放执行模块328。从而,释放执行模块328可以根据PCI-E报文接收处理模块325提供的该数据报文缓存片释放信息将数据报文缓存片60a_i对应的子表项的标志位Flag置为空闲状态(如图15a中的S1412所示)。
请参见图15b,PCI-E报文接收处理模块325从第一板卡内存33中对应数据报文缓存片60b_j的地址区间读取到PCI-E报文62时,从该PCI-E报文62中解析出以太数据报文52、并提取该PCI-E报文62的目的地址和携带的第二接口板40的板卡标识提供给以太报文发送处理模块326;以太报文发送处理模块326在将以太数据报文52发出时,可以将PCI-E报文62的目的地址和第二接口板40的板卡标识提供给释放通告模块329。从而,释放通告模块329根据PCI-E报文62的目的地址和第二接口板40的板卡标识,通知PCI-E报文发送处理模块323构造以对应第二接口板40的控制报文缓存片60a_ctr的PCI-E报文66,并将PCI-E报文62的目的地址(即数据报文缓存片60b_j的PCI-E内存空间地址)作为数据报文缓存片释放信息携带在该PCI-E报文66中。
以上是对上述例子中的网络设备的详细说明。下述的例子中,还提供了用于网络设备中的报文转发方法。
请参见图16,当该报文转发方法应用于如图1所示的网络设备10中的第一接口板30或第二接口板40时,该报文转发方法可以包括在第一逻辑装置32或第二逻辑装置42执行的如下步骤:
S1601,当从以太交换芯片接收到来自网络设备的外部的以太数据报文时,确定 该以太数据报文的目的板卡;
S1602,当成功确定从以太交换芯片接收到的以太数据报文的目的板卡为网络设备的其它接口板时,将该以太数据报文封装为以该其它接口板的板卡内存的PCI-E内存空间地址为目的地址的PCI-E报文,以供PCI-E Endpoint将该PCI-E报文向转发板转发、并使转发板根据该PCI-E报文的目的地址将该PCI-E报文向目的板卡的板卡内存转发;
S1603,当从板卡内存获取来自其它接口板的PCI-E报文时,从获取到的该PCI-E报文中解析出以太数据报文转发至以太交换芯片。
另外,当S1601确定从第一以太交换芯片接收到的以太数据报文以转发板为目的板卡时,报文转发方法可以进一步包括:将该以太数据报文封装为以系统内存的PCI-E内存空间地址为目的地址的PCI-E报文,以供PCI-E Endpoint将该PCI-E报文向转发板转发、并被写入转发板的系统内存中。
在该例子中,该报文转发方法可以进一步维护板卡标识与PCI-E内存空间地址的映射表,用于根据从以太交换芯片接收到的以太数据报文的目的板卡,确定封装有该以太数据报文的PCI-E报文的目的地址;并且,根据转发板的CPU的配置,在板卡标识与PCI-E内存空间地址的映射表中创建表项。
在该例子中,报文转发方法可以进一步维护以太MAC地址与板卡标识的映射表,并且,报文转发方法可以从来自其它接口板的PCI-E报文中解析出其它接口板从网络设备的外部接收到的以太数据报文、以及该其它接口板的板卡标识,并且根据解析出的以太数据报文的源MAC地址和该PCI-E报文中携带的板卡标识,在以太MAC地址与板卡标识的映射表创建对应该其它接口板的表项;以及,报文转发方法可以根据CPU在网络设备初始化时的配置,在以太MAC地址与板卡标识的映射表中对应转发板的表项。此时,上述的S1601可以根据以太数据报文的目的MAC地址确定该以太数据报文的目的板卡。
在该例子中,该报文转发方法还可以进一步在确定从以太交换芯片接收到的以太数据报文的目的板卡失败时,将该以太数据报文封装为分别以网络设备的每一块其它接口板的卡内存的PCI-E内存空间地址为目的地址的多于一个PCI-E报文,以供PCI-E Endpoint将多于一个PCI-E报文向转发板转发、并使转发板根据多于一个PCI-E报文的目的地址将多于一个PCI-E报文分别转发给对应的其它接口板。
另外,该例子中的报文转发方法可以按照与如图12原理相同的方式采用反压控 制的流控机制,相应地:
各接口板的板卡内存在PCI-E内存空间中映射的地址区间中包括多个数据报文缓存片、以及至少一个控制报文缓存片;并且,该报文转发方法进一步包括:
当确定从以太交换芯片接收到的以太数据报文以其它接口板为目的板卡时,选择该其它接口板的板卡内存映射的一个空闲的数据报文缓存片的PCI-E内存空间地址作为PCI-E报文设定的目的地址;并且,根据该PCI-E报文的目的地址将该其它接口板的对应的数据报文缓存片的状态记录由空闲变更为占用;
当从来自其它接口板的PCI-E报文中解析出的以太数据报文转发至以太交换芯片时,构造携带有数据报文缓存片释放信息的PCI-E报文、并且构造的PCI-E报文以该其它接口板的板卡内存映射的控制报文缓存片为目的地址,以供PCI-E Endpoint将构造的该PCI-E报文向转发板转发、并使转发板根据构造的该PCI-E报文的目的地址将该PCI-E报文向该其它接口板转发;
当从来自其它接口板的PCI-E报文中解析出数据报文缓存片释放信息时,根据解析出的数据报文缓存片释放信息将该其它接口板的对应的数据报文缓存片的状态记录由占用重置为空闲。
如上可见,基于上述的例子,当接口板从以太交换芯片接收到以太数据报文时,若确定该以太数据报文以其它接口板为目的板卡,则该接口板的逻辑装置可以将该以太数据报文封装为以其它接口板中的板卡内存的PCI-E内存空间地址为目的地址的PCI-E报文,以使得该PCI-E报文可以通过转发板被转发至其它接口板的板卡内存;并且,当接口板从板卡内存中获取到来自其它接口板的PCI-E报文时,该接口板的逻辑装置可以从该PCI-E报文中解析出以太数据报文并转发。从而,以太数据报文在接口板之间的转发无需转发板的CPU参与、并且可以提升转发性能。

Claims (15)

  1. 一种网络设备的第一接口板,其特征在于,包括PCI-E端点、以太交换芯片、板卡内存、以及逻辑装置;
    PCI-E端点通过PCI-E总线连接网络设备的转发板中对应该第一接口板的PCI-E根复合体;
    以太交换芯片从网络设备的外部接收第一以太数据报文;
    板卡内存映射在网络设备的PCI-E内存空间中,并且,该板卡内存在PCI-E内存空间中映射的地址区间不同于其它接口板的板卡内存、以及转发板的系统内存在PCI-E内存空间中映射的地址区间;
    逻辑装置从以太交换芯片接收第一以太数据报文、并确定该第一以太数据报文的目的板卡;其中,当确定从以太交换芯片接收到的第一以太数据报文以网络设备的第二接口板为目的板卡时,逻辑装置将该第一以太数据报文封装为以该第二接口板的板卡内存的PCI-E内存空间地址为目的地址的第一PCI-E报文,以供PCI-E端点将该第一PCI-E报文向网络设备转发板转发;
    逻辑装置还从板卡内存获取来自第三接口板的第二PCI-E报文,并从获取到的该第二PCI-E报文中解析出第二以太数据报文发送至以太交换芯片,其中所述第三接口板与第二接口板相同,或不同。
  2. 根据权利要求1所述的第一接口板,其特征在于,逻辑装置中维护有以太MAC地址与板卡标识的第一映射表,用于根据第一以太数据报文的目的MAC地址确定该第一以太数据报文的目的板卡;
    并且,来自第三接口板的第二PCI-E报文中封装有第三接口板从网络设备的外部接收到的第二以太数据报文、并携带有该第三接口板的板卡标识;
    以及,逻辑装置根据从该第二PCI-E报文中解析出的第二以太数据报文的源MAC地址和该第二PCI-E报文中携带的板卡标识,在以太MAC地址与板卡标识的第一映射表创建对应该第三接口板的表项。
  3. 根据权利要求1所述的第一接口板,其特征在于,逻辑装置维护有板卡标识与PCI-E内存空间地址的第二映射表,用于根据从以太交换芯片接收到的第一以太数据报文的目的板卡,确定封装有该第一以太数据报文的第一PCI-E报文的目的地址;并且,根据转发板CPU的配置,在板卡标识与PCI-E内存空间地址的第二映射 表中创建表项。
  4. 根据权利要求1所述的第一接口板,其特征在于,当确定从以太交换芯片接收到的第一以太数据报文目的板卡失败时,逻辑装置将该第一以太数据报文封装为分别以网络设备每一块其它接口板的板卡内存的PCI-E内存空间地址为目的地址的PCI-E报文,以供PCI-E端点向转发板转发。
  5. 根据权利要求1所述的第一接口板,其特征在于,该第一接口板与其它接口板的板卡内存在PCI-E内存空间中映射的地址区间中包括多个数据报文缓存片、以及至少一个控制报文缓存片;
    当确定从以太交换芯片接收到的第一以太数据报文以第二接口板为目的板卡时,逻辑装置为所述第一PCI-E报文设定的目的地址为该第二接口板的板卡内存映射的一个空闲的数据报文缓存片的PCI-E内存空间地址,并且,逻辑装置还根据该第一PCI-E报文的目的地址将该第二接口板的对应的数据报文缓存片的状态记录由空闲变更为占用;
    当从来自第三接口板的第二PCI-E报文中解析出的第二以太数据报文发送至以太交换芯片时,逻辑装置构造携带有数据报文缓存片释放信息的第三PCI-E报文、并且构造的第三PCI-E报文以该第三接口板的板卡内存映射的控制报文缓存片为目的地址,以供PCI-E端点将构造的该第三PCI-E报文向转发板转发,其中所述释放信息对应的数据报文缓存片为所述第二PCI-E报文设定的目的地址;
    当从来自第三接口板的第四PCI-E报文中解析出数据报文缓存片释放信息时,逻辑装置根据解析出的数据报文缓存片释放信息将该第三接口板的对应的数据报文缓存片的状态记录由占用重置为空闲。
  6. 根据权利要求1所述的第一接口板,其特征在于,当确定从以太交换芯片接收到的第一以太数据报文的目的板卡为网络设备的转发板时,逻辑装置将该第一以太数据报文封装为以系统内存的PCI-E内存空间地址为目的地址的PCI-E报文,以供PCI-E端点将该PCI-E报文向转发板转发、以由转发板写入转发板的系统内存中。
  7. 根据权利要求6所述的第一接口板,其特征在于,逻辑装置中维护有以太MAC地址与板卡标识的第一映射表,用于根据所述第一以太数据报文的目的MAC地址确定该第一以太数据报文的目的板卡;
    其中,逻辑装置根据CPU在网络设备初始化时的配置,在以太MAC地址与板卡标识的第一映射表中创建对应转发板的表项;
    并且,逻辑装置从板卡内存获取的来自第三接口板的第二PCI-E报文中,封装有所述第三接口板从网络设备的外部接收到的第二以太数据报文、并携带有该第三接口板的板卡标识;
    以及,逻辑装置根据从该第二PCI-E报文中解析出的第二以太数据报文的源MAC地址和该第二PCI-E报文中携带的板卡标识,在以太MAC地址与板卡标识的第一映射表创建对应该第三接口板的表项。
  8. 一种网络设备,其特征在于,包括转发板和至少两块接口板,转发板包括CPU、系统内存、系统内存控制器、以及分别对应各接口板的PCI-E根复合体,每块接口板包括以太交换芯片、逻辑装置、板卡内存、以及通过PCI-E总线连接转发板中对应的PCI-E根复合体的PCI-E端点;
    各接口板的以太交换芯片从网络设备的外部接收第一以太数据报文;
    各接口板的板卡内存以及转发板的系统内存映射在网络设备的PCI-E内存空间中,并且,各接口板的板卡内存以及转发板的系统内存在PCI-E内存空间中映射的地址区间互不相同;
    每块接口板的逻辑装置从该接口板的以太交换芯片接收所述第一以太数据报文、并确定该第一以太数据报文的目的板卡;其中,当确定从以太交换芯片接收到的所述第一以太数据报文以网络设备的第二接口板为目的板卡时,逻辑装置将该第一以太数据报文封装为以该第二接口板的板卡内存的PCI-E内存空间地址为目的地址的第一PCI-E报文,以供该接口板的PCI-E端点将该第一PCI-E报文向转发板的对应PCI-E根复合体转发、并使转发板的对应PCI-E根复合体根据该第一PCI-E报文的目的地址将该第一PCI-E报文向目的板卡的板卡内存转发;
    每块接口板的逻辑装置还从该接口板的板卡内存获取来自第三接口板的第二PCI-E报文,并从获取到的该第二PCI-E报文中解析出第二以太数据报文发送至该接口板的以太交换芯片,其中所述第三接口板与第二接口板相同,或不同。
  9. 根据权利要求8所述的网络设备,其特征在于,每块接口板的逻辑装置在确定从以太交换芯片接收到的第一以太数据报文的目的板卡为转发板时,将该第一以太数据报文封装为以系统内存的PCI-E内存空间地址为目的地址的PCI-E报文,以供PCI-E端点将该PCI-E报文向转发板的对应PCI-E根复合体转发、并由转发板的对应PCI-E根复合体写入转发板的系统内存中。
  10. 根据权利要求9所述的网络设备,其特征在于,每块接口板的逻辑装置中 维护有以太MAC地址与板卡标识的第一映射表,用于根据所述第一以太数据报文的目的MAC地址确定该第一以太数据报文的目的板卡;
    其中,每块接口板的逻辑装置根据CPU在网络设备初始化时的配置,在以太MAC地址与板卡标识的第一映射表中创建对应转发板的表项;
    并且,每块接口板的逻辑装置从该接口板的板卡内存获取的来自所述第三接口板的第二PCI-E报文中,封装有所述第三接口板从网络设备的外部接收到的第二以太数据报文、并携带有该第三接口板的板卡标识;
    以及,每块接口板的逻辑装置根据从该第二PCI-E报文中解析出的第二以太数据报文的源MAC地址和该第二PCI-E报文中携带的板卡标识,在以太MAC地址与板卡标识的第一映射表创建对应该第三接口板的表项。
  11. 根据权利要求8所述的网络设备,其特征在于,每块接口板的逻辑装置维护有板卡标识与PCI-E内存空间地址的第二映射表,用于根据从以太交换芯片接收到的第一以太数据报文的目的板卡,确定封装有该第一以太数据报文的第一PCI-E报文的目的地址;并且,每块接口板的逻辑装置根据CPU在网络设备初始化时的配置,在板卡标识与PCI-E内存空间地址的第二映射表中创建对应转发板和其它接口板的表项。
  12. 一种用于网络设备中的报文转发方法,其特征在于,网络设备包括转发板和至少两块接口板;每块接口板包括PCI-E端点、以太交换芯片、以及板卡内存;各接口板的板卡内存、以及转发板的系统内存共同映射在网络设备的PCI-E内存空间中,并且,各接口板的板卡内存、以及转发板的系统内存在PCI-E内存空间中映射的地址区间互不相同;
    该报文转发方法应用在任一块接口板、并包括:
    当从以太交换芯片接收到来自网络设备的外部的第一以太数据报文时,确定该第一以太数据报文的目的板卡;
    当确定从以太交换芯片接收到的第一以太数据报文的目的板卡为网络设备的第二接口板时,将该第一以太数据报文封装为以该第二接口板的板卡内存的PCI-E内存空间地址为目的地址的第一PCI-E报文,以供PCI-E端点将该第一PCI-E报文向转发板转发、并使转发板根据该第一PCI-E报文的目的地址将该第一PCI-E报文向目的板卡的板卡内存转发;
    当从板卡内存获取来自第三接口板的第二PCI-E报文时,从获取到的该第二 PCI-E报文中解析出第二以太数据报文转发至以太交换芯片。
  13. 根据权利要求12所述的报文转发方法,其特征在于,该报文转发方法进一步包括:
    维护以太MAC地址与板卡标识的第一映射表,用于根据第一以太数据报文的目的MAC地址确定该第一以太数据报文的目的板卡;
    从来自第三接口板的第二PCI-E报文中解析出第三接口板从网络设备的外部接收到的第二以太数据报文、以及该第三接口板的板卡标识;
    根据解析出的第二以太数据报文的源MAC地址和该第二PCI-E报文中携带的板卡标识,在以太MAC地址与板卡标识的第一映射表创建对应该第三接口板的表项。
  14. 根据权利要求12所述的报文转发方法,其特征在于,该报文转发方法进一步包括:维护板卡标识与PCI-E内存空间地址的第二映射表,用于根据从以太交换芯片接收到的第一以太数据报文的目的板卡,确定封装有该第一以太数据报文的第一PCI-E报文的目的地址;并且,根据转发板的CPU的配置,在板卡标识与PCI-E内存空间地址的第二映射表中创建表项。
  15. 根据权利要求12所述的报文转发方法,其特征在于,该报文转发方法进一步包括:当确定从以太交换芯片接收到的第一以太数据报文以的目的板卡失败时,将该第一以太数据报文封装为分别以网络设备的每一块其它接口板的卡内存的PCI-E内存空间地址为目的地址的多于一个PCI-E报文,以供PCI-E端点将多于一个PCI-E报文向转发板转发、并使转发板根据多于一个PCI-E报文的目的地址将多于一个PCI-E报文分别转发给对应的其它接口板。
PCT/CN2016/103943 2015-10-30 2016-10-31 报文转发 Ceased WO2017071667A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP16859102.2A EP3370377B1 (en) 2015-10-30 2016-10-31 Packet forwarding
US15/771,963 US10430364B2 (en) 2015-10-30 2016-10-31 Packet forwarding
JP2018521989A JP6592599B2 (ja) 2015-10-30 2016-10-31 パケット転送

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510721081.9A CN106713183B (zh) 2015-10-30 2015-10-30 网络设备的接口板以及该网络设备和报文转发方法
CN201510721081.9 2015-10-30

Publications (1)

Publication Number Publication Date
WO2017071667A1 true WO2017071667A1 (zh) 2017-05-04

Family

ID=58631397

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/103943 Ceased WO2017071667A1 (zh) 2015-10-30 2016-10-31 报文转发

Country Status (5)

Country Link
US (1) US10430364B2 (zh)
EP (1) EP3370377B1 (zh)
JP (1) JP6592599B2 (zh)
CN (1) CN106713183B (zh)
WO (1) WO2017071667A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021098602A1 (zh) * 2019-11-19 2021-05-27 中兴通讯股份有限公司 一种报文转发方法、装置及分布式设备
CN117499346A (zh) * 2023-12-28 2024-02-02 苏州元脑智能科技有限公司 访问控制信息的下发方法及装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108737296B (zh) 2017-09-27 2020-12-04 新华三技术有限公司 一种数据传输方法、装置和网络设备
US11803503B2 (en) 2021-07-08 2023-10-31 Mediatek Inc. Chip having dual-mode device that switches between root complex mode and endpoint mode in different system stages and associated computer system
CN114553797B (zh) * 2022-02-25 2023-05-09 星宸科技股份有限公司 具有命令转发机制的多芯片系统及地址产生方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101277195A (zh) * 2007-03-30 2008-10-01 杭州华三通信技术有限公司 一种交换网通信系统、实现方法及交换装置
CN102393838A (zh) * 2011-07-04 2012-03-28 华为技术有限公司 数据处理方法及装置、pci-e总线系统、服务器
US20130031288A1 (en) * 2011-07-27 2013-01-31 Agilent Technologies, Inc. Pci-e system having reconfigurable link architecture
CN103490961A (zh) * 2013-09-05 2014-01-01 杭州华三通信技术有限公司 一种网络设备

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5566170A (en) * 1994-12-29 1996-10-15 Storage Technology Corporation Method and apparatus for accelerated packet forwarding
US7830882B2 (en) * 2006-11-17 2010-11-09 Intel Corporation Switch scaling for virtualized network interface controllers
US8503468B2 (en) * 2008-11-05 2013-08-06 Fusion-Io, Inc. PCI express load sharing network interface controller cluster
US9306849B2 (en) * 2010-05-03 2016-04-05 Pluribus Networks, Inc. Methods and systems for managing distribute media access control address tables
JP5613517B2 (ja) 2010-09-30 2014-10-22 京セラドキュメントソリューションズ株式会社 情報処理装置
JP2013088879A (ja) 2011-10-13 2013-05-13 Kyocera Document Solutions Inc 情報処理装置
WO2013157256A1 (ja) 2012-04-18 2013-10-24 日本電気株式会社 インターワーク装置、方法、及びプログラムを格納した非一時的なコンピュータ可読媒体
US9178815B2 (en) * 2013-03-05 2015-11-03 Intel Corporation NIC flow switching
JP6070357B2 (ja) * 2013-03-28 2017-02-01 富士通株式会社 ストレージ装置
US9244874B2 (en) * 2013-06-14 2016-01-26 National Instruments Corporation Selectively transparent bridge for peripheral component interconnect express bus systems
US9135200B2 (en) * 2013-06-28 2015-09-15 Futurewei Technologies, Inc. System and method for extended peripheral component interconnect express fabrics
US10684973B2 (en) * 2013-08-30 2020-06-16 Intel Corporation NUMA node peripheral switch
KR101670342B1 (ko) * 2013-10-29 2016-10-28 후아웨이 테크놀러지 컴퍼니 리미티드 데이터 처리 시스템 및 데이터 처리 방법
JP6197674B2 (ja) 2014-01-31 2017-09-20 富士通株式会社 通信方法、中継装置、および、通信プログラム
US9753883B2 (en) * 2014-02-04 2017-09-05 Netronome Systems, Inc. Network interface device that maps host bus writes of configuration information for virtual NIDs into a small transactional memory
US9886410B1 (en) * 2015-02-04 2018-02-06 Amazon Technologies, Inc. Dynamic function assignment to I/O device interface port
CN107241913B (zh) * 2015-02-25 2020-06-19 株式会社日立制作所 信息处理装置
US10073805B2 (en) * 2015-09-03 2018-09-11 Avago Technologies General Ip (Singapore) Pte. Ltd. Virtual expansion ROM in a PCIe environment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101277195A (zh) * 2007-03-30 2008-10-01 杭州华三通信技术有限公司 一种交换网通信系统、实现方法及交换装置
CN102393838A (zh) * 2011-07-04 2012-03-28 华为技术有限公司 数据处理方法及装置、pci-e总线系统、服务器
US20130031288A1 (en) * 2011-07-27 2013-01-31 Agilent Technologies, Inc. Pci-e system having reconfigurable link architecture
CN103490961A (zh) * 2013-09-05 2014-01-01 杭州华三通信技术有限公司 一种网络设备

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021098602A1 (zh) * 2019-11-19 2021-05-27 中兴通讯股份有限公司 一种报文转发方法、装置及分布式设备
CN117499346A (zh) * 2023-12-28 2024-02-02 苏州元脑智能科技有限公司 访问控制信息的下发方法及装置
CN117499346B (zh) * 2023-12-28 2024-03-01 苏州元脑智能科技有限公司 访问控制信息的下发方法及装置

Also Published As

Publication number Publication date
EP3370377B1 (en) 2019-11-27
CN106713183B (zh) 2020-03-17
US10430364B2 (en) 2019-10-01
CN106713183A (zh) 2017-05-24
EP3370377A1 (en) 2018-09-05
JP6592599B2 (ja) 2019-10-16
JP2018533794A (ja) 2018-11-15
EP3370377A4 (en) 2018-11-07
US20180225247A1 (en) 2018-08-09

Similar Documents

Publication Publication Date Title
US20240291750A1 (en) System and method for facilitating efficient event notification management for a network interface controller (nic)
US9106570B2 (en) 50 Gb/s Ethernet using serializer/deserializer lanes
WO2017071667A1 (zh) 报文转发
CN108055202B (zh) 一种报文处理设备和方法
CN101656676B (zh) 一种媒体访问控制mac地址表项更新方法和装置
CN103828332B (zh) 数据处理方法、装置、存储控制器和机柜
WO2015184706A1 (zh) 统计计数设备及其实现方法、具有统计计数设备的系统
WO2018192587A1 (zh) 一种查表方法及装置、计算机存储介质
US11558315B2 (en) Converged network interface card, message coding method and message transmission method thereof
US8472420B2 (en) Gateway device
CN103036817A (zh) 一种服务器单板、服务器单板实现方法及主处理器
WO2020073907A1 (zh) 转发表项的更新方法及装置
CN105634788A (zh) 一种单板及单板管理方法、系统
CN112912809B (zh) 包括通用封装模式的智能控制器及传感器网络总线、系统和方法
CN116633911B (zh) 数据处理方法、设备及系统
CN108259348A (zh) 一种报文传输方法和装置
CN105704023B (zh) 一种堆叠系统的报文转发方法、装置及堆叠设备
EP4304144A1 (en) Communication method and apparatus
CN115914087A (zh) 报文转发方法、装置、设备、系统及存储介质
CN105515856B (zh) 基于通道的fc网络余度设计方法
KR100496988B1 (ko) 이더넷 스위치를 이용해 프로세서간 통신(ipc)메시지를 전송하는 라우터 시스템에서의 프로세서간 통신메시지 통신방법
CN105282058A (zh) 路径配置方法及装置
CN115309338A (zh) 一种处理器存储式链路管理方法及装置
CN119210936A (zh) 通过现场可编程门阵列实现隧道技术的方法和装置
CN116996592A (zh) 网卡、数据发送处理方法和数据接收处理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16859102

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2018521989

Country of ref document: JP

Ref document number: 15771963

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2016859102

Country of ref document: EP