WO2012139477A1 - 一种基于oam协议确定故障消除的方法及装置 - Google Patents

一种基于oam协议确定故障消除的方法及装置 Download PDF

Info

Publication number
WO2012139477A1
WO2012139477A1 PCT/CN2012/073609 CN2012073609W WO2012139477A1 WO 2012139477 A1 WO2012139477 A1 WO 2012139477A1 CN 2012073609 W CN2012073609 W CN 2012073609W WO 2012139477 A1 WO2012139477 A1 WO 2012139477A1
Authority
WO
WIPO (PCT)
Prior art keywords
fault
session
information corresponding
oam
identification information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2012/073609
Other languages
English (en)
French (fr)
Inventor
陈春雷
钱勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to EP12771254.5A priority Critical patent/EP2698948A4/en
Priority to BR112013026226-5A priority patent/BR112013026226A2/pt
Priority to RU2013147733/08A priority patent/RU2598794C2/ru
Publication of WO2012139477A1 publication Critical patent/WO2012139477A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0811Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking connectivity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route

Definitions

  • the present invention relates to the field of data communication technologies, and in particular, to a method and apparatus for determining fault elimination based on an Operation, Administration, and Maintenance (OAM) protocol.
  • OAM Operation, Administration, and Maintenance
  • connection failure management CFM
  • T-MPLS transport multi-protocol label switching
  • TP MPLS-Exchange File
  • the link fault can be detected through the configuration of the maintenance association end point (MEP).
  • the specific detection process includes: MEP periodically sends and receives detection packets, and the received detection packets are received.
  • the information carried in the network is compared with the locally configured MEP information. According to the comparison result, multiple link faults such as cross-connection, MEP ID fault, and periodic fault are detected.
  • a timer needs to be started for the fault. When the fault is not eliminated within the time limit of the timer, the timer restarts. If the timer expires, Determining the fault elimination, the timing time range is determined according to a period parameter interval carried in the packet, and is generally 3.5 times an interval.
  • the detection period is generally in the ms level, for example, 3.3 ms, 10 ms, or 100 ms. Therefore, the accuracy of the timer needs to reach the ms level correspondingly, and the timer supporting the ms level can only be implemented by hardware.
  • a variety of faults can be detected, usually 3 to 5, and each timer needs to be configured with a timer, for example:
  • the OAM protocol can detect four types of faults.
  • each OAM session must be configured with four hardware timers.
  • the hardware resources are limited. The number of timers will limit the number of OAM sessions.
  • the hardware table also needs to allocate an index for each timer, which increases the physical storage space and the difficulty of implementation.
  • the embodiments of the present invention provide a method and an apparatus for determining fault elimination based on the OAM protocol, which are used to solve the problem that the number of OAM sessions caused by the hardware timer is determined to be eliminated by using a hardware timer is limited, and physical storage is implemented.
  • the method for determining fault elimination based on the OAM protocol provided by the embodiment of the present invention includes: determining whether a link is faulty by detecting an OAM session;
  • the identifier information of the fault corresponding to the OAM session is set;
  • the identifier information corresponding to the number of scans of the OAM session is set according to the set identification information, and the fault identification information is cleared.
  • the device for determining fault elimination based on the OAM protocol includes: a detecting module, configured to determine whether a link is faulty by detecting an OAM session;
  • the first setting module is configured to: when detecting that the link is faulty, set the identifier information of the fault corresponding to the OAM session;
  • a second setting module configured to: when the session scanning time comes, set the identification information of the number of scan times corresponding to the OAM session according to the set identifier information, and clear the fault identification information; the determining module is used for the next When the session scanning time comes, it is determined whether the fault is eliminated according to the identifier information corresponding to the fault and the identifier information corresponding to the number of scans.
  • the embodiment of the invention provides a method and a device for determining fault elimination based on the OAM protocol.
  • the identifier information of the fault corresponding to the session is set, and when the scanning time comes According to the set identification information, the identifier information corresponding to the number of scans of the OAM session is set, and according to the arrival information of the next session scan time, the identifier information corresponding to the scand fault and the identifier information corresponding to the scan number are Determine if the fault is eliminated.
  • the present invention by detecting the identification information corresponding to the fault and the identification information corresponding to the number of scans at each session scanning time, it is determined whether the fault is eliminated, and it is not necessary to determine whether the fault is eliminated by using a hardware timer, thereby reducing The storage space occupied by the hardware timer index entry.
  • the present invention since the present invention does not need to set a corresponding number of timers for each OAM session, the number of OAM sessions can be arbitrarily adjusted, which increases the flexibility of the system.
  • FIG. 1 is a flowchart of a method for determining fault elimination based on an OAM protocol according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of an apparatus for determining fault elimination based on an OAM protocol according to an embodiment of the present invention
  • FIG. 3 is a flowchart showing a method for determining fault elimination based on OAM protocol according to an embodiment of the present invention. Intention. detailed description
  • the embodiment of the present invention reduces the storage space occupied by the timer index entry and improves the flexibility of the system, and provides a method and device for determining fault elimination based on the OAM protocol.
  • the identification information corresponding to the fault and the identification information corresponding to the number of scans are detected to determine whether the fault is eliminated, and it is not necessary to determine whether the fault is eliminated by using a hardware timer, thereby reducing the storage space occupied by the hardware timer index entry.
  • the number of OAM sessions can be arbitrarily adjusted, which increases the flexibility of the system.
  • FIG. 1 is a flowchart of a method for determining fault elimination based on an OAM protocol according to an embodiment of the present invention, where the process includes the following steps:
  • step S102 Determine, by using each detected OAM session, whether the link is faulty. When it is determined that the link is faulty, proceed to step S103. Otherwise, return to step S101.
  • the configured MEP can determine whether the link is faulty.
  • the operation of setting the identification information of the OAM session corresponding to the failure may be implemented by a hardware module.
  • S105 When the next session scanning time comes, determining whether the fault is eliminated according to the identifier information corresponding to the fault and the identifier information corresponding to the number of scans.
  • the identifier information corresponding to each link fault is set in each OAM session, and when it is determined that there is a corresponding link fault, the identifier information corresponding to the fault is set.
  • a corresponding session scanning period is set for each OAM session, and when the session scanning time comes, the OAM session information is scanned; when the identifier information corresponding to a certain fault of the OAM session is set, The identification information corresponding to the number of scans of the OAM session is set, and the fault identification information is cleared.
  • the identification information corresponding to each fault is set, and the session scan times identification information corresponding to the session is set, and at the session scan time, the identifier information corresponding to the fault is detected, and the number of scans is corresponding.
  • the identification information can be used to determine if the fault is eliminated. It can be seen that the present invention does not need to determine whether the fault is eliminated by using a hardware timer, thereby reducing the storage space occupied by the hardware timer index entry, and since there is no need to set a corresponding number of timers for each OAM session, the number of OAM sessions can be Arbitrary adjustments increase the flexibility of the system.
  • FIG. 2 is a schematic structural diagram of an apparatus for determining fault elimination based on an OAM protocol according to an embodiment of the present invention, where the apparatus includes: a detection module, a first set module, a second set module, and a determining module;
  • the detecting module 21 is configured to determine whether the link is faulty by detecting the OAM session; the first setting module 22 is configured to: when detecting that the link is faulty, set the identifier information corresponding to the fault of the OAM session;
  • the second setting module 23 is configured to: when the session scanning time comes, set the identification information of the number of scan times corresponding to the OAM session according to the set identification information, and clear the fault identification information; the determining module 24 is configured to: When the next session scanning time comes, it is determined whether the fault is eliminated according to the identifier information corresponding to the fault and the identifier information corresponding to the number of scans.
  • the detecting module 21 can detect an OAM session through a hardware module, and determine that the link is No fault function;
  • the first setting module 22 can also implement the function of setting the identifier information corresponding to the failure of the OAM session by using the hardware module.
  • the device also includes:
  • the response module 25 is configured to report the detected alarm event to the alarm response process, and respond to the alarm by the alarm response process.
  • the detecting module 21 is further configured to detect the fault again through the OAM session, and when the identifier information corresponding to the fault is set, the fault response event is no longer reported to the fault response process.
  • the determining module 24 is specifically configured to: when the identifier information corresponding to the fault is set, and the identifier information corresponding to the number of scans is set, determine that the fault is not eliminated; and the identifier information corresponding to the fault is not set. When the identification information corresponding to the number of scans has been set, it is determined that the fault has been eliminated.
  • the device for determining the fault elimination based on the OAM may be located in the MEP, or in another network device.
  • the device specifically adopts a combination of software and hardware when determining the fault elimination.
  • the specific implementation includes: setting an alarm response process and a session scanning process by using a software module, where the session scanning process sets an OAM session every set scan period T A scan is performed, that is, the OAM session is scanned at the time of the session scan.
  • a corresponding identification information is allocated for each link fault that needs to be detected.
  • the identifier information allocated for the link cross fault may be xconErr.
  • the identifier information of the scan times of each OAM session needs to be allocated, for example, the identifier information of the scan times may be roundNum.
  • FIG. 3 is a schematic flowchart of a method for determining fault elimination based on an OAM protocol according to an embodiment of the present invention, where the process includes the following steps: S301: The hardware module is used to detect the OAM session, and it is determined whether the link is faulty. When it is determined that the link is faulty, step S302 is performed; otherwise, step S301 is continued.
  • the hardware module sends the detected alarm event to the alarm response process, and sets the identifier information of the fault corresponding to the session according to the detected link fault.
  • the hardware module sends an alarm notification to the alarm response process.
  • the hardware module has reported the alarm notification of the link cross fault to the alarm response process, the alarm event has been reported, and the hardware module continuously detects the OAM session to determine whether a link fault has occurred. Therefore, when the hardware module detects the fault again for the OAM session, and the identifier information corresponding to the fault is set, the hardware module no longer reports the fault response event to the alarm response process.
  • S303 The software module, for example, the CPU scans the OAM session information according to the set session scan period.
  • a corresponding session scanning period is set for each OAM session, and the same scanning period may be set for all sessions.
  • the session scanning time comes, the OAM session information is scanned.
  • S304 The software module sets the identification information of the number of scan times corresponding to the OAM session according to the set identification information, and clears the fault identification information.
  • determining, according to the identifier information corresponding to the fault, and the identifier information corresponding to the number of scans, whether the fault is eliminated if: the identifier information corresponding to the fault is set, and the identifier information corresponding to the scan times is set, The fault is not eliminated.
  • the identifier information corresponding to the fault is not set, and the identifier information corresponding to the number of scans is set, it is determined that the fault has been eliminated.
  • the detected link fault is a link cross fault
  • the identifier information corresponding to the cross fault of the OAM session link is xconErr, the number of scans.
  • the corresponding identification information is roundNum
  • the hardware module takes a network processor (NP) as an example
  • the software module uses a CPU as an example for description.
  • the NP When the NP detects the OAM session and the detected link is faulty, the NP sends an alarm notification to the alarm response process, and the alarm response process responds with an alarm.
  • the identifier information xconErr of the link cross fault corresponding to the OAM session is set. Bit. Then, when the NP detects the link cross fault according to the OAM session, and the identifier information xconErr of the link cross fault corresponding to the OAM session is already set, the NP does not send an alarm notification to the alarm response process.
  • the CPU scans the OAM session information when the scan time arrives according to the set session scan period T.
  • the identifier information xconErr of the link cross fault corresponding to an OAM session is set, the CPU corresponds to the OAM session.
  • the identification information roundNum corresponding to the number of scans is set, and attempts to eliminate the fault of the link, that is, the xconErr is cleared.
  • the CPU scans the OAM session again, when the identification information xconErr corresponding to the link cross fault is not set, and the identification information roundNum corresponding to the scanning times is set. , determining that the link cross fault has been eliminated; when the identifier information xconErr corresponding to the link cross fault is set, and the identification information roundNum corresponding to the scan count is set, determining that the link cross fault is not eliminated; The identification information xconErr corresponding to the road cross fault is not set, and the identification information roundNum corresponding to the number of scans is not set, and the door 'J has no operation.
  • the session scanning period may be set to be greater than or equal to 3.5 times of the maximum OAM maximum packet sending period.
  • the session scanning period may be set to be more than 350 ms, so that the CPU load is not excessively increased at this time. . You can also flexibly set the number of times the CPU performs session scanning as needed.
  • the OAM session is detected.
  • the NP receives the OAM packet
  • the NP detects the content of the packet and determines whether a fault occurs.
  • no fault is detected, all the processing is skipped, and the next report is waited for.
  • the text arrives and the message is detected.
  • Notification software module The current OAM session has detected a failure.
  • the hardware module waits for the arrival of the next message and detects the next message.
  • the software module When the software module receives the notification sent by the hardware module, the corresponding fault status flag of the corresponding OAM session in the hardware is set, indicating that the OAM session is in a fault state.
  • the CPU scans the OAM session information, checks the fault status corresponding to the specific OAM session, and determines whether the OAM session is in a fault state.
  • the fault status flag of the corresponding OAM session in the hardware table is cleared, and the number of scans is set to 1; if it is not in the fault state, but the number of scans is non-zero, it is judged that the fault has been eliminated, and the number of scans is cleared.
  • the strength of the fault elimination criterion can be enhanced by increasing the number of scans. That is: When it is determined that the session is not in the fault state, and the number of scans is non-zero, the number of scans is incremented by one; when the value of the number of scans is increased to the set value, it is judged that the fault has been eliminated.
  • the present invention by detecting the identification information corresponding to the fault and the identification information corresponding to the number of scans in each session scanning time, it is determined whether the fault is eliminated, and it is not necessary to determine whether the fault is eliminated by using a hardware timer, thereby reducing The storage space occupied by the hardware timer index entry, and since there is no need to set a corresponding number of timers for each OAM session, the number of OAM sessions can be arbitrarily adjusted. In addition, for other types of link failures, there is no need to add hardware timers, which increases the flexibility of the system.
  • the embodiment of the invention provides a method and a device for determining fault elimination based on the OAM protocol.
  • the method detects an OAM session and determines that the link is faulty, the identifier information of the corresponding fault is set and arrives at the scanning time. According to the set identification information, the identification information of the number of scan times corresponding to the OAM session is set, and the fault identification information is cleared, and according to the next When a session scanning time arrives, the identification information corresponding to the scanned fault and the identification information corresponding to the number of scans determine whether the fault is eliminated.
  • the present invention by detecting the identification information corresponding to the fault and the identification information corresponding to the number of scans at each session scanning time, it is determined whether the fault is eliminated, and it is not necessary to determine whether the fault is eliminated by using a hardware timer, thereby reducing The storage space occupied by the hardware timer index entry.
  • the present invention since the present invention does not need to set a corresponding number of hardware timers for each OAM session, the number of OAM sessions can be arbitrarily adjusted, which increases the flexibility of the system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Maintenance And Management Of Digital Transmission (AREA)

Description

一种基于 OAM协议确定故障消除的方法及装置 技术领域
本发明涉及数据通信技术领域, 尤其涉及一种基于操作、 管理和维护 ( Operations , Administration and Maintenance, OAM )协议确定故障消除的 方法及装置。 背景技术
随着电信级以太网技术的发展, 电信级 OAM技术的重要性也已经凸 显, 业界也提出了多种 OAM协议, 例如: 连接故障管理(CFM )协议、 传送多协议标签交换 ( T-MPLS ) OAM协议以及 MPLS-交换文件( TP ) OAM 协议等。
基于上述各类 OAM 协议, 可以通过配置的维护端点 (Maintenance association End Point, MEP )检测链路故障, 具体的检测过程包括: MEP 周期性的发送并接收检测报文, 将接收到的检测报文中携带的信息, 与本 地配置的 MEP信息进行比较, 根据比较的结果检测出交叉连接、 MEP ID 故障、周期性故障等多种链路故障。基于 OAM协议, 当检测到链路故障时, 需要为该故障启动一个定时器, 在该定时器的定时时间范围内该故障未消 除时, 该定时器重新开始计时, 如果该定时器超时, 则确定该故障消除, 所述定时时间范围根据报文中携带的周期参数 interval确定, 一般为 3.5倍 的 interval
基于 OAM协议, 检测周期一般都为 ms级别, 例如为 3.3ms、 10ms或 100ms等, 因此定时器的精度也需要相应的达到 ms级别, 而支持 ms级别 的定时器只能通过硬件来实现。与此同时,基于 OAM协议可以检测出多种 故障, 一般为 3~5种, 每种故障都需要配置一个定时器, 例如: 基于某一 OAM协议可以检测出 4种故障, 那么每个 OAM会话都要配置 4个硬件定 时器。但是,硬件资源是有限的,定时器的数目将会限制 OAM会话的数量, 同时在硬件表中还需要为每个定时器分配索引, 从而又增加了物理存储空 间, 以及实现难度。
在实际的应用中, 当检测到故障时需要及时上报告警, 这是因为故障 的发生意味着被检测到的链路出现了故障, 为了防止业务的中断, 该链路 上的业务需要倒换到保护链路上。 事实上, 对于故障消除的告警并不需要 具有很高的及时性, 这是因为当业务已经倒换到保护链路上、 或者已采用 其他措施保证了业务的正常进行了, 因此, 如果此时采用硬件 ms级别的定 时器, 不仅增加了系统的负担, 限制了 OAM会话的数量, 而且物理存储空 间占用量大。 发明内容
有鉴于此,本发明实施例提供一种基于 OAM协议确定故障消除的方法 及装置, 用以解决现有技术中因采用硬件定时器确定故障消除所导致的 OAM会话的数量受限, 且物理存储空间占用量大的问题。
本发明实施例提供的一种基于 OAM协议确定故障消除的方法, 包括: 通过检测 OAM会话判断链路是否出现故障;
当检测到链路出现故障时, 将该 OAM会话所对应故障的标识信息置 位;
当会话扫描时刻到来时,根据置位后的标识信息,将该 OAM会话对应 扫描次数的标识信息置位, 并将故障标识信息清零;
在下一个会话扫描时刻到来时, 根据该故障对应的标识信息, 以及扫 描次数对应的标识信息, 确定该故障是否消除。
本发明实施例提供的一种基于 OAM协议确定故障消除的装置, 包括: 检测模块, 用于通过检测 OAM会话判断链路是否出现故障; 第一置位模块,用于检测到链路出现故障时,将该 OAM会话所对应故 障的标识信息置位;
第二置位模块, 用于会话扫描时刻到来时, 根据置位后的标识信息, 将该 OAM会话对应扫描次数的标识信息置位, 并将故障标识信息清零; 确定模块, 用于在下一个会话扫描时刻到来时, 根据该故障对应的标 识信息, 以及扫描次数对应的标识信息, 确定该故障是否消除。
本发明实施例提供了一种基于 OAM协议确定故障消除的方法及装置, 当通过检测 OAM会话确定链路出现故障时,将该会话对应的该故障的标识 信息置位, 并在扫描时刻到来时, 根据该置位后的标识信息, 将该 OAM会 话对应扫描次数的标识信息置位, 并根据下一个会话扫描时刻到来时, 扫 描到的故障对应的标识信息, 以及扫描次数对应的标识信息, 确定该故障 是否消除。 在本发明实施例中, 通过在每个会话扫描时刻, 检测故障对应 的标识信息以及扫描次数对应的标识信息, 来确定故障是否消除, 而无需 通过硬件定时器确定故障是否消除, 从而减小了硬件定时器索引表项占用 的存储空间。此外, 由于本发明无需针对每个 OAM会话设置相应数量的定 时器, 因此, OAM会话的数量可以任意的调整, 提高了系统的灵活性。 附图说明
此处所说明的附图用来提供对本发明的进一步理解, 构成本发明的一 部分, 本发明的示意性实施例及其说明用于解释本发明, 并不构成对本发 明的不当限定。
图 1为本发明实施例提供的一种基于 OAM协议确定故障消除的方法实 现流程图;
图 2为本发明实施例提供的基于 OAM协议确定故障消除的装置结构示 意图;
图 3为本发明实施例提供的基于 OAM协议确定故障消除的方法流程示 意图。 具体实施方式
本发明实施例为了减小对硬件定时器的依赖, 减小定时器索引表项占 用的存储空间,提高系统的灵活性,提供了一种基于 OAM协议确定故障消 除的方法和装置, 该方法通过在每个会话扫描时刻, 检测故障对应的标识 信息以及扫描次数对应的标识信息, 确定故障是否消除, 无需通过硬件定 时器确定故障是否消除, 从而减小了硬件定时器索引表项占用的存储空间, 并且由于无需针对每个 OAM会话设置相应数量的定时器, 因此 OAM会话 的数量可以任意的调整, 提高了系统的灵活性。
为了使本发明所要解决的技术问题、 技术方案及有益效果更加清楚、 明白, 以下结合附图和实施例, 对本发明进行进一步详细说明。 应当理解, 此处所描述的具体实施例仅仅用以解释本发明, 并不用于限定本发明。
图 1为本发明实施例提供的一种基于 OAM协议确定故障消除的方法实 现流程图, 该流程包括以下步驟:
S101 : 检测每个 OAM会话。
S102: 通过检测到的每个 OAM会话, 判断链路是否出现故障, 当判断 链路出现故障时, 进行步驟 S103 , 否则, 返回步驟 S101。
这里, 可以通过配置的 MEP判断链路是否出现故障。
S103: 将该 OAM会话对应该故障的标识信息置位。
这里,所述将该 OAM会话对应该故障的标识信息置位的操作可通过硬 件模块实现。
S104: 当会话扫描时刻到来时, 根据置位后的标识信息, 将该 OAM会 话对应扫描次数的标识信息置位, 并将故障标识信息清零。
S105: 在下一个会话扫描时刻到来时, 根据该故障对应的标识信息, 以及扫描次数对应的标识信息, 确定该故障是否消除。 在本发明实施例中 ,通过在每个 OAM对话中设置对应每种链路故障的 标识信息, 当确定存在相应的链路故障时, 将该故障对应的标识信息置位。 此外, 在本发明实施例中还针对每个 OAM会话设置对应的会话扫描周期, 当会话扫描时刻到来时, 扫描该 OAM会话信息; 当该 OAM会话某一故障 对应的标识信息置位时, 则将该 OAM会话对应扫描次数的标识信息置位, 并将故障标识信息清零。 在该会话下一扫描时刻, 根据该故障对应的标识 信息, 以及扫描次数对应的标识信息, 确定该故障是否消除。
由于在本发明实施例中针对每个会话, 设置对应每个故障的标识信息, 并设置对应该会话的会话扫描次数标识信息, 在会话扫描时刻, 通过检测 故障对应的标识信息, 以及扫描次数对应的的标识信息, 即可确定故障是 否消除。 可见, 本发明无需通过硬件定时器确定故障是否消除, 从而减小 了硬件定时器索引表项占用的存储空间,并且由于无需针对每个 OAM会话 设置相应数量的定时器, 因此 OAM会话的数量可以任意调整,提高了系统 的灵活性。
图 2为本发明实施例提供的基于 OAM协议确定故障消除的装置结构示 意图, 该装置包括: 检测模块、 第一置位模块、 第二置位模块和确定模块; 其中,
检测模块 21, 用于通过检测 OAM会话判断链路是否出现故障; 第一置位模块 22, 用于检测到链路出现故障时, 将该 OAM会话所对 应故障的标识信息置位;
第二置位模块 23 , 用于会话扫描时刻到来时,根据置位后的标识信息, 将该 OAM会话对应扫描次数的标识信息置位, 并将故障标识信息清零; 确定模块 24, 用于在下一个会话扫描时刻到来时, 根据该故障对应的 标识信息, 以及扫描次数对应的标识信息, 确定该故障是否消除。
所述检测模块 21 , 可通过硬件模块实现检测 OAM会话, 判断链路是 否出现故障的功能;
所述第一置位模块 22, 也可通过所述硬件模块实现将该 OAM会话对 应故障的标识信息置位的功能。
所述装置还包括:
响应模块 25 , 用于将检测到的故障事件上报告警响应进程, 并通过该 告警响应进程对该故障进行告警响应。
所述响应模块对故障进行告警响应后,
所述检测模块 21 , 还用于通过 OAM会话再次检测到该故障, 并且该 故障对应的标识信息已置位时, 不再将该故障事件上报告警响应进程。
所述确定模块 24, 具体用于检测到该故障对应的标识信息已置位, 且 扫描次数对应的标识信息置位时, 确定该故障未消除; 检测到该故障对应 的标识信息未置位, 且扫描次数对应的标识信息已置位时, 确定该故障已 消除。
具体的,本发明实施例在基于 OAM确定故障是否消除时,该基于 OAM 确定故障消除的装置可以位于 MEP中, 或者其他网络设备中。 该装置具体 的在确定故障消除时, 采用软件和硬件结合的方式, 具体的实现包括: 通过软件模块设置告警响应进程和会话扫描进程, 其中会话扫描进程 每隔设置的扫描周期 T, 对 OAM会话进行一次扫描, 即在会话扫描时刻, 对 OAM会话进行扫描。 另外, 在本发明实施例中针对每个 OAM会话, 为 每种需要检测的链路故障分配一个对应的标识信息, 例如: 针对链路交叉 故障分配的标识信息可以为 xconErr。 为了实现对故障消除的确定, 在本发 明实施例中还需要针对每个 OAM会话分配一个扫描次数的标识信息,例如 该扫描次数的标识信息可以为 roundNum。
图 3为本发明实施例提供的基于 OAM协议确定故障消除的方法流程示 意图, 该过程包括以下步驟: S301 : 采用硬件模块检测 OAM会话, 判断检测链路是否出现故障, 当 确定链路出现故障时, 进行步驟 S302, 否则, 继续进行步驟 S301。
S302: 该硬件模块向告警响应进程发送检测到的告警事件, 并根据检 测到的链路故障, 将该会话对应的该故障的标识信息置位。
具体的, 由硬件模块向该告警响应进程发送告警通知。
此时, 由于该硬件模块已经将该链路交叉故障的告警通知上报了告警 响应进程, 即已经上报了该告警事件,且硬件模块还在不断对 OAM会话进 行检测,确定是否出现了链路故障,所以当该硬件模块针对该 OAM会话再 次检测到该故障, 并且该故障对应的标识信息已置位时, 所述硬件模块不 再将该故障事件上报告警响应进程。
S303: 软件模块, 如 CPU根据已设置的会话扫描周期, 对 OAM会话 信息进行扫描。
本发明实施例中, 为每个 OAM会话设置对应的会话扫描周期,也可为 所有会话设置同样的扫描周期, 当会话扫描时刻到来时,扫描该 OAM会话 信息。
S304:软件模块根据置位后的标识信息,将该 OAM会话对应扫描次数 的标识信息置位, 并将故障标识信息清零。
S305: 在下一个会话扫描时刻到来时, 根据该故障对应的标识信息, 以及扫描次数对应的标识信息, 确定该故障是否消除。
具体的, 根据该故障对应的标识信息, 以及扫描次数对应的标识信息, 确定该故障是否消除, 为: 当检测到该故障对应的标识信息置位, 扫描次 数对应的标识信息置位时, 确定该故障未消除; 当检测该故障对应的标识 信息未置位, 扫描次数对应的标识信息置位时, 确定该故障已消除。
下面以一个具体的实施例进行说明, 以检测到的链路故障为链路交叉 故障, 该 OAM会话链路交叉故障对应的标识信息为 xconErr, 该扫描次数 对应的标识信息为 roundNum, 所述硬件模块以网络处理器( NP )为例、 所 述软件模块以 CPU为例进行说明。
NP检测 OAM会话, 检测到的链路出现的故障为链路交叉故障时, NP 向告警响应进程发送告警通知,告警响应进程响应告警,将该 OAM会话对 应的链路交叉故障的标识信息 xconErr置位。 之后 NP在根据该 OAM会话 检测到该链路交叉故障时,且该 OAM会话对应的链路交叉故障的标识信息 xconErr已经置位时, 则不再向告警响应进程发送告警通知。
CPU根据设置的会话扫描周期 T,在扫描时刻到来时,对 OAM会话信 息进行扫描, 当检测到某一 OAM 会话对应的链路交叉故障的标识信息 xconErr 置位时, 则将该 OAM 会话对应的扫描次数对应的标识信息 roundNum置位, 并尝试将该链路的故障消除, 即将 xconErr清零。
经过一个会话扫描周期 T, 在下一个会话扫描时刻到来时, CPU再次 扫描到该 OAM会话, 当该链路交叉故障对应的标识信息 xconErr未置位, 并且该扫描次数对应的标识信息 roundNum置位时,则确定该链路交叉故障 已消除; 当该链路交叉故障对应的标识信息 xconErr置位, 并且该扫描次数 对应的标识信息 roundNum置位,则确定该链路交叉故障未消除; 当该链路 交叉故障对应的标识信息 xconErr未置位,并且该扫描次数对应的标识信息 roundNum未置位 , 贝' J无操作。
在本发明实施例中所述会话扫描周期, 可以设置为大于等于常用的 OAM最大发包周期的 3.5倍,例如该会话扫描周期可以设置为 350ms以上, 则此时不会过分增加 CPU 的负担, 另外。 也可以根据需要灵活设置 CPU 进行会话扫描的次数。
具体的, 在本发明实施例中检测 OAM会话, 当 NP接收到 OAM报文 后, 检测报文的内容, 判断是否出现故障, 当未检测到故障时, 则跳过所 有处理, 等待下一个报文到来, 并对该报文进行检测。 通知软件模块: 当前的 OAM会话已检测到故障。所述硬件模块等待下一个 报文的到来, 并对下一个报文进行检测。
当软件模块接收到硬件模块发送的通知后,将硬件中相应 OAM会话的 对应故障状态标志置位, 表明该 OAM会话已处于故障状态。
当故障判断消除时, CPU扫描 OAM会话信息,检查具体 OAM会话对 应的故障状态, 确定 OAM会话是否处于故障状态。 当确定 OAM会话处于 故障状态,清除硬件表中相应 OAM会话的故障状态标志,扫描次数设置为 1; 如果不处于故障状态, 但扫描次数非零, 则判断故障已消除, 将扫描次 数清零。
根据扫描次数以及故障状态, 确定该故障是否消除。 当该故障状态标 志置位时, 则将扫描次数清零。
进一步地, 可以通过增加扫描次数, 增强故障消除判据的强度。 即: 当确定会话不处于故障状态, 且扫描次数非零时, 将扫描次数加 1; 当扫描 次数的值增加到设定值时, 判断故障已消除。
由于在本发明实施例中通过在每个会话扫描时刻, 检测故障对应的标 识信息以及扫描次数对应的标识信息, 即可确定故障是否消除, 无需通过 硬件定时器确定故障是否消除, 从而减小了硬件定时器索引表项占用的存 储空间, 并且由于无需针对每个 OAM 会话设置相应数量的定时器, 因此 OAM会话的数量可以任意的调整。 另外, 对于其他类型的链路故障, 也不 需要增加硬件定时器, 提高了系统的灵活性。
本发明实施例提供了一种基于 OAM协议确定故障消除的方法及装置, 该方法通过检测 OAM会话, 并确定链路出现故障时,将该会话对应故障的 标识信息置位,并在扫描时刻到来时,根据该置位后的标识信息,将该 OAM 会话对应扫描次数的标识信息置位, 同时将故障标识信息清零, 并根据下 一个会话扫描时刻到来时, 扫描到的故障对应的标识信息, 以及扫描次数 对应的标识信息, 确定该故障是否消除。
在本发明实施例中, 通过在每个会话扫描时刻, 检测故障对应的标识 信息以及扫描次数对应的标识信息, 来确定故障是否消除, 而无需通过硬 件定时器确定故障是否消除, 从而减小了硬件定时器索引表项占用的存储 空间。此外, 由于本发明无需针对每个 OAM会话设置相应数量的硬件定时 器, 因此, OAM会话的数量可以任意的调整, 提高了系统的灵活性。
上述说明示出并描述了本发明的优选实施例, 但如前所述, 应当理解 本发明并非局限于本文所披露的形式, 不应看作是对其他实施例的排除, 而可用于各种其他组合、 修改和环境, 并能够在本文所述发明构想范围内, 通过上述教导或相关领域的技术或知识进行改动。 而本领域人员所进行的 改动和变化不脱离本发明的精神和范围, 则都应在本发明所附权利要求的 保护范围内。

Claims

权利要求书
1、 一种基于操作、 管理和维护 OAM协议确定故障消除的方法, 其特 征在于, 所述方法包括:
通过检测 OAM会话判断链路是否出现故障;
当检测到链路出现故障时, 将该 OAM会话所对应故障的标识信息置 位;
当会话扫描时刻到来时,根据置位后的标识信息,将该 OAM会话对应 扫描次数的标识信息置位, 并将故障标识信息清零;
在下一个会话扫描时刻到来时, 根据该故障对应的标识信息, 以及扫 描次数对应的标识信息, 确定该故障是否消除。
2、 如权利要求 1所述的方法, 其特征在于, 检测到链路出现故障后, 所述方法还包括: 将检测到的故障事件上报告警响应进程, 并通过该告警 响应进程对该故障进行告警响应。
3、 如权利要求 2所述的方法, 其特征在于, 所述对故障进行告警响应 后, 该方法还包括:
通过 OAM会话再次检测到该故障,并且该故障对应的标识信息已置位 时, 不再将该故障事件上报告警响应进程。
4、 如权利要求 1、 2或 3所述的方法, 其特征在于, 所述根据该故障 对应的标识信息, 以及扫描次数对应的标识信息, 确定该故障是否消除, 为:
当检测到该故障对应的标识信息已置位, 且扫描次数对应的标识信息 已置位时, 确定该故障未消除;
当检测到该故障对应的标识信息未置位, 且扫描次数对应的标识信息 置位时, 确定该故障已消除。
5、 一种基于 OAM协议确定故障消除的装置, 其特征在于, 所述装置 包括: 检测模块、 第一置位模块、 第二置位模块和确定模块; 其中, 检测模块, 用于通过检测 OAM会话判断链路是否出现故障; 第一置位模块,用于检测到链路出现故障时,将该 OAM会话所对应故 障的标识信息置位;
第二置位模块, 用于会话扫描时刻到来时, 根据置位后的标识信息, 将该 OAM会话对应的扫描次数的标识信息置位, 并将故障标识信息清零; 确定模块, 用于在下一个会话扫描时刻到来时, 根据该故障对应的标 识信息, 以及扫描次数对应的标识信息, 确定该故障是否消除。
6、 如权利要求 5所述的装置, 其特征在于, 所述装置还包括: 响应模块, 用于将检测到的故障事件上报告警响应进程, 并通过该告 警响应进程对该故障进行告警响应。
7、 如权利要求 6所述的装置, 其特征在于, 所述响应模块对故障进行 告警响应后, 所述检测模块, 还用于通过 OAM会话再次检测到该故障, 并 且该故障对应的标识信息已置位时, 不再将该故障事件上报告警响应进程。
8、 如权利要求 5、 6或 7所述的装置, 其特征在于, 所述确定模块根 据标识信息确定故障是否消除的操作, 为:
检测到该故障对应的标识信息已置位, 且扫描次数对应的标识信息已 置位时, 确定该故障未消除;
检测到该故障对应的标识信息未置位, 且扫描次数对应的标识信息置 位时, 确定该故障已消除。
PCT/CN2012/073609 2011-04-14 2012-04-06 一种基于oam协议确定故障消除的方法及装置 Ceased WO2012139477A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP12771254.5A EP2698948A4 (en) 2011-04-14 2012-04-06 METHOD AND DEVICE FOR DETERMINING FAULT ELIMINATION BASED ON OAM PROTOCOL
BR112013026226-5A BR112013026226A2 (pt) 2011-04-14 2012-04-06 método e dispositivo para determinar a eliminação de falha com base no protocolo oam
RU2013147733/08A RU2598794C2 (ru) 2011-04-14 2012-04-06 Способ и устройство для определения устранения отказа на базе протокола эксплуатации, администрирования и технического обслуживания (оам)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201110093908.8A CN102143005B (zh) 2011-04-14 2011-04-14 一种基于oam协议确定故障消除的方法及装置
CN201110093908.8 2011-04-14

Publications (1)

Publication Number Publication Date
WO2012139477A1 true WO2012139477A1 (zh) 2012-10-18

Family

ID=44410246

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/073609 Ceased WO2012139477A1 (zh) 2011-04-14 2012-04-06 一种基于oam协议确定故障消除的方法及装置

Country Status (5)

Country Link
EP (1) EP2698948A4 (zh)
CN (1) CN102143005B (zh)
BR (1) BR112013026226A2 (zh)
RU (1) RU2598794C2 (zh)
WO (1) WO2012139477A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112187566A (zh) * 2017-09-21 2021-01-05 中国移动通信有限公司研究院 Oam消息传输方法、传输设备及存储介质

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102143005B (zh) * 2011-04-14 2015-01-28 中兴通讯股份有限公司 一种基于oam协议确定故障消除的方法及装置
CN103167539B (zh) * 2011-12-13 2015-12-02 华为技术有限公司 故障处理方法、设备和系统
CN103095526B (zh) * 2013-01-06 2015-09-23 盛科网络(苏州)有限公司 基于扫描的oam事件上报方法及系统
CN105187278B (zh) * 2015-09-21 2018-09-04 盛科网络(苏州)有限公司 无丢失的检测oam错误的芯片实现方法
CN106941414A (zh) * 2016-01-04 2017-07-11 中兴通讯股份有限公司 链路保护决策结果的同步方法和装置及链路保护系统
US10476763B2 (en) 2017-04-05 2019-11-12 Ciena Corporation Scaling operations, administration, and maintenance sessions in packet networks
CN108880842A (zh) * 2017-05-11 2018-11-23 上海宏时数据系统有限公司 一种自动化运维平台的故障根源分析定位系统及分析方法
CN110690983B (zh) * 2018-07-05 2022-04-08 中兴通讯股份有限公司 一种告警方法、装置及计算机可读存储介质
CN109639802B (zh) * 2018-12-18 2021-11-02 杭州迪普科技股份有限公司 一种链路统计管理方法及装置
CN110661705B (zh) * 2019-09-29 2022-06-28 北京物芯科技有限责任公司 一种硬件网络交换引擎和网络故障处理系统及方法
CN112511382B (zh) * 2020-11-24 2022-03-29 中盈优创资讯科技有限公司 灵活以太网FlexE通道的创建方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101232406A (zh) * 2007-01-26 2008-07-30 华为技术有限公司 Oam快速检测方法、装置和系统
CN101729282A (zh) * 2008-10-30 2010-06-09 中兴通讯股份有限公司 单板告警的处理方法和装置
CN101771583A (zh) * 2009-12-30 2010-07-07 中兴通讯股份有限公司 一种检测网络故障是否消除的方法及装置
CN102143005A (zh) * 2011-04-14 2011-08-03 中兴通讯股份有限公司 一种基于oam协议确定故障消除的方法及装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3430074B2 (ja) * 1999-07-05 2003-07-28 日本電気株式会社 運用保守セル検出装置および方法
US7855968B2 (en) * 2004-05-10 2010-12-21 Alcatel Lucent Alarm indication and suppression (AIS) mechanism in an ethernet OAM network
US7813263B2 (en) * 2004-06-30 2010-10-12 Conexant Systems, Inc. Method and apparatus providing rapid end-to-end failover in a packet switched communications network
CN101119245B (zh) * 2007-08-31 2010-08-18 杭州华三通信技术有限公司 利用oam协议进行链路监控的方法及装置
US8291267B2 (en) * 2008-04-22 2012-10-16 Honeywell International Inc. System for determining real time network up time
US8243608B2 (en) * 2008-12-30 2012-08-14 Rockstar Bidco, LP Metro Ethernet connectivity fault management acceleration
CN101931549A (zh) * 2009-06-19 2010-12-29 中兴通讯股份有限公司 基于802.3ah OAM的链路异常的检测方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101232406A (zh) * 2007-01-26 2008-07-30 华为技术有限公司 Oam快速检测方法、装置和系统
CN101729282A (zh) * 2008-10-30 2010-06-09 中兴通讯股份有限公司 单板告警的处理方法和装置
CN101771583A (zh) * 2009-12-30 2010-07-07 中兴通讯股份有限公司 一种检测网络故障是否消除的方法及装置
CN102143005A (zh) * 2011-04-14 2011-08-03 中兴通讯股份有限公司 一种基于oam协议确定故障消除的方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112187566A (zh) * 2017-09-21 2021-01-05 中国移动通信有限公司研究院 Oam消息传输方法、传输设备及存储介质
CN112187566B (zh) * 2017-09-21 2022-07-29 中国移动通信有限公司研究院 Oam消息传输方法、传输设备及存储介质

Also Published As

Publication number Publication date
RU2013147733A (ru) 2015-05-20
EP2698948A1 (en) 2014-02-19
CN102143005A (zh) 2011-08-03
BR112013026226A2 (pt) 2020-10-27
RU2598794C2 (ru) 2016-09-27
EP2698948A4 (en) 2015-01-14
CN102143005B (zh) 2015-01-28

Similar Documents

Publication Publication Date Title
WO2012139477A1 (zh) 一种基于oam协议确定故障消除的方法及装置
KR101591102B1 (ko) Vrrp 라우터의 동작 방법 및 이를 위한 통신 시스템
CN102123024B (zh) 一种时钟源设备切换选择方法、系统及装置
KR101537633B1 (ko) 패킷 전송 네트워크의 보호 스위칭 장치 및 방법
CN101060485B (zh) 拓扑改变报文的处理方法和处理装置
CN101729426B (zh) 一种虚拟路由冗余协议主备用设备快速切换的方法及系统
WO2009009977A1 (en) Master backup switch method for route device and backup system for route device
CN1681254A (zh) 一种以太网链路状态维护方法
CN101420381B (zh) 一种提高vrrp负载均衡中转发可靠性的方法和装置
CN101079781A (zh) 一种工业以太网快速冗余的实现方法
CN101355466A (zh) 连续性检查消息报文的传输方法和装置
CN104283711B (zh) 基于双向转发检测bfd的故障检测方法、节点及系统
CN101188527A (zh) 一种心跳检测方法和装置
WO2011015068A1 (zh) 一种故障检测的方法和系统
CN102025558A (zh) 网络侦测设备及其主动侦测网络品质的方法
CN101399714B (zh) 双向收发检测报文的传输方法和装置
CN101110848B (zh) 一种检测通道故障的方法
WO2015109734A1 (zh) 伪线保护方法、装置及节点
CN101159536B (zh) 双归属网络中媒体网关节点状态同步的方法
CN102739535A (zh) 一种线卡离线的保护方法和系统
CN100403698C (zh) 一种以太网连接故障检测方法和装置
CN112073270B (zh) 一种链路故障检测方法及装置
JP2009003491A (ja) クラスタシステムにおけるサーバ切り替え方法
CN102104534B (zh) 组播业务保护方法及系统
CN101686199B (zh) 以太网保护系统中控制报文的处理方法、装置及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12771254

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2012771254

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2013147733

Country of ref document: RU

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112013026226

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112013026226

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20131011