WO2023160699A1 - 一种单板管理系统、方法、装置及设备 - Google Patents

一种单板管理系统、方法、装置及设备 Download PDF

Info

Publication number
WO2023160699A1
WO2023160699A1 PCT/CN2023/078408 CN2023078408W WO2023160699A1 WO 2023160699 A1 WO2023160699 A1 WO 2023160699A1 CN 2023078408 W CN2023078408 W CN 2023078408W WO 2023160699 A1 WO2023160699 A1 WO 2023160699A1
Authority
WO
WIPO (PCT)
Prior art keywords
management
type
information
board
bus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/078408
Other languages
English (en)
French (fr)
Inventor
胡仁劼
牛元君
李琴
居海强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to EP23759320.7A priority Critical patent/EP4474996A4/en
Publication of WO2023160699A1 publication Critical patent/WO2023160699A1/zh
Priority to US18/816,341 priority patent/US20240419618A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3031Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a motherboard or an expansion card
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/36Handling requests for interconnection or transfer for access to common bus or bus system
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/40Bus coupling

Definitions

  • the present application relates to the technical field of servers, and in particular to a board management system, method, device and equipment.
  • Standardization of the components of the enhanced server is required for this.
  • the standardization of the server's components involves all aspects of the server.
  • Out-of-band management of servers refers to the maintenance of servers and other equipment through independent management channels.
  • Out-of-band management of servers allows system administrators to monitor and manage servers remotely.
  • the out-of-band management of the server mainly involves the management and monitoring of the working environment of the devices on the server board (such as processors, memory, hard disks). Power supply status and other information to ensure that the components of the server can work in a suitable working environment.
  • the out-of-band management of the server is usually implemented by a baseboard management controller (BMC).
  • BMC baseboard management controller
  • the baseboard management controller needs to be connected to the server single board, so as to be connected to various devices on the server single board.
  • the architectures of server boards deployed with different processors are also different, and the interfaces connected to the baseboard management controller in these different server boards are not uniform, which leads to , for the out-of-band management of each server single board, the baseboard management controller needs to perform a lot of adaptation work, and the reuse rate of the baseboard management controllers of different types of server single boards is low.
  • the present application provides a single board management system, method, device and equipment, which are used to provide an out-of-band management BMC and method with higher adaptability.
  • an embodiment of the present application provides a board management system, where the board management system includes a baseboard management controller and a computing device board.
  • the single board management system can be deployed in a computing device, and the computing device can be a server, a personal computer, and the like.
  • the baseboard management controller can be connected to the single board of the computing device through the management bus.
  • Computing device boards include memory and The hardware manager records the management information of the single board of the computing device in the memory.
  • the memory and the device manager can be connected with the baseboard management controller through the management bus.
  • the baseboard management controller can obtain management information from the memory through the management bus, and interact with the device manager to manage the single board of the computing device based on the management information.
  • connection relationship between the baseboard management controller and the computing device is simple, adaptable to single boards of computing devices with different structures, and can effectively simplify the management mode of the single board of the computing device.
  • the management method of computing device boards is also more efficient.
  • the single board of the computing device further includes a first-type device, and the first-type device is connected to a device manager, and the device manager can obtain working information of the first-type device.
  • the baseboard management controller can acquire the working information of the first type of device from the device manager through the management bus.
  • the baseboard management controller can easily obtain the working information of the first type of device through the device manager without being connected to the first type of device.
  • the method of obtaining the working information of the first type of device is simple and efficient, avoiding It eliminates a large amount of adaptation work required by the baseboard management controller to match different computing device boards, and simplifies the out-of-band management process.
  • the single board of the computing device further includes a second type of device, the second type of device may not be connected to the baseboard management controller through the device manager, and the second type of device may be directly connected to the baseboard management controller through the management bus
  • the controller is connected; the baseboard management controller can directly interact with the second type of device through the management bus, and obtain the working information of the second type of device.
  • the management bus can not only connect the memory and the device manager, but also connect the second type of device.
  • This connection method is relatively simple, and the baseboard management controller also does not need to perform too much adaptation work, effectively Expanded application scenarios.
  • the management information is information required by the baseboard management controller to manage the single board of the computing device. That is, the management information can be stored in advance through the memory.
  • the specific content of the management information is not limited in the embodiment of the present application, and all information required for managing the single board of the computing device is applicable to the embodiment of the present application.
  • the management information includes part or all of the following: attribute information of the single board of the computing device, topology information of the single board of the computing device, attribute information of the first type of device, and attribute information of the second type of device.
  • the management information is pre-stored in the memory, and the baseboard management controller only needs to perform a simple loading operation to obtain the management information, and the method of obtaining the management information is simpler.
  • the baseboard management controller may interact with the device manager, and this embodiment of the present application does not limit the interaction manner between the baseboard management controller and the device manager.
  • the baseboard management controller can interact with the device manager based on command words, which can ensure high interaction efficiency. Different computing device boards can be configured with common command words. In this way, the baseboard management controller can be adapted to different computing device single boards, and the adaptability of the baseboard management controller and the management method is improved.
  • the baseboard management controller may control the first type of device.
  • the baseboard management controller may issue a control command to the device manager to instruct the device manager to control the first type of device.
  • the baseboard management controller can directly control the first type of devices, and the baseboard management controller can issue control commands to the second type of devices through the management bus to control the second type of devices.
  • the baseboard management controller upgrades the first type of device or device manager. For example, the baseboard management controller may transmit the upgrade file of the first type of device to the device manager, indicating to upgrade the first type of device. After obtaining the upgrade file of the device of the first type, the device manager uses the upgrade file of the device of the first type to upgrade the device of the first type. The baseboard management controller may also transmit the upgrade file of the device manager to the device manager, instructing to upgrade the device manager. Of course, the baseboard management controller can also directly upgrade the second type of device through the management bus.
  • the baseboard management controller controls or upgrades the device through the management bus or the device manager, which is simple It simplifies the control and upgrade methods to ensure efficient management of single boards of computing devices.
  • the embodiment of the present application does not limit the type of the memory, for example, the memory may be a charge-erasable programmable read-only memory, which is small in size and has a higher degree of integration.
  • the embodiment of the present application does not limit the specific structure of the device manager, and any module capable of implementing device management is applicable to the embodiment of the present application.
  • a device manager is a complex programmable logic device or a microcontroller unit.
  • the specific structure of the device manager is diverse, applicable to different computing device boards, effectively expanding the application scenarios.
  • the management bus may be an inter-integrated circuit bus, a serial peripheral interface bus, or other types of buses.
  • the type of the management bus is relatively flexible, so that the baseboard management controller can be connected to different types of computing device boards through the management bus, and the degree of adaptation between the baseboard management controller and different types of computing devices is improved.
  • the embodiment of the present application provides a single board management method, which is used to manage the single board of the computing device.
  • the computing device single board includes a memory and a device manager, and the memory records management information of the computing device single board.
  • the baseboard management controller can obtain the management information from the memory through a management bus. After acquiring the management information, the baseboard management controller can interact with the device manager through the management bus based on the management information to manage the single board of the computing device.
  • the computing single board includes a first type of device, the first type of device may be connected to a device management device, and the baseboard management controller may obtain working information of the first type of device from the device manager through a management bus.
  • the computing single board includes the second type of device
  • the first type of device can be directly connected to the baseboard management controller through the management bus
  • the baseboard management controller can obtain the second Work information for class devices.
  • the management information includes part or all of the following: attribute information of the single board of the computing device, topology information of the single board of the computing device, attribute information of the first type of device, and attribute information of the second type of device.
  • the baseboard management controller when the baseboard management controller interacts with the device manager through the management bus, the baseboard management controller may interact with the device manager through the management bus based on a command word.
  • the baseboard management controller controls the first type of devices through the device manager, and may also upgrade the first type of devices through the device manager. For example, the baseboard management controller transmits the upgrade file of the first type of device to the device manager, indicating to upgrade the first type of device. After receiving the upgrade file of the device of the first type, the device manager can use the upgrade file of the device of the first type to upgrade the device of the first type. The baseboard management controller can also upgrade or control the device management controller or the second type of device.
  • the management bus is an I2C bus or an SPI bus.
  • the embodiment of the present application also provides a board management device, which has the function of realizing the behavior in the method example of the second aspect above, and the beneficial effect can be referred to the description of the first aspect, which is not repeated here. repeat.
  • the functions may be implemented by hardware, or may be implemented by executing corresponding software through hardware.
  • Hardware or software includes one or more modules corresponding to the above-mentioned functions.
  • the structure of the board management device includes a request acquisition unit, a management unit, and optionally an upgrade unit. These units can perform the corresponding functions in the method example of the second aspect above, for details, refer to the detailed description in the method example, and details are not repeated here.
  • the embodiment of the present application also provides a baseboard management controller, which has the function of realizing the behavior in the method example of the second aspect above, and the beneficial effects can be referred to the description of the second aspect, which is not repeated here. repeat.
  • the structure of the device includes a processor.
  • a memory may also be included.
  • the processor is configured to support the single board management device to execute the corresponding method in the method of the second aspect above.
  • the baseboard management controller may further include a memory.
  • the memory coupled to the processor, holds computer program instructions necessary for the communication device. Processor can The corresponding method in the method of the second aspect above is executed by invoking computer program instructions.
  • the embodiment of the present application further provides a computing device, the computing device includes a baseboard management controller and a computing device single board, and the computing device single board may include devices such as a processor and a memory.
  • the baseboard management controller has the function of implementing the behavior in the method example of the second aspect above, and the beneficial effects can be referred to the description of the first aspect, which will not be repeated here.
  • the present application also provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a computer, the computer can execute the above-mentioned second aspect and each possibility of the second aspect.
  • the present application further provides a computer program product including instructions, which, when run on a computer, cause the computer to execute the method described in the above first aspect and each possible implementation manner of the first aspect.
  • the present application also provides a computer chip, the chip is connected to the memory, and the chip is used to read and execute the software program stored in the memory, and implement the above-mentioned second aspect and each possibility of the second aspect.
  • FIG. 1 is a schematic diagram of the architecture of a single board management system provided by the present application.
  • FIG. 2 is a schematic structural diagram of another board management system provided by the present application.
  • Fig. 3 is a schematic structural diagram of a command word provided by the present application.
  • FIG. 4A is a schematic structural diagram of a read request provided by the present application.
  • FIG. 4B is a schematic structural diagram of a read response provided by the present application.
  • FIG. 4C is a schematic structural diagram of a write request provided by the present application.
  • 5A to 5B are schematic structural diagrams of an expansion board provided by the present application.
  • 6A to 6C are schematic diagrams of the architecture of a board management system provided by the present application.
  • FIG. 7 is a schematic diagram of a board management method provided by the present application.
  • FIG. 8 is a schematic structural diagram of a BCU management system provided by the present application.
  • FIG. 9 is a schematic structural diagram of a board management device provided by the present application.
  • FIG. 10 is a schematic structural diagram of a computing device provided by the present application.
  • the technical threshold for the development of traditional server motherboards is high.
  • CPU central processing unit
  • it also includes functions such as bus fan-out, power supply fan-out, and maintenance management.
  • the CPU-related circuits on these motherboards are provided by CPU manufacturers.
  • the reference design provided by different CPU manufacturers is completely different, which makes the development and design of the motherboard require a lot of resources and time.
  • complete machine manufacturers need to invest more energy in differentiated innovation, but often only focus on low-level hardware specification competition. This can neither meet the diverse scenarios and computing power needs of customers, but also force machine manufacturers to fall into inefficient homogeneous competition.
  • TTM time to market
  • TCO total Cost of operation
  • This application proposes an innovative peer-to-peer interconnection architecture (also called a new server architecture or a new architecture).
  • the traditional motherboard is first split into a basic board (Basic Computing Unit, BCU) and an extension board (Extension Unit, EXU). and form support.
  • the same computing device may include one base board and one expansion board, the same computing device may also include multiple base boards and one expansion board, and the same computing device may also include one base board and multiple expansion boards.
  • the basic board includes a CPU, double data rate (DDR) and related power supplies, and provides general-purpose computing capabilities and peripheral storage, input/output (IO), acceleration and other expansion interfaces. Foundation board support And so on different series of CPU.
  • the base board supports heterogeneous processors, that is, the base board can support different types of processors, for example, the base board supports a CPU, and an application-specific integrated circuit (application-specific integrated circuit, ASIC), programmable logic device (programmable logic device (PLD), complex programmable logic device (complex programmable logical device, CPLD), field-programmable gate array (field-programmable gate array, FPGA), general array logic (generic array logic, GAL), system on chip (system on chip, SoC), software-defined architecture (software-defined infrastructure, SDI) chip, artificial intelligence (artificial intelligence, AI) chip, etc. any processor or any combination thereof.
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • CPLD complex programmable logic device
  • field-programmable gate array field-programmable gate array
  • GAL system on chip
  • SoC system on chip
  • software-defined architecture software-defined infrastructure, SDI
  • AI artificial intelligence
  • the embodiment of the present application provides at least 6 different forms of basic boards, respectively targeting different computing performance and memory configurations.
  • these six basic boards are called A1, A2, B1, B2, C1, and C2 respectively.
  • P is used to represent the number of processors
  • P is an integer greater than 0
  • DPC represents each channel dual in-line memory module per channel (DIMM) Per Channel).
  • the basic board of A1 form supports one processor, and each channel inserts one DIMM (abbreviated as 1P1DPC);
  • the A2 form base board supports one processor, and each channel inserts one or two DIMMs (abbreviated as 1P2DPC or 1P1DPC).
  • B1 form base board supports two processors, one DIMM per channel (referred to as 2P1DPC), or one processor, each channel inserts one or two DIMMs (referred to as 1P2DPC or 1P1DPC);
  • B2 form The basic board supports two processors, one or two DIMMs per channel (abbreviated as 2P2DPC or 2P1DPC), or one processor, one or two DIMMs per channel (abbreviated as 1P2DPC or 1P1DPC);
  • C1 The basic board of the form supports four processors, and each channel inserts one DIMM (abbreviated as 4P1DPC), or, two processors, each channel inserts one or two DIMMs (abbreviated as 2P2DPC or 2P1DPC); the C2 form The base board supports four processors with one or two DIMMs per channel (referred to as 4P2DPC or 4P1DPC for short), or two processors with one or two DIMMs per channel (referred to as 2P2DPC or
  • the basic board of B2 form supports 2P2DPC (2P32DIMM) when each CPU currently has 8 channels of DDR. After the number of CPU memory channels is increased to 12, 2P2DPC (2P48DIMM) will not be realized. Then, the B2 form can support 2P1DPC (2P24DIMM), and 2P2DPC (2P48DIMM) can be realized with other forms such as C1, because the position of the mounting hole and the size of the base board are standard, and it can be replaced and installed directly.
  • the expansion board includes a Baseboard Management Controller (BMC) chip, which is a management extension of the base board. As the management center of the entire system, it provides management functions such as equipment, security, energy efficiency, and reliability. Wherein, the BMC may also be referred to as a baseboard management controller.
  • the expansion board may also include a management system and a bridge (for example, a platform controller hub (platform controller hub, PCH) of an Intel system).
  • the basic board communicates with components through high-speed buses such as PCIe, memory interconnection (Compute Express Link, CXL), or unified bus (UB or Ubus), and connects with expansion boards through management interfaces.
  • PCIe Peripheral Component Interconnect
  • CXL Compute Express Link
  • UB or Ubus unified bus
  • the specific connection methods of the above-mentioned base board and components, as well as the base board and the expansion board include: the soft connection method of realizing the above-mentioned connection by cables, or the hard connection method of realizing the above-mentioned connection by connectors.
  • a component is a general term for a class of electronic devices or electronic equipment.
  • the components are different according to their functions, including storage components (STorage Unit, STU), IO components (Input Output Unit, IOU), acceleration components (ACceleration Unit, ACU), memory expansion components (Memory Expansion Unit, MEU), cooling components, Computing components, management components, etc.
  • the basic board supports Kunpeng, and other different series of CPUs, and the expansion board provides management functions and power supply for the basic board and various expansion components. With the support of the expansion board, the power supply and heat sink can have various options.
  • the storage components include hard disk backplane, expansion board (Expander), PCIe switch (switch), etc., for system storage expansion, supporting mechanical hard disk drive (hard disk drive, HDD) / solid-state drive (solid-state drive, SSD) / non- Volatile high-speed transmission bus (Non-Volatile Memory express, NVMe) / storage class memory (Storage Class Memory, SCM) and other media and forms.
  • HDD hard disk drive
  • SSD solid-state drive
  • NVMe non-Volatile High-speed transmission bus
  • SCM Storage Class Memory
  • the IO components include Riser and other components to realize the expansion of system IO, and support PCIe standard cards and Open Compute Project (Open Compute Project, OCP) cards.
  • the acceleration components include risers, carrier boards, accelerator card interconnection switches (switches), etc., providing system acceleration component expansion and interconnection functions.
  • Memory expansion components include carrier boards, memory expansion chips, dual in-line memory modules (DIMMs), SCM media, etc., providing the system with the function of expanding memory bandwidth and memory capacity.
  • DIMMs dual in-line memory modules
  • SCM media etc.
  • the heat dissipation component is used to dissipate heat from the computing device or the hardware in the computing device, including a combination of several heat dissipation methods such as air cooling, liquid cooling, or a combination of the two. It should be understood that the structure, type and quantity of the heat dissipation components do not constitute a limitation on the technical solution to be protected in this application.
  • central processing unit central processing unit, CPU
  • memory and other devices that provide general computing capabilities.
  • a base board including devices such as a processor, a memory, and a baseboard management controller, or an expansion board may also serve as a type of component.
  • the socket (Socket) of the processor for example, CPU
  • the main board provided by this application can set external interfaces in a standardized way, and perform various external expansions with soft connections such as cables, which can shield processor-related power supply, differences between different processors and components, and the interconnection between components. .
  • the changes of memory and other components are only included in the motherboard, and the function of cross-generation compatibility of the motherboard is realized.
  • the supporting complete machine and components do not need to be replaced, so the supporting components have a longer life cycle.
  • the latest components can be replaced at any time without changing the chassis or increasing the workload of hardware development, and the fastest use of the latest computing power in the industry.
  • the upgrade of the processor or the replacement of different processor manufacturers only needs to simply replace the basic board, which subverts the original development model and derives a new industrial model.
  • the new server architecture in order to support diverse computing power and diverse devices, also realizes hardware standardization, including standardization of basic boards and standardization of component interfaces.
  • the standardization of the base board includes the standardization of size, installation hole position, interface electrical characteristics, management interface protocol and parameters, etc.
  • Table 1 is an example of a basic board interface description table provided in this application.
  • the power supply adopts a unified 12V input, and the inside of the basic board is converted into various types of power required by DC/DC.
  • this embodiment defines a Flexible I/O interface based on the UBC and UBCDD connectors to replace the original PCIe interface.
  • the Flexible I/O interface can be flexibly configured as a PCIE/HCCS/SAS/SATA/Ethernet interface according to requirements.
  • the BCU management interface mainly includes common low-speed maintenance interfaces, such as I2C, UART, JTAG and other interfaces, which are compatible with the management of common processor platforms.
  • components include expansion boards, power supply components, cooling components, storage components, IO components, acceleration components, memory components, etc., standardize the electrical interface, management interface and parameters of the components, without defining and constraining the physical size, installation, location, etc. of the components, which will provide a broad space for innovation and support differentiation and Flexible expansion.
  • components include expansion boards, power supply components, cooling components, storage components, IO components, acceleration components, memory components, etc., standardize the electrical interface, management interface and parameters of the components, without defining and constraining the physical size, installation, location, etc. of the components, which will provide a broad space for innovation and support differentiation and Flexible expansion.
  • Table 2 In addition to the power supply and high-speed signal external interface of the component, other low-speed management interfaces are defined as shown in Table 2 below:
  • the content of the above Tables 1 to 2 is only an example provided to assist in explaining the technical solution of the present application.
  • the new architecture of the server, the interface of the basic board, and the low-speed interface of the functional components may include more or less content.
  • this application also provides an intelligent management software, which realizes the management object template according to the standardization requirements of computing equipment. After the server is powered on, the management software automatically detects the components through the standard management bus and obtains the self-description information of the components, and then Create management object instances according to management object templates, thereby realizing self-adaptive management of management software, realizing intelligent management software, and supporting automatic discovery and automatic adaptation of components.
  • the server because the server needs to carry a large amount of business and perform a large number of data calculations, this requires the deployment of more components in the server, and whether a large number of processors, large memory, and access to more Many hard drives.
  • the working status of the main components of the server determines the running status of the server.
  • the server is also equipped with temperature sensors (to measure the temperature of the device), voltage sensors (to measure the operating voltage of the device), and different types of power supplies (to provide voltages of different volts) , fan (to cool down the device) and other devices.
  • An important part of the out-of-band management of the server is the monitoring and management of the working environment of the main components in the server to ensure that the main components in the server can work in a suitable working environment, such as the temperature is within the operating temperature range of the device, The voltage conforms to the working voltage of the device, the power supply of different models is normal, the fan is running normally, etc.
  • embodiments of the present application provide a board management system, method, device, and equipment.
  • the baseboard management controller can be connected to the server board through a management bus through a unified interface.
  • the baseboard management controller can transfer data from the server board.
  • the management information required for managing the server board can be acquired from the memory deployed on the server, and can also interact with the device manager on the server board through the management bus to manage the server board.
  • the baseboard management controller can acquire the working information of the devices connected to the device manager on the server single board, so as to realize out-of-band management. In this manner, the baseboard management controller can implement out-of-band management for different server boards without a large amount of adaptation work, which simplifies the entire process of the baseboard management controller to implement out-of-band management.
  • FIG. 1 it is a schematic structural diagram of a board management system provided in the embodiment of the present application.
  • the board management system can be deployed in a server, and the board management system includes a server board 100 and a board management control system. device 200.
  • the server single board 100 may be a basic board in the new architecture described above, or any component. It can also be a motherboard in a traditional server.
  • the embodiment of the present application does not limit the number of server single boards 100, which may be one or multiple.
  • the single board management system includes multiple server single boards 100
  • the multiple server single boards 100 may be of the same type.
  • the multiple server single boards 100 are all basic boards, and the multiple server single boards 100 can also be single boards of different types, such as the multiple server single boards 100 include a basic board, an IO component and a storage component .
  • the baseboard management controller 200 is connected to the server board 100 through a management bus 300 .
  • the management bus 300 can be an I2C (Inter-Integrated Circuit) bus, or a serial peripheral interface (serial peripheral interface, SPI) bus.
  • the management bus 300 can also be other types of buses.
  • the management bus 300 can be understood as a root management bus 300 managed by a single board, and the root management bus 300 can serve as a root management link.
  • the baseboard management controller 200 can acquire management information and working information of devices on the server board 100 through the root management link, so as to realize management of the server board 100 .
  • Server components are deployed on the server single board 100.
  • the components deployed on the server single board 100 include but are not limited to: processor, memory, temperature sensor, analog to digital converter (analog to digital converter, ADC), power interface, high-speed serial Support computer expansion bus standard (peripheral component interconnect express, PCIe) slot, hard disk interface, fan, power supply, etc.
  • Different types of server boards 100 may have different types and quantities of components deployed on the server boards 100 .
  • the embodiment of the present application does not limit the way of deploying components on the server board 100.
  • the components of the server can be directly soldered on the server board 100, and for another example, the components of the server can be and other high-speed interfaces) are connected to the server single board 100.
  • some components such as IO components, storage components, etc. may be connected to the base board through interfaces. In this case, these components may also be considered as devices deployed on the server board 100 .
  • a device manager 120 and a memory 110 are further deployed on the server board 100 .
  • the baseboard management controller 200 is respectively connected to the device manager 120 and the memory 110 through the management bus 300 .
  • the device manager 120 may be connected to some or all of the devices on the server board 100 .
  • the devices on the server board 100 include two types, one is the devices that establish a connection with the baseboard management controller 200 through the device manager 120, and for the convenience of description, this type of device is called the first type of device .
  • the other type is a device that is directly connected to the BMC 200 through the management bus 300 , and for convenience of description, this type of device is called the second type of device.
  • the devices involved in the out-of-band management may all belong to the first type of device, that is, the server board
  • the devices on the board 100 are all connected to the device manager 120 .
  • FIG. 1 is drawn as an example that all devices on the server board 100 belong to the first type of devices.
  • the devices involved in the out-of-band management may also include devices of the first type and devices of the second type.
  • the second type of device is not connected to the device manager 120, but can be connected to the baseboard management controller 200 through the management bus 300.
  • This type of server single board 100 and the baseboard management controller For the connection manner of the device 200, please refer to the related description in the subsequent FIG. 2 .
  • the device manager 120 may interact with the first type of devices to acquire the working information of the first type of devices. For example, the device manager 120 can obtain the temperature of the temperature sensor, the voltage value of the voltage sensor, whether the power supply interface is connected to a power supply, the voltage provided by the power supply (the power supply voltage is obtained by connecting to the ADC, and the ADC converts analog signals such as voltage into data signals), Whether a PCIe interface component (such as an accelerator card) is inserted into the PCIe slot, whether the hard disk interface is connected to the hard disk, and whether the fan is running.
  • a PCIe interface component such as an accelerator card
  • the memory 110 stores management information of the server board 100 , which is necessary information for the baseboard management controller 200 to implement out-of-band management. Related descriptions about the management information of the server board 100 will be described below.
  • the baseboard management controller 200 is connected to the device manager 120 and the memory 110 through the management bus 300, and the baseboard management controller 200 can obtain the management information of the server board 100 from the memory 110, and understand the attributes of the service board, device attributes, and server boards. The topology information of the board 100, etc.
  • the baseboard management controller 200 may also acquire the working information of the first type of device through interaction with the device manager 120 .
  • the server board 100 is managed based on the management information (and the work information of the first type of device).
  • FIG. 2 it is a schematic structural diagram of another board management system provided by the embodiment of the present application.
  • the board management system can be deployed in a server, and the board management system includes a server board 100 and a board management system. Controller 200.
  • the BMC 200 and the server board 100 need only be connected through one management bus 300 , and the BMC 200 is connected to the device manager 120 , the memory 110 , and the second type of devices through one management bus 300 .
  • the device manager 120 is connected to the first type of devices on the server board 100 .
  • the device manager 120, the memory 110, the management bus 300, the first type of device and the second type of device please refer to the foregoing content, which is different from the single board management system described in FIG. 1 and the one shown in FIG. 2
  • the devices on the server board 100 may also be directly connected to the baseboard management controller 200 through the management bus 300 and be directly managed by the baseboard management controller 200 .
  • the memory 110 is used to store the management information of the server single board 100 required for out-of-band management.
  • the embodiment of the present application does not limit the type of the memory 110, and the memory 110 can be a live, erasable, programmable read-only
  • the memory electrically erasable programmable read only memory, EEPROM
  • the memory 110 is used as a field replaceable unit description (field replaceable unit description, FRUD), and the management information required for managing the server single board 100 is stored in the FRUD.
  • the management information includes attribute information of the server board 100, information of devices to be managed, topology information, and alarm information.
  • the management information includes attribute information of the server board 100, topology information of the server board 100, and attribute information of devices.
  • the attribute information of the server board 100 is used to describe the hardware information of the server board 100, and the attribute information of the server board includes but not limited to: board type, board identification (identification, ID), board printing The version number of the printed circuit board (PCB), and the version number of the bill of material (BOM) of the single board.
  • the baseboard management controller 200 After acquiring the attribute information of the server board 100 , the baseboard management controller 200 can know the basic information of the server board 100 .
  • the topology information of the server board 100 describes the connection relationship of devices on the server board 100, and the topology information of the server board 100 may include an in-band management topology and a management bus topology.
  • the in-band management topology can also be called the service bus topology.
  • the in-band management topology describes the topology information of the service plane of the server single board 100, that is, between the devices (processors, hard disks, and memory) in the server single board 100 that carry server services. Connection relationship, including but not limited to: connection management of devices on the base board, connection relationship between base board and components, connection management between components, etc.
  • the in-band management topology includes but is not limited to: component signals, processor information (such as port number, type, quantity, bit width, etc.), memory information (such as port number, type, quantity, bit width, etc.), hard disk information (such as hard disk interface, type, quantity, bit width, etc.), and the connection mode between processors, memory, and hard disks, etc. All information related to devices on the service plane can be recorded in the in-band management topology.
  • the management bus topology may also be called an out-of-band management topology, and the management bus topology describes the topology information of devices involved in the out-of-band management of the server board 100 . That is, the connection relationship between devices (temperature sensor, voltage sensor, ADC, power supply, fan) involved in out-of-band management in the server single board 100 .
  • the management bus topology includes, but is not limited to: information about devices connected to the management bus 300 (eg, device manager 120 or second-type devices), and information about devices connected to the device manager 120 (ie, first-type devices). All device information related to out-of-band management can be recorded in the in-band management topology.
  • the baseboard management controller 200 obtains the topology information of the server board 100 to know the connection relationship of the devices on the server board 100, and based on the topology information of the server board 100, it can be determined that the subsequent device manager 120 reads the first type of device.
  • Working information (such as temperature, voltage, whether the power supply is working, etc.) and the working environment of the second-type device directly read from the second-type device’s working information describe which device’s working environment, and then determine the working environment of the device Whether the environment meets the requirements or whether the device is faulty, and whether an alarm is required.
  • Attribute information of the device where the device includes attribute information of the first type of device and attribute information of the second type of device.
  • devices include chips (such as processor chips, etc.), connectors, buses, and slots (slots refer to slots where input/output devices are inserted, such as PCIe slots, hard disk slots, etc.).
  • the information included in the management information can be found in Table 3. It should be noted that the above description and Table 1 only show part of the information in the management information. The embodiment of the present application does not limit the division method and information in the management information. Content, all information required for out-of-band management can be stored in the memory 110 as management information.
  • the management information required for the out-of-band management is stored in the memory 110, and the address of the memory 110 may be a preset address.
  • the baseboard management controller 200 When the baseboard management controller 200 is connected to the memory 110 through the management bus 300, it can interact with the memory 110 through the address, and read the management information from the memory 110, so as to implement subsequent out-of-band management.
  • the baseboard management controller 200 can obtain the management information relatively simply and quickly, which simplifies the out-of-band management process.
  • the out-of-band management interface of the server single board 100 is unified as a root management bus, and a memory 110 with a fixed address (such as EEPROM) can be connected to the root management link as FRUD, and the server single board is described in FRUD
  • the management information of the board 100, the baseboard management controller 200 can automatically load the management configuration of the board by reading the information in the FRUD .
  • the device manager 120 can also be called a satellite management center (satellite manager centre, SMC) to manage.
  • SMC satellite management center
  • the way of reporting can be command words, and one type of work information can correspond to one command word.
  • the SMC is used as the board-level management center on the server single board 100 to collect the work information of the first type of device on the single board , such as sensor information, alarm information, processing server single board 100 upgrade requirements and Management requirements of other devices on the board .
  • the SMC communicates with the baseboard management controller 200 through the root management bus interface using command words .
  • the baseboard management controller 200 does not need to connect each device that needs out-of-band management, but obtains the working information of these devices through the device manager 120, and then determines the working environment of the main devices in the server.
  • the baseboard management controller 200 only needs to be connected to the device manager 120, which can greatly simplify the connection mode between the baseboard management controller 200 and the server board 100, and realize the intelligent management of the out-of-band management of the server board 100.
  • This connection mode It is also suitable for different server single boards 100 .
  • the device manager 120 may be a complex programmable logic device (complex programmable logic device, CPLD), or may be a microcontroller unit (microcontroller unit, MCU). After the device manager 120 collects the working information of each device connected to it, it can report the collected information to the BMC 200 through the management bus 300 .
  • CPLD complex programmable logic device
  • MCU microcontroller unit
  • the embodiment of the present application does not limit the interaction manner between the device manager 120 and the baseboard management controller 200 .
  • the device manager 120 and the baseboard management controller 200 may interact in a command word manner.
  • One type of work information corresponds to one command word.
  • the format of the command word can be shared by different server boards 100, so that the baseboard management controller 200 can interact with the device managers 120 on different server boards 100 in the same way, reducing unnecessary adaptation work.
  • the command word format defined between the device manager 120 and the BMC 200 mainly includes two parts, one part is an operation code (operation code, OP code) and a device parameter (parameter).
  • the embodiment of the present application does not limit the specific size of the command word.
  • the commander can occupy 4 bytes (that is, 32 bits).
  • the device parameter can occupy 1 byte
  • the operation code can occupy 3 bytes.
  • the operation code is used to describe the need to operate the device.
  • the operation may include reading the working information of the device and sending commands to the device (sending commands to the device can be understood as writing information to the device).
  • Device parameters are used to indicate the device that needs to be operated.
  • the device parameter can be the serial number or identification of the device.
  • the opcode consists of four fields. They are respectively function (function) field, command (command) field, read times field (in FIG. 3, this field is represented by MS), and read/write identification field (in FIG. 3, this field is represented by RW).
  • the function field is used to indicate the server board 100 targeted by the command word. This function field cannot be defaulted when there are multiple boards in the board management system. When there is only one board in the board management system, the The content of the function field can be set to a default value or an empty value. The function field may occupy 6 bits.
  • 1 may indicate an expansion component (the expansion component refers to a component used to add interfaces or slots in the server).
  • 2 indicates the storage component (the storage component refers to the component used to connect the hard disk and realize the data storage function in the server).
  • 3 indicates the base plate.
  • 4 indicates the memory expansion component (the memory expansion component refers to the component in the server that undertakes the memory function). 0 is used to represent a common command, that is, the command word is for all server boards 100 .
  • the command field is used to describe the type of operation, such as indicating which kind of working information to read (such as temperature, voltage, whether the power supply is normal, fault or alarm, etc.).
  • the command field needs to be defined in advance to distinguish different operations.
  • the command field may occupy 16 bits.
  • the number of reads field is used to distinguish whether this operation is multiple reads or a single read, that is, it indicates that the working information of multiple devices is read at one time or the working information of one device is read at one time. For example, when this field is 0, it represents multiple reads, and when it is 1, it represents a single read.
  • the read count field may occupy 1 bit.
  • the read-write identification field is used to distinguish whether this operation is a read operation or a write operation. For example, when the field is 0, it means that this operation is a read operation, and when it is 1, it means a write operation.
  • the read-write identification field can occupy 1 bit.
  • the interaction process between the device manager 120 and the baseboard management controller 200 includes: the baseboard management controller 200 initiates a read request to the device manager 120, and the device manager The 120 feeds back the read response to the BMC 200 .
  • FIG. 4A is a schematic diagram of the format of a read request provided by the embodiment of the present application
  • FIG. 4B is a schematic diagram of the format of a read response request provided by the embodiment of the present application.
  • the first row in FIG. 4A and FIG. 4B is the name of each field, and the second row is the number of bits occupied by each field.
  • the baseboard management controller 200 When the baseboard management controller 200 needs to write information to the device, that is, when the baseboard management controller 200 issues commands to the device (such as controlling the device to start, stop, and upgrade), the communication between the device manager 120 and the baseboard management controller 200
  • the interaction process includes: the baseboard management controller 200 initiates a write request to the device manager 120, and the write request carries commands (such as control commands) or data (upgrade files) that need to be written.
  • FIG. 4C is a schematic diagram of a format of a write request provided by the embodiment of the present application.
  • the first row in FIG. 4C is the name of each field, and the second row is the number of bits occupied by each field.
  • each field in the foregoing FIG. 4A to FIG. 4C is just an example. In practical applications, when designing each field in the read request, write request, and read response, the fields can be increased or decreased according to actual needs.
  • the device manager 120 and the baseboard management controller 200 exchange work information of the first type of device.
  • the baseboard management controller 200 can also issue control commands to the first-type devices through interaction with the device manager 120, so as to control the working status of the first-type devices.
  • the control command can control one or several first-type devices. Class devices stop working, or start working.
  • the control command may be carried as data in the data field shown in FIG. 4C.
  • the device manager 120 can identify the control command therein, and control the corresponding first-type device according to the control command, such as controlling the first-type device to stop working or start working.
  • the baseboard management controller 200 may also issue an upgrade command to the first type of device through interaction with the device manager 120, so as to instruct the first type of device to perform an upgrade.
  • the upgrade file required for the device upgrade of the first type can be carried as data in the data field shown in FIG. 4C .
  • the device manager 120 may identify the upgrade file therein, and send the upgrade file to the corresponding first-type device, indicating that the first-type device is upgraded.
  • the baseboard management controller 200 may also directly instruct the device manager 120 to upgrade, and the upgrade file required by the device manager 120 to upgrade may be carried as data in the data field shown in FIG. 4C . After receiving the writing request, the device manager 120 can identify the upgrade file therein, and use the upgrade file to perform the upgrade.
  • the basic management controller can read management information from the memory 110 through the management bus 300, and can also implement out-of-band management for the first type of devices through interaction with the device manager 120 .
  • the BMC 200 directs the root management bus, and allows the BMC 200 to automatically load the management characteristics of this type of device by describing in FRUD.
  • the second type of device is allowed to exist on the server board 100, and the second type of device can be directly connected to the baseboard management controller 200 through the management bus 300, and the baseboard management controller 200 can be directly connected to the baseboard management controller 200 through the management bus 300.
  • the second type of device interacts to obtain the working information of the second type of device, and implements out-of-band management for the second type of device.
  • the baseboard management controller 200 may determine the second-type devices deployed on the server board 100 according to the management information, that is, obtain information about the second-type devices directly connected to the management bus 300 . Based on the management information, the baseboard management controller 200 may pre-load a management driver for the second-type device (the management driver refers to a software program required to manage the second-type device), so as to manage the second-type device.
  • the management driver refers to a software program required to manage the second-type device
  • the baseboard management controller 200 can be deployed on a single board to form a BMC management single board (that is, the expansion board mentioned above), and the BMC management single board can be used as the management center of the server. Used to implement out-of-band management of servers.
  • the appearance of the BMC management single board may be as shown in FIG. 5A.
  • the BMC management board provides external management interfaces, including debugging serial port, unit identification (UID) indicator light, management network port, video graphics array (video graphics array, VGA) interface, universal serial bus (universal serial bus, USB) ) interface, etc. Refer to FIG. 5B for the external management interface provided by the BMC management board.
  • the BMC management board provides the management interface required for board management through the 4C+ connector, including the out-of-band management bus interface. If the management bus is an I2C bus, the out-of-band management bus interface is an I2C interface.
  • the BMC management board may also provide other management interfaces, and this embodiment of the present application does not limit the type of the other management interfaces.
  • Other management interfaces include some or all of the following: joint test action group (joint test action group, JTAG) interface, SPI interface, network control sideband interface (network controller sideband interface, NCSI), platform environment control interface (platform environment control interface, PECI) debugging serial port, UID button indicator light, management network port, VGA port.
  • JTAG joint test action group
  • SPI network control sideband interface
  • NCSI network controller sideband interface
  • platform environment control interface platform environment control interface
  • PECI platform environment control interface
  • the BMC management board also provides the Low pin count (LPC) interface, USB interface, and PECI interface required for in-band management.
  • LPC Low pin count
  • the power supply, clock circuit, stray signal circuit, etc. required for the baseboard management and control work are also deployed on the BMC management board.
  • the definition of the management interface pins provided by the BMC management board is shown in Table 6 below:
  • Power/GND indicates a power signal or a ground signal
  • USB3 indicates support for USB3.0.input indicates signal input
  • output indicates signal output.
  • VGA refers to the VGA signal.
  • the VGA signal in the above table includes three signals, which are red, green and blue signals.
  • HCSL refers to high-speed current control logic level (high-speed current steering logic).
  • the signal definition is only exemplary, and in actual use, different signal definitions can also be set according to actual needs.
  • the following describes the board management system provided in the embodiment of the present application by taking the structure of the board management system to which three different types of server boards 100 belong as an example.
  • the server board 100 is a basic board (Basic Computer Unit, BCU).
  • the BMC is connected to the EEPROM and CPLD of the BCU through an I2C bus.
  • the EEPROM is used to realize the function of the memory 110 in the above embodiments, and stores management information of the computing processing unit, such as attribute information of the computing processing unit, and the like.
  • the CPLD is used to implement the functions of the SMC in the above embodiments, such as implementing management and control of devices, processing upgrade commands or control commands, and the like.
  • CPLD is connected to devices such as ADC, temperature sensor, clock circuit, and flash memory.
  • Three kinds of signals can be obtained, including power good (power good, PG) signal (used to indicate whether the power supply is connected or not connected), present signal (for example, the present signal can be used to indicate the connector Whether there is device access), fault (fault) signal.
  • the power OK signal is used to indicate whether the power is connected or not.
  • the present signal can be used to indicate whether a device is inserted into the connector.
  • the fault signal can be used to indicate whether a device is faulty, for example, the device can be a CPU or a power controller.
  • the CPU may be directly connected to the first conversion chip (for example, 9555 chip) through a low-speed signal line to provide a CPU alarm signal indicating that an error occurs in the CPU.
  • the first conversion chip is used to increase the number of connected devices.
  • the CPLD can obtain the working information of the ADC (the working information of the ADC is the digital signal converted from the voltage signal by the ADC), temperature, CPU alarm signal, power supply information of the power supply and other working information.
  • the CPLD can also realize the frequency of the loading clock circuit and the flash memory upgrade function.
  • the second conversion chip (for example, 9545 chip) can provide multiple I2C interfaces, and multiple voltage regulator controllers (voltage regulator controllers) on the computing processing unit are directly connected to the I2C bus after being expanded by the second conversion chip.
  • the topology information in the calculation processing unit in the EEPROM describes the connection relationship of the voltage regulation power supply controller directly connected to the I2C bus.
  • the voltage regulating power supply controller is used to supply power to the CPU.
  • the BMC can directly manage the voltage regulation power controller.
  • the CPLD interacts with the BMC through the I2C bus based on command words, and transmits the working information of the devices connected to the CPLD. It can also accept the control of the BMC and perform operations such as upgrading and loading some devices. BMC can also upgrade CPLD through I2C bus.
  • the server single board 100 is an IO component (input output unit, IOU).
  • FIG. 6B it is a board management system provided by the embodiment of the present application, and the board management system can be used to implement out-of-band management for an IO expansion unit.
  • the BMC is connected to the EEPROM and MCU of the IOU through an I2C bus.
  • the EEPROM is used to realize the function of the memory 110 in the above-mentioned embodiment, wherein the management information of the IOU is stored, and the MCU is used to realize the function of the SMC in the above-mentioned embodiment, such as realizing the management and control of the device, processing upgrade commands or control commands, etc. .
  • the MCU is connected to temperature sensor power supply, power supply, PCIe slot and other devices.
  • the MCU can obtain working information such as temperature, the PG signal passed through the first conversion chip, and the in-position signal (the in-position signal can indicate whether there is a device inserted into the connector).
  • the MCU interacts with the BMC through the I2C bus based on the command word, and transmits the working information of the devices connected to the MCU.
  • the BMC implements the upgrade function for the MCU through the I2C bus.
  • the PCIe standard card inserted in the PCIe slot (Slot) slot is directly connected to the I2C bus through the second conversion chip, and the topology information in the IOU in the EEPROM describes the connection relationship of the PCIe standard card directly connected to the I2C bus.
  • the BMC can directly manage PCIe standard cards.
  • the server single board 100 is a storage unit (Storage Unit, STU).
  • STU Storage Unit
  • the BMC is connected to the EEPROM and CPLD of the BCU through an I2C bus.
  • the EEPROM is used to realize the function of the memory 110 in the above embodiments, and stores management information of the computing processing unit, such as attribute information of the computing processing unit, and the like.
  • the CPLD is used to implement the functions of the SMC in the above embodiments, such as implementing management and control of devices, processing upgrade commands or control commands, and the like.
  • CPLD is connected to temperature sensor, ADC, hard disk and other devices.
  • CPLD can obtain voltage, temperature, whether hard disk is connected, and can also obtain working information such as PG signal, in-position signal, CPU alarm signal through the fifth conversion chip.
  • the CPLD can also realize the management function of the hard disk, and obtain the working information of each hard disk through the sixth conversion chip.
  • the CPLD interacts with the BMC through the I2C bus based on the command word, and transmits the working information of the device connected to the CPLD.
  • the BMC can obtain the working information of each hard disk on the board through command words.
  • BMC can also upgrade CPLD through I2C bus.
  • the board management method includes the following steps:
  • Step 701 after the BMC 200 starts up, it scans the memory 110 with a preset address under the management bus 300 through the management bus 300 .
  • the baseboard management controller 200 After the server is powered on, the baseboard management controller 200 starts up, and the baseboard management controller 200 can find the memory 110 with a preset address from the devices attached to the management bus 300 through the management bus 300 .
  • Step 702 After scanning the memory 110 , the BMC 200 reads the management information of the server board 100 from the memory 110 through the management bus 300 .
  • the information contained in the management information can refer to the description of the foregoing content.
  • the baseboard management controller 200 can know the hardware information of the server board 100, the topology information of the server board 100, and the server board 100. Attribute information of the components of the board 100.
  • Step 703 After the server board 100 is powered on, the device manager 120 on the server board 100 collects the working information of the first type of device.
  • the device manager 120 can interact with the first type of device connected to the device manager 120 to obtain the working information of the first type of device, such as obtaining the temperature detected by the temperature sensor and the temperature detected by the ADC.
  • Step 704 the baseboard management controller 200 obtains the working information of the first type of device from the device manager 120. If the server board 100 includes a second-type device, the baseboard management controller 200 may also acquire the working information of the second-type device from the second-type device through the management bus 300 .
  • the baseboard management controller 200 may collect the working information of the first type of device of the server through the device manager 120, or obtain the working information of the second type of device through direct interaction.
  • the baseboard management controller 200 does not need to be connected to each device on the server single board 100 , and the method for the baseboard management controller 200 to obtain the working information of the server's devices is relatively simple.
  • Step 705 The baseboard management controller 200 manages the server board 100 based on the management information and the acquired working information of the components (such as the working information of the first type of components and the working information of the second type of components).
  • the baseboard management controller 200 can understand the connection management of the devices on the server board 100 based on the management information, and can determine the working environment (such as temperature, voltage, power supply, and failure) of some main devices on the server board 100 based on the working information of the devices. and other information), based on the baseboard management controller 200 can determine whether to control the devices on the server board 100, such as turning on the fan, restarting the power supply, and so on.
  • the baseboard management controller 200 may send a control command to the device manager 120 to control the first type of device.
  • the baseboard management controller 200 may also directly issue control commands to the second-type devices through the management bus 300 to control the second-type devices. For the manner of issuing the control command, reference may be made to the foregoing content, which will not be repeated here.
  • the baseboard management controller 200 can also upgrade the device.
  • the baseboard management controller 200 may send an upgrade command to the device manager 120 to upgrade the first type of device.
  • the baseboard management controller 200 may send an upgrade command to the device manager 120 to upgrade the first type of device.
  • the baseboard management controller 200 may also send an upgrade command to the device manager 120 to upgrade the device manager 120 .
  • the baseboard management controller 200 may also issue an upgrade command to the second-type device directly through the management bus 300 to upgrade the second-type device.
  • the method of issuing the upgrade command please refer to the foregoing content, and will not repeat it here.
  • the baseboard management controller 200 can also determine whether to issue an alarm to the user to prompt the user that the device is faulty or has a high temperature, or a power supply error, etc., so that the baseboard management controller 200 can manage the server board 100 to ensure that the server board 100 can work normally , or the user can know the status of the server single board 100 in time.
  • FIG. 8 it is a BCU module management system provided by the embodiment of the present application.
  • the management system of the BCU module is used to ensure the management characteristics of the BCU module.
  • the management features of the BCU module include the external management interface provided by the BCU module, and the management features of the BCU module by the management module.
  • the low-speed signal on the outgoing high-speed connector of the BCU module contains management signals, which can be used for the outgoing BCU module
  • the out-of-band management of the riser card the advantage of this design is that the low-speed management signal line can be avoided on the riser card.
  • the management module of the BCU module is divided into out-of-band management and in-band management.
  • Tianchi management architecture recommends that the independent management features on the BCU module be terminated directly on the BCU module.
  • the frequency synthesizer configuration on the BCU module is directly on the BCU module Uploaded from above, it does not need to be managed separately by the management module.
  • the BMC on the management module provides an intelligent platform management bus (IPMB) interface to connect to the CPU of the BCU module as an intelligent platform management interface (Intelligent Platform Management Interface, IPMI) bus channel;
  • IPMB Intelligent Platform Management Bus
  • the BMC on the management module provides an LPC interface to connect to the CPU of the BCU module as a BT bus channel;
  • the BMC on the management module provides an I2C interface to connect to the CPLD and FRUD of the BCU module.
  • the BMC implements the basic out-of-band management of the BCU module through this I2C, including reading information in the FRUD and accessing the CPLD register of the BCU module as an SMC bus channel. ;
  • the CPLD chip on the management module provides two hisport interfaces to connect to the CPLD of the BCU module, one of which hisport0 is used as a logical register interaction channel between the BCU module and the management module, and the other is used as a hisport over I2C interface for the external expansion management interface of the BCU module;
  • the CPLD chip on the BCU module provides multiple I2C interfaces for information reading and configuration of the ADC chip, clock frequency synthesizer chip, and temperature sensor chip of the BCU module, that is, the CPLD on the BCU module realizes reading basic information such as temperature and voltage , and report to the BMC chip through a unified SMC interface, so that the independent management feature can be terminated inside the module.
  • the CPLD chip on the BCU module provides multiple I2C interfaces to connect to the UBC high-speed connector as a management channel for external expansion modules.
  • the management I2C provided externally comes from the hisport over I2C feature provided by the management module.
  • This management channel can be connected to out-of-band management devices such as the FRU chip and temperature sensor on the riser to realize the out-of-band management feature of components.
  • the embodiment of the present application also provides a board management device, which is used to execute the method performed by the baseboard management controller in the method embodiment shown in FIG. 7 above,
  • a board management device which is used to execute the method performed by the baseboard management controller in the method embodiment shown in FIG. 7 above.
  • the board management apparatus 900 includes an acquisition unit 901 and a management unit 902 .
  • the acquiring unit 901 is configured to acquire management information from a memory through a management bus.
  • the management unit 902 is configured to, based on the management information, interact with the device manager through the management bus to manage the single board of the computing device.
  • the computing single board includes a first type of device
  • the device management device is connected to the first type of device
  • the obtaining unit 901 can obtain the working information of the first type of device from the device manager through a management bus.
  • the computing single board includes a second type of device, and the second type of device is connected to the baseboard management controller through the management bus, and the obtaining unit 901 can obtain the working information of the second type of device from the second type of device through the management bus .
  • the management information includes part or all of the following: attribute information of the single board of the computing device, topology information of the single board of the computing device, attribute information of the first type of device, and attribute information of the second type of device.
  • the interaction may be based on a command word.
  • the device further includes an upgrade unit 903 .
  • the upgrade unit 903 may transfer the upgrade file of the first type of device to the device manager, indicating to upgrade the first type of device.
  • the upgrade file of the device manager may also be passed to the device manager to instruct the device manager to be upgraded.
  • the management bus is an I2C bus or an SPI bus.
  • each functional unit in the embodiment of this application can be integrated in one processing In a unit, each unit may exist separately, or two or more units may be integrated into a module.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software function modules.
  • the present application also provides a computing device 1000 as shown in FIG. 10 .
  • the computing device 1000 includes a computer single board and a baseboard management controller 1500 , and the computer single board may include a bus 1100 , a processor 1200 , a communication interface 1300 , and a memory 1400 .
  • the processor 1200 , the memory 1400 and the communication interface 1300 communicate through the bus 1100 .
  • the processor 1200 may be a central processing unit (central processing unit, CPU) application specific integrated circuit (application specific integrated circuit, ASIC), a field programmable gate array (field programmable gate array, FPGA), artificial intelligence (artificial intelligence, AI ) chip, system on chip (SoC) or complex programmable logic device (complex programmable logic device, CPLD), graphics processing unit (graphics processing unit, GPU), etc.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • AI artificial intelligence
  • SoC system on chip
  • CPLD complex programmable logic device
  • GPU graphics processing unit
  • the memory 1400 may include a volatile memory (volatile memory), such as a random access memory (random access memory, RAM).
  • the memory 1400 may also include a non-volatile memory (non-volatile memory), such as a read-only memory (read-only memory, ROM), flash memory, HDD or SSD.
  • the memory 1400 may also include the memory 110 mentioned above, that is, management information may be stored therein.
  • the memory 1400 may also store operating system and other software modules required for running processes.
  • the operating system can be LINUX TM , UNIX TM , WINDOWS TM and so on.
  • the baseboard management controller 1500 includes a processor 1510 and a memory 1520, where computer program codes are stored in the memory 1520, and the processor 1510 executes the computer program codes to perform the method described in FIG. 7 above.
  • the baseboard management controller 1500 may also only include a processor 1510, on which computer program codes are programmed, and the processor 1510 may execute the method described in FIG. 7 above.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes computer program instructions, and when the computer program instructions are loaded and executed on the computer, all or part of the process or function described in FIG. 7 according to the embodiment of the present invention will be generated.
  • the above-mentioned embodiments may be implemented in whole or in part by software, hardware, firmware or other arbitrary combinations.
  • the above-described embodiments may be implemented in whole or in part in the form of computer program products.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded or executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center that includes one or more sets of available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media.
  • the semiconductor medium may be a solid state drive (SSD).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Stored Programmes (AREA)
  • Debugging And Monitoring (AREA)

Abstract

一种单板管理系统、方法、装置及设备,基板管理控制器可以通过管理总线与计算设备单板连接。计算设备单板包括存储器和器件管理器,存储器中记录计算设备单板的管理信息。计算设备单板内部,存储器和器件管理器可以通过管理总线与基板管理控制器连接。基板管理控制器通过管理总线从存储器中获取管理信息,并基于管理信息、与器件管理器交互管理计算设备单板。基板管理控制器与计算设备之间的连接关系简单,适配于不同结构的计算设备的单板,能够有效简化计算设备单板的管理方式。计算设备单板的管理方式更加高效。

Description

一种单板管理系统、方法、装置及设备
本申请要求于2022年2月28日提交中国专利局、申请号为202210188470.X、发明名称为“一种单板管理系统、方法、装置及设备”的中国专利申请的优先权,其专利申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及服务器技术领域,尤其涉及一种单板管理系统、方法、装置及设备。
背景技术
自上个世纪80年代起,微软和英特尔为推动个人电脑(personal computer,PC)产业的发展组成Wintel联盟。两家公司在PC产业内密切合作,以驱动计算产业的更快发展,并逐步影响到服务器等其他计算设备。而服务器等计算设备的应用场景多、配置类型多、对可靠性要求也比较高。同时,服务器具有巨大的商业体量,是开放产业生态建设的焦点。
以传统服务器为例,当前传统服务器的产业生态存在以下特点:
标准化程度不高:传统服务器已经有一定的组件标准化基础,例如,内存条、固态硬盘(Solid State Drive,SSD)、快捷外围部件互连标准(Peripheral Component Interconnect Express,PCIe)卡等组件已有各自标准。组件标准化对产业生态和资源共享做出了很大贡献,减少了一部分服务器整机厂商的开发工作。但是,标准化组件在整个服务器中的占比较少,这就使得服务器主板的开发还需要投入较多人力完成标准化组件和非标准化组件的适配。
为此需要增强服务器的组件的标准化。服务器的组件的标准化涉及到服务器的各个方面。服务器的带外管理(Out-of-band management)是指通过独立管理通道进行服务器等设备进行维护。服务器的带外管理允许系统管理员远程监控和管理服务器。服务器的带外管理主要涉及的是对服务器单板上的器件(如处理器、内存、硬盘)的工作环境进行管理与监控,如器件的工作环境包括但不限于温度、工作电压、风扇、电源供电状态等信息,保证服务器的器件能够在适宜的工作环境中工作。
服务器的带外管理通常是由基板管理控制器(baseboard management controller,BMC)来实现的。为了实现带外管理,基板管理控制器需要与服务器单板连接,以与服务器单板上的各个器件连接。但是由于处理器的类型、以及种类越来越多,部署有不同处理器的服务器单板的架构也不同,这些不同的服务器单板中与基板管理控制器连接的接口并不统一,这就导致,针对每一种服务器单板的带外管理,基板管理控制器需要进行大量的适配工作,不同类型的服务器单板的基板管理控制器的复用率低。
发明内容
本申请提供一种单板管理系统、方法、装置及设备,用以提供一种适配度更高的带外管理BMC以及方法。
第一方面,本申请实施例提供了一种单板管理系统,该单板管理系统包括基板管理控制器和计算设备单板。该单板管理系统可以部署在计算设备中,该计算设备可以为服务器,个人电脑等。
基板管理控制器可以通过管理总线与计算设备单板连接。计算设备单板包括存储器和器 件管理器,存储器中记录计算设备单板的管理信息。计算设备单板内部,存储器和器件管理器可以通过管理总线与基板管理控制器连接。
基板管理控制器可以通过管理总线从存储器中获取管理信息,并基于管理信息、与器件管理器交互管理计算设备单板。
通过上述系统,基板管理控制器与计算设备之间的连接关系简单,适配于不同结构的计算设备的单板,能够有效简化计算设备单板的管理方式。计算设备单板的管理方式也更加高效。
在一种可能的实施方式中,计算设备单板还包括第一类器件,第一类器件与器件管理器连接,器件管理器可以获取第一类器件的工作信息。基板管理控制器可以通过管理总线从器件管理器获取第一类器件的工作信息。
通过上述系统,基板管理控制器在无需与第一类器件连接的情况下,能够方便的通过器件管理器获取第一类器件的工作信息,第一类器件的工作信息获取方式简单、高效,避免了基板管理控制器为匹配不同计算设备单板所需进行的大量适配工作,简化了带外管理流程。
在一种可能的实施方式中,计算设备单板还包括第二类器件,第二类器件可以不通过器件管理器与基板管理控制器连接,该第二类器件可以直接通过管理总线与基板管理控制器连接;基板管理控制器可以直接通过管理总线与第二类器件交互,获取第二类器件的工作信息。
通过上述系统,该管理总线不仅可以下挂存储器和器件管理器,还可以下挂第二类器件,这种连接方式较为简单,基板管理控制器也同样无需进行过多的适配工作,有效地扩展了应用场景。
在一种可能的实施方式中,管理信息是基板管理控制器管理计算设备单板所需的信息。也即可以通过存储器预先保存管理信息。在本申请实施例中并不限定管理信息的具体内容,凡是管理计算设备单板所需的信息均适用于本申请实施例。例如,该管理信息包括下列的部分或全部:计算设备单板的属性信息、计算设备单板的拓扑信息、第一类器件的属性信息、第二类器件的属性信息。
通过上述系统,将管理信息预先保存在存储器中,基板管理控制器仅需进行简单的加载操作就可以获取该管理信息,管理信息的获取方式更加简单。
在一种可能的实施方式中,基板管理控制器可以与器件管理器进行交互,本申请实施例并不限定基板管理控制器与器件管理器的交互方式。例如基板管理控制器可以与器件管理器基于命令字的方式进行交互,这样可以保证高效的交互效率。不同计算设备单板可以设置通用的命令字的方式。这样使得基板管理控制器可以适配于不同的计算设备单板,提升基板管理控制器以及管理方法的适配度。
在一种可能的实施方式中,基板管理控制器可以控制第一类器件。例如,基板管理控制器可以向器件管理器下发控制命令,以指示器件管理器对第一类器件进行控制。基板管理控制器可以直接控制第一类器件,基板管理控制器可以通过管理总线向第二类器件下发控制命令,以控制第二类器件。
基板管理控制器对第一类器件或器件管理器进行升级。例如,基板管理控制器可以向器件管理器传递第一类器件的升级文件,指示对第一类器件进行升级。器件管理器在获取第一类器件的升级文件后,利用第一类器件的升级文件对第一类器件进行升级。基板管理控制器也可以向器件管理器传递器件管理器的升级文件,指示对器件管理器进行升级。当然,基板管理控制器也可以通过管理总线直接对第二类器件进行升级。
通过上述系统,基板管理控制器通过管理总线或器件管理器对器件进行控制或升级,简 化了控制以及升级的方式,保证能够高效的实现对计算设备的单板的管理。
在一种可能的实施方式中,本申请实施例并不限定存储器的类型,例如该存储器可以为带电可擦可编程只读存储器,体积小,集成化程度更高。
在一种可能的实施方式中,本申请实施例并不限定器件管理器的具体结构,凡是能够实现器件管理的模块均适用于本申请实施例。例如,器件管理器为复杂可编程逻辑器件或微控制单元。器件管理器的具体结构较多样,适用于不同的计算设备单板,有效地扩展了应用场景。
在一种可能的实施方式中,管理总线可以为内部集成电路总线或串行外设接口总线,也可以为其他类型的总线。管理总线的类型较为灵活,使得基板管理控制器可以通过管理总线与不同类型的计算设备单板连接,提升基板管理控制器与不同类型的计算设备的适配程度。
第二方面,本申请实施例提供了一种单板管理方法,方法用于对计算设备单板进行管理,有益效果可以参见第一方面的描述此处不再赘述。计算设备单板包括存储器和器件管理器,存储器中记录计算设备单板的管理信息,在该方法中,基板管理控制器可以通过管理总线从存储器中获取管理信息。在获取管理信息之后,基板管理控制器可以基于管理信息,通过管理总线与器件管理器交互,管理计算设备单板。
在一种可能的实施方式中,计算单板包括第一类器件,第一类器件可以与器件管理器件连接,基板管理控制器可以通过管理总线从器件管理器获取第一类器件的工作信息。
在一种可能的实施方式中,计算单板包括第二类器件,第一类器件可以通过管理总线直接与基板管理控制器连接,基板管理控制器可以通过管理总线从第二类器件获取第二类器件的工作信息。
在一种可能的实施方式中,管理信息包括下列的部分或全部:计算设备单板的属性信息、计算设备单板的拓扑信息、第一类器件的属性信息、第二类器件的属性信息。
在一种可能的实施方式中,基板管理控制器通过管理总线与器件管理器交互时,基板管理控制器可以通过管理总线,与器件管理器基于命令字的方式进行交互。
在一种可能的实施方式中,基板管理控制器通过器件管理器控制第一类器件,还可以通过器件管理器对第一类器件进行升级。例如,基板管理控制器向器件管理器传递第一类器件的升级文件,指示对第一类器件进行升级。器件管理器在接收到第一类器件的升级文件,可以利用第一类器件的升级文件对第一类器件进行升级。基板管理控制器也可以对器件管理控制器或第二类器件进行升级或控制。
在一种可能的实施方式中,管理总线为I2C总线或SPI总线。
第三方面,本申请实施例还提供了一种单板管理装置,该单板管理装置具有实现上述第二方面的方法实例中行为的功能,有益效果可以参见第一方面的描述此处不再赘述。功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块。在一个可能的设计中,单板管理装置的结构中包括请求获取单元、管理单元,可选的,还包括升级单元。这些单元可以执行上述第二方面方法示例中的相应功能,具体参见方法示例中的详细描述,此处不做赘述。
第四方面,本申请实施例还提供了一种基板管理控制器,该基板管理控制器具有实现上述第二方面的方法实例中行为的功能,有益效果可以参见第二方面的描述此处不再赘述。所述装置的结构中包括处理器。可选的,还可以包括存储器。所述处理器被配置为支持所述单板管理装置执行上述第二方面方法中相应的方法。可选的,基板管理控制器还可以包括存储器。所述存储器与所述处理器耦合,其保存所述通信装置必要的计算机程序指令。处理器可 以调用计算机程序指令执行上述第二方面方法中相应的方法。
第五方面,本申请实施例还提供了一种计算设备,该计算设备包括基板管理控制器和计算设备单板,计算设备单板上可以包括处理器、存储器等器件。基板管理控制器具有实现上述第二方面的方法实例中行为的功能,有益效果可以参见第一方面的描述此处不再赘述。
第六方面,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第二方面以及第二方面的各个可能的实施方式中所述的方法。
第七方面,本申请还提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面以及第一方面的各个可能的实施方式中所述的方法。
第八方面,本申请还提供一种计算机芯片,所述芯片与存储器相连,所述芯片用于读取并执行所述存储器中存储的软件程序,执行上述第二方面以及第二方面的各个可能的实施方式中所述的方法。
附图说明
图1为本申请提供的一种单板管理系统的架构示意图;
图2为本申请提供的另一种单板管理系统的架构示意图;
图3为本申请提供的一种命令字的结构示意图;
图4A为本申请提供的一种读取请求的结构示意图;
图4B为本申请提供的一种读取响应的结构示意图;
图4C为本申请提供的一种写入请求的结构示意图;
图5A~图5B为本申请提供的一种扩展板的结构示意图;
图6A~图6C为本申请提供的一种单板管理系统的架构示意图;
图7为本申请提供的一种单板管理方法的示意图;
图8为本申请提供的一种BCU的管理系统的结构示意图;
图9为本申请提供的一种单板管理装置的结构示意图;
图10为本申请提供的一种计算设备的结构示意图。
具体实施方式
传统服务器主板开发的技术门槛高,除中央处理器(central processing unit,CPU)外,还包括总线扇出、电源扇出、维护管理等功能,这些主板上CPU相关的电路都来自CPU厂家给出的参考设计,而不同的CPU厂家提供的参考设计完全不同,这就使得主板的开发和设计需要投入大量资源和时间。为了满足服务器等计算产品的快速更新换代的需求,整机厂商需要在差异化创新上投入较多精力,但往往只能聚焦在低级的硬件规格比拼方面。这样既不能满足客户的多样场景和算力的需求,也迫使整机厂家陷入低效的同质化竞争内卷中。而随着算力多样性趋势的呈现,更多的处理器厂家涌现,并推出更多不同架构的处理器产品,各类处理器的迭代速度也随之快速提升。与此同时,处理器的功耗也持续增加,传统服务器的散热技术无法满足需求。此外,为了提升系统性能,业界还推出了新介质类型(例如,Intel推出3D Xpoint新型非易失介质等)和形态,上述新介质类型和形态也需要新架构支持和适配。为了开发适配上述技术趋势的服务器,整机厂商则需要投入巨大的开发工作量,但由于不同产品的差异性导致同一主板或整机的设计方案又无法复用。所以,整个产业对服务器的跨架构共用部件、跨代演进、缩短上市时间(time to market,TTM)、减少总体运营成本(total  cost of operation,TCO)等方面提出更高的要求,产业的进一步发展需要构建更为开放、标准化的服务器架构,提高开发效率,提升部件重用度,提供更多灵活性和差异化。
本申请提出一种创新的对等互联架构(也可以称为服务器新架构或新架构)。在该架构中,首先将传统的主板拆分为基础板(Basic Computing Unit,BCU)、扩展板(Extension Unit,EXU),以基础板配合扩展板的方式实现对不同场景所需的主板的规格和形态的支持。同一计算设备中可以包括一个基础板和一个扩展板,同一计算设备也可以包括多个基础板和一个扩展板,同一计算设备还可以包括一个基础板和多个扩展板。基础板包括CPU、双倍数据速率(double data rate,DDR)以及相关电源,提供通用计算能力及外围存储、输入输出(input/output,IO)、加速等扩展接口。基础板支持等不同系列的CPU。可选地,基础板支持异构处理器,即基础板可以支持不同类型的处理器,例如,基础板支持CPU,以及专用集成电路(application-specific integrated circuit,ASIC)、可编程逻辑器件(programmable logic device,PLD)、复杂程序逻辑器件(complex programmable logical device,CPLD)、现场可编程门阵列(field-programmable gate array,FPGA)、通用阵列逻辑(generic array logic,GAL)、片上系统(system on chip,SoC)、软件定义架构(software-defined infrastructure,SDI)芯片、人工智能(artificial intelligence,AI)芯片等任意一种处理器或其任意组合。
进一步地,根据业务需求和硬件属性本申请实施例提供了至少6种不同形态的基础板,分别针对不同的计算性能和内存配置。为了方便描述,姑且将这6种基础板分别称为A1、A2、B1、B2、C1、和C2。并且,在本实施例中利用“P”表示处理器的个数,P为大于0的整数,“DPC”则表示每个通道双列直插内存模块(dual in-line memory module per channel,DIMM Per Channel)。例如,A1形态的基础板支持一个处理器,每个通道插一根DIMM(简称为1P1DPC);A2形态的基础板支持一个处理器,每通道插一根或二根DIMM(简称为1P2DPC或1P1DPC);B1形态的基础板支持两个处理器,每通道插一根DIMM(简称为2P1DPC),或者,一个处理器,每通道插一根或二根DIMM(简称为1P2DPC或1P1DPC);B2形态的基础板支持两个处理器,每通道插一根或二根DIMM(简称为2P2DPC或2P1DPC),或者,一个处理器,每通道插一根或两根DIMM(简称为1P2DPC或1P1DPC);C1形态的基础板支持四个处理器,每个通道插一根DIMM(简称为4P1DPC),或者,两个处理器,每通道插一根或两根DIMM(简称为2P2DPC或2P1DPC);C2形态的基础板支持四个处理器,每通道插一根或两根DIMM(简称为4P2DPC或4P1DPC),或者,两个处理器,每通道插一根或两根DIMM(简称为2P2DPC或2P1DPC)。随着技术发展,CPU封装尺寸、内存通道和DIMM数可能变化,但主板的标准尺寸和安装孔位将保持不变,这样能确保基础板更新换代时能够跨代跨系列兼容演进。例如:B2形态的基础板在当前每CPU 8通道DDR时,支持2P2DPC(2P32DIMM)。在CPU内存通道数提升到12以后,将无法实现2P2DPC(2P48DIMM)。那么,B2形态可以支持2P1DPC(2P24DIMM),而2P2DPC(2P48DIMM)可以用C1等其他形态实现,因为安装孔位置和基础板尺寸是标准的,直接更换和安装即可。
扩展板包括主板管理控制器(Baseboard Management Controller,BMC)芯片,是对基础板的管理扩展,作为整个系统的管理中心,提供设备、安全、能效、可靠性等管理功能。其中,BMC也可以称为基板管理控制器。可选地,扩展板还可以包括和管理系统、桥片(例如,Intel系统的平台路径控制器(platform controller hub,PCH))。
在新架构中基础板通过PCIe、内存互联(Compute Express Link,CXL)、或统一总线(unified bus,UB或Ubus)等高速总线与组件通信连接,并与扩展板通过管理接口相连。具 体实施中,上述基础板与组件,以及基础板与扩展板的具体连接方式包括:以线缆实现上述连接的软连接方式,或者,以连接器实现上述连接的硬连接方式。进一步地,组件是一类电子器件或电子设备的统称。其中,组件按照功能不同,包括存储组件(STorage Unit,STU)、IO组件(Input Output Unit,IOU)、加速组件(ACceleration Unit,ACU)、内存扩展组件(Memory Expansion Unit,MEU)、散热组件、计算组件、管理组件等。基础板支持鲲鹏、等不同系列CPU,扩展板则为基础板及各扩展组件提供管理功能及供电。电源、散热器在扩展板的支持下,可以有各种不同的选择。
其中,存储组件包括硬盘背板、扩展板(Expander)、PCIe交换机(switch)等,为系统存储扩展,支持机械硬盘(hard disk drive,HDD)/固态硬盘(solid-state drive,SSD)/非易失性高速传输总线(Non-Volatile Memory express,NVMe)/存储级内存(Storage Class Memory,SCM)等多种介质、形态。
IO组件包括Riser等组件,实现对系统IO的扩展,支持PCIe标卡、开放计算项目(Open Compute Project,OCP)卡。
加速组件包括Riser、载板、加速卡互连交换机(switch)等,提供系统加速组件扩展和互连功能。
内存扩展组件包括载板、内存扩展芯片、双列直插内存模块(dual in-line memory module,DIMM)、SCM介质等,提供系统扩展内存带宽、内存容量的功能。
散热组件,用于对计算设备或计算设备中硬件进行散热,包括风冷散热、液冷散热或二者结合等几种散热方式的组合。应理解的是,散热组件的结构、类型和数量不构成对本申请所要保护技术方案的限定。
计算组件,中央处理器(central processing unit,CPU)、内存等提供通用计算能力的器件。
管理组件,基板管理控制器等提供设备管理的器件。
值得说明的是,包含处理器、内存、基板管理控制器的器件的基础板,或扩展板也可以作为组件的一种。
另一方面,在传统的服务器架构中,由于供电、内存通道数、IO数、速率等演进原因,处理器(例如,CPU)的插槽(Socket)一般只能做到每代(Tick/Tock两个小升级)兼容,很难跨代兼容。本申请提供的主板可以采用标准化方式设置对外接口,并以线缆等软连接方式进行各种外部扩展,可屏蔽处理器相关供电、不同处理器与组件以及组件之间互连所带来的差异。使得内存等组件的变化仅包含在了主板内部,实现主板跨代兼容的功能。这样对于各厂商来说,当处理器更新换代时,配套的整机、组件等可以不更换,因此配套的组件具备了更长的生命周期。对于客户来说,在不需要更换机箱、不增加硬件开发工作量的前提下,能够随时更换最新的组件,最快用上业界最新的算力。对整机厂家来说,服务器新架构跨代升级、跨系列演进实现之后,处理器的升级、或者更换不同处理器厂家,只需要简单更换基础板即可,颠覆了原有的开发模式,衍生了新的产业模式。
本实施例除了提供一种服务器新架构,为了支持多样性算力和多样性设备,该服务器新架构还实现硬件标准化,包括基础板的标准化和组件接口的标准化。
基础板的标准化包括尺寸、安装孔位、接口电气特性、管理接口协议和参数等标准化。其中,表1为本申请提供的一种基础板接口描述表的示例。
表1

其中,供电采用统一的12V输入,基础板内部通过DC/DC转换成所需要的各类电源。考虑到未来I/O的演进以及不同CPU的差异化,本实施例基于UBC和UBCDD连接器,定义一种Flexible I/O接口,用于替代原有的PCIe接口。所述Flexible I/O接口可以根据需求灵活配置成PCIE/HCCS/SAS/SATA/以太等接口。BCU管理接口主要包括常见的低速维护接口,例如I2C、UART、JTAG等接口,兼容常见处理器平台的管理。
计算系统内部组件接口的标准化:组件包括扩展板、供电组件、散热组件、存储组件、 IO组件、加速组件、内存组件等,对组件的电气接口、管理接口和参数进行标准化,而不定义和约束组件的物理尺寸、安装、位置等,这些将提供广大的创新空间,支持差异化和灵活扩展。组件对外接口除了电源和高速信号,其余低速管理接口定义如下表2所示:
表2
除了EXU与BCU的接口外,其它接口通过EXU与各个组件相连。值得注意的是,本实施例只定义这些接口的功能,不限定具体针布局(PINMAP)方式,任何能够实现该功能的实现方式都在本实施例的保护范围以内。
值得说明的是,上述表1至表2的内容仅为辅助解释本申请的技术方案提供的一种示例,具体实施中,服务器新架构、基础板的接口和功能组件的低速接口均可以分别包括更多或更少的内容。
此外,本申请还提供一种实现了智能化管理软件,根据计算设备的标准化要求实现管理对象模板,服务器在上电后,管理软件通过标准管理总线自动探测组件并获取组件的自描述信息,再根据管理对象模板创建管理对象实例,从而实现管理软件自适应管理实现管理软件智能化,支持组件自动发现和自动适配。
为了便于描述,下述实施例以计算设备为服务器为例进行说明,本申请提供的方案同样适用于边缘服务器、个人电脑(personal computer,PC)等其他计算设备。
对于服务器,由于服务器需要承载大量的业务,进行大量的数据运算,这就要求服务器中需要部署较多的组件、基础板上需不部署较多数量的处理器、较大的内存以及接入更多的硬盘。基础板上处理器、内存以及硬盘、以及各种组件等作为服务器的主要器件的工作状态决定了服务器的运行状态。为了保证服务器的主要器件的能够正常工作,服务器中还部署有温度传感器(以测量器件的温度)、电压传感器(以测量器件的工作电压)、不同型号的电源(以提供不同伏值的电压)、风扇(以对器件进行降温)等器件。对于服务器的带外管理中的一个重要部分是对服务器中主要器件的工作环境的监控以及管理,以保证服务器中主要器件能够工作在适宜的工作环境中,如温度处于器件工作的温度范围内、电压符合器件的工作电压、不同型号的电源供电正常、风扇正常运行等。
由于不同服务器单板的结构不同,对外没有统一的接口以实现带外管理,导致基板管理控制器对任一服务器单板实现带外管理都需要进行大量的适配工作,灵活性较差。为此本申请实施例提供了一种单板管理系统、方法、装置以及设备。在本申请实施例中,基板管理控制器能够通过统一的接口经过管理总线与服务端单板连接。也即基板管理控制器与服务器单板之间仅需一路管理总线即可实现连接,基板管理控制器能够通过该管理总线从服务器单板 上部署的存储器中获取管理服务器单板所需的管理信息,还能够通过管理总线与该服务器单板上的器件管理器交互,管理该服务器单板。基板管理控制器可以获取服务器单板上与该器件管理器连接的器件的工作信息,进而实现带外管理。采用这种方式,基板管理控制器不需要进行大量的适配工作的情况下,能够对不同的服务器单板实现带外管理,简化了基板管理控制器实现带外管理的整个流程。
如图1所示,为本申请实施例提供的一种单板管理系统的结构示意图,该单板管理系统可以部署在服务器中,在该单板管理系统中包括服务器单板100以及基板管理控制器200。
需要说明的是,该服务器单板100可以为前述所描述的新架构中的基础板,或任一组件。也可以为传统服务器中的主板。本申请实施例并不限定服务器单板100中的数量,可以为一个,也可以为多个。当单板管理系统中包括多个服务器单板100时,该多个服务器单板100可以为同类型的单板。例如,该多个服务器单板100均为基础板,该多个服务器单板100也可以为不同类型的单板,如该多个服务器单板100包括一个基础板、一个IO组件以及一个存储组件。
在本申请实施例中,基板管理控制器200与服务器单板100之间通过一路管理总线300连接。该管理总线300可以为I2C(Inter-Integrated Circuit)总线,也可以为串行外设接口(serial peripheral interface,SPI)总线。该管理总线300也可以为其他类型的总线。该管理总线300可以理解为单板管理的根管理总线300,该根管理总线300能够作为根管理链路。基板管理控制器200可以通过该根管理链路获取管理信息以及服务器单板100上器件的工作信息,以实现对服务器单板100的管理。
服务器单板100上部署了服务器的器件,服务器单板100上部署的器件包括但不限于:处理器、内存、温度传感器、模拟数字转换器(analog to digital converter,ADC)、电源接口、高速串行计算机扩展总线标准(peripheral component interconnect express,PCIe)槽位、硬盘接口、风扇、电源等。不同类型的服务器单板100,服务器单板100上部署的器件的类型以及器件的数量均可能不同。
需要说明的是,本申请实施例中并不限定服务器单板100上部署器件的方式,例如,服务器的器件可以直接焊接在服务器单板100上,又例如,服务器的器件可以通过接口(如UBC等高速接口)连接在服务器单板100上。在实际应用中,一些组件(如IO组件、存储组件等)可以通过接口连接到基础板上,这种情况下,这些组件也可以认为是服务器单板100上部署的器件。
为了能够实现对服务器单板100的带外管理,服务器单板100上还部署有器件管理器120以及存储器110。基板管理控制器200通过该管理总线300分别与器件管理器120以及存储器110连接。
器件管理器120可以与服务器单板100上的部分或全部器件连接。在本申请实施例中,服务器单板100上的器件包括两类,一类为通过器件管理器120与基板管理控制器200建立连接的器件,为方便说明,该类器件称为第一类器件。另一类为直接通过管理总线300与基板管理控制器200连接的器件,方便说明,该类器件称为第二类器件。对于任一服务器单板100,带外管理所涉及的器件(带外管理所涉及的器件是指影响服务器单板100上主要器件工作环境的器件)可以均属于第一类器件,也即服务器单板100上的器件均与器件管理器120连接。图1是以服务器单板100上的器件均属于第一类器件为例进行绘制的。带外管理所涉及的器件也可以包括第一类器件和第二类器件。第二类器件不与器件管理器120连接,但可以通过管理总线300连接基板管理控制器200,该种类型的服务器单板100与基板管理控制 器200的连接方式可以参见后续图2中的相关说明。
器件管理器120与第一类器件之间可以进行交互,以获取第一类器件的工作信息。如器件管理器120可以获取温度传感器的温度、电压传感器的电压值、电源接口是否连接有电源,电源提供的电压(通过连接ADC获取电源电压,ADC将电压这类模拟信号转换为数据信号)、PCIe槽位上是否插入PCIe接口的组件(如加速卡等)、硬盘接口是否接入硬盘、风扇是否正在运行等。
存储器110中存储了服务器单板100的管理信息,该管理信息为基板管理控制器200实现带外管理的必要信息。关于服务器单板100的管理信息的相关描述将在下文中进行说明。
基板管理控制器200通过管理总线300连接器件管理器120与存储器110,基板管理控制器200从存储器110中可以获取该服务器单板100的管理信息,了解服务单板的属性、器件属性以及服务器单板100的拓扑信息等。基板管理控制器200还可以通过与器件管理器120的交互获取第一类器件的工作信息。基于该管理信息(以及第一类器件的工作信息)管理服务器单板100。
如图2所示,为本申请实施例提供的另一种单板管理系统的结构示意图,该单板管理系统可以部署在服务器中,在该单板管理系统中包括服务器单板100以及基板管理控制器200。基板管理控制器200与服务器单板100之间仅需通过一路管理总线300连接,基板管理控制器200通过一路管理总线300连接器件管理器120、存储器110、以及第二类器件。器件管理器120与服务器单板100上的第一类器件连接。关于基板管理控制、器件管理器120、存储器110、管理总线300、第一类器件以及第二类器件的说明可以参见前述内容,区别与图1所述的单板管理系统,图2所示的单板管理系统中,服务器单板100上的器件也可以通过管理总线300直接连接到基板管理控制器200中,由基板管理控制器200直接管理。
在这种单板管理系统中,基板管理控制器200与服务器单板100之间同样也仅需一路管理总线300实现连接,基板管理控制器200与服务器单板100之间的连接方式简单,基板管理控制器200同样适用于不同的服务器单板100。
下面对单板管理系统中的各个组成部分进行说明:
(1)、存储器110。
在本申请实施例中存储器110用于存储带外管理所需的服务器单板100的管理信息,本申请实施例并不限定该存储器110的类型,该存储器110可以为带电可擦可编程只读存储器(electrically erasable programmable read only memory,EEPROM),还可以为其他非易失性内存。存储器110作为现场可更换单元说明(field replaceable unit description,FRUD),FRUD中存储了管理该服务器单板100所需的管理信息。管理信息包括服务器单板100的属性信息、需要管理的器件的信息以及拓扑信息,告警信息等。
管理信息包括服务器单板100的属性信息、服务器单板100的拓扑信息、器件的属性信息。
其中,服务器单板100的属性信息用于描述服务器单板100的硬件信息,服务端单板的属性信息包括但不限于:单板类型、单板标识(identification,ID)、单板的印制电路板(printed circuit board,PCB)版本号、单板的物料清单(bill of material,BOM)版本号。
基板管理控制器200获取服务器单板100的属性信息后,能够了解该服务器单板100的基本信息。
服务器单板100的拓扑信息描述了服务器单板100上器件的连接关系,服务器单板100的拓扑信息可以包括带内管理拓扑以及管理总线拓扑。
带内管理拓扑也可以称为业务总线拓扑,带内管理拓扑描述了服务器单板100业务面的拓扑信息,也即服务器单板100中承载服务器业务的器件(处理器、硬盘、内存)之间的连接关系,其中包括但不限于:基础板上器件的连接管理、基础板与组件的连接关系、组件与组件之间的连接管理等。带内管理拓扑中包括但不限于:组件的信号、处理器的信息(如端口号、类型、数量、位宽等)、内存的信息(如端口号、类型、数量、位宽等)、硬盘的信息(如硬盘接口、类型、数量、位宽等)、以及处理器、内存、硬盘之间的连接方式等。凡是涉及到业务面的器件的信息均可以记录在带内管理拓扑中。
管理总线拓扑也可以称为带外管理拓扑,管理总线拓扑描述了服务器单板100带外管理涉及的器件的拓扑信息。也即服务器单板100中涉及到带外管理的器件(温度传感器、电压传感器、ADC、电源、风扇)之间的连接关系。管理总线拓扑中包括但不限于:管理总线300下挂的器件(如器件管理器120或第二类器件)的信息、器件管理器120所连接的器件(也即第一类器件)的信息。凡是涉及到带外管理的器件的信息均可以记录在带内管理拓扑中。
基板管理控制器200获取服务器单板100的拓扑信息能够了解服务器单板100上器件的连接关系,基于该服务器单板100的拓扑信息,可以确定后续通过器件管理器120读取第一类器件的工作信息(如温度、电压、电源是否工作等信息)以及直接从第二类器件读取的第二类器件的工作信息所描述的工作环境是哪一个器件的工作环境,进而判断该器件的工作环境是否符合要求或该器件是否故障,是否需要进行告警。
器件的属性信息,这里的器件包括第一类器件的属性信息、以及第二类器件的属性信息。从器件的类型来说,器件包括芯片(如处理器芯片等)、连接器、总线、槽位(槽位是指输入/输入设备插入的槽位,如PCIe槽位、硬盘槽位等)。
管理信息所包括的信息可以参见表3,需要说明的是,在上述说明以及表1中仅是展示了管理信息中的部分信息,本申请实施例并不限定管理信息中信息的划分方式以及信息内容,凡是带外管理所需的信息均可以作为管理信息,存储在存储器110中。
表3


在本申请实施例中将带外管理所需的管理信息存储在存储器110中,该存储器110的地址可以为预设的地址。当基板管理控制器200通过管理总线300连到该存储器110时,可以通过该地址与存储器110交互,从存储器110中读取该管理信息,以便实现后续的带外管理。基板管理控制器200能够较为简单、快捷的获取该管理信息,简化了带外管理的流程。
在本申请实施例中,服务器单板100的带外管理接口统一为一条根管理总线,根管理链 路上可以下挂一个固定地址的存储器110(如EEPROM)作为FRUD,在FRUD中描述服务 器单板100的管理信息,基板管理控制器200可以通过读取FRUD中的信息自动加载单板的 管理配置。
(2)、器件管理器120。
器件管理器120也可以称为卫星管理中心(satellite manager centre,SMC)来管理,SMC采集到单板上的第一类器件的工作信息之后,通过根管理总线接口上报给基板管理控制器200,上报的方式可以采用命令字方式,一种类型的工作信息可以对应一个命令字。
在服务器单板100上,服务器单板100上采用SMC作为板级的管理中心,收集单板上第 一类器件的工作信息,如传感器的信息、告警信息、处理服务器单板100的升级需求和单板 上其他器件的管理需求。SMC通过根管理总线接口采用命令字方式与基板管理控制器200通 讯。
在本申请实施例中,基板管理控制器200不需要连接各个需要进行带外管理的器件,而是经过器件管理器120获取这些器件的工作信息,进而确定服务器中主要器件的工作环境。基板管理控制器200仅需与器件管理器120连接即可,这样能够大大简化基板管理控制器200与服务器单板100的连接方式,实现服务器单板100带外管理的智能管理,这种连接方式也适配于不同的服务器单板100。
本申请实施例并不限定器件管理器120的具体结构。例如,器件管理器120可以是复杂可编程逻辑器件(complex programmable logic device,CPLD),也可以为微控制单元(microcontroller unit,MCU)。器件管理器120采集到与其连接的各个器件的工作信息后,可以通过管理总线300将采集到的信息上报该基板管理控制器200。
本申请实施例并不限定器件管理器120与基板管理控制器200之间的交互方式。例如,器件管理器120与基板管理控制器200之间可以采用命令字方式的进行交互。一种类型的工作信息对应一种命令字。命令字的格式可以由不同的服务器单板100共享,这样,基板管理控制器200能够与不同服务器单板100上的器件管理器120采用相同的方式交互,减少不必要的适配工作。
下面介绍一种命令字的设计方式,参见图3。器件管理器120与基板管理控制器200之间定义的命令字格式主要包括两部分,一部分为操作码(operation code,OP code)以及器件参数(parameter)。本申请实施例并不限定该命令字的具体大小,在一种可能的实施方式中, 命令器可以占用4个字节(也即32个比特)。其中,器件参数可占用1个字节,操作码可占用3个字节。
操作码用于描述需要对器件进行操作,在本申请实施例中,操作可以包括读器件的工作信息、向器件下发命令(向器件下发命令可以理解为向器件写入信息)。器件参数用于指示需要操作的器件。器件参数可以为器件的编号或标识。
操作码包括四个字段。分别为功能(function)字段、命令(command)字段、读取次数字段(图3中用MS表征该字段)、以及读写标识字段(图3中用RW表征该字段)。
功能字段用于指示该命令字所针对的服务器单板100,在单板管理系统中包括多个单板的情况下该功能字段不可缺省,当单板管理系统中仅有一个单板,该功能字段的内容可以设置为默认值或者空值。功能字段可以占用6个比特。
当存在不同类型的服务器单板100时,可以用不同的编号指示不同类型的服务器单板100。如图3中1可以指示扩展组件(扩展组件是指在服务器中用于增加接口或槽位的组件)。2指示存储组件(存储组件是指服务器中用于连接硬盘、实现数据存储功能的组件)。3指示基础板。4指示内存扩展组件(内存扩展组件是指服务器中承担内存功能的组件)。0用于表征通用命令,也即命令字针对所有服务器单板100。
命令字段用于描述操作的类型,如指示读取哪一种工作信息(如温度、电压、电源是否正常、故障或告警等信息)。命令字段是需要预先进行定义的,以区分不同的操作。命令字段可占用16个比特。
读取次数字段用于区分此次操作是多次读取还是单次读取,也即指示一次读取多个器件的工作信息或一次读取一个器件的工作信息。如当该字段为0时表征为多个读取,为1时表征为单次读取。读取次数字段可占用1个比特。
读写标识字段用于区分此次操作为读取操作还是写入操作。如当该字段为0时表征为此次操作为读取操作,为1时表征为写入操作。读写标识字段可占用1个比特。
当基板管理控制器200需要读取器件的工作参数时,器件管理器120与基板管理控制器200之间的交互过程包括:基板管理控制器200向器件管理器120发起读取请求,器件管理器120向基板管理控制器200反馈读取响应。
如图4A所示为本申请实施例提供的一种读取请求的格式示意图,图4B为本申请实施例提供的一种读取响应求的格式示意图。图4A与图4B第一行为各个字段的名称,第二行为各个字段占用的比特数。
当基板管理控制器200需要向器件写入信息,也即基板管理控制器200向器件下发命令(如控制器件启动、停止、升级)时,器件管理器120与基板管理控制器200之间的交互过程包括:基板管理控制器200向器件管理器120发起写入请求,该写入请求中携带需要写入的命令(如控制命令)或数据(升级文件)。
如图4C所示为本申请实施例提供的一种写入请求的格式示意图。图4C第一行为各个字段的名称,第二行为各个字段占用的比特数。
图4A~图4C中各个字段的含义可以参见表4。
表4

需要说明的是,上述图4A~图4C中的各个字段仅是举例。在实际应用中,在设计读取请求、写入请求以及读取响应中的各个字段时可以根据实际需求增加减少字段。
在本申请实施例中,器件管理器120与基板管理控制器200之间除了交互第一类器件的工作信息。基板管理控制器200还可以通过与器件管理器120的交互向第一类器件下发控制命令,以控制第一类器件的工作状态,例如,该控制命令可以控制某一个或某几个第一类器件停止工作、或启动工作。该控制命令可以作为数据携带在如图4C所示的数据字段中。当器件管理器120接收到该写入请求后,可以识别其中的控制命令,根据控制命令控制相应的第一类器件,如控制该第一类器件停止工作、或启动工作。
基板管理控制器200还可以通过与器件管理器120的交互向第一类器件下发升级命令,以指示第一类器件进行升级。第一类器件升级所需的升级文件可以作为数据携带在如图4C所示的数据字段中。当器件管理器120接收到该写入请求后,可以识别其中的升级文件,向相应的第一类器件发送该升级文件,指示该第一类器件升级。
基板管理控制器200也可以直接指示器件管理器120升级,器件管理器120升级所需的升级文件可以作为数据携带在如图4C所示的数据字段中。当器件管理器120接收到该写入请求后,可以识别其中的升级文件,利用该升级文件进行升级。
3)、基板管理控制器200。
从上述关于存储器110以及器件管理器120的说明可知,基本管理控制器能够通过管理总线300从存储器110读取管理信息,还能够通过与器件管理器120的交互对第一类器件实现带外管理。
若服务器单板100上有器件(也即第二类器件)无法通过SMC管理,可以直接下挂在基 板管理控制器200直出根管理总线,通过在FRUD中描述让基板管理控制器200自动加载该 类器件的管理特性。
在本申请实施例中,允许服务器单板100上存在第二类器件,该第二类器件可以通过管理总线300直接与基板管理控制器200连接,基板管理控制器200可以直接通过管理总线300与第二类器件进行交互,获取第二类器件的工作信息,对第二类器件实现带外管理。
基板管理控制器200可以根据管理信息确定该服务器单板100上部署的第二类器件,也即获知直接挂在该管理总线300下的第二类器件的信息。基板管理控制器200基于该管理信息可以预先加载与该第二类器件的管理驱动(该管理驱动是指管理第二类器件所需的软件程序),以实现对第二类器件的管理。
在本申请实施例中,可以将基板管理控制器200部署在一个单板上,形成一个BMC管理单板(也即前文中提及的扩展板),BMC管理单板可以作为服务器的管理中心,用于实现对服务器的带外管理。BMC管理单板的外观可以如图5A所示。BMC管理单板对外提供管理接口,包括调试串口、单位识别(unit identification,UID)指示灯、管理网口、视频图形阵列(video graphics array,VGA)接口、通用串行总线(universal serial bus,USB)接口等。BMC管理单板对外提供管理接口可以参见图5B。
BMC管理单板的对外提供的管理接口功能定义和描述如表5所示。
表5
BMC管理单板通过4C+连接器对内提供单板管理所需的管理接口,包括带外管理总线接口,若该管理总线为I2C总线,那么带外管理总线接口即为I2C接口。
BMC管理单板还可以提供其他管理接口,本申请实施例并不限定该其他管理接口的类型。其他管理接口包括下列的部分或全部:联合测试工作组(joint test action group,JTAG)接口、SPI接口、网络控制边带接口(network controller sideband interface,NCSI)、平台环境式控制接口(platform environment control interface,PECI)调试串口、UID按钮指示灯、管理网口、VGA接口。其他管理接口的类型仅是举例,本申请实施例并不限定其他管理接口的数量以及类型。
BMC管理单板还提供带内管理所需的低针脚数量架构(Low pin count,LPC)接口、USB接口、PECI接口。BMC管理单板上还部署有基板管理控制自身工作所需的电源、时钟电路、杂散信号电路等。BMC管理单板对内提供的管理接口针脚定义如下表6所示:
表6





其中,Power/GND指示电源信号或接地信号,USB3是指支持USB3.0.input指示信号输入,output指示信号输出。VGA是指VGA信号,上述表中VGA信号包括三路信号,分别为红、绿、蓝三路信号。HCSL是指高速电流控制逻辑电平(high-speed current steering logic)。关于信号定义仅是示例性的内容,在实际使用中,也可以根据实际需要设置不同的信号定义。下面以三种不同类型的服务器单板100所属的单板管理系统的结构为例,对本申请实施例提供的单板管理系统进行说明。
第一种、服务器单板100为基础板(Basic Computer Unit,BCU)。
如图6A所示,为本申请实施例提供的一种单板管理系统,该单板管理系统能够用于实现针对计算处理单元的带外管理。BMC通过一路I2C总线分别连接BCU的EEPROM和CPLD。其中,EEPROM用于实现上述实施例中存储器110的功能,其中存储了计算处理单元的管理信息,如计算处理单元的属性信息等。CPLD用于实现上述实施例中SMC的功能,例如实现对器件的管理控制、处理升级命令或控制命令等。CPLD连接ADC、温度传感器、时钟电路、闪存等器件图6A中,CPLD通过第一转换芯片服务器单板100上的一些器件可以获取一些信号,在图6A中CPLD通过第一转换芯片服务器单板100可以获取三种信号,该三种信号包括电源OK(power good,PG)信号(用于指示电源接入或未接入)、在位(present)信号(如在位信号可以用于指示连接器是否有器件接入)、故障(fault)信号。
其中,电源OK信号用于指示电源接入或未接入。在位(present)信号可以用于指示连接器是否有器件接入。故障信号可以用于指示器件是否故障,如该器件可以为CPU或电源控制器等。例如,CPU可以通过低速信号线直接接到第一转换芯片(例如,9555芯片),以提供CPU告警信号,该CPU告警信号指示识CPU出现错误。第一转换芯片用于增加连接器件的数量。
CPLD能够获取ADC的工作信息(ADC的工作信息即为ADC将电压信号转换成的数字信号)、温度、CPU告警信号、电源的供电信息等工作信息。CPLD还能够实现加载时钟电路的频率、闪存升级功能。
第二转换芯片(例如,9545芯片)可提供多个I2C接口,计算处理单元上的多个电压调节电源控制器(voltage regulator controller)通过第二转换芯片扩展后直接下挂在I2C总线下,在EEPROM中的计算处理单元中的拓扑信息描述了电压调节电源控制器直接下挂在I2C总线的连接关系。电压调节电源控制器用于对CPU进行供电。
BMC可以直接管理电压调节电源控制器。CPLD基于命令字的方式、通过I2C总线与BMC交互,传递CPLD所连接的器件的工作信息,还可以接受BMC的控制,对一些器件进行升级加载等操作。BMC还可以通过I2C总线对CPLD实现升级功能。
第二种、服务器单板100为IO组件(input output unit,IOU)。
如图6B所示,为本申请实施例提供的一种单板管理系统,该单板管理系统能够用于实现针对IO扩展单元的带外管理。BMC通过一路I2C总线分别连接IOU的EEPROM和MCU。
其中,EEPROM用于实现上述实施例中存储器110的功能,其中存储了IOU的管理信息,MCU用于实现上述实施例中SMC的功能,例如实现对器件的管理控制、处理升级命令或控制命令等。MCU连接温度传感器电源、电源、PCIe槽位等器件。MCU能够获取温度、通过第一转换芯片PG信号和在位信号(在位信号可以指示连接器是否有器件插入)等工作信息。
MCU基于命令字的方式、通过I2C总线实现与BMC的交互,传递MCU所连接的器件的工作信息。BMC通过I2C总线对MCU实现升级功能。PCIe插槽(Slot)槽位上插入的PCIe标卡通过第二转换芯片直接挂在I2C总线下,EEPROM中的IOU中的拓扑信息描述了PCIe标卡直接下挂在I2C总线的连接关系。BMC可以直接管理PCIe标卡。
第三种、服务器单板100为存储组件(Storage Unit,STU)。
如图6C所示,为本申请实施例提供的一种单板管理系统,该单板管理系统能够用于实现针对存储扩展单元的带外管理。BMC通过一路I2C总线分别连接BCU的EEPROM和CPLD。其中,EEPROM用于实现上述实施例中存储器110的功能,其中存储了计算处理单元的管理信息,如计算处理单元的属性信息等。CPLD用于实现上述实施例中SMC的功能,例如实现对器件的管理控制、处理升级命令或控制命令等。CPLD连接温度传感器、ADC、硬盘等器件,CPLD能够获取电压、温度、硬盘是否接入,还可以通过第五转换芯片获取PG信号、在位信号、CPU告警信号等工作信息。CPLD还可以实现对硬盘的管理功能,通过第六转换芯片获取各个硬盘的工作信息。CPLD基于命令字的方式、通过I2C总线与BMC进行交互,传递与CPLD连接的器件的工作信息。BMC可以通过命令字的方式获取单板各个硬盘的工作信息。BMC还可以通过I2C总线对CPLD实现升级功能。
基于上述提供的单板管理系统,下面对本申请实施例提供的单板管理方法进行说明,参见图7,该方法包括如下步骤:
步骤701:基板管理控制器200在启动后,通过管理总线300扫描该管理总线300下的预设地址的存储器110。
当服务器上电之后,基板管理控制器200启动,基板管理控制器200可以通过管理总线300从该管理总线300下挂的器件中找到预设地址的存储器110。
步骤702:基板管理控制器200在扫描到该存储器110后,通过管理总线300从存储器110中读取该服务器单板100的管理信息。该管理信息所包括的信息可以参见前述内容的描述,基板管理控制器200通过读取该管理信息,能够了解该服务器单板100的硬件信息、该服务器单板100的拓扑信息、以及该服务器单板100的器件的属性信息。
步骤703:服务器单板100在上电后,服务器单板100上的器件管理器120收集第一类器件的工作信息。
服务器单板100上电之后,器件管理器120能够与该器件管理器120所连接的第一类器件交互,获取该第一类器件的工作信息,如获取温度传感器所检测的温度、获取ADC所检测的电压、从电压调节电源控制器获取电源OK信息、器件的故障信息(如CPU告警信息)等。
步骤704:基板管理控制器200从器件管理器120获取第一类器件的工作信息。若该服务器单板100上包括第二类器件,基板管理控制器200还可以通过管理总线300从第二类器件中获取该第二类器件的工作信息。
在步骤704中,基板管理控制器200可以通过器件管理器120采集到服务器的第一类器件的工作信息,也可以通过直接交互获取第二类器件的工作信息。基板管理控制器200无需与服务器单板100上各个器件连接,基板管理控制器200获取服务器的器件的工作信息的方式较为简单。
步骤705:基板管理控制器200基于管理信息、以及获取的器件的工作信息(如第一类器件的工作信息、第二类器件的工作信息)管理服务器单板100。
基板管理控制器200基于管理信息能够了解服务器单板100上器件的连接管理,基于器件的工作信息能够确定该服务器单板100上一些主要器件的工作环境(如温度、电压、是否供电、是否故障等信息),基于此基板管理控制器200可以确定是否对服务器单板100上的器件进行控制,如开启风扇、重启电源等。基板管理控制器200可以向器件管理器120发送控制命令,对第一类器件进行控制。基板管理控制器200也可以直接通过管理总线300向第二类器件下发控制命令,对第二类器件进行控制。关于控制命令的下发方式可以参见前述内容,此处不再赘述。
基板管理控制器200除了对器件进行控制,还可以对器件进行升级。例如,基板管理控制器200可以向器件管理器120发送升级命令,对第一类器件进行升级。基板管理控制器200可以向器件管理器120发送升级命令,对第一类器件进行升级。基板管理控制器200也可以向器件管理器120发送升级命令,对器件管理器120进行升级。基板管理控制器200也可以直接通过管理总线300向第二类器件下发升级命令,对第二类器件进行升级。关于升级命令的下发方式可以参见前述内容,此处不再赘述。
基板管理控制器200还可以确定是否向用户进行告警,以提示用户器件发生故障或温度较高、供电出错等,这样基板管理控制器200可以管理服务器单板100,保证服务器单板100能够正常工作,或用户可以及时了解服务器单板100的状态。
如图8所示,为本申请实施例提供的一种BCU模块的管理系统。BCU模块的管理系统用于保证该BCU模块的管理特性。
BCU模块的管理特性包括BCU模块对外提供的管理接口,以及管理模块对BCU模块的管理特性。
BCU模块外出的高速连接器上的低速信号中包含了管理信号,可以用于BCU模块外出 的Riser卡的带外管理,这样设计的优点是Riser卡上可以免低速管理信号线。
管理模块对BCU模块的管理分为带外管理和带内管理,天池管理架构推荐将BCU模块上独立的管理特性直接在BCU模块上终结,如BCU模块上的频率合成器配置,直接在BCU模块上加载,不需要管理模块单独管理。
如图8中,管理模块上的BMC提供一路智能平台管理总线(intelligent platform management bus,IPMB)接口对接BCU模块的CPU,作为智能平台管理接口(Intelligent Platform Management Interface,IPMI)总线通道;
管理模块上的BMC提供一路LPC接口对接BCU模块的CPU,作为BT总线通道;
管理模块上的BMC提供一路I2C接口对接BCU模块的CPLD和FRUD,BMC通过该路I2C实现对BCU模块的基础带外管理,包括FRUD中的信息读取、作为SMC总线通道访问BCU模块CPLD寄存器等;
管理模块上的CPLD芯片提供两路hisport接口对接BCU模块的CPLD,其中一路hisport0作为BCU模块和管理模块之间逻辑寄存器交互通道,另外一路作为hisport over I2C接口,用于BCU模块对外扩展管理接口;
BCU模块上的CPLD芯片提供多路I2C接口用于BCU模块的ADC芯片、时钟频率合成器芯片、温感芯片的信息读取和配置,即BCU模块上的CPLD实现读取温度、电压等基础信息,通过统一的SMC接口上报给BMC芯片,实现独立管理特性在模块内部终结。
BCU模块上的CPLD芯片提供多路I2C接口对接UBC高速连接器,作为对外扩展模块的管理通道。这些对外提供的管理I2C来源于管理模块提供的hisport over I2C特性,该管理通道可以对接Riser上的FRU芯片、温感等带外管理器件,实现组件的带外管理特性。
基于与方法实施例同一发明构思,本申请实施例还提供了一种单板管理装置,该单板管理装置用于执行上述如图7所示的方法实施例中基板管理控制器执行的方法,相关特征可参见上述方法实施例,此处不再赘述。如图9所示,单板管理装置900包括获取单元901、管理单元902。
获取单元901,用于通过管理总线从存储器中获取管理信息。
管理单元902,用于基于管理信息,通过管理总线与器件管理器交互,管理计算设备单板。
一种可能的实施方式,计算单板包括第一类器件,器件管理器件与第一类器件连接,获取单元901可以通过管理总线从器件管理器获取第一类器件的工作信息。
一种可能的实施方式,计算单板包括第二类器件,第二类器件通过管理总线与基板管理控制器连接,获取单元901可以通过管理总线从第二类器件获取第二类器件的工作信息。
一种可能的实施方式,管理信息包括下列的部分或全部:计算设备单板的属性信息、计算设备单板的拓扑信息、第一类器件的属性信息、第二类器件的属性信息。
一种可能的实施方式,管理单元902通过管理总线与器件管理器交互时,可以基于命令字的方式进行交互。
一种可能的实施方式,装置还包括升级单元903。升级单元903可以向器件管理器传递第一类器件的升级文件,指示对第一类器件进行升级。也可以向器件管理器传递器件管理器的升级文件,指示对器件管理器进行升级。
一种可能的实施方式,管理总线为I2C总线或SPI总线。
需要说明的是,本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。在本申请的实施例中的各功能单元可以集成在一个处理 单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
本申请还提供如图10所示的计算设备1000。所述计算设备1000包括计算机单板以及基板管理控制器1500,计算机单板上可以包括总线1100、处理器1200、通信接口1300、存储器1400。处理器1200、存储器1400和通信接口1300之间通过总线1100通信。
其中,处理器1200可以为中央处理器(central processing unit,CPU)专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)、人工智能(artificial intelligence,AI)芯片、片上系统(system on chip,SoC)或复杂可编程逻辑器件(complex programmable logic device,CPLD),图形处理器(graphics processing unit,GPU)等。
存储器1400可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。存储器1400还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,HDD或SSD。该存储器1400还可以包括前述内容提及的存储器110,也即其中可以存储管理信息。存储器1400中还可以存储操作系统等其他运行进程所需的软件模块。操作系统可以为LINUXTM,UNIXTM,WINDOWSTM等。
基板管理控制器1500包括处理器1510和存储器1520,存储器1520中存储有计算机程序代码,处理器1510执行该计算机程序代码以执行前述图7所描述的方法。基板管理控制器1500也可以只包括处理器1510,处理器1510上烧写有计算机程序代码,处理器1510可以执行前述图7所描述的方法。
上述各个附图对应的流程的描述各有侧重,某个流程中没有详述的部分,可以参见其他流程的相关描述。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括计算机程序指令,在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例图7所述的流程或功能。
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载或执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质可以是固态硬盘(solid state drive,SSD)。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (25)

  1. 一种单板管理系统,其特征在于,所述系统包括基板管理控制器和计算设备单板;
    所述计算设备单板包括存储器和器件管理器,所述存储器中记录所述计算设备单板的管理信息;所述存储器和所述器件管理器通过管理总线与所述基板管理控制器连接;
    所述基板管理控制器,用于从所述存储器中获取所述管理信息,并基于所述管理信息、通过与所述器件管理器交互管理所述计算设备单板。
  2. 如权利要求1所述的系统,其特征在于,所述计算设备单板还包括第一类器件,所述第一类器件与所述器件管理器连接,所述基板管理控制器,用于:
    通过所述管理总线从所述器件管理器获取所述第一类器件的工作信息。
  3. 如权利要求1或2所述的系统,其特征在于,所述计算设备单板还包括第二类器件,所述第二类器件通过所述管理总线与所述基板管理控制器连接;
    所述基板管理控制器,还用于通过所述管理总线获取所述第二类器件的工作信息。
  4. 如权利要求1~3任一项所述的系统,其特征在于,所述管理信息包括下列的部分或全部:
    所述计算设备单板的属性信息、所述计算设备单板的拓扑信息、所述第一类器件的属性信息、所述第二类器件的属性信息。
  5. 如权利要求1~4任一项所述的系统,其特征在于,所述基板管理控制器与所述器件管理器基于命令字的方式进行交互。
  6. 如权利要求1~5任一项所述的系统,其特征在于,所述基板管理控制器,还用于向所述器件管理器传递所述第一类器件的升级文件,指示对所述第一类器件进行升级;
    所述器件管理器,用于获取所述第一类器件的升级文件,利用所述第一类器件的升级文件对所述第一类器件进行升级。
  7. 如权利要求1~6任一项所述的系统,其特征在于,所述存储器为带电可擦可编程只读存储器EEPROM。
  8. 如权利要求1~7任一项所述的系统,其特征在于,所述器件管理器为复杂可编程逻辑器件CPLD或微控制单元MCU。
  9. 如权利要求1~8任一项所述的系统,其特征在于,所述管理总线为内部集成电路I2C总线或串行外设接口SPI总线。
  10. 一种单板管理方法,其特征在于,所述方法用于对计算设备单板进行管理,所述计算设备单板包括存储器和器件管理器,所述存储器中记录所述计算设备单板的管理信息;所述方法包括:
    所述基板管理控制器通过管理总线从所述存储器中获取所述管理信息;
    所述基板管理控制器基于所述管理信息,通过所述管理总线与所述器件管理器交互,管理所述计算设备单板。
  11. 如权利要求10所述的方法,其特征在于,所述计算单板包括第一类器件,所述方法包括:
    所述基板管理控制器通过所述管理总线从所述器件管理器获取所述第一类器件的工作信息。
  12. 如权利要求10或11所述的方法,其特征在于,所述计算单板包括第二类器件,所述方法包括:
    所述基板管理控制器通过所述管理总线从所述第二类器件获取所述第二类器件的工作信 息。
  13. 如权利要求10~12任一项所述的方法,其特征在于,所述管理信息包括下列的部分或全部:
    所述计算设备单板的属性信息、所述计算设备单板的拓扑信息、所述第一类器件的属性信息、所述第二类器件的属性信息。
  14. 如权利要求10~12任一项所述的方法,其特征在于,所述基板管理控制器通过所述管理总线与所述器件管理器交互,包括:
    所述基板管理控制器通过所述管理总线,与所述器件管理器基于命令字的方式进行交互。
  15. 如权利要求10~14任一项所述的方法,其特征在于,所述方法还包括:
    所述基板管理控制器向所述器件管理器传递所述第一类器件的升级文件,指示对所述第一类器件进行升级。
  16. 如权利要求10~15任一项所述的方法,其特征在于,所述管理总线为I2C总线或串行外设接口SPI总线。
  17. 一种单板管理装置,其特征在于,所述装置用于对计算设备单板进行管理,所述计算设备单板包括存储器和器件管理器,所述存储器中记录所述计算设备单板的管理信息;所述装置包括获取单元、管理单元;
    所述获取单元,用于通过管理总线从所述存储器中获取所述管理信息;
    所述管理单元,用于基于所述管理信息,通过所述管理总线与所述器件管理器交互,管理所述计算设备单板。
  18. 如权利要求17所述的装置,其特征在于,所述计算单板包括第一类器件,所述获取单元,还用于:
    通过所述管理总线从所述器件管理器获取所述第一类器件的工作信息。
  19. 如权利要求17或18所述的装置,其特征在于,所述计算单板包括第二类器件,所述获取单元,还用于:
    通过所述管理总线从所述第二类器件获取所述第二类器件的工作信息。
  20. 如权利要求17~19任一项所述的装置,其特征在于,所述管理信息包括下列的部分或全部:
    所述计算设备单板的属性信息、所述计算设备单板的拓扑信息、所述第一类器件的属性信息、所述第二类器件的属性信息。
  21. 如权利要求17~19任一项所述的装置,其特征在于,所述管理单元通过所述管理总线与所述器件管理器交互,用于:
    通过所述管理总线,与所述器件管理器基于命令字的方式进行交互。
  22. 如权利要求17~21任一项所述的装置,其特征在于,所述装置还包括升级单元;
    所述升级单元,用于:向所述器件管理器传递所述第一类器件的升级文件,指示对所述第一类器件进行升级。
  23. 如权利要求17~22任一项所述的装置,其特征在于,所述管理总线为I2C总线或串行外设接口SPI总线。
  24. 一种基板管理控制器,其特征在于,所述基板管理控制器包括处理器和存储器,所述处理器用于调用所述存储器中的程序指令执行如权利要求10~16任一项所述的方法。
  25. 一种计算设备,其特征在于,所述计算设备包括计算设备单板以及基板管理控制器,所述基板管理控制器用于执行如权利要求10~16任一项所述的方法。
PCT/CN2023/078408 2022-02-28 2023-02-27 一种单板管理系统、方法、装置及设备 Ceased WO2023160699A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP23759320.7A EP4474996A4 (en) 2022-02-28 2023-02-27 Single-board management system, method and apparatus, and device
US18/816,341 US20240419618A1 (en) 2022-02-28 2024-08-27 Board management system, method, and apparatus, and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210188470.X 2022-02-28
CN202210188470.XA CN116701094A (zh) 2022-02-28 2022-02-28 一种单板管理系统、方法、装置及设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/816,341 Continuation US20240419618A1 (en) 2022-02-28 2024-08-27 Board management system, method, and apparatus, and device

Publications (1)

Publication Number Publication Date
WO2023160699A1 true WO2023160699A1 (zh) 2023-08-31

Family

ID=87764884

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/078408 Ceased WO2023160699A1 (zh) 2022-02-28 2023-02-27 一种单板管理系统、方法、装置及设备

Country Status (4)

Country Link
US (1) US20240419618A1 (zh)
EP (1) EP4474996A4 (zh)
CN (2) CN119493703B (zh)
WO (1) WO2023160699A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2025060492A1 (zh) * 2023-09-19 2025-03-27 华为技术有限公司 设备管理系统、计算机系统及数据传输方法
CN120578612A (zh) * 2025-08-01 2025-09-02 湖南天冠电子信息技术有限公司 一种基于mcu的ipmb实现系统
US12547340B2 (en) 2024-02-26 2026-02-10 Cisco Technology, Inc. Multi-type disk drive support using a common server storage backplane

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119782020B (zh) * 2024-12-18 2025-11-11 联想长风科技(北京)有限公司 一种bmc获取服务器错误信息的方法
CN119718038B (zh) * 2024-12-30 2025-10-14 苏州元脑智能科技有限公司 一种设备供电系统及服务器
CN120596411B (zh) * 2025-07-31 2025-10-31 苏州元脑智能科技有限公司 针对图像处理器的数据采集方法、服务器、装置和介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6389464B1 (en) * 1997-06-27 2002-05-14 Cornet Technology, Inc. Device management system for managing standards-compliant and non-compliant network elements using standard management protocols and a universal site server which is configurable from remote locations via internet browser technology
CN1783799A (zh) * 2004-11-29 2006-06-07 中兴通讯股份有限公司 电信传输系统单元软硬件版本自动获取方法
CN102355365A (zh) * 2011-08-15 2012-02-15 中兴通讯股份有限公司 机框管理器的上电方法及机框管理器
CN102707976A (zh) * 2012-05-14 2012-10-03 中兴通讯股份有限公司 一种atca系统及其管理固件版本的方法
US20160291654A1 (en) * 2015-04-06 2016-10-06 Dell Products L.P. Systems and methods for thermal adaptation for virtual thermal inputs in a chassis infrastructure
CN106850286A (zh) * 2014-09-25 2017-06-13 烽火通信科技股份有限公司 单板上的基板管理控制器及网元管理盘的基板管理控制器

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9367419B2 (en) * 2013-01-08 2016-06-14 American Megatrends, Inc. Implementation on baseboard management controller of single out-of-band communication access to multiple managed computer nodes
US20140344431A1 (en) * 2013-05-16 2014-11-20 Aspeed Technology Inc. Baseboard management system architecture
US9946552B2 (en) * 2016-09-21 2018-04-17 American Megatrends, Inc. System and method for detecting redundant array of independent disks (RAID) controller state from baseboard management controller (BMC)
US10810085B2 (en) * 2017-06-30 2020-10-20 Western Digital Technologies, Inc. Baseboard management controllers for server chassis
US10761858B2 (en) * 2018-04-24 2020-09-01 Dell Products, L.P. System and method to manage a server configuration profile of an information handling system in a data center
CN109471770B (zh) * 2018-09-11 2021-09-03 华为技术有限公司 一种系统管理方法和装置
US11119876B2 (en) * 2018-10-09 2021-09-14 Super Micro Computer, Inc. Device and method for testing computer system
CN111611124B (zh) * 2019-02-22 2023-06-20 富联精密电子(天津)有限公司 监控设备分析方法、装置、计算机装置及存储介质
CN110825204A (zh) * 2019-11-06 2020-02-21 深圳宝龙达信创科技股份有限公司 电子设备的主板及电源信息管理方法
CN111459863B (zh) * 2020-03-08 2021-09-28 苏州浪潮智能科技有限公司 一种基于nvme-mi的机箱管理系统及方法
US11210252B1 (en) * 2020-06-09 2021-12-28 Hewlett Packard Enterprise Development Lp Directing control data between semiconductor packages
CN111984292A (zh) * 2020-08-14 2020-11-24 苏州浪潮智能科技有限公司 一种多板卡进行cpld固件升级的方法及系统
US11914492B2 (en) * 2020-10-13 2024-02-27 Dell Products, L.P. System and method for highly granular power/thermal control in information handling systems
CN113360165A (zh) * 2021-05-28 2021-09-07 浪潮电子信息产业股份有限公司 一种bios更新方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6389464B1 (en) * 1997-06-27 2002-05-14 Cornet Technology, Inc. Device management system for managing standards-compliant and non-compliant network elements using standard management protocols and a universal site server which is configurable from remote locations via internet browser technology
CN1783799A (zh) * 2004-11-29 2006-06-07 中兴通讯股份有限公司 电信传输系统单元软硬件版本自动获取方法
CN102355365A (zh) * 2011-08-15 2012-02-15 中兴通讯股份有限公司 机框管理器的上电方法及机框管理器
CN102707976A (zh) * 2012-05-14 2012-10-03 中兴通讯股份有限公司 一种atca系统及其管理固件版本的方法
CN106850286A (zh) * 2014-09-25 2017-06-13 烽火通信科技股份有限公司 单板上的基板管理控制器及网元管理盘的基板管理控制器
US20160291654A1 (en) * 2015-04-06 2016-10-06 Dell Products L.P. Systems and methods for thermal adaptation for virtual thermal inputs in a chassis infrastructure

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4474996A4

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2025060492A1 (zh) * 2023-09-19 2025-03-27 华为技术有限公司 设备管理系统、计算机系统及数据传输方法
US12547340B2 (en) 2024-02-26 2026-02-10 Cisco Technology, Inc. Multi-type disk drive support using a common server storage backplane
CN120578612A (zh) * 2025-08-01 2025-09-02 湖南天冠电子信息技术有限公司 一种基于mcu的ipmb实现系统
CN120578612B (zh) * 2025-08-01 2025-10-03 湖南天冠电子信息技术有限公司 一种基于mcu的ipmb实现系统

Also Published As

Publication number Publication date
US20240419618A1 (en) 2024-12-19
CN116701094A (zh) 2023-09-05
CN119493703B (zh) 2025-11-21
EP4474996A1 (en) 2024-12-11
EP4474996A4 (en) 2025-06-04
CN119493703A (zh) 2025-02-21

Similar Documents

Publication Publication Date Title
WO2023160699A1 (zh) 一种单板管理系统、方法、装置及设备
US20240220439A1 (en) Motherboard and computing device
CN102081568B (zh) 多主机板服务器系统
US6351819B1 (en) Heterogeneous system enclosure services connection
MX2014001056A (es) Método y sistema para construir un sistema informatico de baja potencia.
CN118708519B (zh) 服务器拓展模组、服务器、配置方法、设备及介质
US10996942B1 (en) System and method for graphics processing unit firmware updates
CN116700747A (zh) 固件升级的方法、控制装置及系统
CN119906687B (zh) 一种服务器及其设备监控系统、方法
CN118132458A (zh) Mmio地址资源分配方法、装置、计算设备和存储介质
CN118860279A (zh) 一种用于存储系统的管理架构及存储系统
CN116185505B (zh) 一种硬盘背板的配置方法及计算设备
CN120234200B (zh) 测试系统、方法、电子设备、存储介质及产品
CN117349212A (zh) 一种服务器主板及其固态硬盘插入检测方法
WO2025214078A1 (zh) 计算设备及控制方法
CN217846999U (zh) 一种主板和计算设备
CN107391332A (zh) 一种存储系统及调试系统
CN117630637A (zh) 一种测试装置
CN211375594U (zh) 一种基于sw421处理器的接口扩展机构
US10409940B1 (en) System and method to proxy networking statistics for FPGA cards
CN116701005A (zh) 一种基于连接器的端口管理方法及相关设备
CN120407482B (zh) 型号确定系统、方法、电子设备、存储介质及产品
CN221446528U (zh) 带外管理模块和服务器
CN116701089A (zh) 测试系统和测试方法
CN114857069B (zh) 一种风扇治具板

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23759320

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023759320

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2023759320

Country of ref document: EP

Effective date: 20240906

NENP Non-entry into the national phase

Ref country code: DE