WO2025067366A1 - Procédé et appareil de gestion d'accélérateurs, et dispositif et support de stockage - Google Patents
Procédé et appareil de gestion d'accélérateurs, et dispositif et support de stockage Download PDFInfo
- Publication number
- WO2025067366A1 WO2025067366A1 PCT/CN2024/121559 CN2024121559W WO2025067366A1 WO 2025067366 A1 WO2025067366 A1 WO 2025067366A1 CN 2024121559 W CN2024121559 W CN 2024121559W WO 2025067366 A1 WO2025067366 A1 WO 2025067366A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- accelerators
- accelerator
- computing device
- interface information
- group
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
Definitions
- Example embodiments of the present disclosure generally relate to the field of computers, and more particularly to methods, devices, apparatuses, and computer-readable storage media for managing accelerators.
- processors need to be extremely efficient when handling complex data analysis, machine learning, and deep learning tasks, so it is expected that processors can be configured in a more efficient way.
- a method for managing an accelerator comprises: obtaining interface information associated with a group of accelerators in a computing device, the interface information at least indicating corresponding identification information and corresponding mapping information of the group of accelerators, wherein the mapping information of one accelerator indicates hardware resources for the accelerator in the computing device; using a driver of the computing device to initialize the group of accelerators according to the interface information; and using at least a portion of the initialized group of accelerators to perform a task.
- a device for managing accelerators includes: an interface information acquisition module configured to acquire interface information associated with a group of accelerators in a computing device, the interface information at least indicating corresponding identification information and corresponding mapping information of a group of accelerators, wherein the mapping information of one accelerator indicates hardware resources for the accelerator in the computing device; an accelerator initialization module configured to use a driver of the computing device to initialize the group of accelerators according to the interface information; and a task execution module configured to use at least a portion of the initialized group of accelerators to execute a task.
- an electronic device in a third aspect of the present disclosure, includes at least one processing unit; and at least one memory, the at least one memory is coupled to the at least one processing unit and stores instructions for execution by the at least one processing unit. When the instructions are executed by the at least one processing unit, the device executes the method of the first aspect.
- a computer-readable storage medium is provided.
- a computer program is stored thereon, and the computer program can be executed by a processor to implement the method of the first aspect.
- FIG1 is a schematic diagram showing an example environment in which embodiments of the present disclosure can be implemented.
- FIG2 shows a schematic diagram of an example architecture of a processing unit including an accelerator according to some embodiments of the present disclosure
- FIG3 is a schematic diagram showing an example mapping relationship between an accelerator and other hardware resources according to some embodiments of the present disclosure
- FIG4 is a schematic diagram showing an example of interface information according to some embodiments of the present disclosure.
- FIG5 shows a schematic diagram of an example for managing an accelerator according to some embodiments of the present disclosure
- FIG6 shows a flowchart of a method for managing an accelerator according to some embodiments of the present disclosure
- FIG. 7 shows a block diagram of an apparatus for managing an accelerator according to some embodiments of the present disclosure.
- FIG8 shows a block diagram of a device capable of implementing various embodiments of the present disclosure.
- a prompt message is sent to the user to clearly prompt the user that the operation requested to be performed will require obtaining and using the user's personal information.
- the user can autonomously choose whether to provide personal information to software or hardware such as an electronic device, application, server, or storage medium that performs the operation of the technical solution of the present disclosure according to the prompt message.
- the prompt information in response to receiving an active request from the user, may be sent to the user in the form of a pop-up window, in which the prompt information may be presented in text form.
- the pop-up window may also carry a selection control for the user to select "agree” or “disagree” to provide personal information to the electronic device.
- the term "in response to” as used herein refers to a state in which a corresponding event occurs or a condition is satisfied. It will be understood that the timing of executing a subsequent action executed in response to the event or condition is not necessarily strongly related to the time when the event occurs or the condition is satisfied. For example, in some cases, the subsequent action may be executed immediately when the event occurs or the condition is satisfied; while in other cases, the subsequent action may be executed some time after the event occurs or the condition is satisfied.
- a graphics processing unit can work more efficiently than a central processing unit (CPU) in graphics rendering.
- a tensor processing unit (TPU) designed for deep learning tasks can significantly increase the speed of machine learning model training and reasoning.
- accelerators supported by application platforms usually have programmable features, allowing developers to customize and optimize them as needed. Therefore, accelerators become an important solution to meet application needs.
- interface information associated with a group of accelerators in a computing device is obtained.
- the interface information at least indicates the corresponding identification information and corresponding mapping information of these accelerators.
- the mapping information of each accelerator indicates the hardware resources used for the accelerator in the computing device, such as an address translation unit, an interrupt service unit, etc.
- the driver of the computing device then initializes the group of accelerators according to the interface information, for example, enumerates these accelerators. Furthermore, in subsequent business processing, at least a part of this group of initialized accelerators is used to perform tasks according to actual needs.
- the accelerator is used as a platform device, and PCIe enumeration is not required, but the enumeration of the accelerator is completed by the driver. In this way, the configuration flexibility and utilization rate of the accelerator can be improved.
- a computing device 110 includes hardware for performing computing-related tasks.
- a computing device 110 is, for example, a personal computer, a server, a mobile device, and the like.
- the computing device 110 includes one or more accelerators 130, such as accelerator 130-1, accelerator 130-2, ... accelerator 130-N, etc. These accelerators may be individually or collectively referred to as accelerators 130. Such accelerators 130 may be utilized to increase the execution speed of a particular type of task (such as graphics processing, machine learning, etc.) or a particular type of data.
- accelerators 130 such as accelerator 130-1, accelerator 130-2, ... accelerator 130-N, etc.
- These accelerators may be individually or collectively referred to as accelerators 130.
- Such accelerators 130 may be utilized to increase the execution speed of a particular type of task (such as graphics processing, machine learning, etc.) or a particular type of data.
- the operating system 120 may be software running on the computing device 110 for allocating resources to applications running on the computing device 110. Furthermore, the operating system 120 may utilize the computing device 110 to perform tasks, and may also utilize the accelerator 130 to accelerate specific types of tasks or specific types of data.
- the architecture 200 may include multiple processing units, such as processing unit 210-1, processing unit 210-2, ... processing unit 210-X, etc. These processing units may be referred to as processing units 210 individually or collectively.
- Each processing unit 210 may include multiple processing cores, such as processing core 220-1, processing core 220-2, ... processing core 220-Y, etc. These processing cores may be individually or collectively referred to as processing cores 220. As a basic component of processing unit 210, processing core 220 may execute various instructions and coordinate system resources in a coordinated or independent manner. Processing core 220 may run multiple threads or processes simultaneously and share certain resources such as cache, registers, etc.
- Each processing unit 210 may also include a system memory management unit (SMMU) 230.
- the SMMU 230 may be used for address translation between the interface device and the bus, memory attribute translation, permission checking, etc.
- Each processing unit 210 may further include an accelerator 130.
- the accelerator 130 may include multiple acceleration units, such as an acceleration unit 250-1, an acceleration unit 250-2, ..., an acceleration unit 250-Z, etc. These acceleration units may be individually or collectively referred to as acceleration units 250.
- An accelerator 130 including multiple acceleration units 250 is also referred to as a first accelerator.
- an accelerator 130 can be regarded as an independent platform device, as a parent device of multiple acceleration units 250 attached to it. Therefore, unified acceleration unit resource management, device overall error handling and recovery and other operational support can be provided for the accelerator.
- one acceleration unit 250 can be regarded as an independent platform device, or a combination of multiple acceleration units 250 can be regarded as a platform device.
- one or more acceleration units 250 can be used as a sub-device attached to an accelerator 130.
- each processing unit 210 is shown as including the same number of processing cores and acceleration units, these processing units 210 may respectively include any appropriate number of processing cores and acceleration units, or may only include an appropriate number of processing cores and one accelerator, and the present disclosure is not limited to this.
- mapping relationship 300 shows a schematic diagram of an example mapping relationship 300 between an accelerator and other hardware resources according to some embodiments of the present disclosure.
- the mapping relationship 300 generally involves the accelerator 130, the SMMU 230, and the interrupt controller 320.
- the accelerator 130, the SMMU 230 and the interrupt controller 320 can be regarded as hardware resources available to the accelerator 130 in the computing device 110.
- the accelerator 130 is mounted to the bus 310 through the SMMU 230.
- the SMMU 310 is used to achieve security isolation between the accelerators 130 or between the multiple acceleration units 250 contained in each accelerator 130.
- the SMMU 310 can assign a stream ID (Stream ID, SID) to each attached or mounted accelerator 130.
- the SID can be used to identify different accelerators 130, thereby achieving security isolation between accelerators.
- the SMMU 310 can also assign a substream ID (SubStream ID, SSID) to each attached or mounted acceleration unit 250.
- the SSID can be used to identify different acceleration units 250, thereby achieving isolation of process address space.
- the SMMU 230 may convert the stream identifier into a device identifier (device ID) recognizable by the interrupt controller 320, and communicate with the interrupt controller 320 through the bus 310.
- the interrupt controller 320 is used to manage and distribute interrupt signals generated by hardware devices. For example, the accelerator 130 or the acceleration unit 250 triggers an interrupt signal, and such an interrupt signal is sent to the interrupt controller 320 via the SMMU 230.
- the interrupt controller 320 provides an interrupt translation service (ITS) so that the operating system 120 can determine which accelerator 130 or which acceleration unit 250 the interrupt signal is triggered by, and then execute the corresponding interrupt handler.
- ITS interrupt translation service
- the above describes an example architecture for managing accelerators from a hardware level.
- the example architecture 200 and the example mapping relationship 300 may be implemented in the environment 100.
- the operating system 120 may obtain interface information associated with a set of accelerators 130 in the computing device 110.
- Such interface information may include corresponding identification information of a set of accelerators 130, and may also include corresponding mapping information.
- the mapping information of each accelerator 130 may indicate the hardware resources used for the accelerator in the computing device 110, such as indicating the SMMU and ITS used for the accelerator.
- the interface information may be reported by the firmware to the operating system 120.
- the firmware may report the interface information used by the accelerator 130 to the operating system 120.
- the interface information may include a description table and a mapping table to help the operating system 120 identify the accelerator and determine the hardware resources for the accelerator.
- Figure 4 shows a schematic diagram of an example 400 of interface information according to some embodiments of the present disclosure.
- Example 400 may include at least one of a mapping table 410, a description table 420 (also referred to as a first description table), or a description table 430 (also referred to as a second description table).
- the mapping table 410 indicates the hardware resources for each accelerator in the computing device 110, such as SMMU, interrupt translation services provided by the interrupt controller, etc.
- the mapping table 410 includes an ITS node 412, an SMMU node 414, and a component node 416.
- the component node 416 indicates the mapping relationship between the accelerator 130 (for example, accelerator 1, accelerator 2, accelerator 3, etc. shown in Figure 4) and the SMMU and ITS.
- the SMMU node 414 indicates the SMMU hardware information in the computing device 110
- the ITS node 416 indicates the ITS-related hardware information in the computing device 110.
- the mapping table 410 can be, for example, an input-output remapping table (IORT).
- the description table 420 includes a set of corresponding hardware identifiers (HIDs) and corresponding register addresses of the accelerator 130 to ensure that the operating system can normally access the registers of the accelerator 130.
- a description table 420 may include, for example, a differentiated system description table (DSDT).
- DSDT differentiated system description table
- ACPI Advanced Configuration and Power Interface
- the DSDT supports matching the defined and described hardware identifiers to the corresponding accelerators.
- a hardware identifier is a unique identification symbol used to mark and distinguish a hardware device, also known as a hardware identifier, which usually includes a set of numbers and letters. Generally speaking, a hardware identifier is assigned to each device by a device manufacturer during the manufacturing process. However, the ACPI specification does not describe an accelerator 130 that is considered a platform device.
- a set of hardware identifiers of accelerators can be customized to ensure that the operating system 120 can identify and run the accelerators.
- the accelerator 130 includes three types of accelerators, such as CDA accelerators, DTE accelerators, and DLA accelerators.
- CDA accelerators their hardware identifiers are, for example, BCDA0000, BCDA0001, BCDA0002, etc.
- DTE accelerators their hardware identifiers are, for example, BDTE0000, BDTE0001, BDTE0002, etc.
- DLA accelerators their hardware identifiers are, for example, DBLA0000, DBLA0001, DBLA002, etc. In this way, it is possible to use the accelerator as a platform device and assign a hardware identifier to it.
- one or some of the accelerators 130 may include multiple acceleration units 250, or multiple acceleration units of the first accelerator may need to be exposed to the operating system as platform devices.
- the description table 420 may include a hardware identifier of the first accelerator and a register address corresponding to the first accelerator, and may also include register addresses corresponding to each acceleration unit.
- the acceleration unit it is possible to use the acceleration unit as a sub-device and assign a register address to it. In this way, flexible configuration of the acceleration unit can be achieved.
- the description table 430 includes corresponding identifiers of hardware resources for a group of accelerators 130 in the computing device 110 to ensure that the group of accelerators 130 can normally access the interrupt controller.
- a description table 420 includes, for example, an Advanced Programmable Interrupt Controller Description Table (MADT).
- MADT includes The relevant information of the interrupt controller, such as the ITS index shown in Figure 4, ITS0, ITS1, ITS2, etc.
- FIG5 shows a schematic diagram of an example 500 for managing accelerators according to some embodiments of the present disclosure.
- the firmware side reports interface information associated with the accelerator, such as IORT, MADT, DSDT, etc. in ACPI shown in FIG4.
- the driver side matches the accelerator according to the reported interface information and performs accelerator enumeration. That is, the driver side searches for all available accelerators in the computing device 110 and performs hardware enablement on the found accelerators.
- the user side uses at least a portion of the enumerated accelerators to perform specific tasks, such as parallel computing tasks. Depending on the specific implementation, the user side may use one or more of the enumerated accelerators, or one or more acceleration units.
- the accelerator or acceleration unit can be used as an independent device.
- one or more acceleration units can be allocated to a virtual machine for use. Referring to FIG. 2 , for a processing unit 210, the operating system 120 can also allocate a first number of processing cores 220 and a second number of acceleration units 250 therein to a virtual machine, and perform tasks through the virtual machine. In this way, the flexibility of using the acceleration unit in different application scenarios can be achieved.
- the operating system 120 can use the SMMU 230 to assign an SSID to each acceleration unit 250.
- an appropriate number of processing cores 220 and an appropriate number of acceleration units 250 are bound for use, for example, four processing cores 220 and four acceleration units 250 are assigned to one virtual machine.
- the operating system 120 can also dynamically configure the first number and the second number. Additionally or alternatively, the first number of processing cores 220 and the second number of acceleration units 250 can also be used as an independent device in non-virtualization deployment.
- the computing device 110 is based on an advanced reduced instruction set architecture, and as shown in FIG2 , the computing device 110 may include multiple processing units 210, which may further include a group of accelerators 130. In this way, the requirements for the computing power, energy efficiency, and power consumption of the advanced reduced instruction set architecture, as well as the diversified application scenarios and the requirements for programmability can be met.
- the present disclosure proposes to use the accelerator as a platform device and define its hardware identification. Furthermore, according to the interface information associated with the accelerator obtained, the driver can be used to complete the initialization of the accelerator and use the initialized accelerator to perform tasks.
- the accelerator management solution for non-PCIe bus standards can be implemented without PCIe bus enumeration. In this way, the configuration flexibility and utilization rate of the accelerator can be improved.
- FIG. 6 shows a flow chart of a method 600 for managing an accelerator according to some embodiments of the present disclosure.
- the method 600 may be implemented at the operating system 120.
- the method 600 is described below with reference to FIG. 1 .
- the operating system 120 obtains interface information associated with a set of accelerators in the computing device.
- the interface information indicates at least corresponding identification information and corresponding mapping information of a set of accelerators, wherein the mapping information of an accelerator indicates hardware resources for the accelerator in the computing device.
- the operating system 120 utilizes a driver of the computing device to initialize a set of accelerators according to the interface information.
- the operating system 120 utilizes at least a portion of the initialized set of accelerators to execute a task.
- the interface information is reported by firmware to an operating system of the computing device.
- the interface information includes at least one of the following: a first description table including corresponding hardware identifications and corresponding register addresses of a set of accelerators, a mapping table indicating, for each accelerator in the set of accelerators, at least one hardware resource in the computing device for the accelerator, or a second description table including corresponding identifications of hardware resources in the computing device for the set of accelerators.
- At least a first accelerator in a group of accelerators includes multiple acceleration units
- the first description table includes at least any of the following: a hardware identification of the first accelerator, a register address corresponding to the first accelerator, or register addresses corresponding to multiple acceleration units respectively.
- the at least one hardware resource includes at least one of: a system memory management unit, or an interrupt translation service.
- initializing a group of accelerators includes: using a driver to set corresponding sharing modes of a plurality of acceleration units, wherein the sharing mode of an acceleration unit indicates whether the acceleration unit is shared by a plurality of processes.
- the computing device is based on an advanced reduced instruction set architecture, and a set of accelerators is included in a processing unit of the computing device.
- FIG. 7 shows a schematic structural block diagram of an apparatus 700 for managing an accelerator according to some embodiments of the present disclosure.
- the apparatus 700 may be implemented as or included in the operating system 120.
- Each module/component in the apparatus 700 may be implemented by hardware, software, firmware, or any combination thereof.
- the apparatus 700 includes an interface information acquisition module 710, which is configured to acquire interface information associated with a group of accelerators in a computing device.
- the interface information at least indicates corresponding identification information and corresponding mapping information of a group of accelerators, wherein the mapping information of one accelerator indicates the hardware resources used for the accelerator in the computing device.
- the apparatus 700 also includes an accelerator initialization module 720, which is configured to use a driver of the computing device to initialize a group of accelerators according to the interface information.
- the apparatus 700 also includes a task execution module 730, which is configured to use at least a part of the initialized group of accelerators to execute a task.
- the interface information is reported by firmware to an operating system of the computing device.
- the interface information includes at least one of the following: a first description table including corresponding hardware identifications and corresponding register addresses of a set of accelerators, a mapping table indicating, for each accelerator in the set of accelerators, at least one hardware resource in the computing device for the accelerator, or a second description table including corresponding identifications of hardware resources in the computing device for the set of accelerators.
- At least a first accelerator in a group of accelerators includes multiple acceleration units
- the first description table includes at least any of the following: a hardware identification of the first accelerator, a register address corresponding to the first accelerator, or register addresses corresponding to multiple acceleration units respectively.
- the computing device is based on an advanced reduced instruction set architecture, and a set of accelerators is included in a processing unit of the computing device.
- the computing device includes multiple processing cores, a second accelerator in a group of accelerators includes multiple acceleration units, and the task execution module is also configured to assign a first number of processing cores in the multiple processing cores and a second number of acceleration units in the multiple acceleration units to the virtual machine; and execute the task through the virtual machine.
- FIG8 shows a block diagram of an electronic device 800 in which one or more embodiments of the present disclosure may be implemented. It should be understood that the electronic device 800 shown in FIG8 is merely exemplary and should not constitute any limitation on the functionality and scope of the embodiments described herein. The electronic device 800 shown in FIG8 may be used to implement the operating system 120 of FIG1 .
- the electronic device 800 is in the form of a general electronic device.
- the components of the electronic device 800 may include, but are not limited to, one or more processors or processing units 810, a memory 820, a storage device 830, one or more communication units 840, one or more input devices 850, and one or more output devices 860.
- the processing unit 810 may be an actual or virtual processor and is capable of performing various processes according to a program stored in the memory 820. In a multi-processor system, multiple processing units execute computer executable instructions in parallel to improve the parallel processing capability of the electronic device 800.
- the electronic device 800 typically includes a plurality of computer storage media. Such media may be any accessible media that is accessible to the electronic device 800, including but not limited to volatile and non-volatile media, removable and non-removable media.
- the memory 820 may be a volatile memory (e.g., a register, a cache, a random access memory (RAM)), a non-volatile memory (e.g., a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), flash memory), or some combination thereof.
- the storage device 830 may be a removable or non-removable medium, and may include a machine-readable medium, such as a flash drive, a disk, or any other medium, which may be capable of being used to store information and/or data (e.g., training data for training) and may be accessed within the electronic device 800.
- a machine-readable medium such as a flash drive, a disk, or any other medium, which may be capable of being used to store information and/or data (e.g., training data for training) and may be accessed within the electronic device 800.
- the electronic device 800 may further include additional removable/non-removable, volatile/non-volatile storage media.
- a disk drive for reading or writing from a removable, non-volatile disk e.g., a “floppy disk”
- an optical drive for reading or writing from a removable, non-volatile optical disk may be provided.
- each drive may be connected to a bus (not shown) by one or more data media interfaces.
- the memory 820 may include a computer program product 825 having one or more program modules configured to perform various methods or actions of various embodiments of the present disclosure.
- the communication unit 840 implements communication with other electronic devices through a communication medium. Additionally, the functions of the components of the electronic device 800 can be implemented with a single computing cluster or multiple computing machines that can communicate through a communication connection. Therefore, the electronic device 800 can operate in a networked environment using a logical connection with one or more other servers, a network personal computer (PC), or another network node.
- PC network personal computer
- the input device 850 may be one or more input devices, such as a mouse, a keyboard, a tracking ball, etc.
- the output device 860 may be one or more output devices, such as a display, a speaker, a printer, etc.
- the electronic device 800 may also communicate with one or more external devices (not shown) through the communication unit 840 as needed, such as a storage device, a display device, etc., communicate with one or more devices that allow a user to interact with the electronic device 800, or communicate with any device that allows the electronic device 800 to communicate with one or more other electronic devices (e.g., a network card, a modem, etc.). Such communication may be performed via an input/output (I/O) interface (not shown).
- I/O input/output
- a computer-readable storage medium on which computer-executable instructions are stored, wherein the computer-executable instructions are executed by a processor to implement the method described above.
- a computer program product is also provided, wherein the computer program product is tangibly The method is stored on a non-transitory computer-readable medium and includes computer-executable instructions, and the computer-executable instructions are executed by a processor to implement the method described above.
- These computer-readable program instructions can be provided to a processing unit of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine, so that when these instructions are executed by the processing unit of the computer or other programmable data processing device, a device that implements the functions/actions specified in one or more boxes in the flowchart and/or block diagram is generated.
- These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause the computer, programmable data processing device, and/or other equipment to work in a specific manner, so that the computer-readable medium storing the instructions includes a manufactured product, which includes instructions for implementing various aspects of the functions/actions specified in one or more boxes in the flowchart and/or block diagram.
- Computer-readable program instructions can be loaded onto a computer, other programmable data processing apparatus, or other device so that a series of operational steps are performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, so that the instructions executed on the computer, other programmable data processing apparatus, or other device implement the functions/actions specified in one or more boxes in the flowchart and/or block diagram.
- each square box in the flow chart or block diagram can represent a part of a module, program segment or instruction, and a part of a module, program segment or instruction includes one or more executable instructions for realizing the logical function of the specification.
- the function marked in the square box can also occur in a sequence different from that marked in the accompanying drawings. For example, two continuous square boxes can actually be executed substantially in parallel, and they can sometimes be executed in reverse order, depending on the functions involved.
- each square box in the block diagram and/or flow chart, and the combination of the square boxes in the block diagram and/or flow chart can be realized by a special hardware-based system that performs the function or action of the specification, or can be realized by a combination of special hardware and computer instructions.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Advance Control (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Dans les modes de réalisation de la présente divulgation, sont proposés un procédé et un appareil de gestion d'accélérateurs, ainsi qu'un dispositif et un support de stockage. Le procédé de gestion d'accélérateurs consiste à : acquérir des informations d'interface associées à un groupe d'accélérateurs dans un dispositif informatique, les informations d'interface indiquant au moins des informations d'identification correspondantes et des informations de mappage correspondantes du groupe d'accélérateurs, les informations de mappage d'un accélérateur indiquant une ressource matérielle dans le dispositif informatique qui est utilisée pour l'accélérateur ; utiliser un pilote du dispositif informatique pour initialiser le groupe d'accélérateurs sur la base des informations d'interface ; et utiliser au moins certains accélérateurs dans le groupe d'accélérateurs initialisés pour exécuter une tâche. De cette manière, la flexibilité de configuration et le taux d'utilisation d'accélérateurs peuvent être améliorés.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311270375.5A CN119718534A (zh) | 2023-09-27 | 2023-09-27 | 用于管理加速器的方法、装置、设备和存储介质 |
| CN202311270375.5 | 2023-09-27 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025067366A1 true WO2025067366A1 (fr) | 2025-04-03 |
Family
ID=95075508
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2024/121559 Pending WO2025067366A1 (fr) | 2023-09-27 | 2024-09-26 | Procédé et appareil de gestion d'accélérateurs, et dispositif et support de stockage |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN119718534A (fr) |
| WO (1) | WO2025067366A1 (fr) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100077179A1 (en) * | 2007-12-17 | 2010-03-25 | Stillwell Jr Paul M | Method and apparatus for coherent device initialization and access |
| US20130007762A1 (en) * | 2011-06-30 | 2013-01-03 | International Business Machines Corporation | Processing workloads using a processor hierarchy system |
| CN105579961A (zh) * | 2013-09-25 | 2016-05-11 | Arm有限公司 | 数据处理系统 |
| CN107710161A (zh) * | 2015-06-09 | 2018-02-16 | 微软技术许可有限责任公司 | 用于增加的工作流优化的独立可联网硬件加速器 |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10740257B2 (en) * | 2018-07-02 | 2020-08-11 | International Business Machines Corporation | Managing accelerators in application-specific integrated circuits |
-
2023
- 2023-09-27 CN CN202311270375.5A patent/CN119718534A/zh active Pending
-
2024
- 2024-09-26 WO PCT/CN2024/121559 patent/WO2025067366A1/fr active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100077179A1 (en) * | 2007-12-17 | 2010-03-25 | Stillwell Jr Paul M | Method and apparatus for coherent device initialization and access |
| US20130007762A1 (en) * | 2011-06-30 | 2013-01-03 | International Business Machines Corporation | Processing workloads using a processor hierarchy system |
| CN105579961A (zh) * | 2013-09-25 | 2016-05-11 | Arm有限公司 | 数据处理系统 |
| CN107710161A (zh) * | 2015-06-09 | 2018-02-16 | 微软技术许可有限责任公司 | 用于增加的工作流优化的独立可联网硬件加速器 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN119718534A (zh) | 2025-03-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10514931B2 (en) | Computing platform interface with memory management | |
| US8972991B2 (en) | Systems and methods for exposing processor topology for virtual machines | |
| EP3798835B1 (fr) | Procédé, dispositif et système de mise en oeuvre d'un traitement d'accélération matérielle | |
| US8914606B2 (en) | System and method for soft partitioning a computer system | |
| US8595723B2 (en) | Method and apparatus for configuring a hypervisor during a downtime state | |
| US8443376B2 (en) | Hypervisor scheduler | |
| US8762999B2 (en) | Guest-initiated resource allocation request based on comparison of host hardware information and projected workload requirement | |
| US8918568B2 (en) | PCI express SR-IOV/MR-IOV virtual function clusters | |
| US9043562B2 (en) | Virtual machine trigger | |
| WO2017114283A1 (fr) | Procédé et appareil pour traiter une requête de lecture/écriture dans un hôte physique | |
| US9131031B2 (en) | Virtual computer system, virtual computer management program, and MAC address management method | |
| US10983847B2 (en) | Dynamically loadable unikernel binaries | |
| CN105264506A (zh) | 向内存映射配置分配处理器 | |
| CN108255598A (zh) | 性能保证的虚拟化管理平台资源分配系统及方法 | |
| US10013199B2 (en) | Translation bypass by host IOMMU for systems with virtual IOMMU | |
| CN113778612A (zh) | 基于微内核机制的嵌入式虚拟化系统实现方法 | |
| US10459771B2 (en) | Lightweight thread synchronization using shared memory state | |
| WO2025020602A1 (fr) | Procédé d'exécution de système de contrôleur de gestion de carte de base, et contrôleur de gestion de carte de base | |
| CN112306669B (zh) | 一种基于多核系统的任务处理方法及装置 | |
| CN117827449B (zh) | 服务器的物理内存扩展架构、服务器、方法、设备及介质 | |
| US9280493B2 (en) | Method and device for enumerating input/output devices | |
| WO2025067366A1 (fr) | Procédé et appareil de gestion d'accélérateurs, et dispositif et support de stockage | |
| CN113485789B (zh) | 资源配置方法、装置及计算机架构 | |
| CN116069451B (zh) | 一种虚拟化方法、装置、设备、介质、加速器及系统 | |
| US12405884B1 (en) | Context-aware firmware-mapped host memory buffer (HMB) management system |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24870905 Country of ref document: EP Kind code of ref document: A1 |