WO2022170769A1 - 通信方法、装置及系统 - Google Patents
通信方法、装置及系统 Download PDFInfo
- Publication number
- WO2022170769A1 WO2022170769A1 PCT/CN2021/120844 CN2021120844W WO2022170769A1 WO 2022170769 A1 WO2022170769 A1 WO 2022170769A1 CN 2021120844 W CN2021120844 W CN 2021120844W WO 2022170769 A1 WO2022170769 A1 WO 2022170769A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- memory
- host
- network device
- memory pool
- pool
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/0284—Multiple user address space allocation, e.g. using different base addresses
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0877—Cache access modes
- G06F12/0882—Page mode
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/16—Handling requests for interconnection or transfer for access to memory bus
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/42—Bus transfer protocol, e.g. handshake; Synchronisation
- G06F13/4282—Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G06F15/17306—Intercommunication techniques
- G06F15/17331—Distributed shared memory [DSM], e.g. remote direct memory access [RDMA]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0813—Multiuser, multiprocessor or multiprocessing cache systems with a network or matrix configuration
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/10—Address translation
- G06F12/109—Address translation for multiple virtual address spaces, e.g. segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
- G06F13/28—Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
- G06F2212/1024—Latency reduction
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/15—Use in a specific computing environment
- G06F2212/154—Networked environment
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/25—Using a specific main memory architecture
- G06F2212/251—Local memory within processor subsystem
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/25—Using a specific main memory architecture
- G06F2212/254—Distributed memory
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/30—Providing cache or TLB in specific location of a processing system
- G06F2212/306—In system interconnect, e.g. between two buses
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/65—Details of virtual memory and virtual address translation
- G06F2212/657—Virtual address space management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2213/00—Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F2213/0026—PCI express
Definitions
- the present application relates to the field of communication, and more particularly, to a communication method, apparatus and system.
- a host eg, a server, etc.
- resources such as computing, memory, and storage are typically included. All applications on the host need to run in memory.
- the memory in the host is generally configured in advance, which may cause insufficient memory during the running of the application, thereby affecting the running performance of the application.
- the present application provides a communication method, device and system, which can improve the running performance of an application in a host.
- a communication method is provided.
- the first host receives a memory access address sent by the network device, the memory access address points to a storage unit in the first memory pool, the network device is connected to the first host, and the network device is used to exchange and forward the services of the first host , the network device is further configured to manage the first memory pool; when the memory of the first host satisfies a preset condition, the first host accesses the storage unit according to the memory access address.
- the existing network equipment connected to the host and capable of exchanging and forwarding services of the host usually does not include the memory management function, which means that in the present application, the memory management function needs to be deployed in the network equipment in advance.
- the management of the first memory pool by the network device includes implementing functions such as address isolation, access control, message distribution, flow control, and access conflict handling.
- functions such as address isolation, access control, message distribution, flow control, and access conflict handling.
- the first host can receive a memory access address sent by a network device, wherein the memory access address points to a storage unit in the first memory pool, so that when the memory of the first host meets a preset condition, The first host can access the storage unit in the first memory pool according to the memory access address, so that the memory of the first host can be expanded, and the running performance of the application program in the first host can be improved; and the first memory pool is managed by the network device At the time, the management difficulty and cost of the first memory pool can be reduced.
- the network device is further connected to at least one second host, the network device is configured to exchange and forward services of the at least one second host, the at least one second host
- the host provides a second memory pool, and the first memory pool includes the second memory pool.
- the at least one second host provides a second memory pool, in other words, the second memory pool may be a logical memory pool formed by the memory of one or more second hosts.
- the first memory pool is managed by the network device, and the first memory pool includes the second memory pool, which means that the second memory pool in this application is also managed by the network device.
- the first memory pool may include a second memory pool
- the second memory pool is a logical memory pool formed by the memory of one or more second hosts, so that the memory of the first host satisfies a preset condition
- the memory of the second host can be accessed according to the memory access address, so that the memory of the first host can be expanded, and the running performance of the application program in the first host can be improved; and the memory utilization rate of the second host can be improved;
- the device manages the second memory pool of the second host, which can reduce the difficulty and cost of managing the second memory pool.
- the network device includes a third memory pool, and the first memory pool includes the third memory pool.
- the existing network device connected to the host and capable of exchanging and forwarding services of the host usually does not include a memory pool, which means that in this implementation, a memory pool needs to be deployed in the network device in advance.
- the first memory pool may include a third memory pool
- the third memory pool is a memory pool of a network device, so that when the memory meets a preset condition, the first host can access the memory of the network device according to the memory access address.
- the third memory pool can expand the memory of the first host and improve the running performance of the applications in the first host; and can shorten the memory compared to accessing the memory of the second host (that is, accessing the memory of the second memory pool).
- the network device manages the third memory pool, the difficulty and cost of managing the third memory pool can be reduced.
- the first memory pool may only include the second memory pool, or may only include the third memory pool, or may include both the second memory pool and the third memory pool (that is, the first memory pool includes The logical memory pool of the second memory pool and the third memory pool).
- managing the first memory pool by the network device can not only expand the memory of the first host, but also reduce the difficulty and cost of managing the memory pool.
- the method before the first host receives the memory access address sent by the network device, the method further includes: when the memory of the first host satisfies the preset condition, The first host sends a request message to the network device, where the request message is used to request memory in the first memory pool.
- the preset condition is any one of the following: the memory usage rate of the first host is greater than the first threshold; the remaining memory space of the first host is less than the second Threshold; the remaining memory space of the first host is less than the memory space required by the first host to process services in the future target time period; or, the memory usage policy of the first host is to preferentially use the first memory pool.
- the above-mentioned first threshold or second threshold may be a specific value or a percentage.
- the first threshold may be a percentage, for example, may be 80%, 90%, 98%, etc., which is not limited in this application . It should be understood that when the memory usage rate of the first host is greater than the first threshold, it means that the application program in the first host has occupied more memory space.
- the second threshold when the preset condition is that the remaining memory space of the first host is less than the second threshold, the second threshold may be a specific value, for example, a specific value such as 0G, 5G, 8G, etc.; it may also be a percentage, for example: it may be 0%, 10%, 20%, etc., which are not limited in this application. It should be understood that when the remaining memory space of the first host is less than the second threshold, it means that less memory space remaining in the first host can be used by the application.
- the preset condition is that the remaining memory space of the first host is less than the memory space required by the first host to process services in the future target time period.
- the above method further includes: predicting the memory space required for the first host to process services (ie, run the application program) in the future target time period.
- the present application can predict in advance the memory space required by the first host to process services in the future target time period, and then when the memory of the first host meets the preset conditions (that is, the remaining memory space of the first host is larger than that of the first host in the future). If the memory space required to process the business in the target time period is small), the first memory pool is accessed in advance according to the memory access address, so as to avoid the delay problem caused by the first host running out of memory and then requesting memory from the network device. The running performance of the application in the first host is improved.
- the first host includes a network card, and the first host communicates with the network device through a remote direct memory access RDMA protocol.
- the first host includes a bus interface, and the first host communicates with the network device through a high-speed peripheral component interconnecting a PCIe bus or a computing express link CXL.
- the method further includes: the first host sends a notification message to the network device, the notification message The memory access address is included to cause the network device to release the storage unit.
- the first memory pool can also be used as a shared memory pool of multiple hosts.
- a notification message may be sent to the network device through the first host, so that the network device releases the storage unit for use by other hosts.
- releasing a storage unit includes modifying the storage unit from a used state to an idle state.
- a communication method receives a request message sent by the first host, where the request message is used to request memory in the first memory pool, the network device is connected to the first host, and the network device is used for exchanging and forwarding services of the first host, The network device is also used for managing the first memory pool; the network device sends a memory access address to the first host, where the memory access address points to a storage unit in the first memory pool.
- the sending of the memory access address by the network device to the first host specifically refers to that the network device sends the memory access address to the first host according to the request message.
- the network device is further connected to at least one second host, the network device is configured to exchange and forward services of the at least one second host, the at least one second host
- the host provides a second memory pool, and the first memory pool includes the second memory pool.
- the network device includes a third memory pool, and the first memory pool includes the third memory pool.
- the first host communicates with the network device through a remote direct memory access RDMA protocol.
- the first host communicates with the network device through a high-speed peripheral component interconnecting a PCIe bus or a computing express link CXL.
- sending the memory access address by the network device to the first host includes: the network device determining a free storage unit in the first memory pool; the network device sending the memory access address to the first host.
- the first host sends the memory access address corresponding to the idle storage unit, so that the first host uses the idle storage unit.
- the method further includes: the network device records the status of each storage unit in the first memory pool, where the status includes idle or used.
- the status of being in use means that the memory space of the storage unit is occupied
- the status of being idle means that the memory space of the storage unit is not occupied.
- the method further includes: the network device receives a notification message sent by the first host, where the notification message includes the memory access address; the network device receives the notification message according to the notification message Free the storage unit.
- the first memory pool can also be used as a shared memory pool of multiple hosts.
- the network device may receive a notification message sent by the first host, where the notification message includes a memory access address, and then release a storage unit corresponding to the memory access address according to the notification message for use by other hosts.
- releasing a storage unit includes modifying the storage unit from a used state to an idle state.
- a communication device configured to apply to a first host, and the communication device includes: a receiving module configured to receive a memory access address sent by a network device, where the memory access address points to a memory access address in the first memory pool.
- a storage unit the network device is connected to the first host, the network device is used to exchange and forward services of the first host, and the network device is also used to manage the first memory pool;
- a processing module is used to When the memory of the host satisfies the preset condition, the storage unit is accessed according to the memory access address.
- the network device is further connected to at least one second host, the network device is configured to exchange and forward services of the at least one second host, the at least one second host
- the host provides a second memory pool, and the first memory pool includes the second memory pool.
- the network device includes a third memory pool, and the first memory pool includes the third memory pool.
- the communication device further includes: a sending module, configured to send a request message to the network device when the memory of the first host satisfies the preset condition, the The request message is used to request the memory in the first memory pool.
- the preset condition is any one of the following: the memory usage rate of the first host is greater than the first threshold; the remaining memory space of the first host is less than the second Threshold; the remaining memory space of the first host is less than the memory space required by the first host to process services in the future target time period; or, the memory usage policy of the first host is to preferentially use the first memory pool.
- the first host includes a network card, and the first host communicates with the network device through a remote direct memory access RDMA protocol.
- the first host includes a bus interface, and the first host communicates with the network device through a high-speed peripheral component interconnecting a PCIe bus or a computing express link CXL.
- the sending module when the first host no longer needs to use the storage unit, the sending module is further configured to send a notification message to the network device, where the notification message includes the memory access address to cause the network device to release the storage unit.
- a communication apparatus configured to apply to a network device, and the communication apparatus includes: a receiving module, configured to receive a request message sent by a first host, where the request message is used to request a memory pool in the first memory pool. memory, the network device is connected to the first host, the network device is used to exchange and forward the services of the first host, and the network device is also used to manage the first memory pool; a sending module is used to send the first host Send a memory access address, where the memory access address points to a storage unit in the first memory pool.
- the network device is further connected to at least one second host, the network device is configured to exchange and forward services of the at least one second host, the at least one second host
- the host provides a second memory pool, and the first memory pool includes the second memory pool.
- the network device includes a third memory pool, and the first memory pool includes the third memory pool.
- the first host communicates with the network device through a remote direct memory access RDMA protocol.
- the first host and the network device communicate via a high-speed peripheral component interconnecting a PCIe bus or a computing express link CXL.
- the communication device further includes: a processing module configured to determine free storage units in the first memory pool; the sending module is further configured to send a message to the first memory pool.
- the host sends the memory access address corresponding to the idle storage unit, so that the first host uses the idle storage unit.
- the processing module is further configured to record the status of each storage unit in the first memory pool, where the status includes idle or used.
- the receiving module is further configured to receive a notification message sent by the first host, where the notification message includes the memory access address; the processing module is further configured to: The storage unit is released according to the notification message.
- a communication system comprising: a communication device as in the third aspect or any possible implementation manner of the third aspect, and a communication device as in the fourth aspect or any possible implementation manner of the fourth aspect communication device.
- a communication apparatus including a processor and a memory; the processor executes the instructions in the memory, so that the communication apparatus executes the first aspect or any possible implementation manner of the first aspect and/or, performing the communication method in the second aspect or any possible implementation manner of the second aspect.
- a computing device comprising: at least one processor and a memory, the at least one processor is coupled to the memory for reading and executing instructions in the memory to execute the first aspect or the communication method in any possible implementation manner of the first aspect; and/or, performing the communication method in the second aspect or any possible implementation manner of the second aspect.
- a computer program product comprising instructions that, when the computer program product runs on a computer, cause the computer to perform the communication method as in the first aspect or any possible implementation of the first aspect; and /or, perform the communication method in the second aspect or any possible implementation manner of the second aspect.
- a computer-readable storage medium which is characterized by comprising instructions; the instructions are used to implement the communication method in the first aspect or any possible implementation manner of the first aspect; and/or , to implement the communication method in the second aspect or any possible implementation manner of the second aspect.
- a tenth aspect provides a chip, the chip includes a processor and a data interface, the processor reads an instruction stored in a memory through the data interface, and executes the first aspect or any possibility of the first aspect and/or, performing the communication method in the second aspect or any possible implementation manner of the second aspect.
- the chip may further include a memory, in which instructions are stored, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the communication method in the first aspect or any possible implementation manner of the first aspect; and/or, execute the communication method in the second aspect or any possible implementation manner of the second aspect.
- a chip system in an eleventh aspect, includes at least one processor for supporting functions involved in implementing the above first aspect or some implementations of the first aspect; and/or, implementing the above-mentioned first aspect
- the functions involved in the second aspect or some implementations of the second aspect such as receiving or processing data and/or information involved in the above methods.
- the chip system further includes a memory for storing program instructions and data, and the memory is located inside the processor or outside the processor.
- the chip system may be composed of chips, or may include chips and other discrete devices.
- Figure 1 is an example diagram of a traditional DCN architecture
- FIG. 2 is an example diagram of a memory access flow of an application in a server provided by an embodiment of the present application
- FIG. 3 is an example diagram of a communication method provided by an embodiment of the present application.
- FIG. 4 is an exemplary diagram of the composition of a first memory pool provided by an embodiment of the present application.
- FIG. 5 is an example diagram of a DCN architecture provided by an embodiment of the present application.
- FIG. 6 is an example diagram of a TOR-based memory pooling architecture provided by an embodiment of the present application.
- FIG. 7 is an example diagram of another TOR-based memory pooling architecture provided by an embodiment of the present application.
- FIG. 8 is an example diagram of a communication apparatus 800 provided by an embodiment of the present application.
- FIG. 9 is an example diagram of a communication apparatus 900 provided by an embodiment of the present application.
- FIG. 10 is an example diagram of a communication system 1000 provided by an embodiment of the present application.
- FIG. 11 is an exemplary block diagram of a hardware structure of a communication apparatus 1100 provided by an embodiment of the present application.
- a host eg, a server, etc.
- resources such as computing, memory, and storage are typically included. All applications running on the host need to be in memory.
- the memory in the host is generally configured in advance, which may cause insufficient memory during the running of the application, thereby affecting the running performance of the application.
- DCN data center network
- FIG 1 is an example diagram of a traditional DCN architecture.
- traditional data centers are mainly based on a server-centric architecture, in which each server has a fixed number of computations (ie, central processing unit (CPU)) ), memory (memory) and storage (storage, for example, solid state drive (SSD), hard disk drive (HDD), etc.) resources.
- Servers in the same rack (rack) are not directly interconnected, but are interconnected through corresponding top of rack (TOR) switches.
- a TOR switch (hereinafter referred to as TOR for short) may also be referred to as an access switch or a leaf switch.
- the TOR communicates with the TOR through the aggregation switch. Therefore, under this architecture, servers in different racks can communicate through TOR and aggregation switches.
- Each server in DCN can be regarded as an independent data processing unit. The following describes the memory access process of the application in the server with reference to FIG. 2 .
- the memory access process applied in the server mainly includes the following steps:
- the operating system in the server assigns a virtual address, and the virtual address contains page number and address offset information.
- the memory management unit (MMU) in the CPU converts the virtual address into a physical address according to the address mapping information in the page table, so as to realize the access to the physical memory during application processing.
- step 2) If in step 2), the allocated virtual address cannot find its corresponding physical address information in the page table (for example, the physical memory is insufficient), the system will generate a page fault (page fault), and the system will exchange ( swap) to obtain address space from server storage.
- the system will have a simple cold and hot data replacement function, which will put a part of the cold data in the memory into storage, so that the memory can be freed for more applications. .
- the present application proposes a communication method, so that when the memory of the host meets a preset condition, the host can access the memory in the memory pool managed and controlled by the network device. Therefore, not only the memory expansion of the host can be realized, but also the fast access to the memory can be realized, thereby improving the running performance of the application program in the host; and the management difficulty and cost of the memory pool can be reduced.
- FIG. 3 is an example diagram of a communication method provided by an embodiment of the present application. As shown in FIG. 3, the method 300 may include S310 and S320, and each step in the method 300 will be described in detail below.
- the network device sends the memory access address to the first host.
- the first host receives the memory access address sent by the network device.
- the memory access address points to a storage unit in the first memory pool
- the network device is connected to the first host, the network device is used to exchange and forward services of the first host, and the network device is also used to manage the first memory pool.
- the host (including the first host and the second host hereinafter) involved in the embodiments of the present application may be any computing device, such as any one of a server, a computer, a desktop computer, a virtual machine, or other user equipment, etc. item, which is not limited in this application. It should be understood that the host may communicate with the distributed storage system through a network, and an operating system and other application programs are installed in the host. For ease of description, in the following embodiments, various functions of the host will be described by taking a server as an example, referring to FIG. 5 to FIG. 7 .
- the network device involved in the embodiments of the present application is connected to the host (may be a direct connection or an indirect connection), and can exchange and forward services of the host.
- the network device may be any one of an access switch, an intelligent switch, an aggregation switch, or other network device forms with switching functions, which is not limited in this application.
- TOR will be used as an example for description.
- the network device and the host in the embodiments of the present application can perform end-to-end data interaction, and the termination of the communication protocol can be implemented at the network device (that is, the processing related to the communication protocol is performed on the received data packet, not only forwarding only).
- an existing network device connected to a host and capable of exchanging and forwarding services of the host usually does not include a memory management function, which means that in the present application, a memory management function needs to be deployed in the network device in advance.
- the management of the first memory pool by the network device includes implementing functions such as address isolation, access control, message distribution, flow control, and access conflict handling.
- functions such as address isolation, access control, message distribution, flow control, and access conflict handling.
- the first memory pool may be a T-level memory pool or a G-level memory pool, which is not limited in this application.
- T and G are the units of memory.
- the network device may also record the status of each storage unit in the first memory pool, where the status includes idle or used. After a storage unit is allocated to a certain host, the network device may set the state of the storage unit to use, and when the storage unit is not allocated, the network device may set the state of the storage unit to idle. For example, the network device may set a status flag for each storage unit, and different values of the status flag may indicate different states of the storage unit.
- sending the memory access address by the network device to the first host includes: the network device determines an idle storage unit in the first memory pool; and the network device sends the memory access address corresponding to the idle storage unit to the first host, so that the first host Use the free storage unit.
- the method 300 may further include step S330, where the memory of the first host satisfies the predetermined conditions.
- the first host sends a request message to the network device, where the request message is used to request memory in the first memory pool.
- the network device receives the request message sent by the first host.
- the network device sending the memory access address to the first host actually means that the network device sends the memory access address to the first host according to the request message.
- Step S330 can also be replaced with the following implementation process: the network device learns the memory usage in the first host through the first host or other memory monitoring devices, and then the network device sends the memory to the first host according to the memory usage in the first host. address. For example, the network device may send the memory access address to the first host when the memory usage rate of the first host is relatively high, or the remaining memory of the first host is small, or the remaining memory of the first host cannot meet later requirements, etc. It should be understood that the present application does not limit the conditions for triggering the network device to send the memory access address to the first host.
- the above preset conditions include but are not limited to any one of the following: the memory usage rate of the first host is greater than the first threshold; the remaining memory space of the first host is less than the second threshold; the remaining memory space of the first host is less than the first threshold. Memory space required by a host to process services in a future target time period; or, the memory usage policy of the first host is to use the first memory pool preferentially.
- first threshold and second threshold may be specific numerical values or percentages.
- the first threshold may be a percentage, for example, may be 80%, 90%, 98%, etc., which is not limited in this application . It should be understood that when the memory usage rate of the first host is greater than the first threshold, it means that the running of the application program in the first host has occupied more memory space.
- the second threshold when the preset condition is that the remaining memory space of the first host is less than the second threshold, the second threshold may be a specific value, for example, a specific value such as 0G, 5G, 8G, etc.; it may also be a percentage, for example: it may be 0%, 10%, 20%, etc., which are not limited in this application. It should be understood that when the remaining memory space of the first host is less than the second threshold, it means that less memory space remaining in the first host can be used for running the application program.
- the method 300 may further include: predicting the memory space required by the first host to process services (ie, run an application) in a future target time period. It means that the present application can predict in advance the memory space required by the first host to process services in the future target time period, and then when the memory of the first host meets the preset conditions (that is, the remaining memory space of the first host is larger than that of the first host in the future).
- the first host accesses the memory of the first memory pool in advance according to the memory access address, so as to avoid the delay caused by the first host running out of memory and then requesting memory from the network device. Therefore, the running performance of the application in the first host can be further improved.
- the network device may send a memory access address to the first host, wherein the memory access address points to a storage unit in the first memory pool, so that when the memory of the first host meets a preset condition, the A host can access the storage unit in the first memory pool of the network device according to the memory access address, so that the memory of the first host can be expanded, and the running performance of the application program in the first host can be improved; and the network device is used as the first memory pool.
- the management difficulty and cost of the first memory pool can be reduced.
- the network device may also be connected with at least one second host (two are shown in FIG. 4 ), and the network device is used for exchanging and forwarding services of the at least one second host.
- the at least one second host provides a second memory pool
- the first memory pool includes the second memory pool.
- the first memory pool may only include the second memory pool.
- the at least one second host provides a second memory pool, in other words, the second memory pool may be a logical memory pool formed by the memories of one or more second hosts.
- the memory in the second memory pool changes dynamically with the usage of the memory on each host. Therefore, when there are few applications running on the first host, the remaining memory on the first host may also belong to the second memory pool.
- the first memory pool is managed by the network device, and the first memory pool includes the second memory pool, which means that the second memory pool in this application is also managed by the network device.
- the memory on each host may include multiple storage units, which means a logical second memory provided by one or more second hosts
- a pool can also include multiple storage units.
- the plurality of storage units on each host may be composed entirely of dynamic random access memory (DRAM), or may be composed entirely of storage class memory (SCM) (for example, non-volatile memory).
- DRAM dynamic random access memory
- SCM storage class memory
- Volatile memory non-volatile memory, NVM
- phase-change memory phase-change memory
- PCM phase-change memory
- Intel persistent memory Japanese pass, AEP, etc.
- HBM high bandwidth memory
- HBM high bandwidth memory
- DRAM dynamic random access memory
- SCM static random access memory
- the application does not limit the type and composition of the storage unit in each host, which means that this application is for the logical second memory composed of one or more second hosts.
- the types and composition modes of the multiple storage units included in the pool are also not limited.
- each host connected to the network device needs to register memory with the network device in real time or at intervals, that is, to provide the network device with information about its own available memory space.
- Network equipment collects all information and maintains, manages and distributes it.
- the first host (which can be any host connected to the network device) needs to obtain memory from the memory pool (that is, the preset condition is met)
- the first host can apply to the network device, and the network device sends the memory pool information to the first host according to the acquired memory pool information.
- the host allocates a memory address, and then the first host directly accesses the allocated memory.
- the access path of the memory is: second host memory-(second host network card)-network device-(first host network card)-first host memory-first host CPU.
- the first memory pool may include a second memory pool
- the second memory pool is a logical memory pool formed by the memory of one or more second hosts, so that the memory of the first host satisfies a preset condition
- the memory of the second host can be accessed according to the memory access address, so that the memory of the first host can be expanded, and the running performance of the application program in the first host can be improved; and the memory utilization rate of the second host can be improved;
- the device manages the second memory pool of the second host the difficulty and cost of managing the second memory pool can be reduced.
- the network device may include a third memory pool, and in this case, the first memory pool may include a third memory pool.
- the first memory pool may only include the third memory pool.
- the existing network device connected to the host and capable of exchanging and forwarding services of the host generally does not include a memory pool, which means that in this implementation, a memory pool needs to be deployed in the network device in advance.
- the access path of the memory is: network device memory-(first host network card)-first host memory-first host CPU.
- the third memory pool provided by the network device may include multiple storage units.
- the plurality of storage units in the third memory pool may be composed entirely of DRAM, may be composed entirely of SCM (for example, NVM, PCM, AEP, etc.), or may be composed of a mixture of HBM, DRAM, SCM, etc.
- the application does not limit the type and composition of the storage units in the third memory pool. It should also be understood that in the case of the above mixed composition, the network device also needs to perform hierarchical management on the third memory pool.
- the plurality of storage units in the third memory pool may be deployed (ie, accessed) in the network device in various ways.
- multiple storage units can be accessed through a memory interface directly provided by a chip (for example, an application-specific integrated circuit (ASIC) chip) in a network device, and the chip can perform memory on the multiple storage units.
- ASIC application-specific integrated circuit
- the device has a built-in field-programmable gate array (FPGA), where the FPGA manages memory and provides a memory interface, which is used to access multiple storage units; it can also be plugged into an FPGA or FPGA through a network device It performs memory management and provides a memory interface, which is used to access multiple storage units.
- FPGA field-programmable gate array
- this application does not limit the access method of multiple storage units.
- the memory management can be implemented in the above manner, or can also be implemented by other newly added memory management modules or processing modules, which is not limited in this application.
- the first memory pool may include a third memory pool
- the third memory pool is a memory pool of a network device, so that when the memory meets a preset condition, the first host can access the memory of the network device according to the memory access address.
- the third memory pool can expand the memory of the first host and improve the running performance of the applications in the first host; and can shorten the memory compared to accessing the memory of the second host (that is, accessing the memory of the second memory pool).
- the network device manages the third memory pool, the difficulty and cost of managing the third memory pool can be reduced.
- the first memory pool may include the second memory pool and the third memory pool (that is, the first memory pool is a logical memory pool including the second memory pool and the third memory pool), and The first memory pool is managed by the network device.
- the first host when the first host accesses the memory in the first memory pool, it may first access the memory in the second memory pool, that is, it may first access the memory of the second host, or it may first access the network.
- this application does not limit the access order of the memory.
- the first memory pool only includes the third memory pool in the network device as an example for description.
- the first memory pool can be used as a shared memory pool of multiple hosts, so that when the memory of any host connected to the network device reaches a preset condition, it can request access to the first memory pool. of memory.
- the method 300 may further include: the first host sends a notification message to the network device.
- the notification message includes the memory access address of the storage unit, so that the network device releases the storage unit.
- the network device receives the notification message sent by the first host; the network device releases the storage unit corresponding to the memory access address according to the notification message. It should be understood that releasing a storage unit includes modifying the storage unit from a used state to an idle state.
- a notification message may be sent to the network device through the first host, so that the network device releases the corresponding storage unit according to the notification message for used by other hosts.
- the communication between the first host and the network device may be through a remote direct memory access (Remote Direct Memory Access, RDMA) protocol or the like.
- the communication can be performed through the simplified RDMA protocol or the standard RDMA protocol.
- the simplified RDMA protocol includes a communication protocol obtained after functionally deleting or optimizing the existing RDMA protocol, which is not limited in this embodiment. It should be understood that since the standard RDMA protocol is too complicated, in the following embodiments, the preferred communication mode is to simplify the RDMA protocol, as shown in FIG. 6 . It should be understood that when the first host communicates with the network device through the RDMA protocol, it needs to be implemented through a network card on the first host and a communication module on the network device.
- the first host and the network device may also communicate via a high-speed peripheral component interconnect (Peripheral Component Interconnect Express, PCIe) bus or a Compute Express Link (Compute Express Link, CXL) bus, as shown in FIG. 7 .
- PCIe peripheral component interconnect Express
- CXL Compute Express Link
- the memory of the first memory pool can be accessed directly by means of synchronous memory semantics or direct memory access (DMA), which speeds up the remote access speed.
- DMA direct memory access
- the communication between the first host and the network device needs to be implemented by a bus interface on the first host and a communication module on the network device.
- an engine such as PCIe/CXL and a DMA engine need to be added to the network device.
- the present application does not limit the application scenarios of the method 300 .
- the method 300 may be applied to the DCN architecture shown in FIG. 5 .
- FIG. 5 is an example diagram of a DCN architecture provided by an embodiment of the present application. As shown in Figure 5, this architecture is mainly based on the traditional DCN architecture shown in Figure 1, and the TOR switch (ie the network device) is connected to the memory pool (ie the third memory pool, and in this example, the The first memory pool only includes the third memory pool as an example for description), and a corresponding function module (memory management module) is also deployed in the TOR switch to manage the memory in the memory pool.
- the TOR switch ie the network device
- the memory pool ie the third memory pool, and in this example, the The first memory pool only includes the third memory pool as an example for description
- a corresponding function module memory management module
- a communication module (not shown) is also deployed to realize the end-to-end data interaction between the TOR switch and the server (ie, the first host), and to realize the termination of the communication protocol at the TOR switch (ie, the receiving The message performs the processing related to the communication protocol), so that the server can access the memory on the TOR switch.
- the DCN architecture provided by the embodiments of the present application and the application processing flow based on the architecture are exemplarily introduced below with reference to FIG. 6 and FIG. 7 .
- the memory pool of the TOR switch takes the DRAM pool as an example
- the storage takes the SSD as an example.
- FIG. 6 and FIG. 7 are only an example, and do not constitute a limitation to the present application.
- FIG. 6 is an example diagram of a TOR-based memory pooling architecture provided by an embodiment of the present application.
- TOR includes a DRAM pool, and TOR can also provide memory management functions. It should be understood that, regarding the access manner of the DRAM and the implementation modules of the memory management function, reference may be made to the above description.
- the standard/simplified RDMA protocol stack needs to be implemented on TOR. This enables the server to access the memory on TOR through a high-speed network when the memory on the server side is insufficient.
- the data transmission process can be divided into two types: one is: when the server wants to access the memory on the corresponding TOR, it can be directly implemented by simplifying RDMA.
- the communication is: TOR memory - server network card - server local memory - server local CPU.
- the simplified RDMA is a communication protocol obtained by deleting or optimizing the functions of the existing RDMA, which is not limited in this embodiment.
- the other is: TOR and TOR can communicate through standard RDMA.
- the server may not include a prefetch module.
- the application processing flow is as follows:
- the operating system in the server will allocate a virtual address, and the virtual address contains page number and address offset information.
- the MMU in the CPU converts the virtual address into a physical address according to the address mapping information in the page table, so as to realize the access to the physical memory during application processing.
- step 2) the system will preferentially use the local memory (that is, the memory in the server), and when the local memory is insufficient, the system will try to access the memory on the corresponding TOR.
- the specific process is as follows:
- the server When the server's local memory meets the preset conditions (for example, the remaining memory is insufficient), the server will request the TOR to use the TOR memory, and the memory management module on the TOR will allocate a part of the memory space to the server after receiving the request, and Sends the memory access address of the allocated memory space to the server.
- the server When the server's local memory meets the preset conditions (for example, the remaining memory is insufficient), the server will request the TOR to use the TOR memory, and the memory management module on the TOR will allocate a part of the memory space to the server after receiving the request, and Sends the memory access address of the allocated memory space to the server.
- the server After the server obtains the memory access address, it can access the memory on the TOR by means such as RDMA (for example, the simplified RDMA shown in FIG. 6 ).
- RDMA for example, the simplified RDMA shown in FIG. 6 .
- the communication between the server and the corresponding TOR is implemented by the network card on the server and the communication module on the TOR.
- the system can access the memory on the TOR in another rack through standard RDMA or other methods, or it can exchange the memory from the SSD in the local server. Get the address space without limitation.
- a cluster refers to a system consisting of servers and TORs.
- the memory configuration requirements in the server can be reduced through statistical multiplexing of cluster memory, so that when the total amount of cluster memory is the same (reduce the configuration memory on the server side, deploy the reduced memory on the TOR, so that the The total amount of memory remains unchanged) to improve memory utilization and application performance; under the same application performance, the total amount of cluster memory is reduced (statistical reuse of memory on TOR), thereby saving costs.
- each server is configured with 100G memory
- each server in the architecture of this application, each server can be configured with 60G memory, and then each TOR can be configured with 80G memory (assuming that one TOR is connected to two servers), so that The total amount of memory in the cluster of the architecture of the present application is the same as the total amount of memory of the cluster of the traditional architecture.
- the memory configured in the TOR can be used as the shared memory of all servers, so that when the memory in some servers and the total memory of the corresponding TOR are insufficient, the memory on other TORs can be used, so that in the cluster When the total amount of memory remains unchanged, memory utilization and application performance can be improved.
- each server is configured with 100G of memory. Some servers require 100G to run applications, some require 60G to run, and some require 80G. In short, it is not fixed. Then, in the architecture of this application, each server can be configured with 60G memory, and then configure memory for each TOR (0-80G, also assuming a TOR connection according to the memory occupied by the application running on each server in the architecture of Figure 1). There are 2 servers), so that in the case of the same application performance, the total amount of cluster memory is reduced, thereby saving costs.
- a prefetching module can be introduced into the server to reduce the delay generated when accessing the memory of TOR through the network in actual operation.
- a memory access address monitoring module (not shown in the figure) may also be added to the server, which is not limited in this embodiment.
- the prefetching module and the memory access address monitoring module may exist independently, or may exist in the kernel of the operating system in the server, which is not limited in this embodiment.
- the application processing flow is as follows:
- the memory access address monitoring module will analyze the application's memory access in real time, and by tracking the application's memory access, predict the memory space required for the application to run in the future target time period. After obtaining this information, if the system determines that the remaining local memory of the server will soon be unable to meet future needs, the prefetch module can trigger pre-memory access between the server and TOR to obtain the address space needed in the future in advance, thereby preventing the system from page faults in the future.
- obtaining the address space required in the future in advance may be obtaining the memory access address sent by TOR in advance, and transferring the processed data in the local memory to TOR in advance according to the memory access address, so as to reserve the local memory for the application to run. use.
- the server does not include a prefetch module
- the TOR switch obtains the required address space, and the application processing stops at this time. , wait until memory space is available before continuing with the app.
- the address space required in the future can be obtained in advance by means of prefetching, and the address space required in the future can be obtained in advance and application processing can occur at the same time, which can greatly reduce the memory access delay and The incidence of page faults, and can also improve application performance.
- the server and the TOR may not communicate through the simplified RDMA method shown in FIG. 6 , but communicate after interconnection through a bus such as PCIe/CXL as shown in FIG. 7 .
- a bus such as PCIe/CXL
- the communication between the server and the TOR needs to be implemented by the bus interface on the server and the communication module on the TOR.
- the specific communication process in this case is: TOR memory - server local memory - server local CPU.
- the communication between the server and the TOR is realized through PCIe/CXL, and the memory on the TOR can be accessed directly by means of synchronous memory semantics or DMA, which speeds up the remote access speed. It should be understood that in this case, TOR also needs to add PCIe/CXL and other engines and DMA engines.
- the present application further provides a communication apparatus 800 and a communication apparatus 900 .
- FIG. 8 is an example diagram of a communication apparatus 800 provided by an embodiment of the present application.
- the communication device 800 is applied to the above-mentioned first host.
- the apparatus 800 includes a receiving module 810 and a processing module 820 .
- the receiving module 810 is configured to receive the memory access address sent by the network device.
- the memory access address points to a storage unit in the first memory pool, the network device is connected to the first host, the network device is used to exchange and forward services of the first host, and the network device is also used to manage the first memory pool.
- the processing module 820 is configured to access the storage unit according to the memory access address when the memory of the first host satisfies the preset condition.
- the network device may also be connected to at least one second host, the network device is used to exchange and forward services of the at least one second host, the at least one second host provides a second memory pool, and the first memory pool includes: the second memory pool.
- the network device may include a third memory pool, and the first memory pool may include a third memory pool.
- the communication apparatus 800 may further include: a sending module 830 .
- the sending module 830 may be configured to send a request message to the network device when the memory of the first host satisfies a preset condition.
- the request message is used to request memory in the first memory pool.
- the preset condition can be any one of the following: the memory usage rate of the first host is greater than the first threshold; the remaining memory space of the first host is less than the second threshold; the remaining memory space of the first host is less than The memory space required for processing the service in the future target time period; or, the memory usage policy of the first host is to preferentially use the first memory pool.
- the first host may include a network card, and the first host and the network device may communicate through the remote direct memory access RDMA protocol.
- the first host may include a bus interface, and the first host and the network device may communicate via a high-speed peripheral component interconnecting a PCIe bus or a computing express link CXL.
- the sending module 830 may be further configured to send a notification message to the network device, where the notification message includes the memory access address, so that the network device releases the storage unit.
- FIG. 9 is an example diagram of a communication apparatus 900 provided by an embodiment of the present application.
- the communication apparatus 900 is applied to the above-mentioned network equipment.
- the communication device 900 includes a receiving module 910 and a sending module 920 .
- the receiving module 910 is configured to receive the request message sent by the first host.
- the request message is used to request memory in the first memory pool
- the network device is connected to the first host
- the network device is used to exchange and forward services of the first host
- the network device is also used to manage the first memory pool.
- the sending module 920 is configured to send a memory access address to the first host, where the memory access address points to a storage unit in the first memory pool.
- the network device may also be connected to at least one second host, the network device is used to exchange and forward services of the at least one second host, the at least one second host provides a second memory pool, and the first memory pool includes: the second memory pool.
- the network device may include a third memory pool, and the first memory pool may include a third memory pool.
- the first host and the network device may communicate through the remote direct memory access RDMA protocol.
- the first host and the network device may communicate via a high-speed peripheral component interconnecting a PCIe bus or a computing express link CXL.
- the communication apparatus 900 may further include: a processing module 930 .
- the processing module 930 may be configured to determine an idle storage unit in the first memory pool; the sending module 920 may also be configured to send the memory access address corresponding to the idle storage unit to the first host, so that the first host uses the idle storage unit .
- processing module 930 may also be recorded as a memory management module, which is not limited.
- the processing module 930 may be further configured to record the status of each storage unit in the first memory pool.
- the status includes idle or in use.
- the receiving module 910 may be further configured to receive a notification message sent by the first host, where the notification message includes a memory access address.
- the processing module 930 may also be configured to release the storage unit corresponding to the memory access address according to the notification message.
- FIG. 10 is an example diagram of a communication system 1000 provided by an embodiment of the present application. As shown in FIG. 10 , the communication system 1000 includes a communication device 800 and a communication device 900 .
- Figure 1100 is an exemplary block diagram of a hardware structure of a communication apparatus 1100 provided by an embodiment of the present application.
- the communication apparatus 1100 may specifically be a computer device.
- the communication device 1100 includes a memory 1110 , a processor 1120 , a communication interface 1130 and a bus 1140 .
- the memory 1110 , the processor 1120 , and the communication interface 1130 are connected to each other through the bus 1140 for communication.
- the memory 1110 may be a read-only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
- the memory 1110 may store a program, and when the program stored in the memory 1110 is executed by the processor 1120, the processor 1120 is configured to execute each step of the communication method of the embodiment of the present application.
- the processor 1120 may adopt a general-purpose CPU, a microprocessor, an ASIC, a graphics processing unit (graphics processing unit, GPU), or one or more integrated circuits, for executing related programs, so as to implement the communication method of the method embodiment of the present application.
- the processor 1120 can also be an integrated circuit chip with signal processing capability.
- the communication method of the present application may be implemented by an integrated logic circuit of hardware in the processor 1120 or an instruction in the form of software.
- the above-mentioned processor 1120 may also be a general-purpose processor, a digital signal processor (DSP), an ASIC, an FPGA or other programmable logic devices, discrete gate or transistor logic devices, and discrete hardware components.
- DSP digital signal processor
- ASIC application specific integrated circuit
- FPGA field-programmable gate array
- a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
- the steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
- the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
- the storage medium is located in the memory 1110, and the processor 1120 reads the information in the memory 1110, and combines its hardware to complete the functions required to be performed by the modules included in the apparatus of the embodiments of the present application, or to execute the communication methods of the method embodiments of the present application.
- the communication interface 1130 implements communication between the device 1100 and other devices or a communication network using a transceiving device such as, but not limited to, a transceiver.
- the bus 1140 may include a pathway for communicating information between the various components of the device 1100 (eg, the memory 1110, the processor 1120, the communication interface 1130).
- An embodiment of the present application further provides a computing device, including: at least one processor and a memory, wherein the at least one processor is coupled to the memory and configured to read and execute instructions in the memory to execute the present application The communication method of the method embodiment.
- the embodiments of the present application also provide a computer program product containing instructions, when the computer program product is run on a computer, the computer program product enables the computer to execute the communication method of the method embodiments of the present application.
- the embodiments of the present application further provide a computer-readable storage medium, including instructions; the instructions are used to implement the communication method of the method embodiments of the present application.
- An embodiment of the present application further provides a chip, the chip includes a processor and a data interface, the processor reads an instruction stored in a memory through the data interface, and executes the communication method of the method embodiment of the present application.
- the chip may further include a memory, in which instructions are stored, the processor is configured to execute the instructions stored in the memory, and when the instructions are executed, the The processor is configured to execute the communication method of the method embodiment of the present application.
- the disclosed system, apparatus and method may be implemented in other manners.
- the apparatus embodiments described above are only illustrative.
- the division of the units is only a logical function division. In actual implementation, there may be other division methods.
- multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
- the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Computer And Data Communications (AREA)
Abstract
Description
Claims (35)
- 一种通信方法,其特征在于,包括:第一主机接收网络设备发送的内存访问地址,所述内存访问地址指向第一内存池中的存储单元,所述网络设备与所述第一主机连接,所述网络设备用于交换和转发所述第一主机的业务,所述网络设备还用于管理所述第一内存池;在所述第一主机的内存满足预设条件时,所述第一主机根据所述内存访问地址访问所述存储单元。
- 根据权利要求1所述的通信方法,其特征在于,所述网络设备还与至少一个第二主机连接,所述网络设备用于交换和转发所述至少一个第二主机的业务,所述至少一个第二主机提供第二内存池,所述第一内存池包括所述第二内存池。
- 根据权利要求1或2所述的通信方法,其特征在于,所述网络设备包括第三内存池,所述第一内存池包括所述第三内存池。
- 根据权利要求1至3中任一项所述的通信方法,其特征在于,在所述第一主机接收网络设备发送的内存访问地址之前,所述方法还包括:在所述第一主机的内存满足所述预设条件时,所述第一主机向所述网络设备发送请求消息,所述请求消息用于请求所述第一内存池中的内存。
- 根据权利要求1至4中任一项所述的通信方法,其特征在于,所述预设条件为以下任意一种:所述第一主机的内存使用率大于第一阈值;所述第一主机的内存剩余空间小于第二阈值;所述第一主机的内存剩余空间小于所述第一主机在未来目标时间段处理业务所需的内存空间;或者,所述第一主机的内存使用策略为优先使用所述第一内存池。
- 根据权利要求1至5中任一项所述的通信方法,其特征在于,所述第一主机包括网卡,所述第一主机与所述网络设备通过远程直接内存访问RDMA协议通信。
- 根据权利要求1至5中任一项所述的通信方法,其特征在于,所述第一主机包括总线接口,所述第一主机与所述网络设备通过高速外围组件互联PCIe总线或计算快速链路CXL通信。
- 根据权利要求1至7中任一项所述的通信方法,其特征在于,当所述第一主机不再需要使用所述存储单元时,所述方法还包括:所述第一主机向所述网络设备发送通知消息,所述通知消息包括所述内存访问地址,以使所述网络设备释放所述存储单元。
- 一种通信方法,其特征在于,包括:网络设备接收第一主机发送的请求消息,所述请求消息用于请求第一内存池中的内存,所述网络设备与所述第一主机连接,所述网络设备用于交换和转发所述第一主机的业务,所述网络设备还用于管理所述第一内存池;所述网络设备向所述第一主机发送内存访问地址,所述内存访问地址指向所述第一内 存池中的存储单元。
- 根据权利要求9所述的通信方法,其特征在于,所述网络设备还与至少一个第二主机连接,所述网络设备用于交换和转发所述至少一个第二主机的业务,所述至少一个第二主机提供第二内存池,所述第一内存池包括所述第二内存池。
- 根据权利要求9或10所述的通信方法,其特征在于,所述网络设备包括第三内存池,所述第一内存池包括所述第三内存池。
- 根据权利要求9至11中任一项所述的通信方法,其特征在于,所述第一主机与所述网络设备通过远程直接内存访问RDMA协议通信。
- 根据权利要求9至11中任一项所述的通信方法,其特征在于,所述第一主机与所述网络设备通过高速外围组件互联PCIe总线或计算快速链路CXL通信。
- 根据权利要求9至13中任一项所述的通信方法,其特征在于,所述网络设备向所述第一主机发送内存访问地址包括:所述网络设备确定所述第一内存池中的空闲存储单元;所述网络设备向所述第一主机发送所述空闲存储单元对应的内存访问地址,以使所述第一主机使用所述空闲存储单元。
- 根据权利要求14所述的通信方法,其特征在于,所述方法还包括:所述网络设备记录所述第一内存池中的各存储单元的状态,所述状态包括空闲或使用。
- 根据权利要求14所述的通信方法,其特征在于,所述方法还包括:所述网络设备接收所述第一主机发送的通知消息,所述通知消息包括所述内存访问地址;所述网络设备根据所述通知消息释放所述内存访问地址对应的存储单元。
- 一种通信装置,其特征在于,所述通信装置应用于第一主机,所述通信装置包括:接收模块,用于接收网络设备发送的内存访问地址,所述内存访问地址指向第一内存池中的存储单元,所述网络设备与所述第一主机连接,所述网络设备用于交换和转发所述第一主机的业务,所述网络设备还用于管理所述第一内存池;处理模块,用于在所述第一主机的内存满足预设条件时,根据所述内存访问地址访问所述存储单元。
- 根据权利要求17所述的通信装置,其特征在于,所述网络设备还与至少一个第二主机连接,所述网络设备用于交换和转发所述至少一个第二主机的业务,所述至少一个第二主机提供第二内存池,所述第一内存池包括所述第二内存池。
- 根据权利要求17或18所述的通信装置,其特征在于,所述网络设备包括第三内存池,所述第一内存池包括所述第三内存池。
- 根据权利要求17至19中任一项所述的通信装置,其特征在于,所述通信装置还包括:发送模块,用于在所述第一主机的内存满足所述预设条件时,向所述网络设备发送请求消息,所述请求消息用于请求所述第一内存池中的内存。
- 根据权利要求17至20中任一项所述的通信装置,其特征在于,所述预设条件为以下任意一种:所述第一主机的内存使用率大于第一阈值;所述第一主机的内存剩余空间小于第二阈值;所述第一主机的内存剩余空间小于所述第一主机在未来目标时间段处理业务所需的内存空间;或者,所述第一主机的内存使用策略为优先使用所述第一内存池。
- 根据权利要求17至21中任一项所述的通信装置,其特征在于,所述第一主机包括网卡,所述第一主机与所述网络设备通过远程直接内存访问RDMA协议通信。
- 根据权利要求17至21中任一项所述的通信装置,其特征在于,所述第一主机包括总线接口,所述第一主机与所述网络设备通过高速外围组件互联PCIe总线或计算快速链路CXL通信。
- 根据权利要求20所述的通信装置,其特征在于,当所述第一主机不再需要使用所述存储单元时,所述发送模块还用于,向所述网络设备发送通知消息,所述通知消息包括所述内存访问地址,以使所述网络设备释放所述存储单元。
- 一种通信装置,其特征在于,所述通信装置应用于网络设备,所述通信装置包括:接收模块,用于接收第一主机发送的请求消息,所述请求消息用于请求第一内存池中的内存,所述网络设备与所述第一主机连接,所述网络设备用于交换和转发所述第一主机的业务,所述网络设备还用于管理所述第一内存池;发送模块,用于向所述第一主机发送内存访问地址,所述内存访问地址指向所述第一内存池中的存储单元。
- 根据权利要求25所述的通信装置,其特征在于,所述网络设备还与至少一个第二主机连接,所述网络设备用于交换和转发所述至少一个第二主机的业务,所述至少一个第二主机提供第二内存池,所述第一内存池包括所述第二内存池。
- 根据权利要求25或26所述的通信装置,其特征在于,所述网络设备包括第三内存池,所述第一内存池包括所述第三内存池。
- 根据权利要求25至27中任一项所述的通信装置,其特征在于,所述第一主机与所述网络设备通过远程直接内存访问RDMA协议通信。
- 根据权利要求25至27中任一项所述的通信装置,其特征在于,所述第一主机与所述网络设备通过高速外围组件互联PCIe总线或计算快速链路CXL通信。
- 根据权利要求25至29中任一项所述的通信装置,其特征在于,所述通信装置还包括:处理模块,用于确定所述第一内存池中的空闲存储单元;所述发送模块还用于,向所述第一主机发送所述空闲存储单元对应的内存访问地址,以使所述第一主机使用所述空闲存储单元。
- 根据权利要求30所述的通信装置,其特征在于,所述处理模块还用于,记录所述第一内存池中的各存储单元的状态,所述状态包括空闲或使用。
- 根据权利要求30所述的通信装置,其特征在于,所述接收模块还用于,接收所述第一主机发送的通知消息,所述通知消息包括所述内存访问地址;所述处理模块还用于,根据所述通知消息释放所述内存访问地址对应的存储单元。
- 一种通信系统,其特征在于,包括:如权利要求17至24中任一项所述的通信装置以及如权利要求25至32中任一项所述的通信装置。
- 一种通信装置,其特征在于,包括处理器和存储器;所述处理器运行所述存储器中的指令,使得所述通信装置执行如权利要求1至8中任一项所述的通信方法;和/或,执行如权利要求9至16中任一项所述的通信方法。
- 一种计算机可读存储介质,其特征在于,包括指令;所述指令用于实现如权利要求1至8中任一项所述的通信方法;和/或,实现如权利要求9至16中任一项所述的通信方法。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP21925420.8A EP4276643B1 (en) | 2021-02-10 | 2021-09-27 | Communication method, apparatus, and system |
| US18/447,046 US12380018B2 (en) | 2021-02-10 | 2023-08-09 | Method for host access to network device-managed memory pool |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110184060 | 2021-02-10 | ||
| CN202110184060.3 | 2021-02-10 | ||
| CN202110656360.7A CN114911725A (zh) | 2021-02-10 | 2021-06-11 | 通信方法、装置及系统 |
| CN202110656360.7 | 2021-06-11 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/447,046 Continuation US12380018B2 (en) | 2021-02-10 | 2023-08-09 | Method for host access to network device-managed memory pool |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022170769A1 true WO2022170769A1 (zh) | 2022-08-18 |
Family
ID=82761468
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/120844 Ceased WO2022170769A1 (zh) | 2021-02-10 | 2021-09-27 | 通信方法、装置及系统 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US12380018B2 (zh) |
| EP (1) | EP4276643B1 (zh) |
| CN (1) | CN114911725A (zh) |
| WO (1) | WO2022170769A1 (zh) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117667300A (zh) * | 2022-08-31 | 2024-03-08 | 阿里巴巴(中国)有限公司 | 计算系统及相关方法 |
| CN118244972A (zh) * | 2022-12-24 | 2024-06-25 | 华为技术有限公司 | 一种数据存储方法、装置及系统 |
| CN119496792A (zh) * | 2023-08-21 | 2025-02-21 | 超聚变数字技术有限公司 | 数据传输方法、及计算节点 |
| CN119402566B (zh) * | 2024-12-30 | 2025-04-29 | 苏州元脑智能科技有限公司 | 一种内存管理系统、方法、程序产品及存储介质 |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104239222A (zh) * | 2013-06-20 | 2014-12-24 | 华为技术有限公司 | 一种内存访问方法、设备和系统 |
| US20150026780A1 (en) * | 2012-03-07 | 2015-01-22 | Ntt Docomo, Inc. | Host providing system and communication control method |
| US20170134225A1 (en) * | 2015-11-05 | 2017-05-11 | Accelstor, Inc. | Network apparatus for temporarily accessing network setting and method using thereof |
| CN106776048A (zh) * | 2017-01-24 | 2017-05-31 | 郑州云海信息技术有限公司 | 一种实时虚拟机内存调度方法及装置 |
| CN108023914A (zh) * | 2016-11-03 | 2018-05-11 | 阿里巴巴集团控股有限公司 | 一种内存数据共享系统、内存数据的写入以及读取方法 |
| CN110008140A (zh) * | 2019-03-11 | 2019-07-12 | 深圳市广和通无线股份有限公司 | 内存管理方法、装置、计算机设备和存储介质 |
| CN112291086A (zh) * | 2020-10-16 | 2021-01-29 | 苏州浪潮智能科技有限公司 | 一种交换机的内存扩容方法、系统及装置 |
Family Cites Families (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2001002965A2 (en) * | 1999-06-30 | 2001-01-11 | Broadcom Corporation | Memory management unit for a network switch |
| US8429652B2 (en) * | 2009-06-22 | 2013-04-23 | Citrix Systems, Inc. | Systems and methods for spillover in a multi-core system |
| US8537613B2 (en) * | 2011-03-31 | 2013-09-17 | Sandisk Technologies Inc. | Multi-layer memory system |
| US9128843B2 (en) * | 2012-10-11 | 2015-09-08 | Industrial Technology Research Institute | Method and computer system for memory management on virtual machine system |
| US9692820B2 (en) * | 2013-04-06 | 2017-06-27 | Citrix Systems, Inc. | Systems and methods for cluster parameter limit |
| CN107135189B (zh) * | 2016-02-26 | 2020-02-14 | 华为技术有限公司 | 一种报文发送方法及物理机 |
| US20190044809A1 (en) * | 2017-08-30 | 2019-02-07 | Intel Corporation | Technologies for managing a flexible host interface of a network interface controller |
| US11636014B2 (en) * | 2017-10-31 | 2023-04-25 | SK Hynix Inc. | Memory system and data processing system including the same |
| US12135876B2 (en) * | 2018-02-05 | 2024-11-05 | Micron Technology, Inc. | Memory systems having controllers embedded in packages of integrated circuit memory |
| US11792307B2 (en) * | 2018-03-28 | 2023-10-17 | Apple Inc. | Methods and apparatus for single entity buffer pool management |
| KR102769757B1 (ko) * | 2018-12-21 | 2025-02-20 | 에스케이하이닉스 주식회사 | 메모리 시스템 및 메모리 시스템의 동작방법 |
| US20200322287A1 (en) * | 2020-06-18 | 2020-10-08 | Intel Corporation | Switch-managed resource allocation and software execution |
| US11841793B2 (en) * | 2021-01-27 | 2023-12-12 | Rambus Inc. | Switch-based free memory tracking in data center environments |
| US12511071B2 (en) * | 2022-03-31 | 2025-12-30 | Intel Corporation | Advanced interleaving techniques for fabric based pooling architectures |
| US12498974B2 (en) * | 2022-03-31 | 2025-12-16 | Intel Corporation | Adaptive collaborative memory with the assistance of programmable networking devices |
-
2021
- 2021-06-11 CN CN202110656360.7A patent/CN114911725A/zh active Pending
- 2021-09-27 WO PCT/CN2021/120844 patent/WO2022170769A1/zh not_active Ceased
- 2021-09-27 EP EP21925420.8A patent/EP4276643B1/en active Active
-
2023
- 2023-08-09 US US18/447,046 patent/US12380018B2/en active Active
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150026780A1 (en) * | 2012-03-07 | 2015-01-22 | Ntt Docomo, Inc. | Host providing system and communication control method |
| CN104239222A (zh) * | 2013-06-20 | 2014-12-24 | 华为技术有限公司 | 一种内存访问方法、设备和系统 |
| US20170134225A1 (en) * | 2015-11-05 | 2017-05-11 | Accelstor, Inc. | Network apparatus for temporarily accessing network setting and method using thereof |
| CN108023914A (zh) * | 2016-11-03 | 2018-05-11 | 阿里巴巴集团控股有限公司 | 一种内存数据共享系统、内存数据的写入以及读取方法 |
| CN106776048A (zh) * | 2017-01-24 | 2017-05-31 | 郑州云海信息技术有限公司 | 一种实时虚拟机内存调度方法及装置 |
| CN110008140A (zh) * | 2019-03-11 | 2019-07-12 | 深圳市广和通无线股份有限公司 | 内存管理方法、装置、计算机设备和存储介质 |
| CN112291086A (zh) * | 2020-10-16 | 2021-01-29 | 苏州浪潮智能科技有限公司 | 一种交换机的内存扩容方法、系统及装置 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4276643A4 |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4276643A1 (en) | 2023-11-15 |
| EP4276643A4 (en) | 2024-07-10 |
| CN114911725A (zh) | 2022-08-16 |
| EP4276643B1 (en) | 2025-04-23 |
| US12380018B2 (en) | 2025-08-05 |
| US20230385190A1 (en) | 2023-11-30 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7752489B2 (ja) | メモリリソースを管理するためのシステム及び方法 | |
| US11841814B2 (en) | System with cache-coherent memory and server-linking switch | |
| US11748278B2 (en) | Multi-protocol support for transactions | |
| US11507426B2 (en) | Resource pool management method and apparatus, resource pool control unit, and communications device | |
| EP4276643B1 (en) | Communication method, apparatus, and system | |
| EP4439312A1 (en) | Data storage method and system, storage access configuration method and related device | |
| US20150261698A1 (en) | Memory system, memory module, memory module access method, and computer system | |
| WO2022271239A1 (en) | Queue scaling based, at least, in part, on processing load | |
| AU2015402888B2 (en) | Computer device and method for reading/writing data by computer device | |
| WO2024179298A1 (zh) | 跨机柜服务器内存池化方法、装置、设备、服务器及介质 | |
| CN116126742A (zh) | 内存访问方法、装置、服务器及存储介质 | |
| CN118779280B (zh) | 降低总线负载的方法、cxl模组、处理系统和处理器芯片 | |
| TW202020674A (zh) | 數據處理系統 | |
| CN113722110B (zh) | 计算机系统、内存访问方法及设备 | |
| CN118349399A (zh) | 一种数据迁移方法、控制器及扩展存储箱 | |
| CN119718707B (zh) | 缓存管理方法、装置及计算机可读存储介质 | |
| HK40062664A (zh) | 计算机系统、内存访问方法及设备 | |
| HK40062664B (zh) | 计算机系统、内存访问方法及设备 | |
| WO2026066110A1 (zh) | 存储节点、存储阵列和数据访问方法 | |
| CN117499511A (zh) | 报文处理方法、芯片和计算机设备 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21925420 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202337051609 Country of ref document: IN |
|
| ENP | Entry into the national phase |
Ref document number: 2021925420 Country of ref document: EP Effective date: 20230807 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWG | Wipo information: grant in national office |
Ref document number: 2021925420 Country of ref document: EP |