WO2018040629A1 - 键值存储方法、装置及系统 - Google Patents

键值存储方法、装置及系统 Download PDF

Info

Publication number
WO2018040629A1
WO2018040629A1 PCT/CN2017/085983 CN2017085983W WO2018040629A1 WO 2018040629 A1 WO2018040629 A1 WO 2018040629A1 CN 2017085983 W CN2017085983 W CN 2017085983W WO 2018040629 A1 WO2018040629 A1 WO 2018040629A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage
request
key
address information
host
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2017/085983
Other languages
English (en)
French (fr)
Inventor
高峰
袁慧琴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to EP17844936.9A priority Critical patent/EP3495970B1/en
Publication of WO2018040629A1 publication Critical patent/WO2018040629A1/zh
Priority to US16/287,826 priority patent/US11048642B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • G06F2212/657Virtual address space management

Definitions

  • the present application relates to the field of storage technologies, and in particular, to a key value storage method, apparatus, and system.
  • Key value (English: key-value, abbreviation: KV) storage, is a non-relational database (English: no structured query language, abbreviation: NoSQL) stored in a way, its data is organized in the form of key-value pairs, index and Storage has the characteristics of simple storage semantics, good storage system scalability, fast data query speed and large data storage capacity.
  • the non-volatile memory express (English: non-volatile memory express, abbreviation: NVMe) protocol is an efficient storage protocol among various storage protocols of the current storage system.
  • the storage system using the NVMe protocol and the use of traditional small Compared with the storage system of the computer system interface (English: small computer system interface, abbreviation: SCSI) protocol, the general-purpose input and output (English: input-output, IO) scheduling layer, SCSI upper layer, and SCSI intermediate layer are reduced. It has the characteristics of short IO path, low delay and strong concurrent processing capability.
  • the embodiment of the present invention provides a key value storage method, device, and system for solving the problem that the key value storage and some efficient storage protocols (such as the NVMe protocol) cannot be combined in the prior art.
  • the embodiment of the present application provides the following technical solutions:
  • the embodiment of the present application provides a key value storage method, the method includes: after the host detects the key value storage request, the host writes the first instruction code and the address information carried by the key value storage request into a protocol.
  • the defined field constitutes a first storage request instruction sequence, wherein the first instruction code is an instruction code defined according to a reserved extension field of the protocol; further, the host interacts with the storage controller to cause the storage controller Obtaining the first sequence of storage request instructions.
  • the host can interact with the storage controller to implement key value storage by using the storage protocol, without performing on the block layer or the file system. Conversion to achieve key-value storage, therefore, also reduces the IO path latency of the storage system.
  • the host composes the first instruction code carried by the key value storage request and the field defined by the address information writing protocol to form a first storage request instruction sequence, including: the field defined by the host according to the protocol is The first storage request instruction sequence allocates memory; the host stores the key value The first instruction code carried by the storage request and the address information are written into the memory; further, the host interacts with the storage controller (such as sending a notification message, etc.), so that the storage controller acquires the first storage request instruction sequence And including: the host notifying the storage controller to read the first sequence of storage request instructions from the memory.
  • the host may also directly send the first sequence of storage request instructions to the storage controller.
  • the manner in which the host notifies the storage controller to read the first storage request instruction sequence from the memory may save the memory space of the storage controller.
  • the key value storage request includes: a write data request, or a get data request, or a delete data request, or an obsolete data request; wherein the address information carried by the write data request includes a key and a value stored.
  • the method further includes: the length of the value requested by the host to obtain the data request; the host allocates memory for the value according to the length of the value, so that the host and the The storage controller interacts to cause the storage controller to read the first storage request instruction sequence, and the storage controller reads the data to the memory allocated by the host for the value.
  • the host first needs to allocate memory for the value. In this way, the data read by the storage controller from the storage device has corresponding storage space.
  • the host obtains the length of the value requested by the data acquisition request, including: the host writes the instruction code of the length of the value corresponding to the key, and the address information stored by the key into the protocol definition.
  • the field constitutes a second storage request instruction sequence, wherein the instruction code of the length of the value corresponding to the acquisition key is an instruction code defined according to the reserved extension field of the protocol; the host interacts with the storage controller to enable the The storage controller acquires the second sequence of storage request instructions; the host receives the length of the value sent by the storage controller.
  • the host can obtain the length of the value requested by the acquisition data request.
  • the key value storage request is an aggregate key value storage request including a plurality of single key value storage requests; wherein the address information carried by the key value storage request is indexed by the address information of the convergence table
  • the aggregation table includes address information carried by each single key value storage request in the plurality of single key value storage requests.
  • the method further includes: detecting a non-key value storage request at the host Afterwards, the host writes the second instruction code carried by the non-key value storage request and the data memory pointer into the field defined by the protocol to form a third storage request instruction sequence, wherein the second instruction code is a standard instruction of the protocol.
  • the host interacts with the storage controller to cause the storage controller to acquire the third sequence of storage request instructions.
  • the traditional block device storage can be supported while supporting the key value storage.
  • the embodiment of the present application provides a key value storage method, the method comprising: the storage controller acquiring a first storage request instruction sequence, where the first storage request instruction sequence is carried by the host to carry the key value storage request.
  • the instruction code and the address information are written by a field defined by a protocol, wherein the first instruction code is an instruction code defined according to a reserved extension field of the protocol; the storage controller is separated from the first storage request instruction sequence The first instruction code and the address information; the storage controller performs an operation corresponding to the first instruction code on the storage device according to the first instruction code and the address information.
  • the host can interact with the storage controller to implement key value storage by using the storage protocol, without performing on the block layer or the file system. Conversion to achieve key-value storage, thus reducing the IO path latency of the storage system.
  • the storage controller acquires the first storage request instruction sequence, including: the storage controller reads the first storage from the memory allocated by the host according to a protocol-defined field for the first storage request instruction sequence. Request a sequence of instructions.
  • the storage controller may directly receive the first storage request instruction sequence sent by the host.
  • the storage controller saves the memory space of the storage controller by storing the first storage request instruction sequence in the memory allocated by the host according to the field defined by the protocol for the first storage request instruction sequence.
  • the key value storage request includes: a write data request, or a get data request, or a delete data request, or an obsolete data request; wherein the address information carried by the write data request includes a key and a value stored.
  • the address information carried by the data request includes the address information stored by the key; the address information carried by the deletion data request includes the address information stored by the key; and the address information carried by the discarded data request includes the address information stored by the key.
  • the method further includes: the storage controller acquiring the value requested by the acquisition data request The storage controller sends the length of the value to the host, so that the host allocates memory for the value according to the length of the value; and further, the storage controller stores the first instruction code and the address information according to the length The device performs the operation corresponding to the first instruction code, and the storage controller reads the data to the memory allocated by the host for the value according to the first instruction code and the address information.
  • the host first needs to allocate memory for the value. In this way, the data read by the storage controller from the storage device has corresponding storage space.
  • the storage controller acquires the length of the value requested by the acquiring data request, including: the storage controller acquires a second storage request instruction sequence, and the second storage request instruction sequence is obtained by the host
  • the instruction code of the length of the value corresponding to the key and the address information stored by the key are written into the field defined by the protocol, wherein the instruction code of the length of the value corresponding to the acquisition key is defined according to the reserved extension field of the protocol.
  • the length of the instruction code and the address information stored by the key, the length of the value is obtained from the storage device.
  • the manner in which the storage controller acquires the second storage request instruction sequence may refer to the manner in which the storage controller obtains the first storage request instruction sequence, and details are not described herein again.
  • the host can obtain the length of the value requested by the acquisition data request.
  • the key value storage request is an aggregate key value storage request including a plurality of single key value storage requests; wherein the address information carried by the key value storage request is indexed by the address information of the convergence table
  • the aggregation table includes address information carried by each single key value storage request in the plurality of single key value storage requests.
  • the method further includes: the storage controller acquiring a third storage request instruction sequence, the third instruction code sequence carried by the host, and the second instruction code carried by the non-key value storage request, and the data
  • the memory pointer is written into a field defined by the protocol, wherein the second instruction code is a standard instruction code of the protocol; the storage controller separates the second instruction code and the data memory from the third storage request instruction sequence a pointer; the storage controller performs an operation corresponding to the second instruction code on the storage device according to the second instruction code and the data memory pointer.
  • the manner in which the storage controller obtains the sequence of the third storage request instruction may refer to the manner in which the storage controller obtains the sequence of the first storage request instruction, and details are not described herein again.
  • the traditional block device storage can be supported while supporting the key value storage.
  • the embodiment of the present application provides a host, where the host includes: a processing module and a communication module; and the processing module is configured to: after detecting the key value storage request, store the first instruction code carried by the key value request And the field defined by the address information writing protocol constitutes a first storage request instruction sequence, wherein the first instruction code is an instruction code defined according to a reserved extension field of the protocol; the communication module is configured to be used with the storage controller Interacting to cause the storage controller to acquire the first sequence of storage request instructions.
  • the processing module is specifically configured to: a storage request instruction sequence allocates a memory; the first instruction code carried by the key value storage request and the address information are written into the memory; the communication module is specifically configured to: notify the storage controller to read the first from the memory Storing a sequence of request instructions; or transmitting the first sequence of stored request instructions to the memory controller.
  • the key value storage request includes: a write data request, or a get data request, or a delete data request, or an obsolete data request; wherein the address information carried by the write data request includes a key and a value stored.
  • the processing module is further configured to: after detecting the key value storage request, store the first instruction code carried by the key value request, and The field defined by the address information writing protocol constitutes a length of the value requested by the acquiring data request before the first storage request instruction sequence is formed; the memory is allocated according to the length of the value, so that the communication module and the storage control After the device interacts to cause the storage controller to acquire the first sequence of storage request instructions, the storage controller reads the data to the memory allocated by the processing module for the value.
  • the processing module is specifically configured to: write an instruction code for obtaining a length of a value corresponding to the key, and address information stored by the key into a field defined by the protocol to form a second storage request instruction sequence, where
  • the instruction code for obtaining the length of the value corresponding to the key is an instruction code defined according to the reserved extension field of the protocol; interacting with the storage controller by the communication module, so that the storage controller acquires the second storage request a sequence of instructions; receiving, by the communication module, a length of the value sent by the storage controller.
  • the key value storage request is an aggregate key value storage request including a plurality of single key value storage requests; wherein the address information carried by the key value storage request is indexed by the address information of the convergence table
  • the aggregation table includes address information carried by each single key value storage request in the plurality of single key value storage requests.
  • the processing module is further configured to: after detecting the non-key storage request, write the second instruction code carried by the non-key storage request and the data memory pointer into the field defined by the protocol Forming a third storage request instruction sequence, wherein the second instruction code is a standard instruction code of the protocol; the communication module is further configured to interact with the storage controller, so that the storage controller acquires the third storage request Instruction sequence.
  • the host provided by the embodiment of the present application can be used to perform the functions performed by the host in the foregoing method embodiment. Therefore, the related technical solutions can be referred to the related description in the foregoing method embodiments, and details are not described herein again.
  • the embodiment of the present application provides a storage controller, where the storage controller includes: a front end communication module, a back end communication module, a processing module, and a control module; the front end communication module is configured to acquire a first storage request from the host a sequence of instructions, the first storage request instruction sequence being composed of a first instruction code carried by the host and a address information written by the host, wherein the first instruction code is a reservation according to the protocol An instruction code defined by an extension field; the processing module a block, configured to separate the first instruction code and the address information from the first storage request instruction sequence; the control module is configured to store, by the back end communication module, according to the first instruction code and the address information The device performs an operation corresponding to the first instruction code.
  • the front-end communication module is specifically configured to: read the first storage request instruction sequence from the memory allocated by the host according to a protocol-defined field for the first storage request instruction sequence; or receive the The first sequence of storage request instructions sent by the host.
  • the key value storage request includes: a write data request, or a get data request, or a delete data request, or an obsolete data request; wherein the address information carried by the write data request includes a key and a value stored.
  • the address information carried by the data request includes the address information stored by the key; the address information carried by the deletion data request includes the address information stored by the key; and the address information carried by the discarded data request includes the address information stored by the key.
  • the processing module is further configured to obtain the value requested by the data request request before the front-end communication module acquires the first storage request instruction sequence.
  • the communication module is further configured to send the length of the value to the host, so that the host allocates a memory for the value according to the length of the value; the control module is specifically configured to: according to the first instruction code and the address Information, reading data to the memory allocated by the host for this value.
  • the processing module is specifically for:
  • the instruction code of the length of the value corresponding to the acquisition key is an instruction code defined according to the reserved extension field of the protocol; and the instruction code for separating the length of the value corresponding to the acquisition key from the second storage request instruction sequence And the address information stored by the key; the instruction code according to the length of the value corresponding to the acquisition key, and the address information stored by the key, and the length of the value is obtained from the storage device by the control module and the back-end communication module .
  • the key value storage request is an aggregate key value storage request including a plurality of single key value storage requests; wherein the address information carried by the key value storage request is indexed by the address information of the convergence table
  • the aggregation table includes address information carried by each one-time key value storage request in the plurality of single-key value storage requests.
  • the front-end communication module is further configured to acquire a third storage request instruction sequence from the host, where the third storage request instruction sequence is a second instruction code carried by the host to the non-key value storage request, And the data memory pointer is written into a field defined by the protocol, wherein the second instruction code is a standard instruction code of the protocol; the processing module is further configured to separate the second instruction from the third storage request instruction sequence The code and the data memory pointer; the control module is further configured to perform the operation corresponding to the second instruction code on the storage device by using the second instruction code and the data memory pointer.
  • the storage controller provided by the embodiment of the present application can be used to perform the functions performed by the storage controller in the foregoing method embodiment. Therefore, the technical effects that can be obtained by referring to the related description in the foregoing method embodiments are not described herein. .
  • non-volatile storage standard NVMe protocol is included; wherein the NVMe protocol defines 0-63 bytes as a field storing a sequence of request instructions.
  • the storage system can have the key value storage and storage semantics, the storage system has good scalability, the data query speed is fast, and the data storage capacity is large.
  • the advantages, as well as the NVMe protocol IO path is short, low latency, and strong concurrent processing capabilities.
  • the protocol described in any of the above aspects includes a small computer system interface SCSI protocol.
  • the SCSI protocol is a general-purpose storage protocol, combining key-value storage with the SCSI protocol is more versatile.
  • the above-mentioned protocol may also be other storage protocols, which are not specifically limited in this embodiment of the present application.
  • the form of the instruction sequence in any aspect of the foregoing aspects may be a queue, or may be a data packet or the like, and is not specifically limited in this embodiment of the present application.
  • the embodiment of the present application provides a host, where the host can implement the functions performed by the host in the foregoing method embodiment, and the function can be implemented by using hardware or by executing corresponding software by hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the host includes a processor and a communication interface configured to support the host to perform the corresponding functions of the above methods.
  • the communication interface is used to support communication between the host and other network elements.
  • the host can also include a memory for coupling with the processor that holds the necessary program instructions and data for the host.
  • the embodiment of the present application provides a storage controller, which can implement the functions performed by the storage controller in the foregoing method embodiment, and the function can be implemented by hardware or by executing corresponding software through hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the memory controller includes a processor and a communication interface configured to support the memory controller to perform the corresponding functions of the above methods.
  • the communication interface is used to support communication between the storage controller and other network elements.
  • the memory controller can also include a memory for coupling with the processor that holds the program instructions and data necessary for the memory controller.
  • an embodiment of the present application provides a key value storage system including a storage device, and the host and storage controller described in the above aspects.
  • an embodiment of the present application provides a computer storage medium for storing computer software instructions used by the host, including a program designed to perform the above aspects.
  • the embodiment of the present application provides a computer storage medium for storing computer software instructions used by the storage controller, which includes a program designed to execute the above aspects.
  • Figure 1 shows two main implementations of existing key value storage
  • FIG. 2 is a schematic structural diagram of a key value storage system according to an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an operation module of a key value storage method according to an embodiment of the present disclosure
  • FIG. 4 is a schematic flowchart diagram of a key value storage method according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of an operation flow of combining key value storage and NVMe protocol to implement key value storage according to an embodiment of the present application
  • FIG. 6 is a schematic diagram of a key value storage process when a data request is requested according to an embodiment of the present disclosure
  • FIG. 7 is a schematic diagram of a key value storage process when a data request is obtained according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a key value storage process when a data request is deleted according to an embodiment of the present disclosure
  • FIG. 9 is a schematic diagram of a key value storage process when a data request is discarded according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a key value storage process when an aggregate delete data request is provided in an embodiment of the present application.
  • FIG. 11 is a schematic flowchart of operation of a standard NVMe device related to non-key value storage according to an embodiment of the present disclosure
  • FIG. 12 is a schematic diagram of an operation flow of combining key value storage and SCSI protocol to implement key value storage according to an embodiment of the present disclosure
  • FIG. 13 is a schematic structural diagram of a host according to an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of a storage controller according to an embodiment of the present disclosure.
  • the hardware uses the SCSI device, and the SCSI device provides the block device service through the SCSI underlay driver, the SCSI middle layer, the SCSI upper layer, the IO scheduling layer, and the block device layer.
  • the software establishes middleware on the block device layer or the file system, and performs key value operations and block device or file system operations conversion in the middleware, thereby providing a key value storage service for the application of the user space.
  • middleware on the block device layer or the file system, and performs key value operations and block device or file system operations conversion in the middleware, thereby providing a key value storage service for the application of the user space.
  • the IO path delay is large.
  • the hardware uses a device that supports key value storage, and the key value storage device communicates with the middleware through the key value storage drive layer, and the middleware converts the user storage into a key value storage operation, thereby providing a key to the application of the user space. Value storage service.
  • the dedicated key value storage device cannot support the traditional block device storage, and the application range has limitations.
  • the embodiment of the present application provides a key value storage method, which can extend the key value storage operation to the storage protocol, thereby not only reducing the IO path delay of the storage system, but also supporting the traditional block device storage and key value storage. .
  • a component can be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread in execution, a program, and/or a computer.
  • an application running on a computing device and the computing device can be a component.
  • One or more components can reside within a process and/or thread of execution, and a component can be located in a computer and/or distributed between two or more computers. Moreover, these components can execute from various computer readable media having various data structures thereon.
  • These components may be passed, for example, by having one or more data packets (eg, data from one component that interacts with the local system, another component of the distributed system, and/or signaled through, such as the Internet)
  • the network interacts with other systems to communicate in a local and/or remote process.
  • the network architecture and the service scenario described in the embodiments of the present application are for the purpose of more clearly illustrating the technical solutions of the embodiments of the present application, and do not constitute a limitation of the technical solutions provided by the embodiments of the present application. It can be seen that, with the evolution of the network architecture and the emergence of new service scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.
  • FIG. 2 is a schematic structural diagram of a key value storage system provided by an embodiment of the present application.
  • the key value storage system is composed of a host 20, a storage controller 21, and a storage device 22.
  • the host 20 performs a key value storage operation on the storage medium in the storage device 22 through the storage controller 21; the storage controller 21 is configured to process the key value storage request and perform a data access operation on the storage device 22.
  • the host 20 may include: a host backplane 201, and a central processing unit (English: central processing unit, CPU) 202, a memory 203, a bridge 204, and a host deployed on the host backplane 201.
  • Communication interface 205 may include: a host backplane 201, and a central processing unit (English: central processing unit, CPU) 202, a memory 203, a bridge 204, and a host deployed on the host backplane 201.
  • Communication interface 205 may include: a host backplane 201, and a central processing unit (English: central processing unit, CPU) 202, a memory 203, a bridge 204, and a host deployed on the host backplane 201.
  • the memory 203 stores program instructions and data necessary for the host 20 to run; the CPU 202 is configured to process program instructions of the host 20; and the bridge 204 is used to connect various external devices (not shown) on the host backplane 201.
  • the host communication interface 205 is a bus interface that the host 20 uses to connect with the memory controller 21, Used to complete communication between the host 20 and the storage controller 21.
  • the storage controller 21 may include a front end communication interface 211, a CPU 212, a memory 213, and a back end communication interface 214.
  • the memory 213 is used to store program instructions and data necessary for the operation of the memory controller 21; the CPU 212 is used to process the instructions and operations of the memory controller 21; the front-end communication interface 211 is used between the memory controller 21 and the host 20.
  • the connected bus interface is used to complete communication between the storage controller 21 and the host 20; the back end communication interface 214 is a bus interface for the connection between the storage controller 21 and the storage device 22 for completing the storage controller 21 Communication with storage device 22.
  • the storage controller 21 and the storage device 22 in the embodiment of the present application may be deployed independently or may be integrated on the same device, which is not specifically limited in this application.
  • FIG. 3 and FIG. 4 are respectively a schematic diagram of an operation module of a key value storage method according to an embodiment of the present application, and a corresponding flow diagram.
  • the key value storage method may specifically include:
  • the host After the host detects the key value storage request, the host composes the first instruction code carried by the key value storage request and the field defined by the address information writing protocol to form a first storage request instruction sequence (as shown in Table 1).
  • the first instruction code is an instruction code defined according to a reserved extension field of the protocol.
  • the reserved extension field of the protocol specifically refers to a field reserved by the protocol for protocol extension or for user customization.
  • the host interacts with the storage controller, so that the storage controller acquires the first storage request instruction sequence.
  • the command processing unit of the storage controller acquires the first storage request instruction sequence, and after determining that the current key value operation is performed by separating the first instruction code in the first storage request instruction sequence, the first storage request instruction sequence is The content is forwarded to the key value storage processing unit.
  • the key value storage processing unit processes all the key value storage requests, and submits the processed key value storage request to the storage control unit.
  • the storage control unit converts the previously submitted request into an operation on the storage device.
  • the storage control unit feeds back the operation result to the host through the foregoing unit.
  • the host After the host detects the non-key value storage request, the host forms a second instruction code sequence (such as Table 2) by the second instruction code carried by the non-key value storage request and the data memory pointer writing protocol definition field.
  • a second instruction code sequence such as Table 2
  • the second instruction code is a standard instruction code of the protocol.
  • the host interacts with the storage controller, so that the storage controller acquires the third storage request instruction sequence.
  • the command processing unit of the storage controller acquires the third storage request instruction sequence, and determines that the current storage command sequence is the non-key operation after separating the second instruction code in the third storage request instruction sequence.
  • the content in the content is forwarded to the standard protocol processing unit.
  • the standard protocol processing unit processes all non-key value storage requests, and stores the processed non-key values.
  • the storage request is submitted to the storage control unit.
  • the storage control unit converts the previously submitted request into an operation on the storage device.
  • the storage control unit feeds back the operation result to the host through the foregoing unit.
  • the foregoing protocol may be an NVMe protocol, or may be another storage protocol such as the SCSI protocol, which is not specifically limited in this embodiment of the present application.
  • the foregoing instruction sequence may be in the form of a queue or a data packet, and is not specifically limited in this embodiment.
  • the above-mentioned key value storage operation may be an operation of writing data, acquiring data, deleting data, or discarding data, which is not specifically limited in the embodiment of the present application.
  • the key value storage method provided by the embodiment of the present application extends the key value storage operation to the storage protocol. Further, on the one hand, since the host can interact with the storage controller by using the storage protocol to implement key value storage, there is no need for a block layer or a file. Conversion on top of the system for key-value storage, thus reducing the IO path latency of the storage system; on the other hand, since the host can interact with the storage controller to implement non-key-value storage, it can support both traditional blocks. Device storage.
  • step S401 may specifically include:
  • the host After the host detects the key value storage request, the host allocates memory for the first storage request instruction sequence according to a field defined by the protocol;
  • the host writes the first instruction code carried by the key value storage request and the address information into the memory.
  • step S402 may specifically include:
  • the host notifies the storage controller to read the first storage request instruction sequence from the memory
  • the host sends the first sequence of storage request instructions to the storage controller.
  • the manner in which the host notifies the storage controller to read the first storage request instruction sequence from the memory may save the memory space of the storage controller.
  • the embodiment of the present application is merely an exemplary manner of providing two types of hosts to interact with a storage controller, so that the storage controller acquires the first storage request instruction sequence.
  • the host and the storage controller may also The other embodiment of the present application does not specifically limit the storage controller to obtain the first storage request instruction sequence.
  • the key value storage method provided by the embodiment of the present application will be further described below in conjunction with a specific protocol and a specific key value storage operation or a non-key value storage operation.
  • FIG. 5 a schematic diagram of an operation flow for combining key value storage with the NVMe protocol to implement key value storage.
  • the host and the storage controller are connected through a host interface, and the host interacts with the storage controller through commands, addresses, and data through the host interface.
  • the host interface is a bus and interface standard (English: peripheral component interface express, abbreviation: PCIe) interface;
  • Application layer Host application or storage client software.
  • the key value storage middleware provides a key value storage interface to the host application and passes the storage request to the NVMe driver layer.
  • Block layer The abstraction layer of the operating system to the block storage device, the general file system is built on this layer.
  • NVMe driver layer The host operating system performs data transmission and command interaction through the driver software and the NVMe storage controller.
  • NVMe command conversion unit located in the NVMe driver layer, used to fill in the NVMe send queue with information such as the extended command stored by the key-value, the address of the Key and/or Value, and submit the NVMe send queue to
  • the NVMe driver layer is sent to the NVMe storage controller by the NVMe driver layer.
  • NVMe storage controller Second, NVMe storage controller:
  • NVMe command processing unit analyzes the instruction code in the NVMe transmission queue received by the NVMe controller, and distributes the NVMe transmission queue to the key value storage processing unit or the NVMe operation processing unit according to different instructions.
  • Key value storage processing unit processes all key value operation requests, and submits the processed request to the storage control unit.
  • NVMe operation processing unit handles standard NVMe protocol operation requests.
  • the storage control unit converts the previously submitted request into an operation on the storage device, and feeds back the operation result to a storage operation of performing data on the storage device.
  • the storage medium in the storage device includes dynamic random access memory (English: dynamic random access memory, abbreviation: DRAM), non-volatile random access memory (English: non-volatile random access memory, abbreviation: NVRAM), NAND or other Storage device.
  • DRAM dynamic random access memory
  • NVRAM non-volatile random access memory
  • NAND NAND or other Storage device.
  • the first instruction code carried by the key value storage request is defined according to the reserved extension field of the NVMe protocol, as shown in Table 4:
  • Table 4 merely provides a manner of defining a first instruction code carried in a key value storage request according to a reserved extension field of the NVMe protocol.
  • the reserved extension field defines a key according to the NVMe protocol.
  • the first instruction code carried in the value storage request is not limited to the above manner.
  • the above-mentioned key value storage operation may be defined on the field of 90h-99h, which is not specifically limited in this embodiment of the present application.
  • the key-value storage operation in this embodiment includes but is not limited to: write data Put (String Key, String Value), get data: Get (String Key), delete data: Delete (String Key) The data is discarded: TRIM (String Key), which is not specifically limited in the embodiment of the present application.
  • the key value storage method includes steps S601-S611:
  • the middleware submits the address information of the write data request, the Key, and the Value to the NVMe driver layer.
  • the NVMe command conversion unit of the NVMe driver layer writes the address information stored in the data command 80h, Key, and Value to the NVMe transmission queue.
  • 0-63 bytes are defined as fields corresponding to the NVMe transmission queue.
  • the command format of the NVMe send queue in this step is as shown in the send queue command format in Figure 6.
  • byte 03:00 ie 0th byte to 3rd byte
  • byte 23:04 ie 4th byte to 23rd byte
  • Command Dword 1-6 ie, the first 1-6 bytes of the command word
  • byte 39:24 ie, the 24th byte to the 39th byte
  • Byte 63:40 ie 40th byte to 63rd byte
  • the NVMe sending queue in the embodiment of the present application is a specific form of the foregoing instruction sequence.
  • the command sequence in the embodiment of the present application may also be a data packet or the like.
  • the application examples are not specifically limited thereto.
  • the NVMe driver layer notifies the NVMe storage controller to read the NVMe send queue.
  • the NVMe command processing unit of the S605 and the NVMe storage controller reads the NVMe transmission queue by direct data access (English: direct memory access, abbreviation: DMA).
  • the NVMe command processing unit separates the address information stored in the write data command, the key, and the value in the NVMe queue, and submits the write data request to the key value storage processing unit.
  • the key value storage processing unit processes the address information stored by the Key and the Value, converts the write data request to a write data request of the corresponding storage device, and submits the write data request to the storage control unit.
  • the storage control unit writes the data to the storage device according to the address information stored by the Key and the Value.
  • the storage control unit returns status information to the key value storage processing unit.
  • the status information herein refers to whether the data is successfully written.
  • the key value storage processing unit and the foregoing unit sequentially transmit state information to the middleware.
  • the foregoing unit specifically includes: an NVMe command processing unit of the NVMe storage controller, and an NVMe command conversion unit of the NVMe driver layer of the host.
  • the middleware returns status information to the application layer.
  • the key value storage process ends.
  • the key value storage method includes steps S701-S721:
  • the application layer invokes a middleware read operation interface: Get(String Key).
  • the middleware submits a request for obtaining a Key corresponding to the length of the value to the NVMe driver layer.
  • the NVMe command conversion unit of the NVMe driver layer writes the instruction 81h of the Value length and the address information stored by the Key to the NVMe transmission queue.
  • 064 bytes are defined as fields corresponding to the NVMe transmission queue.
  • the command format of the NVMe send queue in this step is as shown in the send queue command format 1 in FIG.
  • byte 03:00 ie 0th byte to 3rd byte
  • byte 23:04 ie 4th byte to 23rd byte
  • Command Dword 1-6 ie, the first 1-6 bytes of the command word
  • byte 39:24 ie, the 24th byte to the 39th byte
  • Section 63:40 ie, the 40th byte to the 63rd byte
  • is Command Dword 10-15 ie, 10-15th 4 bytes of the command word.
  • the NVMe sending queue in the embodiment of the present application is a specific form of the foregoing instruction sequence.
  • the command sequence in the embodiment of the present application may also be a data packet or the like.
  • the application examples are not specifically limited thereto.
  • the NVMe driver layer notifies the NVMe storage controller to read the NVMe send queue.
  • the NVMe command processing unit of the NVMe storage controller reads the NVMe send queue by DMA.
  • the NVMe command processing unit separates the instruction for obtaining the value length in the NVMe queue, and stores the key The address information is placed, and a read operation request is submitted to the key value storage processing unit.
  • the key value storage processing unit processes the address information stored by the key, converts the read operation request to a read operation request of the corresponding storage device, and submits the read operation request to the storage control unit.
  • the storage control unit acquires a length of the corresponding Value from the storage device according to the address information stored by the key.
  • the storage control unit submits the length information of the Value to the key value storage processing unit.
  • the foregoing unit specifically includes: an NVMe command processing unit of the NVMe storage controller, and an NVMe command conversion unit of the NVMe driver layer of the host.
  • S711 The middleware allocates a memory space that the Value stores on the host side according to the length of the Value.
  • the middleware submits the request for obtaining the Value and the address information stored by the value to the NVMe driver layer.
  • the NVMe command conversion unit of the S713 and the NVMe driver layer writes the instruction 82h for obtaining the Value and the address information stored by the key to the NVMe transmission queue.
  • 0-63 bytes are defined as fields corresponding to the NVMe transmission queue.
  • the command format of the NVMe send queue in this step is as shown in the send queue command format 2 in FIG.
  • byte 03:00 ie 0th byte to 3rd byte
  • byte 23:04 ie 4th byte to 23rd byte
  • Command Dword 1-6 ie, the first 1-6 bytes of the command word
  • byte 39:24 ie, the 24th byte to the 39th byte
  • Section 63:40 ie, the 40th byte to the 63rd byte
  • is Command Dword 10-15 ie, 10-15th 4 bytes of the command word.
  • the NVMe sending queue in the embodiment of the present application is a specific form of the foregoing instruction sequence.
  • the command sequence in the embodiment of the present application may also be a data packet or the like.
  • the application examples are not specifically limited thereto.
  • the NVMe driver layer notifies the NVMe storage controller to read the NVMe send queue.
  • the NVMe command processing unit of the NVMe storage controller reads the NVMe send queue by DMA.
  • the NVMe command processing unit separates the instruction for obtaining the Value and the address information stored by the key in the NVMe queue, and submits a request for obtaining the Value to the key value storage processing unit.
  • the key value storage processing unit processes the address information stored in the Value, converts the request for obtaining the Value into a request for acquiring a value corresponding to the storage device, and submits a request for obtaining the Value to the storage control unit.
  • the storage control unit reads the data by DMA to the memory address allocated by the host to the Value.
  • the transmission mode of the command and the data between the host and the storage controller may also be a remote direct data access (English: remote direct memory access, abbreviation: RDMA), and the like. limited.
  • RDMA remote direct memory access
  • the storage control unit returns status information to the key value storage processing unit.
  • the status information herein refers to whether the data is successfully obtained.
  • the foregoing unit specifically includes: an NVMe command processing unit of the NVMe storage controller, and an NVMe command conversion unit of the NVMe driver layer of the host.
  • the middleware returns status information to the application layer.
  • the key value storage process ends.
  • the key value storage method includes steps S801-S811:
  • the application layer invokes the middleware deletion interface: Delete (String Key).
  • S802 The middleware submits the deletion data request and the address information stored by the key to the NVMe driver layer.
  • S803 The NVMe command conversion unit of the NVMe driver layer writes the delete data command 83h and the address information stored by the Key to the NVMe send queue.
  • 0-63 bytes are defined as fields corresponding to the NVMe transmission queue.
  • the command format of the NVMe send queue in this step is as shown in the send queue command format in FIG.
  • byte 03:00 ie 0th byte to 3rd byte
  • byte 23:04 ie 4th byte to 23rd byte
  • Command Dword 1-6 ie, the first 1-6 bytes of the command word
  • byte 39:24 ie, the 24th byte to the 39th byte
  • Section 63:40 ie, the 40th byte to the 63rd byte
  • is Command Dword 10-15 ie, 10-15th 4 bytes of the command word.
  • the NVMe sending queue in the embodiment of the present application is a specific form of the foregoing instruction sequence.
  • the command sequence in the embodiment of the present application may also be a data packet or the like.
  • the application examples are not specifically limited thereto.
  • the NVMe driver layer notifies the NVMe storage controller to read the NVMe send queue.
  • the NVMe command processing unit of the NVMe storage controller reads the NVMe send queue by DMA.
  • the NVMe command processing unit separates the delete data command in the NVMe queue, the address information stored in the key, and submits the delete data request to the key value storage processing unit.
  • the key value storage processing unit processes the address information stored by the key, converts the delete data request to the delete data request of the corresponding storage device, and submits the delete data request to the storage control unit.
  • the storage control unit performs a deletion operation on the data in the storage device according to the address information stored in the key.
  • the storage control unit returns status information to the key value storage processing unit.
  • the status information herein refers to whether or not the data is successfully deleted.
  • the key value storage processing unit and the foregoing unit sequentially transmit state information to the middleware.
  • the foregoing unit specifically includes: an NVMe command processing unit of the NVMe storage controller, and an NVMe command conversion unit of the NVMe driver layer of the host.
  • the middleware returns status information to the application layer.
  • the key value storage process ends.
  • the key value storage method includes steps S901-S911:
  • S902 The middleware submits the discarded data request and the address information stored by the key to the NVMe driver layer.
  • the NVMe command conversion unit of the S903 and the NVMe driver layer writes the discarded data command 84h and the place where the Key is stored.
  • the address information is sent to the NVMe send queue.
  • 0-63 bytes are defined as fields corresponding to the NVMe transmission queue.
  • the command format of the NVMe send queue in this step is as shown in the send queue command format in FIG.
  • byte 03:00 ie 0th byte to 3rd byte
  • byte 23:04 ie 4th byte to 23rd byte
  • Command Dword 1-6 ie, the first 1-6 bytes of the command word
  • byte 39:24 ie, the 24th byte to the 39th byte
  • Section 63:40 ie, the 40th byte to the 63rd byte
  • is Command Dword 10-15 ie, 10-15th 4 bytes of the command word.
  • the NVMe sending queue in the embodiment of the present application is a specific form of the foregoing instruction sequence.
  • the command sequence in the embodiment of the present application may also be a data packet or the like.
  • the application examples are not specifically limited thereto.
  • the NVMe driver layer notifies the NVMe storage controller to read the NVMe send queue.
  • the NVMe command processing unit of the NVMe storage controller reads the NVMe send queue by DMA.
  • the NVMe command processing unit separates the discarded data instruction in the NVMe queue, the address information stored in the key, and submits the discarded data request to the key value storage processing unit.
  • the key value storage processing unit processes the address information stored by the key, converts the discarded data request to the discarded data request of the corresponding storage device, and submits the discarded data request to the storage control unit.
  • the storage control unit performs a discarding operation on the data in the storage device according to the address information stored in the key.
  • the storage control unit returns status information to the key value storage processing unit.
  • the status information herein refers to whether the data is discarded successfully.
  • the key value storage processing unit and the foregoing unit sequentially transmit state information to the middleware.
  • the foregoing unit specifically includes: an NVMe command processing unit of the NVMe storage controller, and an NVMe command conversion unit of the NVMe driver layer of the host.
  • the middleware returns status information to the application layer.
  • the key value storage process ends.
  • the embodiments shown in the above FIGS. 6-9 are all key value storage for a single key value storage request. It can be seen from Table 2 that during the definition of the instruction, the aggregation operation can also be defined.
  • the so-called aggregation operation refers to a request that a single aggregation storage operation can complete multiple single storage operations at the same time.
  • the flow is basically the same as the single request operation; the difference is that multiple keys (Key) and/or values ( The address information of Value) needs to be passed to the storage controller through the NVMe send queue by means of SGL (convergence table), that is, the address information of multiple keys (Key) and/or value (Value) can pass the aggregation table address information.
  • SGL convergence table
  • the aggregation table contains address information carried by each single key value storage request in a plurality of single key value storage requests.
  • the key value storage method provided by the embodiment of the present application is described below by taking the key value storage request as the aggregate deletion data request as an example. As shown in FIG. 10, the key value storage method provided by the embodiment of the present application includes steps S1001-S1011:
  • S1001 The application layer invokes the middleware aggregation deletion interface Delete_Group (String Key group).
  • S1002 The middleware submits an aggregate deletion data request, and indicates a convergence table of addresses stored in the key data group. (SGL) address information to the NVMe driver layer.
  • SGL convergence table of addresses stored in the key data group.
  • the NVMe command conversion unit of the NVMe driver layer writes the aggregate delete data command 88h, the address information of the SGL to the NVMe send queue.
  • 0-63 bytes are defined as fields corresponding to the NVMe transmission queue.
  • the command format of the NVMe send queue in this step is as shown in the send queue command format in FIG.
  • byte 03:00 ie 0th byte to 3rd byte
  • byte 23:04 ie 4th byte to 23rd byte
  • Command Dword 1-6 ie, the first 1-6 bytes of the command word
  • byte 39:24 ie, the 24th byte to the 39th byte
  • 63:40 ie, the 40th byte to the 63rd byte
  • is Command Dword 10-15 ie, 10-15th 4 bytes of the command word.
  • the aggregation operation may include any number of single operations.
  • FIG. 10 is only an example of the aggregation deletion data request including five single deletion data requests, and the address information including the key-key 5 address information in the SGL is given. It does not constitute a limitation on the technical solution of the present application.
  • the NVMe sending queue in the embodiment of the present application is a specific form of the foregoing instruction sequence.
  • the command sequence in the embodiment of the present application may also be a data packet or the like.
  • the application examples are not specifically limited thereto.
  • the NVMe driver layer notifies the NVMe storage controller to read the NVMe send queue.
  • the NVMe command processing unit of the NVMe storage controller reads the NVMe send queue by DMA.
  • the NVMe command processing unit separates the aggregate delete data instruction and the address information of the SGL in the NVMe queue, and submits the aggregate delete data request to the key value storage processing unit.
  • the key value storage processing unit processes the address information stored in the Key data group in the SGL one by one, converts the aggregate delete data request to the data deletion request of the data in the corresponding storage device, and submits the request to the storage control unit.
  • the storage control unit performs data deletion operations one by one.
  • the storage control unit After all operations are completed, the storage control unit returns status information to the key value storage processing unit.
  • the status information herein refers to whether or not all data is successfully deleted.
  • the foregoing unit specifically includes: an NVMe command processing unit of the NVMe storage controller, and an NVMe command conversion unit of the NVMe driver layer of the host.
  • the middleware returns status information to the application layer.
  • the key value storage process ends.
  • the embodiments shown in FIG. 6-10 above are all for the operation when the host detects the key value storage request.
  • the host may also interact with the storage controller to implement non-key value storage by using a storage protocol, for example, supporting traditional block device storage.
  • a storage protocol for example, supporting traditional block device storage.
  • the standard NVMe device operation related to the non-key value storage includes steps S1101-S1111:
  • S1101 The traditional application layer invokes a file system interface.
  • the file system layer converts the non-key value storage request into an operation request of the block layer.
  • the NVMe command conversion unit conversion operation request of the S1104 and the NVMe driver layer is a standard NVMe instruction, and writes an instruction code, a data memory pointer, and the like to the NVMe transmission queue.
  • command format of the NVMe sending queue in this embodiment is similar to the format of the NVMe sending queue command in the foregoing embodiment, and details are not described herein again.
  • the NVMe driver layer notifies the NVMe storage controller to read the NVMe send queue.
  • the NVMe command processing unit of the NVMe storage controller reads the NVMe send queue by using DMA;
  • the S1107 and the NVMe command processing unit separate information such as an instruction code, a data memory pointer, and the like in the NVMe transmission queue, and submit an operation request to the NVMe operation processing unit.
  • the NVMe operation processing unit processes information such as a data memory pointer, and the conversion operation request is an operation request of the corresponding storage device, and submits the operation request to the storage control unit.
  • the storage control unit performs an operation request for the storage device according to the data memory pointer.
  • the storage control unit returns status information to the NVMe operation processing unit.
  • the status information herein refers to whether the operation request is successful.
  • the S1111, the NVMe operation processing unit, and the foregoing unit sequentially transmit state information to the legacy application layer.
  • the foregoing unit specifically includes: an NVMe command processing unit of the NVMe storage controller, an NVMe command conversion unit of the NVMe driver layer of the host, a block layer, and a file system.
  • the embodiments shown in FIG. 6-11 are all described in conjunction with the key value storage shown in FIG. 5 combined with the NVMe protocol to implement key value storage.
  • Combining key-value storage with an efficient storage protocol such as the NVMe protocol allows the storage system to have both advantages.
  • the protocol in the embodiment of the present application may be an NVMe protocol, or may be another storage protocol such as the SCSI protocol, which is not specifically limited in this embodiment of the present application.
  • the embodiment of the present application can also combine key value storage with the SCSI protocol, as shown in FIG. 12, which is a schematic diagram of an operation flow for combining key value storage and SCSI protocol to implement key value storage.
  • the host interface shown in FIG. 12 has the same function as the host interface shown in FIG. 5 .
  • SCSI layer The software layer for processing SCSI transactions, including SCSI upper layer, SCSI middle layer and lower SCSI layer.
  • SCSI driver layer Located under the SCSI layer, it is responsible for submitting SCSI requests to the serial SCSI interface (English: Serial Attached SCSI, SAS) storage controller to complete control and data interaction with the SAS storage controller.
  • serial SCSI interface English: Serial Attached SCSI, SAS
  • SCSI command conversion unit located in the SCSI driver layer, used to expand the key value (Key-Value) Information such as the command, the address of the Key and/or Value is filled in the NVMe send queue, and the NVMe send queue is submitted to the SCSI driver layer and sent to the SAS storage controller.
  • Key-Value Key-Value
  • SAS storage controller Second, SAS storage controller:
  • SCSI command processing unit analyzes the instruction code in the SCSI transmission queue received by the SCSI controller, and distributes the SCSI transmission queue to the key value storage processing unit or the SCSI operation processing unit according to different instructions.
  • SCSI operation processing unit handles standard SCSI protocol operation requests.
  • the storage device shown in FIG. 12 has the same function as the storage device shown in FIG. 5 .
  • the first instruction code carried by the key value storage request is defined according to the reserved extension field of the SCSI protocol, as shown in Table 6:
  • the foregoing Table 6 merely provides a manner of defining a first instruction code carried in a key value storage request according to a reserved extension field of the SCSI protocol, and of course, a reserved extension field definition key according to the SCSI protocol.
  • the first instruction code carried in the value storage request is not limited to the above manner.
  • the above-mentioned key value storage operation may be defined on other reserved extension fields mentioned in the third part of the comment section in Table 5 above. Specifically limited.
  • the process of performing the key value storage operation in conjunction with FIG. 12 is consistent with the process of performing the key value storage operation in conjunction with FIG. 5 .
  • the process of performing the key value storage operation in conjunction with FIG. 12 is consistent with the process of performing the key value storage operation in conjunction with FIG. 5 .
  • the process of performing the key value storage operation in conjunction with FIG. 12 is consistent with the process of performing the key value storage operation in conjunction with FIG. 5 .
  • FIG. 6-11 For details, refer to the embodiment shown in FIG. 6-11 , and details are not described herein again.
  • the key value storage method provided by the embodiment of the present application extends the key value storage operation to the storage protocol. Further, on the one hand, the host can use the storage protocol and the storage. Controller interaction for key-value storage, no need to convert on top of the block or file system for key-value storage, thus reducing the IO path latency of the storage system; on the other hand, because the host can rely on storage protocols and storage controls The device interacts to implement non-key-value storage, so it can support traditional block device storage at the same time.
  • each device such as a host, a storage controller, etc.
  • each device in order to implement the above functions, includes hardware structures and/or software modules corresponding to the execution of the respective functions.
  • the present application can be implemented in a combination of hardware or hardware and computer software in combination with the elements and algorithm steps of the various examples described in the embodiments disclosed herein. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.
  • the embodiments of the present application may divide the function modules of the host, the storage controller, and the like according to the foregoing method examples.
  • each function module may be divided according to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of the module in the embodiment of the present application is schematic, and is only a logical function division, and the actual implementation may have another division manner.
  • FIG. 13 shows a possible structural diagram of the host 1300 involved in the foregoing embodiment.
  • the host 1300 includes a processing module 1301 and a communication module 1302.
  • the communication module 1302 is for communicating with the storage controller.
  • the processing module 1301 is configured to support the host to perform the processes S401 and S407 in FIG. 4, or the processing module 1301 may include an application layer 1301a, a middleware 1301b, and a driving layer 1301c for supporting the host to execute the application layer and the middleware in FIG. 6-10.
  • the operations performed by the NVMe driver layer, or the processing module 1301 may further include a legacy application layer 1301d, a file system 1301e, a block layer 1301f, and a driver layer 1301c for supporting the host to execute the traditional application layer, file system, and block layer in FIG. And the operations performed by the NVMe driver layer. All the related content of the steps involved in the foregoing method embodiments may be referred to the functional descriptions of the corresponding functional modules, and details are not described herein again.
  • the host 1300 may further include a storage module for storing program codes and data of the host 1300.
  • the processing module 1301 may be a processor or a controller, for example, may be the CPU 202 in FIG. 2, or may be a general-purpose processor, a digital signal processor (English: digital signal processor, abbreviated as DSP), an application specific integrated circuit (English) :application-specific integrated circuit, abbreviation: ASIC), on-site A programmable gate array (English: field programmable gate array, abbreviated: FPGA) or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • the processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
  • the communication module 1302 may be a communication interface, for example, the host communication interface 205 in FIG. 2, a receiver and a transmitter, or a transceiver circuit or the like.
  • the storage module 1201 may be a memory or a memory.
  • the host involved in the embodiment of the present application may be the host shown in FIG. 2 .
  • the host shown in FIG. 2 For details, refer to the related description in FIG. 2 , and details are not described herein again.
  • FIG. 14 shows a possible structural diagram of the storage controller 1400 involved in the foregoing embodiment.
  • the storage controller 1400 includes: a front end communication module 1401, a processing module 1402, and a control module 1403. And a backend communication module 1404.
  • the front end communication module 1401 is used to support communication between the storage controller 1400 and the front end device, such as the communication with the host in FIG. 4, FIG. 6-11.
  • the backend communication module 1404 is used to support communication between the storage controller 1400 and the backend device, such as communication with the storage devices of Figures 4, 6-11.
  • the processing module 1402 may include a command processing unit 1402a, a key value storage processing unit 1402b, and a standard protocol processing unit 1402c for supporting the memory controller 1400 to execute the command processing unit, the key value storage processing unit, and the standard protocol processing unit executed in FIG. The operation, or for supporting the storage controller 1400 to perform the operations performed by the NVMe command processing unit, the key value storage processing unit, and the NVMe operation processing unit in FIGS. 6-11.
  • the control module 1403 is configured to support the memory controller to perform the operations performed by the memory control unit of FIGS. 4 and 6-11. All the related content of the steps involved in the foregoing method embodiments may be referred to the functional descriptions of the corresponding functional modules, and details are not described herein again.
  • the memory controller 1400 may further include a storage module for storing program codes and data of the storage controller 1400.
  • the processing module 1402 and the control module 1403 may be a processor or a controller, for example, the CPU 212 in FIG. 2, or a general-purpose processor, a digital signal processor (English: digital signal processor, abbreviated as DSP), dedicated Integrated circuit (English: application-specific integrated circuit, abbreviation: ASIC), field programmable gate array (English: field programmable gate array, abbreviation: FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof . It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • the processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
  • the front-end communication module 1401 and the back-end communication module 1404 may be communication interfaces, for example, the front-end communication interface 211, the back-end communication interface 214 in FIG. 2, the receiver and the transmitter, or the transceiver circuit or the like.
  • the storage module can be either memory or storage.
  • the storage controller involved in the embodiment of the present application may be the storage controller shown in FIG. 2, for details. The related description in part of Figure 2 will not be repeated here.
  • the steps of a method or algorithm described in connection with the present disclosure may be implemented in a hardware or may be implemented by a processor executing software instructions.
  • the software instructions may be composed of corresponding software modules, which may be stored in random access memory (English: random access memory, abbreviation: RAM), flash memory, read only memory (English: read only memory, abbreviation: ROM), Erase programmable read-only memory (English: erasable programmable ROM, abbreviation: EPROM), electrically erasable programmable read-only memory A memory (English: electrically readable, EEPROM), a register, a hard disk, a removable hard disk, a compact disk (CD-ROM), or any other form of storage medium known in the art.
  • An exemplary storage medium is coupled to the processor to enable the processor to read information from, and write information to, the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and the storage medium can be located in an ASIC. Additionally, the ASIC can be located in a core network interface device.
  • the processor and the storage medium may also exist as discrete components in the core network interface device.
  • the functions described herein can be implemented in hardware, software, firmware, or any combination thereof.
  • the functions may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium.
  • Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can be accessed by a general purpose or special purpose computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)
  • Communication Control (AREA)

Abstract

本申请实施例提供键值存储方法、装置及系统,以至少解决目前没有相关解决方案能够实现键值存储和NVMe协议这类高效存储协议的结合的问题。方法包括:在主机检测到键值存储请求之后,所述主机将所述键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列,其中,所述第一指令码为根据所述协议的预留扩展字段定义的指令码;所述主机与所述存储控制器进行交互,以使得所述存储控制器获取所述第一存储请求指令序列。本申请适用于存储技术领域。

Description

键值存储方法、装置及系统
本申请要求于2016年08月31日提交中国专利局、申请号为201610794448.4、发明名称为“键值存储方法、装置及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及存储技术领域,尤其涉及键值存储方法、装置及系统。
背景技术
键值(英文:key-value,缩写:KV)存储,是非关系型数据库(英文:no structured query language,缩写:NoSQL)存储的一种方式,其数据按照键值对的形式进行组织,索引和存储,具有存储语义简单、存储系统扩展性好、数据查询速度快、数据存储量大的特点。
而非易失性存储标准(英文:non-volatile memory express,缩写:NVMe)协议,是目前存储系统的多种存储协议中的一种高效存储协议,使用NVMe协议的存储系统和使用传统的小型计算机系统接口(英文:small computer system interface,缩写:SCSI)协议的存储系统相比,由于减少了通用输入输出(英文:input-output,缩写:IO)调度层、SCSI上层和SCSI中间层,因此具有IO路径短、时延低、并发处理能力强的特点。
若将键值存储和NVMe协议为代表的这类高效存储协议结合起来,将使得存储系统同时具备二者的优势。然而,目前并没有相关解决方案能够实现键值存储和NVMe协议这类高效存储协议的结合。
发明内容
本申请实施例提供键值存储方法、装置及系统,用于解决现有技术中无法实现键值存储和一些高效存储协议(如NVMe协议)结合的问题。
为解决上述问题,本申请实施例提供如下技术方案:
一方面,本申请实施例提供一种键值存储方法,该方法包括:在主机检测到键值存储请求之后,该主机将该键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列,其中,该第一指令码为根据该协议的预留扩展字段定义的指令码;进而,该主机与该存储控制器进行交互,以使得该存储控制器获取该第一存储请求指令序列。
由于本申请实施例提供的键值存储方法将键值存储操作扩展到存储协议之上,使得主机可以借助存储协议和存储控制器交互以实现键值存储,无需在块层或者文件系统之上进行转换以实现键值存储,因此,还降低了存储系统的IO路径时延。
在一种可能的设计中,该主机将该键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列,包括:该主机按照协议定义的字段为第一存储请求指令序列分配内存;该主机将该键值存 储请求携带的第一指令码、以及地址信息写入该内存;进而,该主机与该存储控制器进行交互(如发送通知消息等),以使得该存储控制器获取该第一存储请求指令序列,包括:该主机通知该存储控制器从该内存中读取该第一存储请求指令序列。除了上述让存储控制器获取第一存储请求指令序列该主机也可以向该存储控制器直接发送该第一存储请求指令序列。
其中,主机通知存储控制器从内存中读取第一存储请求指令序列的方式可以节省存储控制器的内存空间。
在一种可能的设计中,该键值存储请求包括:写数据请求、或者获取数据请求、或者删除数据请求、或者废弃数据请求;其中,该写数据请求携带的地址信息包括键和值存放的地址信息;该获取数据请求携带的地址信息包括键存放的地址信息;该删除数据请求携带的地址信息包括键存放的地址信息;该废弃数据请求携带的地址信息包括键存放的地址信息。
当然,上述仅是示例性的列举了一些键值存储请求操作,还可能存在其它的键值存储请求操作,本申请实施例对此不作具体限定。
在一些可能的设计中,若该键值存储请求为获取数据请求,则在主机检测到键值存储请求之后,在该主机将该键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列之前,还包括:该主机获取该获取数据请求所请求的值的长度;该主机根据该值的长度为该值分配内存,以使得在该主机与该存储控制器进行交互,以使得该存储控制器获取该第一存储请求指令序列之后,该存储控制器读数据到该主机为该值分配的内存。
也就是说,若该键值存储请求为获取数据请求,则主机首先需要为值分配内存。这样,存储控制器从存储设备中读取的数据才有相应的存储空间。
在一种可能的设计中,该主机获取该获取数据请求所请求的值的长度,包括:该主机将获取键对应的值的长度的指令码、以及该键存放的地址信息写入该协议定义的字段组成第二存储请求指令序列,其中,该获取键对应的值的长度的指令码为根据该协议的预留扩展字段定义的指令码;该主机与该存储控制器进行交互,以使得该存储控制器获取该第二存储请求指令序列;该主机接收该存储控制器发送的该值的长度。
通过上述方式,主机可以获取到该获取数据请求所请求的值的长度。
在一种可能的设计中,该键值存储请求为包含多个单次键值存储请求的聚合键值存储请求;其中,该键值存储请求携带的地址信息通过聚散表的地址信息进行索引,该聚散表中包含该多个单次键值存储请求中每个单次键值存储请求携带的地址信息。
通过将多个单次键值存储请求合并成聚合键值存储请求,并将多个单次键值存储请求中每个单次键值存储请求携带的地址信息通过聚散表进行索引,可以使得在一次键值存储操作流程中同时完成多个单次键值存储操作,提高了键值存储的效率。
在一种可能的设计中,该方法还包括:在该主机检测到非键值存储请求 之后,该主机将该非键值存储请求携带的第二指令码、以及数据内存指针写入该协议定义的字段组成第三存储请求指令序列,其中,该第二指令码为该协议的标准指令码;该主机与该存储控制器进行交互,以使得该存储控制器获取该第三存储请求指令序列。
本申请实施例提供的键值存储方法中,由于主机可以借助存储协议和存储控制器交互以实现非键值存储,因此可以在支持键值存储的同时支持传统块设备存储。
另一方面,本申请实施例提供一种键值存储方法,该方法包括:存储控制器获取第一存储请求指令序列,该第一存储请求指令序列由该主机将键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成,其中,该第一指令码为根据该协议的预留扩展字段定义的指令码;该存储控制器从该第一存储请求指令序列中分离出该第一指令码和该地址信息;该存储控制器根据该第一指令码和该地址信息,对存储设备进行该第一指令码对应的操作。
由于本申请实施例提供的键值存储方法将键值存储操作扩展到存储协议之上,使得主机可以借助存储协议和存储控制器交互以实现键值存储,无需在块层或者文件系统之上进行转换以实现键值存储,因此降低了存储系统的IO路径时延。
在一种可能的设计中,该存储控制器获取第一存储请求指令序列,包括:该存储控制器从主机按照协议定义的字段为该第一存储请求指令序列分配的内存中读取第一存储请求指令序列。或者,除了上述方法外,存储控制器也可以直接接收主机发送的第一存储请求指令序列。
其中,存储控制器从主机按照协议定义的字段为该第一存储请求指令序列分配的内存中读取第一存储请求指令序列的方式可以节省存储控制器的内存空间。
在一种可能的设计中,该键值存储请求包括:写数据请求、或者获取数据请求、或者删除数据请求、或者废弃数据请求;其中,该写数据请求携带的地址信息包括键和值存放的地址;该获取数据请求携带的地址信息包括键存放的地址信息;该删除数据请求携带的地址信息包括键存放的地址信息;该废弃数据请求携带的地址信息包括键存放的地址信息。
当然,上述仅是示例性的列举了一些键值存储请求操作,还可能存在其它的键值存储请求操作,本申请实施例对此不作具体限定。
在一种可能的设计中,若该键值存储请求为获取数据请求,则在该存储控制器获取第一存储请求指令序列之前,还包括:该存储控制器获取该获取数据请求所请求的值的长度;该存储控制器向该主机发送该值的长度,以使得该主机根据该值的长度为该值分配内存;进而,该存储控制器根据该第一指令码和该地址信息,对存储设备进行该第一指令码对应的操作,包括:该存储控制器根据该第一指令码和该地址信息,读数据到该主机为该值分配的内存。
也就是说,若该键值存储请求为获取数据请求,则主机首先需要为值分配内存。这样,存储控制器从存储设备中读取的数据才有相应的存储空间。
在一种可能的设计中,该存储控制器获取该获取数据请求所请求的值的长度,包括:该存储控制器获取第二存储请求指令序列,该第二存储请求指令序列由该主机将获取键对应的值的长度的指令码、以及该键存放的地址信息写入该协议定义的字段组成,其中,该获取键对应的值的长度的指令码为根据该协议的预留扩展字段定义的指令码;该存储控制器从该第二存储请求指令序列中分离出该获取键对应的值的长度的指令码、以及该键存放的地址信息;该存储控制器根据该获取键对应的值的长度的指令码、以及该键存放的地址信息,从该存储设备中获取该值的长度。
其中,存储控制器获取第二存储请求指令序列的方式可参考上述存储控制器获取第一存储请求指令序列的方式,此处不再赘述。
通过上述方式,主机可以获取到该获取数据请求所请求的值的长度。
在一种可能的设计中,该键值存储请求为包含多个单次键值存储请求的聚合键值存储请求;其中,该键值存储请求携带的地址信息通过聚散表的地址信息进行索引,该聚散表中包含该多个单次键值存储请求中每个单次键值存储请求携带的地址信息。
通过将多个单次键值存储请求合并成聚合键值存储请求,并将多个单次键值存储请求中每个单次键值存储请求携带的地址信息通过聚散表进行索引,可以使得在一次键值存储操作流程中同时完成多个单次键值存储操作,提高了键值存储的效率。
在一种可能的设计中,该方法还包括:该存储控制器获取第三存储请求指令序列,该第三存储请求指令序列由该主机将非键值存储请求携带的第二指令码、以及数据内存指针写入该协议定义的字段组成,其中,该第二指令码为该协议的标准指令码;该存储控制器从该第三存储请求指令序列中分离出该第二指令码和该数据内存指针;该存储控制器根据该第二指令码和该数据内存指针,对该存储设备进行该第二指令码对应的操作。
其中,存储控制器获取第三存储请求指令序列的方式可参考上述存储控制器获取第一存储请求指令序列的方式,此处不再赘述。
本申请实施例提供的键值存储方法中,由于主机可以借助存储协议和存储控制器交互以实现非键值存储,因此可以在支持键值存储的同时支持传统块设备存储。
又一方面,本申请实施例提供一种主机,该主机包括:处理模块和通信模块;该处理模块,用于在检测到键值存储请求之后,将该键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列,其中,该第一指令码为根据该协议的预留扩展字段定义的指令码;该通信模块,用于与该存储控制器进行交互,以使得该存储控制器获取该第一存储请求指令序列。
在一种可能的设计中,该处理模块具体用于:按照协议定义的字段为第 一存储请求指令序列分配内存;将该键值存储请求携带的第一指令码、以及地址信息写入该内存;该通信模块具体用于:通知该存储控制器从该内存中读取该第一存储请求指令序列;或者,向该存储控制器发送该第一存储请求指令序列。
在一种可能的设计中,该键值存储请求包括:写数据请求、或者获取数据请求、或者删除数据请求、或者废弃数据请求;其中,该写数据请求携带的地址信息包括键和值存放的地址信息;该获取数据请求携带的地址信息包括键存放的地址信息;该删除数据请求携带的地址信息包括键存放的地址信息;该废弃数据请求携带的地址信息包括键存放的地址信息。
在一种可能的设计中,若该键值存储请求为获取数据请求,则该处理模块,还用于在检测到键值存储请求之后,将该键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列之前,获取该获取数据请求所请求的值的长度;根据该值的长度为该值分配内存,以使得在该通信模块与该存储控制器进行交互,以使得该存储控制器获取该第一存储请求指令序列之后,该存储控制器读数据到该处理模块为该值分配的内存。
在一种可能的设计中,该处理模块具体用于:将获取键对应的值的长度的指令码、以及该键存放的地址信息写入该协议定义的字段组成第二存储请求指令序列,其中,该获取键对应的值的长度的指令码为根据该协议的预留扩展字段定义的指令码;通过该通信模块与该存储控制器进行交互,以使得该存储控制器获取该第二存储请求指令序列;通过该通信模块接收该存储控制器发送的该值的长度。
在一种可能的设计中,该键值存储请求为包含多个单次键值存储请求的聚合键值存储请求;其中,该键值存储请求携带的地址信息通过聚散表的地址信息进行索引,该聚散表中包含该多个单次键值存储请求中每个单次键值存储请求携带的地址信息。
在一种可能的设计中,该处理模块,还用于在检测到非键值存储请求之后,将该非键值存储请求携带的第二指令码、以及数据内存指针写入该协议定义的字段组成第三存储请求指令序列,其中,该第二指令码为该协议的标准指令码;该通信模块,还用于与该存储控制器进行交互,以使得该存储控制器获取该第三存储请求指令序列。
由于本申请实施例提供的主机可用于执行上述方法实施例中主机所执行的功能,因此其所能获得的技术效果可参考上述方法实施例中的相关描述,此处不再赘述。
又一方面,本申请实施例提供一种存储控制器,该存储控制器包括:前端通信模块、后端通信模块、处理模块和控制模块;该前端通信模块,用于从主机获取第一存储请求指令序列,该第一存储请求指令序列由该主机将键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成,其中,该第一指令码为根据该协议的预留扩展字段定义的指令码;该处理模 块,用于从该第一存储请求指令序列中分离出该第一指令码和该地址信息;该控制模块,用于根据该第一指令码和该地址信息,通过该后端通信模块对存储设备进行该第一指令码对应的操作。
在一种可能的设计中,该前端通信模块具体用于:从该主机按照协议定义的字段为该第一存储请求指令序列分配的内存中读取该第一存储请求指令序列;或者,接收该主机发送的第一存储请求指令序列。
在一种可能的设计中,该键值存储请求包括:写数据请求、或者获取数据请求、或者删除数据请求、或者废弃数据请求;其中,该写数据请求携带的地址信息包括键和值存放的地址;该获取数据请求携带的地址信息包括键存放的地址信息;该删除数据请求携带的地址信息包括键存放的地址信息;该废弃数据请求携带的地址信息包括键存放的地址信息。
在一种可能的设计中,若该键值存储请求为获取数据请求,则该处理模块,还用于在该前端通信模块获取第一存储请求指令序列之前,获取该获取数据请求所请求的值的长度;该通信模块,还用于向该主机发送该值的长度,以使得该主机根据该值的长度为该值分配内存;该控制模块具体用于:根据该第一指令码和该地址信息,读数据到该主机为该值分配的内存。
在一种可能的设计中,该处理模块具体用于:
通过该前端通信模块获取第二存储请求指令序列,该第二存储请求指令序列由该主机将获取键对应的值的长度的指令码、以及该键存放的地址信息写入该协议定义的字段组成,其中,该获取键对应的值的长度的指令码为根据该协议的预留扩展字段定义的指令码;从该第二存储请求指令序列中分离出该获取键对应的值的长度的指令码、以及该键存放的地址信息;根据该获取键对应的值的长度的指令码、以及该键存放的地址信息,通过该控制模块和该后端通信模块从该存储设备中获取该值的长度。
在一种可能的设计中,该键值存储请求为包含多个单次键值存储请求的聚合键值存储请求;其中,该键值存储请求携带的地址信息通过聚散表的地址信息进行索引,其中,该聚散表中包含该多个单次键值存储请求中每个单次键值存储请求携带的地址信息。
在一种可能的设计中,该前端通信模块,还用于从该主机获取第三存储请求指令序列,该第三存储请求指令序列由该主机将非键值存储请求携带的第二指令码、以及数据内存指针写入该协议定义的字段组成,其中,该第二指令码为该协议的标准指令码;该处理模块,还用于从该第三存储请求指令序列中分离出该第二指令码和该数据内存指针;该控制模块,还用于根据该第二指令码和该数据内存指针,通过该后端通信模块对该存储设备进行该第二指令码对应的操作。
由于本申请实施例提供的存储控制器可用于执行上述方法实施例中存储控制器所执行的功能,因此其所能获得的技术效果可参考上述方法实施例中的相关描述,此处不再赘述。
基于上述各方面,一种可能的设计中,上述各方面任一方面中所述的协 议包括非易失性存储标准NVMe协议;其中,该NVMe协议定义0-63字节为存储请求指令序列的字段。
由于本申请实施例可以将键值存储和NVMe协议这类高效存储协议结合起来,因此可以使得存储系统同时具备键值存储存储语义简单、存储系统扩展性好、数据查询速度快、数据存储量大的优势,以及NVMe协议IO路径短、时延低、并发处理能力强的优势。
基于上述各方面,一种可能的设计中,上述各方面任一方面中所述的协议包括小型计算机系统接口SCSI协议。
由于SCSI协议为通用的存储协议,因此将键值存储与SCSI协议结合起来,更具有通用性。
当然,上述的协议还可以为其它存储协议,本申请实施例对此不作具体限定。
基于上述各方面,一种可能的设计中,上述各方面任一方面中所述的指令序列的形式可以是队列,也可以是数据包等其它形式,本申请实施例对此不作具体限定。
又一方面,本申请实施例提供了一种主机,该主机可以实现上述方法实施例中主机所执行的功能,该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个上述功能相应的模块。
在一种可能的设计中,该主机的结构中包括处理器和通信接口,该处理器被配置为支持该主机执行上述方法中相应的功能。该通信接口用于支持该主机与其他网元之间的通信。该主机还可以包括存储器,该存储器用于与处理器耦合,其保存该主机必要的程序指令和数据。
又一方面,本申请实施例提供了一种存储控制器,该存储控制器可以实现上述方法实施例中存储控制器所执行的功能,该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个上述功能相应的模块。
在一种可能的设计中,该存储控制器的结构中包括处理器和通信接口,该处理器被配置为支持该存储控制器执行上述方法中相应的功能。该通信接口用于支持该存储控制器与其他网元之间的通信。该存储控制器还可以包括存储器,该存储器用于与处理器耦合,其保存该存储控制器必要的程序指令和数据。
又一方面,本申请实施例提供了一种键值存储系统,该键值存储系统包括存储设备,以及上述方面所述的主机和存储控制器。
再一方面,本申请实施例提供了一种计算机存储介质,用于储存上述主机所用的计算机软件指令,其包含用于执行上述方面所设计的程序。
再一方面,本申请实施例提供了一种计算机存储介质,用于储存为上述存储控制器所用的计算机软件指令,其包含用于执行上述方面所设计的程序。
附图说明
图1为现有键值存储的两种主要实现方式;
图2为本申请实施例提供的键值存储系统的架构示意图;
图3为本申请实施例提供的本申请实施例提供的键值存储方法的操作模块示意图;
图4为本申请实施例提供的键值存储方法的流程示意图;
图5为本申请实施例提供的键值存储与NVMe协议相结合以实现键值存储的操作流程示意图;
图6为本申请实施例提供的写数据请求时的键值存储流程示意图;
图7为本申请实施例提供的获取数据请求时的键值存储流程示意图;
图8为本申请实施例提供的删除数据请求时的键值存储流程示意图;
图9为本申请实施例提供的废弃数据请求时的键值存储流程示意图;
图10为本申请实施例提供的聚合删除数据请求时的键值存储流程示意图;
图11为本申请实施例提供的非键值存储相关的标准NVMe设备操作流程示意图;
图12为本申请实施例提供的键值存储与SCSI协议相结合以实现键值存储的操作流程示意图;
图13为本申请实施例提供的主机的结构示意图;
图14为本申请实施例提供的存储控制器的结构示意图。
具体实施方式
如图1所示,为现有键值存储的两种主要实现方式。
其中,在方式一中,硬件使用SCSI设备,SCSI设备通过SCSI底层驱动、SCSI中层、SCSI上层、IO调度层和块设备层对外提供块设备服务。软件在块设备层或者文件系统之上建立中间件,在中间件中完成键值操作和块设备或文件系统操作的转换,从而对用户空间的应用提供键值存储的服务。然而,在该方式中,存在多层存储协议栈转换,因此IO路径时延较大。
在方式二中,硬件使用支持键值存储的设备,键值存储设备通过键值存储驱动层与中间件进行通信,中间件将用户存储转换为键值存储操作,从而对用户空间的应用提供键值存储服务。然而,在该方式中,专用的键值存储设备无法支持传统块设备存储,应用范围具有局限性。
本申请实施例提供一种键值存储方法,能够将键值存储操作扩展到存储协议之上,进而,不仅可以降低存储系统的IO路径时延,还可以同时支持传统块设备存储和键值存储。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。
需要说明的是,为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分,本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行 次序进行限定。
需要说明的是,本文中的“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。“多个”是指两个或多于两个。
如本申请所使用的术语“组件”、“模块”、“系统”等等旨在指代计算机相关实体,该计算机相关实体可以是硬件、固件、硬件和软件的结合、软件或者运行中的软件。例如,组件可以是,但不限于是:在处理器上运行的处理、处理器、对象、可执行文件、执行中的线程、程序和/或计算机。作为示例,在计算设备上运行的应用和该计算设备都可以是组件。一个或多个组件可以存在于执行中的过程和/或线程中,并且组件可以位于一个计算机中以及/或者分布在两个或更多个计算机之间。此外,这些组件能够从在其上具有各种数据结构的各种计算机可读介质中执行。这些组件可以通过诸如根据具有一个或多个数据分组(例如,来自一个组件的数据,该组件与本地系统、分布式系统中的另一个组件进行交互和/或以信号的方式通过诸如互联网之类的网络与其它系统进行交互)的信号,以本地和/或远程过程的方式进行通信。
需要说明的是,本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
需要说明的是,本申请实施例中,除非另有说明,“多个”的含义是指两个或两个以上。例如,多个数据包是指两个或两个以上的数据包。
需要说明的是,本申请实施例中,“的(英文:of)”,“相应的(英文:corresponding,relevant)”和“对应的(英文:corresponding)”有时可以混用,应当指出的是,在不强调其区别时,其所要表达的含义是一致的。
需要说明的是,本申请实施例描述的网络架构以及业务场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着网络架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
如图2所示,为本申请实施例提供的键值存储系统的架构示意图。该键值存储系统由主机20、存储控制器21以及存储设备22组成。其中,主机20通过存储控制器21对存储设备22中的存储介质进行键值存储操作;存储控制器21用于处理键值存储请求并对存储设备22进行数据存取操作。
具体的,如图2所示,主机20可以包括:主机底板201、以及部署在主机底板201上的中央处理单元(英文:central processing unit,缩写:CPU)202、内存203、桥片204和主机通信接口205。
其中,内存203中存储了该主机20运行时必要的程序指令和数据;CPU202用于处理该主机20的程序指令;桥片204用于连接主机底板201上的各类外接设备(未画出);主机通信接口205是主机20用于和存储控制器21之间连接的总线接口, 用于完成主机20与存储控制器21之间的通信。
存储控制器21可以包括:前端通信接口211、CPU212、内存213、以及后端通信接口214。
其中,内存213用于存储存储控制器21运行时必要的程序指令和数据;CPU212用于处理存储控制器21运行的指令和运算;前端通信接口211是存储控制器21用于和主机20之间连接的总线接口,用于完成存储控制器21与主机20之间的通信;后端通信接口214是存储控制器21用于和存储设备22之间连接的总线接口,用于完成存储控制器21与存储设备22之间的通信。
需要说明的是,本申请实施例中的存储控制器21以及存储设备22可能独立部署,也可能集成在同一设备上,本申请对此不作具体限定。
下面将基于该键值存储系统,对本申请实施例提供的键值存储方法进行详细介绍。首先,图3和图4分别为本申请实施例提供的键值存储方法的操作模块示意图和相应的流程示意图,该键值存储方法具体可以包括:
S401、在主机检测到键值存储请求之后,主机将键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列(如表一)。
其中,该第一指令码为根据协议的预留扩展字段定义的指令码。协议的预留扩展字段具体是指协议预留的用于协议扩展或用于用户自定义的字段。
S402、主机与存储控制器进行交互,以使得存储控制器获取该第一存储请求指令序列。
S403、存储控制器的命令处理单元获取该第一存储请求指令序列,并通过分离第一存储请求指令序列中的第一指令码来判断当前为键值操作后,将第一存储请求指令序列中的内容转发给键值存储处理单元。
S404、键值存储处理单元处理全部的键值存储请求,并将处理后的键值存储请求提交给存储控制单元。
S405、存储控制单元将前述提交的请求转换为对存储设备的操作。
S406、存储控制单元将操作结果通过前述单元反馈给主机。
表一
Figure PCTCN2017085983-appb-000001
S407、在主机检测到非键值存储请求之后,主机将非键值存储请求携带的第二指令码、以及数据内存指针写入协议定义的字段组成第三存储请求指令序列(如表二)。
其中,该第二指令码为协议的标准指令码。
S408、主机与存储控制器进行交互,以使得存储控制器获取该第三存储请求指令序列。
S409、存储控制器的命令处理单元获取该第三存储请求指令序列,并通过分离第三存储请求指令序列中的第二指令码来判断当前为非键值操作后,将第三存储请求指令序列中的内容转发给标准协议处理单元。
S410、标准协议处理单元处理全部的非键值存储请求,并将处理后的非键值存 储请求提交给存储控制单元。
S411、存储控制单元将前述提交的请求转换为对存储设备的操作。
S412、存储控制单元将操作结果通过前述单元反馈给主机。
表二
Figure PCTCN2017085983-appb-000002
具体的,上述的协议可以是NVMe协议,也可以是SCSI协议等其它存储协议,本申请实施例对此不作具体限定。
具体的,上述的指令序列的形式可以是队列,也可以是数据包等其它形式,本申请实施例对此不作具体限定。
具体的,上述的键值存储操作可以是写数据、获取数据、删除数据或者废弃数据等操作,本申请实施例对此不作具体限定。
本申请实施例提供的键值存储方法将键值存储操作扩展到存储协议之上,进而,一方面,由于主机可以借助存储协议和存储控制器交互以实现键值存储,无需在块层或者文件系统之上进行转换以实现键值存储,因此降低了存储系统的IO路径时延;另一方面,由于主机可以借助存储协议和存储控制器交互以实现非键值存储,因此可以同时支持传统块设备存储。
可选的,步骤S401具体可以包括:
在主机检测到键值存储请求之后,主机按照协议定义的字段为第一存储请求指令序列分配内存;
主机将该键值存储请求携带的第一指令码、以及地址信息写入该内存。
进而,步骤S402具体可以包括:
主机通知存储控制器从该内存中读取该第一存储请求指令序列;
或者,主机向存储控制器发送该第一存储请求指令序列。
其中,主机通知存储控制器从内存中读取第一存储请求指令序列的方式可以节省存储控制器的内存空间。
需要说明的是,本申请实施例仅是示例性的提供两种主机与存储控制器进行交互,以使得存储控制器获取该第一存储请求指令序列的方式,当然,主机与存储控制器还可能通过其它交互方式以使得存储控制器获取该第一存储请求指令序列,本申请实施例对此不作具体限定。
下面将结合具体的协议以及具体的键值存储操作或者非键值存储操作对本申请实施例提供的键值存储方法进一步说明。
如图5所示,为键值存储与NVMe协议相结合以实现键值存储的操作流程示意图。
首先,对图5中的相关单元进行简要介绍:
一、主机接口:
主机与存储控制器通过主机接口进行连接,主机通过主机接口与存储控制器进行指令、地址和数据的交互,本申请实施例中主机接口为总线和接口标准(英文:peripheral component interface express,缩写:PCIe)接口;
二、主机软件模块:
1)应用层:主机应用程序或者是存储客户端软件。
2)传统应用层:基于传统文件系统或者块设备接口的主机应用程序。
3)中间件:键值存储中间件,对主机应用提供键值存储接口,并将存储请求传递给NVMe驱动层。
4)文件系统:例如EXT3、EXT4、FAT32等。
5)块层:操作系统对块存储设备的抽象层,一般文件系统都构建于这层之上。
6)NVMe驱动层:主机操作系统通过驱动软件和NVMe存储控制器进行数据传输和命令交互。
7)NVMe命令转换单元:位于NVMe驱动层,用于将键值(Key-Value)存储的扩展指令、Key和/或Value存放的地址等信息填入NVMe发送队列,并将NVMe发送队列提交给NVMe驱动层,由NVMe驱动层下发给NVMe存储控制器。
三、NVMe存储控制器:
1)NVMe命令处理单元:分析NVMe控制器接收到的NVMe发送队列中的指令码,并根据不同的指令将NVMe发送队列分发给键值存储处理单元或者NVMe操作处理单元。
2)键值存储处理单元:处理全部的键值操作请求,并将处理后的请求提交给存储控制单元。
3)NVMe操作处理单元:处理标准NVMe协议操作请求。
4)存储控制单元:将前述提交的请求转换为对存储设备的操作,并将操作结果反馈给对存储设备执行数据的存储操作。
四、存储设备:
存储设备中的存储介质包括动态随机存取存储器(英文:dynamic random access memory,缩写:DRAM)、非易失性随机访问存储器(英文:non-volatile random access memory,缩写:NVRAM)、NAND或者其它存储器件。
其次,给出NVMe协议中NVMe IO队列操作定义的指令码,如表三所示:
表三
Figure PCTCN2017085983-appb-000003
Figure PCTCN2017085983-appb-000004
本申请实施例中,根据NVMe协议的预留扩展字段定义键值存储请求携带的第一指令码,如表四所示:
表四
Figure PCTCN2017085983-appb-000005
Figure PCTCN2017085983-appb-000006
需要说明的是,上述表四仅是示例性的提供一种根据NVMe协议的预留扩展字段定义键值存储请求携带的第一指令码的方式,当然,根据NVMe协议的预留扩展字段定义键值存储请求携带的第一指令码不限于上述方式,比如,可以在90h-99h字段上定义上述键值存储操作,本申请实施例对此不作具体限定。
如上所述,本实施例中键值(Key-Value)存储操作包括但不限于:写数据Put(String Key,String Value)、获取数据:Get(String Key)、删除数据:Delete(String Key)、废弃数据:TRIM(String Key),本申请实施例对此不作具体限定。
下面将结合图5对上述键值存储操作进行详细的描述。
示例一:
当键值存储请求为单次写数据请求时,如图6所示,本申请实施例提供的键值存储方法包括步骤S601-S611:
S601、应用层调用中间件写接口:Put(String Key,String Value)。
S602、中间件提交写数据请求、Key和Value存放的地址信息到NVMe驱动层。
S603、NVMe驱动层的NVMe命令转换单元写写数据指令80h、Key和Value存放的地址信息到NVMe发送队列。
具体的,在NVMe协议中,定义0-63字节为NVMe发送队列对应的字段。该步骤中NVMe发送队列的命令格式如图6中的发送队列命令格式。其中,字节03:00(即第0个字节到第3个字节)为指令码对应的字节;字节23:04(即第4个字节到第23个字节)为Command Dword 1-6(即命令字的第1-6个4字节);字节39:24(即第24个字节到第39个字节)为Key和Value存放的地址信息对应的字节;字节63:40(即第40个字节到第63个字节)为Command Dword 10-15(即命令字 的第10-15个4字节)。
需要说明的是,本申请实施例中的NVMe发送队列是上述指令序列的一种具体形式,当然,如上所述,本申请实施例中的指令序列的形式也可以是数据包等其它形式,本申请实施例对此不作具体限定。
S604、NVMe驱动层通知NVMe存储控制器读NVMe发送队列。
S605、NVMe存储控制器的NVMe命令处理单元通过直接数据存取(英文:direct memory access,缩写:DMA)读NVMe发送队列。
S606、NVMe命令处理单元分离NVMe队列中的写数据指令、key和Value存放的地址信息,提交写数据请求到键值存储处理单元。
S607、键值存储处理单元处理Key和Value存放的地址信息,转换写数据请求为对应存储设备的写数据请求,并提交写数据请求到存储控制单元。
S608、存储控制单元根据Key和Value存放的地址信息,写数据到存储设备。
S609、存储控制单元返回状态信息到键值存储处理单元。
具体的,这里的状态信息是指是否写数据成功的信息。
S610、键值存储处理单元及前述单元依次传递状态信息到中间件。
具体的,结合图5可知,此处的前述单元具体包括:NVMe存储控制器的NVMe命令处理单元、主机的NVMe驱动层的NVMe命令转换单元。
S611、中间件返回状态信息到应用层。
至此,当键值存储请求为单次写数据请求时,键值存储过程结束。
示例二:
当键值存储请求为单次获取数据请求时,如图7所示,本申请实施例提供的键值存储方法包括步骤S701-S721:
S701、应用层调用中间件读操作接口:Get(String Key)。
S702、中间件提交获取Key对应Value长度的请求到NVMe驱动层。
S703、NVMe驱动层的NVMe命令转换单元写获取Value长度的指令81h、Key存放的地址信息到NVMe发送队列。
具体的,在NVMe协议中,定义0-64字节为NVMe发送队列对应的字段。该步骤中NVMe发送队列的命令格式如图7中的发送队列命令格式1。其中,字节03:00(即第0个字节到第3个字节)为指令码对应的字节;字节23:04(即第4个字节到第23个字节)为Command Dword 1-6(即命令字的第1-6个4字节);字节39:24(即第24个字节到第39个字节)为Key存放的地址信息对应的字节;字节63:40(即第40个字节到第63个字节)为Command Dword 10-15(即命令字的第10-15个4字节)。
需要说明的是,本申请实施例中的NVMe发送队列是上述指令序列的一种具体形式,当然,如上所述,本申请实施例中的指令序列的形式也可以是数据包等其它形式,本申请实施例对此不作具体限定。
S704、NVMe驱动层通知NVMe存储控制器读NVMe发送队列。
S705、NVMe存储控制器的NVMe命令处理单元通过DMA读NVMe发送队列。
S706、NVMe命令处理单元分离NVMe队列中的获取Value长度的指令、key存 放的地址信息,提交读操作请求到键值存储处理单元。
S707、键值存储处理单元处理Key存放的地址信息,转换读操作请求为对应存储设备的读操作请求,并提交读操作请求到存储控制单元。
S708、存储控制单元根据Key存放的地址信息,从存储设备获取对应Value的长度。
S709、存储控制单元提交Value的长度信息到键值存储处理单元。
S710、键值存储处理单元及前述单元依次Value的长度信息到中间件。
具体的,结合图5可知,此处的前述单元具体包括:NVMe存储控制器的NVMe命令处理单元、主机的NVMe驱动层的NVMe命令转换单元。
S711、中间件根据Value的长度分配Value在主机端存放的内存空间。
S712、中间件提交获取Value的请求、Value存放的地址信息到NVMe驱动层。
S713、NVMe驱动层的NVMe命令转换单元写获取Value的指令82h、key存放的地址信息到NVMe发送队列。
具体的,在NVMe协议中,定义0-63字节为NVMe发送队列对应的字段。该步骤中NVMe发送队列的命令格式如图7中的发送队列命令格式2。其中,字节03:00(即第0个字节到第3个字节)为指令码对应的字节;字节23:04(即第4个字节到第23个字节)为Command Dword 1-6(即命令字的第1-6个4字节);字节39:24(即第24个字节到第39个字节)为key存放的地址信息对应的字节;字节63:40(即第40个字节到第63个字节)为Command Dword 10-15(即命令字的第10-15个4字节)。
需要说明的是,本申请实施例中的NVMe发送队列是上述指令序列的一种具体形式,当然,如上所述,本申请实施例中的指令序列的形式也可以是数据包等其它形式,本申请实施例对此不作具体限定。
S714、NVMe驱动层通知NVMe存储控制器读NVMe发送队列。
S715、NVMe存储控制器的NVMe命令处理单元通过DMA读NVMe发送队列。
S716、NVMe命令处理单元分离NVMe队列中的获取Value的指令、key存放的地址信息,提交获取Value的请求到键值存储处理单元。
S717、键值存储处理单元处理Value存放的地址信息,转换获取Value的请求为对应存储设备的获取Value的请求,并提交获取Value的请求到存储控制单元。
S718、存储控制单元通过DMA方式读数据到主机分配给Value存放的内存地址。
可选的,主机和存储控制器之间指令和数据的传输方式还可以是远程直接数据存取(英文:remote direct memory access,缩写:RDMA)等其它传输方式,本申请实施例对此不作具体限定。
S719、存储控制单元返回状态信息到键值存储处理单元。
具体的,这里的状态信息是指是否获取数据成功的信息。
S720、键值存储处理单元及前述单元依次传递状态信息到中间件。
具体的,结合图5可知,此处的前述单元具体包括:NVMe存储控制器的NVMe命令处理单元、主机的NVMe驱动层的NVMe命令转换单元。
S721、中间件返回状态信息到应用层。
至此,当键值存储请求为单次获取数据请求时,键值存储过程结束。
示例三、
当键值存储请求为单次删除数据请求时,如图8所示,本申请实施例提供的键值存储方法包括步骤S801-S811:
S801、应用层调用中间件删除接口:Delete(String Key)。
S802、中间件提交删除数据请求、Key存放的地址信息到NVMe驱动层。
S803、NVMe驱动层的NVMe命令转换单元写删除数据指令83h、Key存放的地址信息到NVMe发送队列。
具体的,在NVMe协议中,定义0-63字节为NVMe发送队列对应的字段。该步骤中NVMe发送队列的命令格式如图8中的发送队列命令格式。其中,字节03:00(即第0个字节到第3个字节)为指令码对应的字节;字节23:04(即第4个字节到第23个字节)为Command Dword 1-6(即命令字的第1-6个4字节);字节39:24(即第24个字节到第39个字节)为key存放的地址信息对应的字节;字节63:40(即第40个字节到第63个字节)为Command Dword 10-15(即命令字的第10-15个4字节)。
需要说明的是,本申请实施例中的NVMe发送队列是上述指令序列的一种具体形式,当然,如上所述,本申请实施例中的指令序列的形式也可以是数据包等其它形式,本申请实施例对此不作具体限定。
S804、NVMe驱动层通知NVMe存储控制器读NVMe发送队列。
S805、NVMe存储控制器的NVMe命令处理单元通过DMA读NVMe发送队列。
S806、NVMe命令处理单元分离NVMe队列中的删除数据指令、key存放的地址信息,提交删除数据请求到键值存储处理单元。
S807、键值存储处理单元处理Key存放的地址信息,转换删除数据请求为对应存储设备的删除数据请求,并提交删除数据请求到存储控制单元。
S808、存储控制单元根据Key存放的地址信息,执行对存储设备中数据的删除操作。
S809、存储控制单元返回状态信息到键值存储处理单元。
具体的,这里的状态信息是指是否删除数据成功的信息。
S810、键值存储处理单元及前述单元依次传递状态信息到中间件。
具体的,结合图5可知,此处的前述单元具体包括:NVMe存储控制器的NVMe命令处理单元、主机的NVMe驱动层的NVMe命令转换单元。
S811、中间件返回状态信息到应用层。
至此,当键值存储请求为单次删除数据请求时,键值存储过程结束。
示例四、
当键值存储请求为单次废弃数据请求时,如图9所示,本申请实施例提供的键值存储方法包括步骤S901-S911:
S901、应用层调用中间件废弃接口:TRIM(String Key)。
S902、中间件提交废弃数据请求、Key存放的地址信息到NVMe驱动层。
S903、NVMe驱动层的NVMe命令转换单元写废弃数据指令84h、Key存放的地 址信息到NVMe发送队列。
具体的,在NVMe协议中,定义0-63字节为NVMe发送队列对应的字段。该步骤中NVMe发送队列的命令格式如图9中的发送队列命令格式。其中,字节03:00(即第0个字节到第3个字节)为指令码对应的字节;字节23:04(即第4个字节到第23个字节)为Command Dword 1-6(即命令字的第1-6个4字节);字节39:24(即第24个字节到第39个字节)为key存放的地址信息对应的字节;字节63:40(即第40个字节到第63个字节)为Command Dword 10-15(即命令字的第10-15个4字节)。
需要说明的是,本申请实施例中的NVMe发送队列是上述指令序列的一种具体形式,当然,如上所述,本申请实施例中的指令序列的形式也可以是数据包等其它形式,本申请实施例对此不作具体限定。
S904、NVMe驱动层通知NVMe存储控制器读NVMe发送队列。
S905、NVMe存储控制器的NVMe命令处理单元通过DMA读NVMe发送队列。
S906、NVMe命令处理单元分离NVMe队列中的废弃数据指令、key存放的地址信息,提交废弃数据请求到键值存储处理单元。
S907、键值存储处理单元处理Key存放的地址信息,转换废弃数据请求为对应存储设备的废弃数据请求,并提交废弃数据请求到存储控制单元。
S908、存储控制单元根据Key存放的地址信息,执行对存储设备中数据的废弃操作。
S909、存储控制单元返回状态信息到键值存储处理单元。
具体的,这里的状态信息是指是否废弃数据成功的信息。
S910、键值存储处理单元及前述单元依次传递状态信息到中间件。
具体的,结合图5可知,此处的前述单元具体包括:NVMe存储控制器的NVMe命令处理单元、主机的NVMe驱动层的NVMe命令转换单元。
S911、中间件返回状态信息到应用层。
至此,当键值存储请求为单次废弃数据请求时,键值存储过程结束。
其中,上述图6-9所示的实施例均是针对单次键值存储请求时的键值存储。由表二可知,在指令定义的过程中,还可以定义聚合操作。所谓聚合操作,是指一次聚合存储操作的请求可以同时完成多个单次存储操作的请求,流程和单次请求操作的流程基本一致;不同的是,多个键(Key)和/或值(Value)的地址信息需要借助SGL(聚散表)通过NVMe发送队列传递给存储控制器,也就是说,多个键(Key)和/或值(Value)的地址信息可以通过聚散表地址信息进行索引,该聚散表中包含多个单次键值存储请求中每个单次键值存储请求携带的地址信息。
示例五、
下面以当键值存储请求为聚合删除数据请求为例,对本申请实施例提供的键值存储方法进行说明。如图10所示,本申请实施例提供的键值存储方法包括步骤S1001-S1011:
S1001、应用层调用中间件聚合删除接口Delete_Group(String Key group)。
S1002、中间件提交聚合删除数据请求、指示Key数据组存放的地址的聚散表 (SGL)的地址信息到NVMe驱动层。
S1003、NVMe驱动层的NVMe命令转换单元写聚合删除数据指令88h、SGL的地址信息到NVMe发送队列。
具体的,在NVMe协议中,定义0-63字节为NVMe发送队列对应的字段。该步骤中NVMe发送队列的命令格式如图10中的发送队列命令格式。其中,字节03:00(即第0个字节到第3个字节)为指令码对应的字节;字节23:04(即第4个字节到第23个字节)为Command Dword 1-6(即命令字的第1-6个4字节);字节39:24(即第24个字节到第39个字节)为SGL的地址信息对应的字节;字节63:40(即第40个字节到第63个字节)为Command Dword 10-15(即命令字的第10-15个4字节)。
需要说明的是,聚合操作可以包含任意数量个单次操作,图10仅是以聚合删除数据请求包括5个单次删除数据请求为例,给出了SGL中包含key1-key5的地址信息的示意,不构成对本申请技术方案的限定。
需要说明的是,本申请实施例中的NVMe发送队列是上述指令序列的一种具体形式,当然,如上所述,本申请实施例中的指令序列的形式也可以是数据包等其它形式,本申请实施例对此不作具体限定。
S1004、NVMe驱动层通知NVMe存储控制器读NVMe发送队列。
S1005、NVMe存储控制器的NVMe命令处理单元通过DMA读NVMe发送队列。
S1006、NVMe命令处理单元分离NVMe队列中的聚合删除数据指令、SGL的地址信息,提交聚合删除数据请求到键值存储处理单元。
S1007、键值存储处理单元对SGL中Key数据组存放的地址信息逐一进行处理,转换聚合删除数据请求为每个Key对应存储设备中数据的删除数据请求,并提交请求到存储控制单元。
S1008、存储控制单元逐一执行数据的删除操作。
S1009、全部操作完成后,存储控制单元返回状态信息到键值存储处理单元。
具体的,这里的状态信息是指是否全部删除数据成功的信息。
S1010、键值存储处理单元及前述单元依次传递状态信息到中间件。
具体的,结合图5可知,此处的前述单元具体包括:NVMe存储控制器的NVMe命令处理单元、主机的NVMe驱动层的NVMe命令转换单元。
S1011、中间件返回状态信息到应用层。
至此,当键值存储请求为聚合删除数据请求时,键值存储过程结束。
其中,上述图6-10所示的实施例均是针对主机检测到键值存储请求时的操作。当然,如上所述,本申请实施例中,主机还可以借助存储协议和存储控制器交互以实现非键值存储,比如同时支持传统块设备存储,具体参见示例六。
示例六、
当主机检测到非键值存储请求时,如图11所示,非键值存储相关的标准NVMe设备操作包括步骤S1101-S1111:
S1101、传统应用层调用文件系统接口。
S1102、文件系统层转换非键值存储请求为块层的操作请求。
S1103、块层提交操作请求到NVMe驱动层。
S1104、NVMe驱动层的NVMe命令转换单元转换操作请求为标准的NVMe指令,写指令码、数据内存指针等信息到NVMe发送队列。
具体的,本实施例中NVMe发送队列的命令格式和上述实施例中的NVMe发送队列命令格式类似,此处不再赘述。
S1105、NVMe驱动层通知NVMe存储控制器读NVMe发送队列。
S1106、NVMe存储控制器的NVMe命令处理单元通过DMA读NVMe发送队列;
S1107、NVMe命令处理单元分离NVMe发送队列中的指令码、数据内存指针等信息,提交操作请求到NVMe操作处理单元。
S1108、NVMe操作处理单元处理数据内存指针等信息,转换操作请求为对应存储设备的操作请求,并提交操作请求到存储控制单元。
S1109、存储控制单元根据数据内存指针,执行对存储设备的操作请求。
S1110、存储控制单元返回状态信息到NVMe操作处理单元。
具体的,这里的状态信息是指是否执行操作请求成功的信息。
S1111、NVMe操作处理单元及前述单元依次传递状态信息到传统应用层。
具体的,结合图5可知,此处的前述单元具体包括:NVMe存储控制器的NVMe命令处理单元、主机的NVMe驱动层的NVMe命令转换单元、块层、以及文件系统。
至此,非键值存储相关的标准NVMe设备操作结束。
其中,上述图6-11所示的实施例均是结合图5所示的键值存储与NVMe协议相结合以实现键值存储的操作流程示意图进行说明。将键值存储和NVMe协议这类高效存储协议结合起来,可以使得存储系统同时具备二者的优势。然而,如上所述,本申请实施例中的协议可以是NVMe协议,也可以是SCSI协议等其它存储协议,本申请实施例对此不作具体限定。
比如,本申请实施例还可以将键值存储与SCSI协议相结合,如图12所示,为键值存储与SCSI协议相结合以实现键值存储的操作流程示意图。
首先,对图12中的相关单元进行简要介绍:
一、主机接口:
图12所示的主机接口与图5所示的主机接口的功能相同,具体可参考图5所示的主机接口的描述,此处不再赘述。
二、主机软件模块:
1)图12所示的应用、传统应用、中间件、文件系统与块层的功能和图5所示的应用、传统应用、中间件、文件系统与块层的功能分别相同,具体可参考图5所示的应用、传统应用、中间件、文件系统与块层的描述,此处不再赘述。
2)SCSI层:处理SCSI事务的软件层,包括SCSI上层、SCSI中层和SCSI下层。
3)SCSI驱动层:位于SCSI下层,负责将SCSI请求提交给串行SCSI接口(英文:Serial Attached SCSI,缩写:SAS)存储控制器,完成与SAS存储控制器间的控制和数据交互操作。
4)SCSI命令转换单元:位于SCSI驱动层,用于将键值(Key-Value)存储的扩 展指令、Key和/或Value的地址等信息填入NVMe发送队列,并将NVMe发送队列提交给SCSI驱动层下发给SAS存储控制器。
三、SAS存储控制器:
1)图12所示的键值存储处理单元与存储控制单元的功能和图5所示的键值存储处理单元与存储控制单元的功能分别相同,具体可参考图5所示的键值存储处理单元与存储控制单元的描述,此处不再赘述。
2)SCSI命令处理单元:分析SCSI控制器接收到的SCSI发送队列中的指令码,并根据不同的指令将SCSI发送队列分发给键值存储处理单元或者SCSI操作处理单元。
3)SCSI操作处理单元:处理标准SCSI协议操作请求。
四、存储设备:
图12所示的存储设备与图5所示的存储设备的功能相同,具体可参考图5所示的存储设备的描述,此处不再赘述。
其次,给出SCSI协议中对块设备操作定义的指令码,如表五所示:
表五
Figure PCTCN2017085983-appb-000007
Figure PCTCN2017085983-appb-000008
Figure PCTCN2017085983-appb-000009
Figure PCTCN2017085983-appb-000010
本申请实施例中,根据SCSI协议的预留扩展字段定义键值存储请求携带的第一指令码,如表六所示:
表六
命令名 指令码
写数据:Write 02h
获取Key对应Value的长度:GetLength 05h
获取数据:Read 06h
删除数据:Delete 09h
废弃数据:TRIM 0Ch
聚合写数据:Write 0Dh
聚合获取Key对应Value的长度:GetLength 0Eh
聚合获取数据:Read 0Fh
聚合删除数据:Delete 10h
聚合废弃数据:TRIM 13h
需要说明的是,上述表六仅是示例性的提供一种根据SCSI协议的预留扩展字段定义键值存储请求携带的第一指令码的方式,当然,根据SCSI协议的预留扩展字段定义键值存储请求携带的第一指令码不限于上述方式,比如,可以在上面表五中注释部分第3条提到的其它预留扩展字段上定义上述键值存储操作,本申请实施例对此不作具体限定。
具体的,结合图12进行键值存储操作的流程与结合图5进行键值存储操作的流程一致,具体可参考图6-11所示的实施例,此处不再赘述。
由上述各实施例所示的键值存储的方法可知,本申请实施例提供的键值存储方法将键值存储操作扩展到存储协议之上,进而,一方面,由于主机可以借助存储协议和存储控制器交互以实现键值存储,无需在块层或者文件系统之上进行转换以实现键值存储,因此降低了存储系统的IO路径时延;另一方面,由于主机可以借助存储协议和存储控制器交互以实现非键值存储,因此可以同时支持传统块设备存储。
上述主要从各个设备之间交互的角度对本申请实施例提供的方案进行了介绍。可以理解的是,各个设备,例如主机、存储控制器等为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对主机、存储控制器等进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用集成的单元的情况,图13示出了上述实施例中所涉及的主机1300的一种可能的结构示意图,主机1300包括:处理模块1301和通信模块1302。通信模块1302用于和存储控制器之间进行通信。处理模块1301用于支持主机执行图4中的过程S401和S407,或者处理模块1301可以包括应用层1301a、中间件1301b和驱动层1301c,用于支持主机执行图6-10中应用层、中间件和NVMe驱动层所执行的操作,或者处理模块1301还可以包括传统应用层1301d、文件系统1301e、块层1301f和驱动层1301c,用于支持主机执行图11中传统应用层、文件系统、块层和NVMe驱动层所执行的操作。其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。此外,主机1300还可以包括存储模块,用于存储主机1300的程序代码和数据。
其中,处理模块1301可以是处理器或控制器,例如可以是图2中的CPU202,也可以是通用处理器,数字信号处理器(英文:digital signal processor,缩写:DSP),专用集成电路(英文:application-specific integrated circuit,缩写:ASIC),现场可 编程门阵列(英文:field programmable gate array,缩写:FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。通信模块1302可以是通信接口,例如可以是图2中的主机通信接口205,也可以是接收器和发送器,或者收发电路等。存储模块1201可以是内存或存储器。
当处理模块1301为CPU,通信模块1302为通信接口时,本申请实施例所涉及的主机可以为图2所示的主机,具体可参见图2部分的相关描述,此处不再赘述。
在采用集成的单元的情况,图14示出了上述实施例中所涉及的存储控制器1400的一种可能的结构示意图,存储控制器1400包括:前端通信模块1401、处理模块1402、控制模块1403和后端通信模块1404。前端通信模块1401用于支持存储控制器1400与前端设备之间的通信,例如和图4、图6-11中主机的通信。后端通信模块1404用于支持存储控制器1400与后端设备之间的通信,例如和图4、图6-11中存储设备的通信。处理模块1402可以包括命令处理单元1402a、键值存储处理单元1402b和标准协议处理单元1402c,用于支持存储控制器1400执行图4中命令处理单元、键值存储处理单元和标准协议处理单元所执行的操作,或者用于支持存储控制器1400执行图6-11中NVMe命令处理单元、键值存储处理单元和NVMe操作处理单元所执行的操作。控制模块1403用于支持存储控制器执行图4和图6-11中存储控制单元所执行的操作。其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。此外,存储控制器1400还可以包括存储模块,用于存储存储控制器1400的程序代码和数据。
其中,处理模块1402和控制模块1403可以是处理器或控制器,例如可以是图2中的CPU212,也可以是通用处理器,数字信号处理器(英文:digital signal processor,缩写:DSP),专用集成电路(英文:application-specific integrated circuit,缩写:ASIC),现场可编程门阵列(英文:field programmable gate array,缩写:FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。前端通信模块1401和后端通信模块1404可以是通信接口,例如分别可以是图2中的前端通信接口211后端通信接口214,也可以是接收器和发送器,或者收发电路或等。存储模块可以是内存或存储器。
当处理模块1402和控制模块1403为CPU,前端通信模块1401和后端通信模块1404为通信接口时,本申请实施例所涉及的存储控制器可以为图2所示的存储控制器,具体可参见图2部分的相关描述,此处不再赘述。
结合本申请公开内容所描述的方法或者算法的步骤可以硬件的方式来实现,也可以是由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(英文:random access memory,缩写:RAM)、闪存、只读存储器(英文:read only memory,缩写:ROM)、可擦除可编程只读存储器(英文:erasable programmable ROM,缩写:EPROM)、电可擦可编程只读存 储器(英文:electrically EPROM,缩写:EEPROM)、寄存器、硬盘、移动硬盘、只读光盘(CD-ROM)或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于核心网接口设备中。当然,处理器和存储介质也可以作为分立组件存在于核心网接口设备中。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本申请所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。
以上所述的具体实施方式,对本申请的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本申请的具体实施方式而已,并不用于限定本申请的保护范围,凡在本申请的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本申请的保护范围之内。

Claims (33)

  1. 一种键值存储方法,其特征在于,所述方法包括:
    在主机检测到键值存储请求之后,所述主机将所述键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列,其中,所述第一指令码为根据所述协议的预留扩展字段定义的指令码;
    所述主机与所述存储控制器进行交互,以使得所述存储控制器获取所述第一存储请求指令序列。
  2. 根据权利要求1所述的方法,其特征在于,所述主机将所述键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列,包括:
    所述主机按照协议定义的字段为第一存储请求指令序列分配内存;
    所述主机将所述键值存储请求携带的第一指令码、以及地址信息写入所述内存;
    所述主机与所述存储控制器进行交互,以使得所述存储控制器获取所述第一存储请求指令序列,包括:
    所述主机通知所述存储控制器从所述内存中读取所述第一存储请求指令序列。
  3. 根据权利要求1或2所述的方法,其特征在于,所述键值存储请求包括:写数据请求、或者获取数据请求、或者删除数据请求、或者废弃数据请求;
    其中,所述写数据请求携带的地址信息包括键和值存放的地址信息;
    所述获取数据请求携带的地址信息包括键存放的地址信息;
    所述删除数据请求携带的地址信息包括键存放的地址信息;
    所述废弃数据请求携带的地址信息包括键存放的地址信息。
  4. 根据权利要求3所述的方法,其特征在于,若所述键值存储请求为获取数据请求,则在主机检测到键值存储请求之后,在所述主机将所述键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列之前,还包括:
    所述主机获取所述获取数据请求所请求的值的长度;
    所述主机根据所述值的长度为所述值分配内存,以使得在所述主机与所述存储控制器进行交互,以使得所述存储控制器获取所述第一存储请求指令序列之后,所述存储控制器读数据到所述主机为所述值分配的内存。
  5. 根据权利要求4所述的方法,其特征在于,所述主机获取所述获取数据请求所请求的值的长度,包括:
    所述主机将获取键对应的值的长度的指令码、以及所述键存放的地址信息写入所述协议定义的字段组成第二存储请求指令序列,其中,所述获取键对应的值的长度的指令码为根据所述协议的预留扩展字段定义的指令码;
    所述主机与所述存储控制器进行交互,以使得所述存储控制器获取所述第二存储请求指令序列;
    所述主机接收所述存储控制器发送的所述值的长度。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述键值存储请求为包含多个单次键值存储请求的聚合键值存储请求;其中,
    所述键值存储请求携带的地址信息通过聚散表的地址信息进行索引,所述聚散表中包含所述多个单次键值存储请求中每个单次键值存储请求携带的地址信息。
  7. 根据权利要求1-6任一项所述的方法,其特征在于,所述方法还包括:
    在所述主机检测到非键值存储请求之后,所述主机将所述非键值存储请求携带的第二指令码、以及数据内存指针写入所述协议定义的字段组成第三存储请求指令序列,其中,所述第二指令码为所述协议的标准指令码;
    所述主机与所述存储控制器进行交互,以使得所述存储控制器获取所述第三存储请求指令序列。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述协议为非易失性存储标准NVMe协议;
    其中,所述NVMe协议定义0-63字节为存储请求指令序列的字段。
  9. 一种键值存储方法,其特征在于,所述方法包括:
    存储控制器获取第一存储请求指令序列,所述第一存储请求指令序列由所述主机将键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成,其中,所述第一指令码为根据所述协议的预留扩展字段定义的指令码;
    所述存储控制器从所述第一存储请求指令序列中分离出所述第一指令码和所述地址信息;
    所述存储控制器根据所述第一指令码和所述地址信息,对存储设备进行所述第一指令码对应的操作。
  10. 根据权利要求9所述的方法,其特征在于,所述存储控制器获取第一存储请求指令序列,包括:
    所述存储控制器从主机按照协议定义的字段为所述第一存储请求指令序列分配的内存中读取第一存储请求指令序列。
  11. 根据权利要求9或10所述的方法,其特征在于,所述键值存储请求包括:写数据请求、或者获取数据请求、或者删除数据请求、或者废弃数据请求;
    其中,所述写数据请求携带的地址信息包括键和值存放的地址;
    所述获取数据请求携带的地址信息包括键存放的地址信息;
    所述删除数据请求携带的地址信息包括键存放的地址信息;
    所述废弃数据请求携带的地址信息包括键存放的地址信息。
  12. 根据权利要求11所述的方法,其特征在于,若所述键值存储请求为获取数据请求,则在所述存储控制器获取第一存储请求指令序列之前,还包括:
    所述存储控制器获取所述获取数据请求所请求的值的长度;
    所述存储控制器向所述主机发送所述值的长度,以使得所述主机根据所述值的长度为所述值分配内存;
    所述存储控制器根据所述第一指令码和所述地址信息,对存储设备进行所述第一指令码对应的操作,包括:
    所述存储控制器根据所述第一指令码和所述地址信息,读数据到所述主机为所述值分配的内存。
  13. 根据权利要求12所述的方法,其特征在于,所述存储控制器获取所述获取数据请求所请求的值的长度,包括:
    所述存储控制器获取第二存储请求指令序列,所述第二存储请求指令序列由所述主机将获取键对应的值的长度的指令码、以及所述键存放的地址信息写入所述协议定义的字段组成,其中,所述获取键对应的值的长度的指令码为根据所述协议的预留扩展字段定义的指令码;
    所述存储控制器从所述第二存储请求指令序列中分离出所述获取键对应的值的长度的指令码、以及所述键存放的地址信息;
    所述存储控制器根据所述获取键对应的值的长度的指令码、以及所述键存放的地址信息,从所述存储设备中获取所述值的长度。
  14. 根据权利要求9-13任一项所述的方法,其特征在于,所述键值存储请求为包含多个单次键值存储请求的聚合键值存储请求;其中,
    所述键值存储请求携带的地址信息通过聚散表的地址信息进行索引,所述聚散表中包含所述多个单次键值存储请求中每个单次键值存储请求携带的地址信息。
  15. 根据权利要求9-14任一项所述的方法,其特征在于,所述方法还包括:
    所述存储控制器获取第三存储请求指令序列,所述第三存储请求指令序列由所述主机将非键值存储请求携带的第二指令码、以及数据内存指针写入所述协议定义的字段组成,其中,所述第二指令码为所述协议的标准指令码;
    所述存储控制器从所述第三存储请求指令序列中分离出所述第二指令码和所述数据内存指针;
    所述存储控制器根据所述第二指令码和所述数据内存指针,对所述存储设备进行所述第二指令码对应的操作。
  16. 根据权利要求9-15任一项所述的方法,其特征在于,所述协议包括非易失性存储标准NVMe协议;
    其中,所述NVMe协议定义0-63字节为存储请求指令序列的字段。
  17. 一种主机,其特征在于,所述主机包括:处理模块和通信模块;
    所述处理模块,用于在检测到键值存储请求之后,将所述键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列,其中,所述第一指令码为根据所述协议的预留扩展字段定义的指令 码;
    所述通信模块,用于与所述存储控制器进行交互,以使得所述存储控制器获取所述第一存储请求指令序列。
  18. 根据权利要求17所述的主机,其特征在于,所述处理模块具体用于:
    按照协议定义的字段为第一存储请求指令序列分配内存;
    将所述键值存储请求携带的第一指令码、以及地址信息写入所述内存;
    所述通信模块具体用于:
    通知所述存储控制器从所述内存中读取所述第一存储请求指令序列。
  19. 根据权利要求17或18所述的主机,其特征在于,所述键值存储请求包括:写数据请求、或者获取数据请求、或者删除数据请求、或者废弃数据请求;
    其中,所述写数据请求携带的地址信息包括键和值存放的地址信息;
    所述获取数据请求携带的地址信息包括键存放的地址信息;
    所述删除数据请求携带的地址信息包括键存放的地址信息;
    所述废弃数据请求携带的地址信息包括键存放的地址信息。
  20. 根据权利要求19所述的主机,其特征在于,若所述键值存储请求为获取数据请求,则
    所述处理模块,还用于在检测到键值存储请求之后,将所述键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成第一存储请求指令序列之前,获取所述获取数据请求所请求的值的长度;
    根据所述值的长度为所述值分配内存,以使得在所述通信模块与所述存储控制器进行交互,以使得所述存储控制器获取所述第一存储请求指令序列之后,所述存储控制器读数据到所述处理模块为所述值分配的内存。
  21. 根据权利要求20所述的主机,其特征在于,所述处理模块具体用于:
    将获取键对应的值的长度的指令码、以及所述键存放的地址信息写入所述协议定义的字段组成第二存储请求指令序列,其中,所述获取键对应的值的长度的指令码为根据所述协议的预留扩展字段定义的指令码;
    通过所述通信模块与所述存储控制器进行交互,以使得所述存储控制器获取所述第二存储请求指令序列;
    通过所述通信模块接收所述存储控制器发送的所述值的长度。
  22. 根据权利要求17-21任一项所述的主机,其特征在于,所述键值存储请求为包含多个单次键值存储请求的聚合键值存储请求;其中,
    所述键值存储请求携带的地址信息通过聚散表的地址信息进行索引,所述聚散表中包含所述多个单次键值存储请求中每个单次键值存储请求携带的地址信息。
  23. 根据权利要求17-22任一项所述的主机,其特征在于,
    所述处理模块,还用于在检测到非键值存储请求之后,将所述非键值存 储请求携带的第二指令码、以及数据内存指针写入所述协议定义的字段组成第三存储请求指令序列,其中,所述第二指令码为所述协议的标准指令码;
    所述通信模块,还用于与所述存储控制器进行交互,以使得所述存储控制器获取所述第三存储请求指令序列。
  24. 根据权利要求17-23任一项所述的主机,其特征在于,所述协议包括非易失性存储标准NVMe协议;
    其中,所述NVMe协议定义0-63字节为存储请求指令序列的字段。
  25. 一种存储控制器,其特征在于,所述存储控制器包括:前端通信模块、后端通信模块、处理模块和控制模块;
    所述前端通信模块,用于从主机获取第一存储请求指令序列,所述第一存储请求指令序列由所述主机将键值存储请求携带的第一指令码、以及地址信息写入协议定义的字段组成,其中,所述第一指令码为根据所述协议的预留扩展字段定义的指令码;
    所述处理模块,用于从所述第一存储请求指令序列中分离出所述第一指令码和所述地址信息;
    所述控制模块,用于根据所述第一指令码和所述地址信息,通过所述后端通信模块对存储设备进行所述第一指令码对应的操作。
  26. 根据权利要求25所述的存储控制器,其特征在于,所述前端通信模块具体用于:
    从所述主机按照协议定义的字段为所述第一存储请求指令序列分配的内存中读取所述第一存储请求指令序列。
  27. 根据权利要求25或26所述的存储控制器,其特征在于,所述键值存储请求包括:写数据请求、或者获取数据请求、或者删除数据请求、或者废弃数据请求;
    其中,所述写数据请求携带的地址信息包括键和值存放的地址;
    所述获取数据请求携带的地址信息包括键存放的地址信息;
    所述删除数据请求携带的地址信息包括键存放的地址信息;
    所述废弃数据请求携带的地址信息包括键存放的地址信息。
  28. 根据权利要求27所述的存储控制器,其特征在于,若所述键值存储请求为获取数据请求,则
    所述处理模块,还用于在所述前端通信模块获取第一存储请求指令序列之前,获取所述获取数据请求所请求的值的长度;
    所述通信模块,还用于向所述主机发送所述值的长度,以使得所述主机根据所述值的长度为所述值分配内存;
    所述控制模块具体用于:
    根据所述第一指令码和所述地址信息,读数据到所述主机为所述值分配的内存。
  29. 根据权利要求28所述的存储控制器,其特征在于,所述处理模块具体用于:
    通过所述前端通信模块获取第二存储请求指令序列,所述第二存储请求指令序列由所述主机将获取键对应的值的长度的指令码、以及所述键存放的地址信息写入所述协议定义的字段组成,其中,所述获取键对应的值的长度的指令码为根据所述协议的预留扩展字段定义的指令码;
    从所述第二存储请求指令序列中分离出所述获取键对应的值的长度的指令码、以及所述键存放的地址信息;
    根据所述获取键对应的值的长度的指令码、以及所述键存放的地址信息,通过所述控制模块和所述后端通信模块从所述存储设备中获取所述值的长度。
  30. 根据权利要求25-29任一项所述的存储控制器,其特征在于,所述键值存储请求为包含多个单次键值存储请求的聚合键值存储请求;其中,
    所述键值存储请求携带的地址信息通过聚散表的地址信息进行索引,所述聚散表中包含所述多个单次键值存储请求中每个单次键值存储请求携带的地址信息。
  31. 根据权利要求25-30任一项所述的存储控制器,其特征在于,
    所述前端通信模块,还用于从所述主机获取第三存储请求指令序列,所述第三存储请求指令序列由所述主机将非键值存储请求携带的第二指令码、以及数据内存指针写入所述协议定义的字段组成,其中,所述第二指令码为所述协议的标准指令码;
    所述处理模块,还用于从所述第三存储请求指令序列中分离出所述第二指令码和所述数据内存指针;
    所述控制模块,还用于根据所述第二指令码和所述数据内存指针,通过所述后端通信模块对所述存储设备进行所述第二指令码对应的操作。
  32. 根据权利要求25-31任一项所述的存储控制器,其特征在于,
    所述协议包括非易失性存储标准NVMe协议;
    其中,所述NVMe协议定义0-63字节为存储请求指令序列的字段。
  33. 一种键值存储系统,其特征在于,所述键值存储系统包括存储设备、如权利要求17-24任一项所述的主机、以及如权利要求25-32任一项所述的存储控制器。
PCT/CN2017/085983 2016-08-31 2017-05-25 键值存储方法、装置及系统 Ceased WO2018040629A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP17844936.9A EP3495970B1 (en) 2016-08-31 2017-05-25 Key-value storage method, apparatus and system
US16/287,826 US11048642B2 (en) 2016-08-31 2019-02-27 Key-value storage method, apparatus, and system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610794448.4 2016-08-31
CN201610794448.4A CN106469198B (zh) 2016-08-31 2016-08-31 键值存储方法、装置及系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/287,826 Continuation US11048642B2 (en) 2016-08-31 2019-02-27 Key-value storage method, apparatus, and system

Publications (1)

Publication Number Publication Date
WO2018040629A1 true WO2018040629A1 (zh) 2018-03-08

Family

ID=58230478

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/085983 Ceased WO2018040629A1 (zh) 2016-08-31 2017-05-25 键值存储方法、装置及系统

Country Status (4)

Country Link
US (1) US11048642B2 (zh)
EP (1) EP3495970B1 (zh)
CN (1) CN106469198B (zh)
WO (1) WO2018040629A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119623364A (zh) * 2025-02-13 2025-03-14 沐曦集成电路(上海)股份有限公司 基于键值数据库的一对一延时调整方法、电子设备和介质

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106469198B (zh) * 2016-08-31 2019-10-15 华为技术有限公司 键值存储方法、装置及系统
CN107357523B (zh) * 2017-06-27 2021-06-15 联想(北京)有限公司 一种数据处理方法及电子设备
CN107479833B (zh) * 2017-08-21 2020-04-17 中国人民解放军国防科技大学 一种面向键值存储的远程非易失内存访问与管理方法
CN107678685B (zh) * 2017-09-11 2020-01-17 清华大学 基于闪存的存储路径优化的键值存储管理方法
EP3531666B1 (en) * 2017-12-26 2021-09-01 Huawei Technologies Co., Ltd. Method for managing storage devices in a storage system, and storage system
US10715499B2 (en) * 2017-12-27 2020-07-14 Toshiba Memory Corporation System and method for accessing and managing key-value data over networks
CN110275990B (zh) * 2018-03-14 2021-04-23 北京忆芯科技有限公司 Kv存储的键与值的生成方法及装置
CN110324381B (zh) * 2018-03-30 2021-08-03 北京忆芯科技有限公司 云计算与雾计算系统中的kv存储设备
CN109388596B (zh) * 2018-09-29 2019-12-31 上海依图网络科技有限公司 一种数据操作方法和装置
CN109711178B (zh) * 2018-12-18 2021-02-19 北京城市网邻信息技术有限公司 一种键值对的存储方法、装置、设备及存储介质
CN111190844A (zh) * 2019-12-31 2020-05-22 杭州华澜微电子股份有限公司 一种协议转化方法及电子设备
CN111371848A (zh) * 2020-02-21 2020-07-03 苏州浪潮智能科技有限公司 一种请求处理方法、装置、设备及存储介质
CN111399771B (zh) * 2020-02-28 2023-01-10 苏州浪潮智能科技有限公司 一种mcs存储系统的协议配置方法、装置及设备
CN112579003B (zh) * 2020-12-15 2022-06-14 浙江大华技术股份有限公司 键值对的调整方法、装置、存储介质以及电子装置
CN116991338B (zh) * 2023-09-28 2023-12-22 北京超弦存储器研究院 访问数据的方法及控制器、cxl内存模组和存储系统
CN120029963B (zh) * 2025-04-22 2025-11-04 山东云海国创云计算装备产业创新中心有限公司 控制数据交互方法、装置、计算机程序产品、设备及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103955440A (zh) * 2013-12-18 2014-07-30 记忆科技(深圳)有限公司 一种非易失存储设备及其进行数据操作的方法
CN104111907A (zh) * 2014-06-27 2014-10-22 华为技术有限公司 一种访问NVMe存储设备的方法和NVMe存储设备
US20150254003A1 (en) * 2014-03-10 2015-09-10 Futurewei Technologies, Inc. Rdma-ssd dual-port unified memory and network controller
CN106469198A (zh) * 2016-08-31 2017-03-01 华为技术有限公司 键值存储方法、装置及系统

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7457897B1 (en) * 2004-03-17 2008-11-25 Suoer Talent Electronics, Inc. PCI express-compatible controller and interface for flash memory
JP5524144B2 (ja) * 2011-08-08 2014-06-18 株式会社東芝 key−valueストア方式を有するメモリシステム
US10592106B2 (en) 2013-03-20 2020-03-17 Amazon Technologies, Inc. Replication target service

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103955440A (zh) * 2013-12-18 2014-07-30 记忆科技(深圳)有限公司 一种非易失存储设备及其进行数据操作的方法
US20150254003A1 (en) * 2014-03-10 2015-09-10 Futurewei Technologies, Inc. Rdma-ssd dual-port unified memory and network controller
CN104111907A (zh) * 2014-06-27 2014-10-22 华为技术有限公司 一种访问NVMe存储设备的方法和NVMe存储设备
CN106469198A (zh) * 2016-08-31 2017-03-01 华为技术有限公司 键值存储方法、装置及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3495970A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119623364A (zh) * 2025-02-13 2025-03-14 沐曦集成电路(上海)股份有限公司 基于键值数据库的一对一延时调整方法、电子设备和介质

Also Published As

Publication number Publication date
CN106469198A (zh) 2017-03-01
EP3495970A4 (en) 2019-07-31
CN106469198B (zh) 2019-10-15
EP3495970A1 (en) 2019-06-12
US20190196976A1 (en) 2019-06-27
US11048642B2 (en) 2021-06-29
EP3495970B1 (en) 2023-04-05

Similar Documents

Publication Publication Date Title
WO2018040629A1 (zh) 键值存储方法、装置及系统
US12197345B2 (en) Data processing method and NVMe storage device
CN108984465B (zh) 一种消息传输方法及设备
CN109983449B (zh) 数据处理的方法和存储系统
CN112214158B (zh) 主机输出输入命令的执行装置及方法及计算机可读取存储介质
CN116569154B (zh) 数据传输方法和相关装置
CN111427808A (zh) 用于管理存储设备和主机单元之间的通信的系统和方法
WO2022143774A1 (zh) 一种数据访问方法及相关设备
CN114090495B (zh) 数据处理的方法、网卡和服务器
WO2018137217A1 (zh) 一种数据处理的系统、方法及对应装置
KR102471219B1 (ko) NVMe 기반의 데이터 판독 방법, 장치, 및 시스템
WO2022007470A1 (zh) 一种数据传输的方法、芯片和设备
US9558232B1 (en) Data movement bulk copy operation
WO2014209764A1 (en) Nvm express controller for remote memory access
KR20170043993A (ko) 인터페이스 제어 메커니즘을 갖는 전자 시스템 및 그것의 동작 방법
US20260079650A1 (en) Solving submission queue entry overflow with shadow submission queue
WO2015062390A1 (zh) 虚拟机迁移方法、装置及系统
WO2022205054A1 (zh) 存储系统和远程直接数据存取方法
CN112395245B (zh) 处理器的访问装置、方法及计算机设备
US20240168876A1 (en) Solving submission queue entry overflow using metadata or data pointers
US12541451B2 (en) Solving submission queue entry overflow with an additional out-of-order submission queue entry
JP6821313B2 (ja) データ処理システム及びデータ処理方法
CN121233553A (zh) 防止重复数据写入

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17844936

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017844936

Country of ref document: EP

Effective date: 20190308