WO2020042850A1 - 数据的存储方法、装置和存储系统 - Google Patents

数据的存储方法、装置和存储系统 Download PDF

Info

Publication number
WO2020042850A1
WO2020042850A1 PCT/CN2019/098256 CN2019098256W WO2020042850A1 WO 2020042850 A1 WO2020042850 A1 WO 2020042850A1 CN 2019098256 W CN2019098256 W CN 2019098256W WO 2020042850 A1 WO2020042850 A1 WO 2020042850A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
target
storage
valid data
storage area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2019/098256
Other languages
English (en)
French (fr)
Inventor
王英
赵小宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to EP19855114.5A priority Critical patent/EP3839716B1/en
Publication of WO2020042850A1 publication Critical patent/WO2020042850A1/zh
Priority to US17/183,657 priority patent/US12008263B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1044Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices with specific ECC/EDC distribution
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1068Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices in sector programmable memories, e.g. flash disk
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operations
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • G06F12/0246Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0253Garbage collection, i.e. reclamation of unreferenced memory
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0688Non-volatile semiconductor memory arrays
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/12Formatting, e.g. arrangement of data block or words on the record carriers
    • G11B20/1217Formatting, e.g. arrangement of data block or words on the record carriers on discs
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/03Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
    • H03M13/05Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
    • H03M13/13Linear codes
    • H03M13/15Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes
    • H03M13/151Cyclic codes, i.e. cyclic shifts of codewords produce other codewords, e.g. codes defined by a generator polynomial, Bose-Chaudhuri-Hocquenghem [BCH] codes using error location or error correction polynomials
    • H03M13/154Error and erasure correction, e.g. by using the error and erasure locator or Forney polynomial
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/82Solving problems relating to consistency
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/12Formatting, e.g. arrangement of data block or words on the record carriers
    • G11B20/1217Formatting, e.g. arrangement of data block or words on the record carriers on discs
    • G11B2020/1218Formatting, e.g. arrangement of data block or words on the record carriers on discs wherein the formatting concerns a specific area of the disc
    • G11B2020/1238Formatting, e.g. arrangement of data block or words on the record carriers on discs wherein the formatting concerns a specific area of the disc track, i.e. the entire a spirally or concentrically arranged path on which the recording marks are located
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/12Formatting, e.g. arrangement of data block or words on the record carriers
    • G11B2020/1291Formatting, e.g. arrangement of data block or words on the record carriers wherein the formatting serves a specific purpose
    • G11B2020/1292Enhancement of the total storage capacity

Definitions

  • the present application relates to the field of information technology, and more particularly, to a method, an apparatus, and a storage system for storing data.
  • erasure coding In storage systems, data security is a major indicator of storage system performance. In order to enable data to be safely stored in a storage system, many data protection mechanisms are provided in the prior art.
  • the principle of erasure coding is to use erasure coding (EC) coding to encode the original data to generate multiple data fragments, and store the data fragments in multiple memories. After a memory failure causes data in the memory to be lost, the above-mentioned lost data can be recovered through data fragments stored in other memories.
  • mirror storage means data storage in a mirror manner. Store the same data in the master node and the standby node of the storage system respectively. In this way, when a memory failure in the master node causes data loss, the above-mentioned lost data can be copied from the standby node to other memories in the master node. .
  • the lost data is recovered by the EC method or the mirror method, the recovered data is randomly stored in the storage area of the memory, resulting in very different time to expiration of the data stored in the final storage area. Big.
  • the garbage collection (GC) mechanism the storage area is used as the basic unit of garbage collection.
  • the valid data in the storage area needs to be migrated to other storage areas first. Only then can garbage collection be performed on the storage area to be recycled, which causes a lot of time for garbage collection and the efficiency is very low.
  • the present application provides a data storage method and device, which is beneficial to reduce the time occupied by garbage collection and improve the efficiency of garbage collection.
  • a data storage method is provided.
  • the method is applied to a storage system, where the storage system includes at least one first storage, the storage system further includes a second storage, and the at least one first storage includes A plurality of storage areas, each of the plurality of storage areas being a unit for garbage collection, the method includes: after a data failure occurs in the second storage, obtaining a plurality of target valid data to be recovered is invalidated Time, and recovery of the plurality of target valid data, the expiration time is used to indicate the time when valid data becomes invalid data; selecting a target storage area from the plurality of storage areas, and At least a part of the target valid data is stored in the target storage area, wherein after the at least part of the target valid data is stored in the target storage area, The length of time between the earliest failure time and the latest failure time in the failure time is less than or equal to the The length of time.
  • the target storage area Before storing multiple target valid data in the target storage area, the target storage area may be a blank storage area without stored data, or a storage area in which a part of data has been stored.
  • the valid data stored in the target valid data may include at least part of the target valid data in the multiple target valid data, and may also include other data in addition to the at least part of the target valid data, and the other valid data may be the target storage.
  • the data stored in the area before storing the at least part of the target valid data may also be newly written data after the target storage area stores the at least part of the target valid data.
  • a time length between the earliest expiration time and the latest expiration time is less than or equal to a preset time length, and is specifically divided into the following three cases.
  • Case 1 If valid data is already stored in the target storage area before at least part of the target valid data is saved, after saving at least part of the target valid data this time, the valid data saved this time and the existing valid data
  • the length of time between the earliest expiration time of the valid data and the latest expiration time of the valid data is less than or equal to a preset time length.
  • Case two if there is no valid data in the target storage area before at least part of the target valid data is stored this time. In the valid data stored this time, the time length between the earliest expiration time of the valid data and the latest expiration time of the valid data is less than or equal to a preset time length.
  • Case 3 If at least part of the target valid data is stored in the target storage area and other valid data continues to be stored in the target storage area, the set of the at least part of the target valid data and the above other valid data is stored in the set: The length of time between the earliest expiration time of the valid data and the latest expiration time of the valid data is less than or equal to a preset time length.
  • case three can also be combined, for example, case three can be combined with case one and case two, respectively.
  • case three and case one are combined, three parts of data can be stored in the target storage area, that is, valid data stored in the target storage area before at least part of the target valid data, at least part of the target valid data itself, and at least part of the target Other valid data stored in the target storage area after the valid data, that is, at least part of the valid data stored in the target storage area before the valid data, at least part of the target valid data itself, and at least part of the target valid data after being stored in the target storage In the set composed of other valid data in the area, a time length between the earliest failure time and the latest failure time of the valid data is less than or equal to a preset time length.
  • the target storage area by limiting the time length between the earliest expiration time and the latest expiration time of valid data stored in the target storage area to be less than or equal to a preset time length, the target storage area The invalidation time of the stored valid data is relatively concentrated. In the process of garbage collection with the target storage area, it is beneficial to reduce the time occupied by the migration of valid data and improve the efficiency of garbage collection.
  • the storage area does not need to be recycled. If the time of garbage collection is later than the latest expiration time, all the data in the storage area becomes invalid data and can be directly recycled without performing the step of migrating valid data. If the garbage collection time is between the earliest expiration time and the latest expiration time, because the expiration time of the data stored in the storage area is more concentrated, from a macro perspective, the effective data that needs to be migrated in each storage area The amount of data is less than the amount of valid data that needs to be migrated in the traditional recovery data storage mechanism.
  • the invalidation times of the valid data stored in the target storage area may all be the same. Thus, if the target storage area is All the data is invalid.
  • garbage collection there is no need to perform the step of migrating valid data, which minimizes the time occupied by migrating valid data during the garbage collection process.
  • data stored in the multiple storage areas is written in a sequential writing manner.
  • the at least one first memory and the second memory are shingled magnetic recording SMR hard disks, and each of the plurality of storage areas is a sequential area Szone.
  • the at least one memory is a solid state drive SSD, and each of the plurality of storage areas is a block.
  • the method further includes: determining a storage order of the multiple target valid data according to the expiration time of each valid data of the multiple target valid data, and the storage order It is used to instruct all the valid data in the multiple target valid data to be stored in the target storage area after all the data is written, and then write all valid data in the next storage area to be stored in the next storage area.
  • the next storage area is a storage area other than the target storage area among the plurality of storage areas; selecting the target storage area from the plurality of storage areas, and And storing at least a part of the target valid data in the target storage area includes: selecting a target storage area from the plurality of storage areas, and storing the at least part of the target valid data in accordance with the storage order. To the target storage area.
  • the target valid data is stored in the target storage area in a storage order, and the storage order is used to indicate that data of the multiple target valid data that is to be stored in the same storage area is written continuously. It avoids that in the process of storing multiple target valid data, the target valid data written adjacently is data stored in different target storage areas, resulting in the need to Switching back and forth in the target storage area is conducive to improving the efficiency of storing target valid data.
  • the head of the SMR hard disk will swing, which will cause the time of data storage to increase, which will seriously affect the performance of SMR hard disk data .
  • the expiration time of any one of the plurality of target valid data is stored in metadata of a data fragment corresponding to the any one of the target valid data, and the one A data fragment corresponding to one target valid data and any one of the target valid data is generated by erasure code encoding on the original data.
  • the expiration time is represented by a data life cycle and data writing time.
  • the life cycle of the above data can be determined according to the service to which the data belongs. Data belonging to different services have their own corresponding life cycles.
  • a data storage device in a second aspect, includes various modules for performing the foregoing method.
  • a storage system includes various modules for performing the foregoing methods.
  • a storage system including at least one processor and at least one memory.
  • the at least one memory is configured to store a computer program
  • the at least one processor is configured to call and run the computer program from the memory, so that the storage system executes the foregoing method.
  • the at least one processor and the at least one memory may be located in multiple different storage nodes, and may also be located in the same storage node.
  • a storage system including at least one processor and at least one memory.
  • the at least one memory is configured to store a computer program
  • the at least one processor is configured to call and run the computer program from the memory, so that the storage system executes the foregoing method.
  • a computer program product includes: computer program code that, when the computer program code runs on a computer, causes the computer to execute the methods in the above aspects.
  • the above computer program code may be stored in whole or in part on a first storage medium, where the first storage medium may be packaged with the processor, or may be packaged separately with the processor. This embodiment of the present application does not deal with this. Specific limitations.
  • a computer-readable medium stores program code, and when the computer program code runs on a computer, the computer causes the computer to execute the methods in the foregoing aspects.
  • FIG. 1 is an architecture diagram of a storage system to which an embodiment of the present application is applied.
  • FIG. 2 is a schematic diagram showing a storage result for storing recovered data based on a random storage mechanism.
  • FIG. 3 is a schematic diagram of a time sequence between failure times.
  • FIG. 4 is a schematic flowchart of a data storage method according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a time attribute structure according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a storage result of recovered data according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a data storage device according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a controller according to an embodiment of the present application.
  • a mechanism for managing storage space When the remaining storage space of the storage space is insufficient, the invalid data stored in the storage space can be deleted through the garbage collection mechanism to achieve the purpose of recycling storage resources.
  • the recovered storage resources can be used to store other valid data to improve the use of storage space. rate.
  • Invalid data can be understood as data that needs to be deleted from the storage system. For example, it could be data with an expiration date, or data with an expiration time.
  • Valid data can be understood as data that the storage system needs to continue to store. For example, it can be data that has not yet expired, or data that has not expired.
  • the storage area may be a recycling unit for garbage collection, that is, a basic unit when the memory is garbage collected, for example, it may be an integer multiple of the smallest unit for garbage collection, or the smallest unit for garbage collection for the memory.
  • the basic unit of garbage collection may include at least one sequential zone (Szone), and the size of each sequential zone is usually 256M.
  • the basic unit of garbage collection may include at least one block.
  • FIG. 1 is an architecture diagram of a distributed storage system to which an embodiment of the present application is applied.
  • the distributed storage system 100 shown in FIG. 1 includes a plurality of storage nodes 110 and a controller 120.
  • Multiple storage nodes 110 that is, storage node 1 to storage node n, where each storage node may be provided with multiple memories.
  • m memories are provided, that is, memory 1 to memory m.
  • the storage node can provide storage space for data through respective memories, where n and m are positive integers greater than 1.
  • the foregoing storage node may specifically be a storage node in a distributed storage system, such as a storage server.
  • the storage node may also be a storage node in a cluster storage system.
  • any of the storage nodes may be used as a master node, and other storage nodes may be used as mirror nodes (also called backup nodes) of the master node.
  • the controller 120 is configured to recover the lost data from the data.
  • the above-mentioned controller may be a controller having a control function independently of the storage node, and the above-mentioned controller may also be a controller located in a certain storage node.
  • the foregoing storage system is a distributed system, and the controller may be composed of at least one processor, and the at least one processor may be distributed in at least one storage node (server).
  • the controller may be located on the master node, such as a processor in the master node.
  • the controller can use a data protection mechanism (for example, EC, or mirror storage) to lose the data according to the data stored in other memories in the storage system. Data is restored, and the restored data is randomly stored in the storage system.
  • a data protection mechanism for example, EC, or mirror storage
  • This mechanism of randomly storing the recovered data in the storage system causes the expiration time of the recovered data stored in each target storage area in the storage system to be random.
  • garbage collection usually only a small part of the data in the storage area to be recycled reaches the invalidation time and becomes invalid data. The remaining data is still valid data. At this time, you need to wait for a large number of After valid data is migrated from the storage area to be recycled to other storage areas, garbage collection can be performed on the storage area to be recycled.
  • FIG. 2 shows a schematic diagram of storing the storage result of the recovered data based on the random storage mechanism.
  • the data a to data p are stored in the memory 2 among the multiple memories, where the failure times of the data a, data b, data c, and data d are the first failure times, and the data e, data f, data g, and data
  • the failure time of h is the second failure time
  • the failure time of data i, data j, data k, and data l is the third failure time
  • the failure time of data m, data n, data o, and data p is the fourth failure time.
  • the time sequence between the first failure time, the second failure time, the third failure time, and the fourth failure time is shown in FIG. 3.
  • storage area 1 When the memory 2 fails, causing the data a to p stored in the memory 2 to be lost, it is necessary to recover the lost data in the memory 2 through the data stored in other memories, and randomly store the recovered data to the memory 1
  • storage area 2 storage area 3, and storage area 4
  • storage area 1 finally stores data a, data e, data i, and data n
  • storage area 2 stores Data c, data f, data k, and data p.
  • the storage area 3 stores data d, data h, data 1, and data m
  • the storage area 4 stores data b, data g, data j, and data o.
  • garbage collection is performed on the above 4 storage areas within the time period between the first expiration time and the second expiration time, only data a in storage area 1, data c in storage area 2, and storage area 3
  • the data d in the storage area 4 and the data b in the storage area 4 are invalid.
  • the remaining data are all valid data.
  • the valid data in each storage area needs to be migrated to other storage areas. In this way, a large amount of data is migrated, which causes a lot of time for garbage collection, which is very inefficient.
  • this application provides a new technology for storing and recovering data.
  • the recovery data ie, the target valid data below
  • the recovery data is stored in the same storage area (the following target storage area).
  • garbage collection is performed on a storage area basis. It is beneficial to reduce the time taken to migrate valid data and improve the efficiency of garbage collection.
  • the recovery data is randomly stored in the storage area, which causes the expiration time of the recovery data stored in each storage area to be random.
  • garbage collection using the storage area as a unit Often, only a part of the scattered data in each storage area in the storage system is invalidated, so that the amount of valid data that needs to be migrated is large.
  • the following describes the data storage method in the embodiment of the present application with reference to FIG. 4.
  • the method is applied to a storage system (for example, the storage system shown in FIG. 1).
  • a storage system for example, the storage system shown in FIG. 1.
  • a memory that works normally in a storage system is called a first memory
  • a memory that has a data failure in the storage system is called a second memory, where the normally working memory includes multiple storage areas.
  • FIG. 4 is a schematic flowchart of a data storage method according to an embodiment of the present application.
  • the method shown in FIG. 4 can be executed by a device having a control function in the storage system, for example, the controller shown in FIG. 1.
  • the method shown in FIG. 4 includes steps 410 and 420.
  • the expiration time is used to indicate the time when valid data becomes invalid data. It can be directly stored in the storage system.
  • the expiration time can also be stored in the storage system indirectly through the time attribute.
  • the time attribute can include The data life cycle and the data writing time, wherein the data life cycle is used to indicate the total length of time that the data is stored in the storage system, and the write time is used to indicate the time that the data is written to the storage system. Is the start time of the life cycle. After the total time indicated by the life cycle, the end time of the life cycle is the expiration time.
  • the life cycle of the above data can be determined according to the service to which the data belongs. Data belonging to different services have their own corresponding life cycles.
  • the original data that has not been stored in the storage system is usually EC coded to generate multiple data fragments. Because these multiple data fragments are generated based on the original data, they are being sent to the storage system. When the multiple data fragments are written, the multiple data fragments are continuously written to different memories in the storage system, that is, the write time of the multiple data is also the same. Therefore, the above multiple data The failure time of each shard is the same.
  • the target valid data fragment that is, the target valid data
  • the data in the multiple data fragments stored in the storage system may be lost. Obtain the invalidation time of the target valid data segment in other data segments.
  • the storage system is a distributed storage system
  • the client of the distributed storage system divides the data written by the user into multiple data fragments through EC coding. It is possible to add a life cycle to the metadata of each data slice according to the business to which the business data belongs.
  • each storage node (including a data shard node and a redundant shard node) in the distributed storage system receives at least a part of the data shards from the multiple data shards, it stores the data shards to the storage node.
  • write time can be added to the metadata of each data slice.
  • the storage nodes can directly add the data shard metadata to the data shards during the process of writing the data shards. Expiration time.
  • data needs to be stored not only on the master node, but also on the standby node.
  • the data stored on the standby node is a copy of the data stored on the master node
  • the master node The expiration time of the stored data is the same as the replica data of the data stored in the standby node.
  • the invalidation time of the target valid data can be obtained from the standby node.
  • the recovering the target valid data may include recovering multiple target valid data according to the data stored in the storage system.
  • multiple data fragments stored in the storage system can be EC-decoded to recover the multiple valid data.
  • the specific decoding method can refer to the traditional EC decoding method, which is not described in the embodiment of this application. Detailed description.
  • multiple target valid data can be copied from the standby node to recover the multiple target valid data.
  • the time relationship between the execution process of the recovered data and the selection of the target storage area in step 410 and step 420 is relatively flexible.
  • the execution process of the recovered data may be performed before step 410, and the execution process of the recovered data may also be selected.
  • the target storage area is performed before.
  • the process of recovering data may also be performed after the target storage area is selected.
  • the process of recovering data may also be performed after step 410, which is not specifically limited in this embodiment of the present application.
  • the execution process of the recovered data is performed before step 410, the expiration times of the above-mentioned multiple valid valid data may be directly obtained from the metadata of the recovered data.
  • the target storage area Select a target storage area from the multiple storage areas, and send at least a part of the target valid data from the restored multiple target valid data to a memory where the target storage area is located, where at least a part of the target valid data is stored to After the target storage area, the length of time between the earliest failure time and the latest failure time of the valid data stored in the target storage area is less than or equal to a preset time length.
  • the target storage area Before storing multiple target valid data in the target storage area, the target storage area may be a blank storage area without stored data, or a storage area in which a part of data has been stored.
  • the valid data stored in the target valid data may include at least part of the target valid data in the multiple target valid data, and may also include other data in addition to the at least part of the target valid data, and the other valid data may be the target storage.
  • the data stored in the area before storing the at least part of the target valid data may also be newly written data after the target storage area stores the at least part of the target valid data.
  • a time length between the earliest expiration time and the latest expiration time is less than or equal to a preset time length, and is specifically divided into the following three cases.
  • Case 1 If valid data is already stored in the target storage area before at least part of the target valid data is saved, after saving at least part of the target valid data this time, the valid data saved this time and the existing valid data
  • the length of time between the earliest expiration time of the valid data and the latest expiration time of the valid data is less than or equal to a preset time length.
  • Case two if there is no valid data in the target storage area before at least part of the target valid data is stored this time. In the valid data stored this time, the time length between the earliest expiration time of the valid data and the latest expiration time of the valid data is less than or equal to a preset time length.
  • Case 3 If at least part of the target valid data is stored in the target storage area and other valid data continues to be stored in the target storage area, the set of the at least part of the target valid data and the above other valid data is stored in the set: The length of time between the earliest expiration time of the valid data and the latest expiration time of the valid data is less than or equal to a preset time length.
  • case three can also be combined, for example, case three can be combined with case one and case two, respectively.
  • case three and case one are combined, three parts of data can be stored in the target storage area, that is, valid data stored in the target storage area before at least part of the target valid data, at least part of the target valid data itself, and at least part of the target Other valid data stored in the target storage area after the valid data, that is, at least part of the valid data stored in the target storage area before the valid data, at least part of the target valid data itself, and at least part of the target valid data after being stored in the target storage In the set composed of other valid data in the area, a time length between the earliest failure time and the latest failure time of the valid data is less than or equal to a preset time length.
  • the valid data stored in the target storage area may include some target valid data in the multiple target valid data, and the valid data stored in the target storage area may include all target valid data in the multiple target valid data.
  • the target storage area is originally a blank storage area, at least part of the target valid data can be directly stored in the target storage area. If the target storage area originally contained data, then you need to ensure that the length of time between the earliest expiration time and the latest expiration time of the data to be stored and the stored data expiration time in the target effective storage area is less than or equal to the preset Length of time.
  • the target storage area by limiting the time length between the earliest expiration time and the latest expiration time of valid data stored in the target storage area to be less than or equal to a preset time length, the target storage area The invalidation time of the stored valid data is relatively concentrated. In the process of garbage collection with the target storage area, it is beneficial to reduce the time occupied by the migration of valid data and improve the efficiency of garbage collection.
  • the storage area does not need to be recycled. If the time of garbage collection is later than the latest expiration time, all the data in the storage area becomes invalid data and can be directly recycled without performing the step of migrating valid data. If the garbage collection time is between the earliest expiration time and the latest expiration time, because the expiration time of the data stored in the storage area is more concentrated, from a macro perspective, the effective data that needs to be migrated in each storage area The amount of data is less than the amount of valid data that needs to be migrated in the traditional recovery data storage mechanism.
  • the expiration time of the valid data stored in the target storage area can be all the same, that is, the data stored in the target storage area either becomes invalid data at the same time, or it is all valid data. In this way, if all the data in the target storage area is valid Invalidation, when garbage collection is needed, there is no need to perform the step of migrating valid data, which minimizes the time occupied by migrating valid data during the garbage collection process.
  • the data with the same expiration time may be data with the same life cycle and write time, and the data with the same expiration time may be data with different write time and different life cycle, but the end of the life cycle Data with the same time, for example, data with the same remaining life cycle and the same time to recover the data, where the remaining life cycle indicates the remaining effective time of the data after recovery, that is, starting from the time to recover the data, after the remaining life cycle, the remaining The end of the life cycle is the expiration time of the data.
  • FIG. 6 is a schematic diagram of a storage result of recovered data according to an embodiment of the present application.
  • a life period (period) indicated by p and a write time (t) indicated by t indicate a data expiration time.
  • the failure time of the data a, data b, data c, and data d is the first failure time, which is represented by (p1, t1); data e
  • the data, f, data g, and data h failure times are the second failure time, represented by (p1, t2);
  • the data i, data j, data k, and data l failure times are the third failure time, using (p2, t1) indicates;
  • the failure time of data m, data n, data o, and data p is the fourth failure time, which is represented by (p2, t2).
  • the same storage strategy can be used to recover the lost data in memory 2 and store the recovered data to storage area 1 and memory 3 in memory 1, respectively.
  • the plurality of storage areas may be located in one or more first memories.
  • the at least one first storage and the second storage are located in one storage node, and the at least one first storage and the second storage may also be located in different storage nodes, which is not limited in the embodiment of the present application.
  • the memory for storing the invalidation time of multiple target valid data may be the same memory as the memory for storing the target valid data (that is, the memory where the target storage area is located);
  • the storage of valid data (that is, the storage where the target storage area is located) is a different storage, which is not limited in the embodiment of the present application.
  • the storage node where the memory storing the expiration time may be located in the same storage node as the at least one first storage, and the storage node where the memory storing the expiration time may be different from the storage node where the at least one first storage is located.
  • the storage node where the controller performing steps 410 and 420 is located may be the same storage node as the storage node where the target storage area is located. That is, after a data failure occurs in the second storage in the storage node, the storage node The controller in the server stores the target valid data to the target storage area by performing the above steps 410 and 420.
  • the storage node on which the controller performing steps 410 and 420 is located may also be a storage node that is different from the storage node on which the target storage area is located, that is, the controller may send at least part of the target valid data after recovery to the target The storage node where the storage area is located, and the storage node where the target storage area is located stores at least part of the target valid data to the target storage area.
  • the storage node where the controller performing step 410 and step 420 is located may be the same storage node as the storage node where the second storage is located, and the storage node where the controller performing step 410 and step 420 is located may be the same as the second storage Storage nodes are different storage nodes.
  • the storage area for storing the data for recovering the target valid data may be in the same storage node as the target storage area, or even in the same memory.
  • the storage area for storing the data for recovering the target valid data may be located at a different storage node from the target storage area.
  • the data of the same data stream will be naturally written into continuous storage space.
  • the expiration time of the data stored in the target storage area is usually the same. Even if the expiration time of the data stored in each target storage area is not exactly the same, it is usually concentrated. In the process of garbage collection based on the target storage area, the amount of effective data migration is usually not large. However, as described above, during the process of storing the recovered data, due to the random storage of the recovered data, the expiration time of the recovered data stored in the target storage area is also random. Therefore, the method in the embodiment of the present application also The same applies to the above-mentioned storage medium that supports sequential writing.
  • the memory mentioned in the embodiment of the present invention may be a “sequential write medium” such as an SSD or an SMR, and may also be an “append” memory.
  • software such as storage management software
  • the additional write memory the newly written data cannot directly overwrite the existing data, for example, the newly written data is added after the existing data in the memory is stored.
  • target valid data in the target storage area In the process of storing target valid data in the target storage area, data that needs to be stored in the same target storage area can usually be written continuously. It avoids that in the process of storing multiple target valid data, the target valid data written adjacently is data stored in different target storage areas, resulting in the need to Switching back and forth in the target storage area is conducive to improving the efficiency of storing target valid data.
  • the head of the SMR hard disk will swing, which will cause the time of data storage to increase, which will seriously affect the performance of SMR hard disk data writing. .
  • the storage order of multiple target valid data may be determined according to the expiration time of multiple target valid data. Based on the storage order, multiple target valid data are prepared to be stored in the same The target effective data of the storage area is continuously written into the target storage area.
  • a storage order of the plurality of target valid data is determined according to the expiration time of each valid data of the plurality of target valid data, and the storage order is used to indicate that the plurality of target valid data is ready to be stored. After all the data in the target storage area is written, all valid data to be stored in the next storage area is written to the next storage area, and the next storage area is the multiple storage areas.
  • the specific sorting method of the above storage order can be used in conjunction with the storage strategy introduced above.
  • the above storage order may specifically indicate that target valid data with the same expiration time is continuously written.
  • the storage policy is to store target valid data with an expiration time within a preset time length threshold in the same storage area, the above storage order may indicate that the target valid data that satisfies the storage policy is continuously written.
  • the restored valid data of the target is stored in the target storage area according to the storage sequence.
  • the above-mentioned process of establishing a storage sequence may also be performed before recovering multiple target valid data.
  • the data stored in a storage area can be aggregated into a data set and temporarily stored according to the expiration time. Then, the target valid data in the data set is sequentially stored in the storage area in units of the data set.
  • the above-mentioned process of establishing the storage sequence is performed before recovering multiple target valid data, at this time, the multiple target valid data fragments have not been recovered, and the purpose of adjusting the storage order may be achieved by adjusting the data recovery order.
  • the method of adjusting the storage order of data is described by taking the same expiration time of valid data stored in the target storage area as an example. Since the expiration time mentioned above, or the time attribute used to indicate the expiration time, can be added to the metadata of the data as a key, the traditional hash algorithm can be reused to valid data for multiple targets The hash of the key is performed. In this way, the data carrying the same key can be hashed into the same bucket, and then the valid data of multiple targets is restored in the unit of the bucket, and the restored target valid data is stored in the corresponding bucket. Destination storage area.
  • the data stored in the second memory needs to be stored for 30 days, and the time attributes (life cycle and write time) stored in the metadata of the data are 1020180602, 1020180601, and 3020180602, respectively.
  • the time attributes of the data that needs to be recovered from the storage system are 1020180602, 1020180601, and 3020180602, respectively, and the above time attributes are hashed respectively as keys.
  • writing on June 2, 2018 requires storage The 10-day data, the data that needs to be stored for 10 days when written on June 1, 2018, and the data that needs to be stored for 30 days when written on June 2, 2018, will be hashed to the hash buckets of 1020180602, 1020180601, and 3020180602 respectively.
  • the data on the buckets 1020180602, 1020180601, and 3020180602 are sequentially restored, so that the data stored in the same storage area can be written one after another.
  • FIG. 7 is a schematic diagram of a data storage device according to an embodiment of the present application.
  • the apparatus 700 shown in FIG. 7 may be applied to a storage system, where the storage system includes at least one first memory, the storage system further includes a second memory, the at least one first memory includes multiple storage areas, and the multiple Each of the storage areas is a unit for garbage collection.
  • the apparatus 700 may include a processing unit 710.
  • a processing unit 710 configured to obtain the invalidation time of a plurality of target valid data to be recovered after a data failure occurs in the second memory, where the invalidation time is used to indicate a time when valid data becomes invalid data;
  • the processing unit 710 is further configured to select a target storage area from the plurality of storage areas, and store at least a part of the target valid data of the restored plurality of target valid data to the target storage area, where After the at least part of the target valid data is stored in the target storage area, the length of time between the earliest invalidation time and the latest invalidation time of valid data stored in the target storage area is less than or equal to The preset length of time.
  • the target storage area by limiting the time length between the earliest expiration time and the latest expiration time of valid data stored in the target storage area to be less than or equal to a preset time length, the target storage area The invalidation time of the stored valid data is relatively concentrated. In the process of garbage collection with the target storage area, it is beneficial to reduce the time occupied by the migration of valid data and improve the efficiency of garbage collection.
  • the expiration time of the valid data stored in the target storage area is the same.
  • data stored in the multiple storage areas is written in a sequential writing manner.
  • the at least one first memory and the second memory are shingled magnetic recording SMR hard disks, and each of the plurality of storage areas is a sequential area Szone.
  • the at least one memory is a solid state drive SSD, and each of the plurality of storage areas is a block.
  • the processing unit is further configured to determine a storage order of the plurality of target valid data according to the expiration time of each valid data in the plurality of target valid data.
  • the storage order is used to indicate that data of the plurality of target valid data that is to be stored in the same storage area is written continuously; and selecting a target storage area from the plurality of storage areas, and in accordance with the storage order, The at least part of the target valid data is stored in the target storage area.
  • the expiration time of any one of the plurality of target valid data is stored in metadata of a data fragment corresponding to the any one of the target valid data, and the one A data fragment corresponding to one target valid data and any one of the target valid data is generated by erasure code encoding on the original data.
  • the expiration time is represented by a data life cycle and data writing time.
  • the foregoing apparatus 700 may also be a controller 800.
  • the processing unit 710 may be at least one processor 820.
  • the controller 800 may further include at least one memory 810 and at least one input / output interface 830, as shown in FIG. 8.
  • FIG. 8 is a schematic diagram of a controller according to an embodiment of the present application.
  • the controller 800 shown in FIG. 8 may include at least one memory 810, at least one processor 820, and at least one input / output interface 830. Among them, at least one memory 810, at least one processor 820, and at least one input / output interface 830 are connected through a communication connection.
  • the at least one memory 810 is used to store program instructions, and the at least one processor 820 is used to execute the memory 820 storage. Program instructions to control at least one input / output interface 830 to receive input data and information, and output data such as operation results.
  • the memory 810 may include a read-only memory and a random access memory, and provide instructions and data to the processor 820.
  • a portion of the processor 820 may also include non-volatile random access memory.
  • the processor 820 may also store information of a device type.
  • each step of the above method may be completed by using hardware integrated logic circuits or instructions in the form of software in the processor 820.
  • the method disclosed in combination with the embodiments of the present application may be directly implemented by a hardware processor, or may be performed by a combination of hardware and software modules in the processor.
  • the software module may be located in a mature storage medium such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, or an electrically erasable programmable memory, a register, and the like.
  • the storage medium is located in the memory 810, and the processor 820 reads information in the memory 810 and completes the steps of the foregoing method in combination with its hardware. To avoid repetition, it will not be described in detail here.
  • the processor may be a central processing unit (CPU), and the processor may also be other general-purpose processors, digital signal processors (DSPs), and special-purpose integrations.
  • Circuit application specific integrated circuit, ASIC
  • ready-made programmable gate array field programmable gate array, FPGA
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • An embodiment of the present application further provides a storage system.
  • the storage system includes at least one first storage, and the storage system further includes a second storage.
  • the at least one first storage includes multiple storage areas, and each of the multiple storage areas is garbage collected. unit.
  • the storage system may include a processing unit and a processing unit.
  • a processing unit configured to obtain the invalidation time of a plurality of target valid data to be recovered after a data failure occurs in the second memory, where the invalidation time is used to indicate a time when valid data becomes invalid data;
  • a processing unit configured to select a target storage area from the plurality of storage areas, and store at least a part of the target valid data of the restored plurality of target valid data to the target storage area, where After at least part of the target valid data is stored in the target storage area, a length of time between the earliest failure time and the latest failure time of the valid data stored in the target storage area is less than or equal to a preset time length.
  • the above storage system involves three types of storage nodes, that is, the storage node where the second memory where the data failure occurs, the storage node where the target storage area is located, and the storage node where the processing unit and processing unit are located.
  • the above three types of storage nodes may be the same storage node (for convenience of description, hereinafter referred to as the first storage node).
  • the controller in the first storage node After a data failure occurs in the second storage in the first storage node, the controller in the first storage node The restored at least part of the target valid data is then stored in a target storage area in the first storage node.
  • the above storage system includes at least one storage node, that is, the first storage node.
  • the above three types of storage nodes may also be different storage nodes.
  • the storage node where the second storage is located and the processing unit and the storage node where the processing unit is located may be the same storage node (also referred to as a second storage node). )
  • the second storage node and the storage node where the target storage area is located are different storage nodes, that is, after the second storage node in the second storage node has a data failure, the second storage node
  • the processing unit and the processing unit may perform data recovery on multiple target valid data, and send at least a part of the target valid data among the restored multiple target valid data to a third storage node, so that the third storage node sends at least a part of the above.
  • the target valid data is stored in the target storage area.
  • the above storage system includes at least two storage nodes, that is, a second storage node and a third storage node.
  • the target storage area by limiting the time length between the earliest expiration time and the latest expiration time of valid data stored in the target storage area to be less than or equal to a preset time length, the target storage area The invalidation time of the stored valid data is relatively concentrated. In the process of garbage collection with the target storage area, it is beneficial to reduce the time occupied by the migration of valid data and improve the efficiency of garbage collection.
  • the expiration time of the valid data stored in the target storage area is the same.
  • data stored in the multiple storage areas is written in a sequential writing manner.
  • the at least one first memory and the second memory are shingled magnetic recording SMR hard disks, and each of the plurality of storage areas is a sequential area Szone.
  • the at least one memory is a solid state drive SSD, and each of the plurality of storage areas is a block.
  • the processing unit is further configured to determine a storage order of the plurality of target valid data according to the expiration time of each valid data in the plurality of target valid data.
  • the storage order is used to indicate that data of the plurality of target valid data that is to be stored in the same storage area is written continuously; and selecting a target storage area from the plurality of storage areas, and in accordance with the storage order, The at least part of the target valid data is stored in the target storage area.
  • the expiration time of any one of the plurality of target valid data is stored in metadata of a data fragment corresponding to the any one of the target valid data, and the one A data fragment corresponding to one target valid data and any one of the target valid data is generated by erasure code encoding on the original data.
  • the expiration time is represented by a data life cycle and data writing time.
  • the size of the sequence numbers of the above processes does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not deal with the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission by wire (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be read by a computer or a data storage device such as a server, a data center, and the like that includes one or more available medium integration.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a digital video disc (DVD)), or a semiconductor medium (for example, a solid state disk (SSD) )Wait.
  • a magnetic medium for example, a floppy disk, a hard disk, a magnetic tape
  • an optical medium for example, a digital video disc (DVD)
  • DVD digital video disc
  • SSD solid state disk

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)
  • Algebra (AREA)
  • Pure & Applied Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

一种数据的存储方法、装置和存储系统,该方法应用于存储系统,存储系统包括至少一个第一存储器,存储系统还包括第二存储器,至少一个第一存储器包括多个存储区域,多个存储区域中每个存储区域是进行垃圾回收的单元,该方法通过限定目标存储区域中存储的有效数据的失效时间中最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度,使得目标存储区域中存储的有效数据的失效时间比较集中,在以目标存储区域进行垃圾回收的过程中,有利于减少迁移有效数据占用的时间,以提高垃圾回收的效率。

Description

数据的存储方法、装置和存储系统 技术领域
本申请涉及信息技术领域,并且更具体地,涉及数据的存储方法、装置和存储系统。
背景技术
在存储系统中,数据安全是衡量存储系统性能的一大指标。为了使得数据能够安全地被存储在存储系统中,现有技术中提供了很多数据保护机制。例如,纠删码原理,即采用纠删码(erasure coded,EC)编码的方式对原始数据进行编码生成多个数据分片,并将数据分片存储至多个存储器中,当多个存储器中有一个存储器故障,导致该存储器中的数据丢失后,可以通过其他存储器中存储的数据分片,对上述丢失的数据进行恢复。又例如,镜像存储,即以镜像的方式进行数据存储。在存储系统的主节点和备节点中分别存储一份相同的数据,如此,当主节点中的一个存储器故障导致数据丢失后,可以从备节点中将上述丢失的数据复制到主节点的其他存储器中。
然而,无论是通过EC的方式还是镜像的方式对丢失的数据进行恢复时,恢复后的数据都是随机的存储在存储器的存储区域中的,导致最终存储区域中存储的数据的失效时间差别非常大。如此,通过垃圾回收(garbage collection,GC)机制,以存储区域作为垃圾回收的基本单元,回收存储区域中无效数据占用的存储区域时,需要先将该存储区域中的有效数据迁移至其他存储区域后,才能对待回收的存储区域进行垃圾回收,导致垃圾回收占用了大量的时间,效率非常低。
发明内容
本申请提供一种数据的存储方法和装置,有利于降低垃圾回收占用的时间,提高垃圾回收的效率。
第一方面,提供了一种数据的存储方法,所述方法应用于存储系统,所述存储系统包括至少一个第一存储器,所述存储系统还包括第二存储器,所述至少一个第一存储器包括多个存储区域,所述多个存储区域中每个存储区域是进行垃圾回收的单元,所述方法包括:在所述第二存储器出现数据故障后,获取待恢复的多个目标有效数据的失效时间,以及对所述多个目标有效数据进行恢复,所述失效时间用于指示有效数据变为无效数据的时间;从所述多个存储区域中选择目标存储区域,并将恢复后的所述多个目标有效数据中的至少部分目标有效数据存储至所述目标存储区域,其中,在把所述至少部分目标有效数据存储至所述目标存储区域之后,所述目标存储区域存储的有效数据的失效时间中最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度。
在将多个目标有效数据存储目标存储区域之前,目标存储区域可以是没有存储数据的空白的存储区域,还可以是已经存储了一部分数据的存储区域。
上述目标有效数据存储的有效数据,可以包括上述多个有目标有效数据中的至少部分目标有效数据,还可以包括除上述至少部分目标有效数据之外的其他数据,该其他有效数据可以是目标存储区域在存储上述至少部分目标有效数据之前存储有的数据,还可以是在 目标存储区域存储了上述至少部分目标有效数据之后,通过追加写的方式新写入的数据。
上述目标存储区域中存储的有效数据的失效时间中:最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度,具体分为以下三种情况。
情况一:若在本次存入至少部分目标有效数据之前,目标存储区域已存有有效数据,则在本次存入至少部分目标有效数据之后,本次存入的有效数据和已有的有效数据组成的集合中:有效数据的最早的失效时间与有效数据的最晚的失效时间之间的时间长度,小于或等于预设的时间长度。
情况二:若在本次存入至少部分目标有效数据之前,所述目标存储区域不存在有效数据。本次存入的有效数据中,有效数据的最早的失效时间与有效数据的最晚的失效时间之间的时间长度,小于或等于预设的时间长度。
情况三:若在本次存入至少部分目标有效数据之后,所述目标存储区域中又继续存储了其他有效数据,则本次存储的至少部分目标有效数据和上述其他有效数据组成的集合中:有效数据的最早失效时间与有效数据的最晚的失效时间之间的时间长度,小于或等于预设的时间长度。
需要说明的是,上述三种情况还可以结合,例如,情况三可以分别和情况一、情况二结合。若情况三和情况一结合时,目标存储区域中可以存储了三部分数据,即在至少部分目标有效数据之前存储至目标存储区域中的有效数据,至少部分目标有效数据本身,以及在至少部分目标有效数据之后存储至目标存储区域中的其他有效数据,即至少部分目标有效数据之前存储至目标存储区域中的有效数据,至少部分目标有效数据本身,以及在至少部分目标有效数据之后存储至目标存储区域中的其他有效数据组成的集合中,有效数据的最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度。
在本申请实施例中,通过限定目标存储区域中存储的有效数据的失效时间中最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度,使得目标存储区域中存储的有效数据的失效时间比较集中,在以目标存储区域进行垃圾回收的过程中,有利于减少迁移有效数据占用的时间,以提高垃圾回收的效率。
具体地,在以存储区域为单位,进行垃圾回收的过程中,如果垃圾回收的时间早于上述最早的失效时间,那么上述存储区域不需要被回收。如果垃圾回收的时间晚于上述最晚的失效时间,那么上述存储区域中的数据全部变为无效数据,可以直接被回收,不需要执行迁移有效数据的步骤。如果垃圾回收的时间位于最早的失效时间和最晚的失效时间之间,则由于存储区域中存储的数据的失效时间较集中,从宏观的角度看,每个存储区域中需要迁移的有效数据的数据量,少于传统的恢复数据存储机制中需要迁移的有效数据的数据量。
在一种可能的实现方式中,在把所述至少部分目标有效数据存储至所述目标存储区域之后,目标存储区域中存储的有效数据的失效时间还可以全部相同,如此,若目标存储区域中的数据全部失效,需要进行垃圾回收时,无需再执行迁移有效数据的步骤,最大程度的减少了垃圾回收过程中,迁移有效数据占用的时间。
在一种可能的实现方式中,所述多个存储区域中存储的数据通过顺序写的方式写入。
在一种可能的实现方式中,所述至少一个第一存储器以及所述第二存储器是叠瓦式磁记录SMR硬盘,所述多个存储区域中的每个存储区域是顺序区Szone。
在一种可能的实现方式中,所述至少一个存储器是固态硬盘SSD,所述多个存储区域中的每个存储区域是块block。
在一种可能的实现方式中,所述方法还包括:根据所述多个目标有效数据中每个有效数据的所述失效时间,确定所述多个目标有效数据的存储顺序,所述存储顺序用于指示所述多个目标有效数据中准备存储至所述目标存储区域中的数据被全部写入后,再向下一个存储区域中写准备存储至所述下一个存储区域中的全部有效数据,所述下一个存储区域为所述多个存储区域中除所述目标存储区域之外的存储区域;所述从所述多个存储区域中选择目标存储区域,并将恢复后的所述多个目标有效数据中的至少部分目标有效数据存储至所述目标存储区域,包括:从所述多个存储区域中选择目标存储区域,并按照所述存储顺序,将所述至少部分目标有效数据存储至所述目标存储区域。
在本申请实施例中,按照存储顺序将至少部分目标有效数据存储至目标存储区域,存储顺序用于指示所述多个目标有效数据中准备存储在同一存储区域的数据被连续地写入。避免了在存储多个目标有效数据的过程中,相邻写入的目标有效数据是存储在不同目标存储区域中的数据,导致在存储相邻写入的目标有效数据的过程中需要在不同的目标存储区域中来回切换,有利于提高存储目标有效数据的效率。尤其,对于SMR硬盘而言,当存储相邻写入的目标有效数据分别在不同的目标存储空间中,会造成SMR硬盘的磁头摆动,导致存储数据的时间增长,严重影响SMR硬盘写数据的性能。
在一种可能的实现方式中,所述多个目标有效数据中任一个目标有效数据的所述失效时间,存储在所述任一个目标有效数据对应的数据分片的元数据中,所述任一个目标有效数据与所述任一个目标有效数据对应的数据分片是对原始数据通过纠删码编码生成的。
在一种可能的实现方式中,所述失效时间通过数据的生命周期和数据的写入时间表示。上述数据的生命周期可以根据数据所属的业务确定,属于不同的业务的数据都有各自对应的生命周期。
第二方面,提供了一种数据存储的装置,所述装置包括用于执行上述方法的各个模块。
第三方面,提供了一种存储系统,所述存储系统包括用于执行上述方法的各个模块。
第四方面,提供了一种存储系统,包括至少一个处理器和至少一个存储器。该至少一个存储器用于存储计算机程序,该至少一个处理器用于从存储器中调用并运行该计算机程序,使得该存储系统执行上述方法。
需要说明的是,上述至少一个处理器和至少一个存储器可以位于多个不同的存储节点中,还可以位于相同的存储节点中。
第五方面,提供了一种存储系统,包括至少一个处理器和至少一个存储器。该至少一个存储器用于存储计算机程序,该至少一个处理器用于从存储器中调用并运行该计算机程序,使得该存储系统执行上述方法。
第六方面,提供了一种计算机程序产品,所述计算机程序产品包括:计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行上述各方面中的方法。
需要说明的是,上述计算机程序代码可以全部或者部分存储在第一存储介质上,其中第一存储介质可以与处理器封装在一起的,也可以与处理器单独封装,本申请实施例对此不作具体限定。
第七方面,提供了一种计算机可读介质,所述计算机可读介质存储有程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行上述各方面中的方法。
附图说明
图1是本申请实施例适用的存储系统的架构图。
图2示出了基于随机存储机制存储恢复数据的存储结果的示意图。
图3是各失效时间之间的时间顺序的示意图。
图4是本申请实施例的数据的存储方法示意性流程图。
图5是本申请实施例的时间属性的结构的示意图。
图6是本申请实施例的恢复数据的存储结果的示意图。
图7是本申请实施例的数据的存储装置的示意图。
图8是本申请实施例的控制器的示意图。
具体实施方式
为了便于理解本申请,先简单介绍下本申请涉及的名词。
一、垃圾回收
一种管理存储空间的机制。当存储空间的剩余存储空间不够时,可以通过垃圾回收机制删除存储空间中存储的无效数据,以达到回收存储资源的目的,回收后的存储资源可以用于存储其他有效数据,以提高存储空间利用率。
二、无效数据和有效数据
无效数据可以理解为是需要从存储系统中删除的数据。例如,可以是生命周期到期的数据,或者说是过了失效时间的数据。
有效数据可以理解为是存储系统需要继续存储的数据。例如,可以是生命周期未到期的数据,或者说是未到失效时间的数据。
三、存储区域
存储区域可以是垃圾回收的回收单位,即对存储器进行垃圾回收时的基本单元,例如,可以是对存储器进行垃圾回收的最小单元的整数倍,还可以是对存储器进行垃圾回收的最小单元。
当存储器为叠瓦式磁记录(shingled magnetic recording)硬盘时,垃圾回收的基本单元可以包括至少一个顺序区(squential zone,Szone),每个顺序区的大小通常为256M。当存储器为固态硬盘(solid state drives,SSD)时,垃圾回收的基本单元可以包括至少一个块(block)。
下面将结合附图,对本申请中的技术方案进行描述。
先结合图1介绍本申请实施例适用的存储系统的架构。图1是本申请实施例适用的分布式存储系统的架构图。图1所示的分布式存储系统100包括多个存储节点110和控制器120。
多个存储节点110,即存储节点1至存储节点n,其中每个存储节点中可以设置有多个存储器。在上述存储节点1至存储节点中设置有m个存储器,即存储器1至存储器m,存 储节点可以通过各自的存储器为数据提供存储空间,其中n和m为大于1的正整数。
上述存储节点具体可以是分布式存储系统中的存储节点,例如存储服务器。上述存储节点还可以是集群存储系统中的存储节点,此时,上述任一个存储节点可以作为主节点,而其他存储节点可以作为该主节点的镜像节点(又称备节点)。
控制器120,用于对数据对丢失的数据进行恢复。
需要说明的是,上述控制器可以是独立于存储节点具有控制功能的控制器,上述控制器还可以是位于某个存储节点中的控制器。例如,上述存储系统为分布式系统,控制器可以是由至少一个处理器组成的,该至少一个处理器可以分布在至少一个存储节点(服务器)中。上述存储系统为集群存储系统时,控制器可以位于主节点,例如是主节点中的处理器。
若上述存储系统中的某一存储器出现数据故障,导致存储器中存储的数据丢失,控制器可以通过数据保护机制(例如,EC,或者镜像存储),根据存储系统中其他存储器中存储的数据对丢失的数据进行恢复,并将恢复后的数据随机的存储至存储系统中。
这种随机地将恢复后的数据存储在存储系统的机制,导致存储系统中的每个目标存储区域中存储的恢复数据的失效时间也是随机的。在以存储区域为单元进行垃圾回收的过程中,待回收的存储区域中通常仅有小部分数据到了失效时间,变成无效数据,剩余的数据还是有效数据,此时,则需要等待将大量的有效数据从待回收的存储区域迁移至其他存储区域后,才能对该待回收的存储区域进行垃圾回收。
例如,图2示出了基于随机存储机制存储恢复数据的存储结果的示意图。假设上述多个存储器中的存储器2中存储有数据a至数据p,其中,数据a、数据b、数据c以及数据d的失效时间为第一失效时间,数据e、数据f、数据g以及数据h的失效时间为第二失效时间,数据i、数据j、数据k以及数据l的失效时间为第三失效时间,数据m、数据n、数据o以及数据p的失效时间为第四失效时间,且第一失效时间、第二失效时间、第三失效时间以及第四失效时间之间的时间顺序如图3所示。
当存储器2故障,导致存储器2中存储的数据a至数据p丢失,则需要通过其他存储器中存储的数据,对存储器2中丢失的数据进行恢复,并将恢复后的数据随机存储至存储器1的存储区域1、存储器3的存储区域2、存储器4存储区域3以及存储器5的存储区域4中,最终存储区域1中存储有数据a、数据e、数据i以及数据n,存储区域2中存储有数据c、数据f、数据k以及数据p,存储区域3中存储有数据d、数据h、数据l以及数据m,存储区域4中存储有数据b、数据g、数据j以及数据o。若在第一失效时间与第二失效时间之间的时间段内,对上述4个存储区域进行垃圾回收,即仅有存储区域1中的数据a、存储区域2中的数据c、存储区域3中的数据d以及存储区域4中的数据b失效,其余的数据均为有效数据,则在回收上述4个存储区域之前,需要将每个存储区域中的有效数据迁移至其他存储区域中。如此,大量的数据迁移,导致垃圾回收占用了大量的时间,效率非常低。
为了减少垃圾回收占用的时间,本申请提供了一种新的存储恢复数据的技术。在该技术中,将失效时间比较集中的恢复数据(即下文中的目标有效数据),存储在相同的存储区域(即下文的目标存储区域)中,在以存储区域为单位进行垃圾回收时,有利于减少迁移有效数据占用的时间,提高垃圾回收的效率。避免了传统的存储恢复数据的方法中,恢复 数据被随机存储在存储区域中,导致每个存储区域中存储的恢复数据的失效时间也是随机的,在以存储区域为单位进行垃圾回收的过程中,存储系统中的各个存储区域中通常仅有零散的部分数据失效,使得需要进行迁移的有效数据的数据量较大。
下文结合图4介绍本申请实施例的数据的存储方法,该方法应用于存储系统(例如,图1所示的存储系统)。为了便于描述,将存储系统中正常工作的存储器称为第一存储器,将存储系统中出现数据故障的存储器称为第二存储器,其中正常工作的存储器中包括多个存储区域。
图4是本申请实施例的数据的存储方法示意性流程图。图4所示的方法可以存储系统中具有控制功能的设备执行,例如,图1所示的控制器。图4所示的方法包括步骤410和步骤420。
410,在第二存储器出现数据故障后,获取待恢复的多个目标有效数据的失效时间,以及对多个目标有效数据进行恢复。
上述失效时间用于指示有效数据变为无效数据的时间,可以是直接存储在存储系统中的,上述失效时间还可以通过时间属性的方式间接地存储在存储系统中,例如,上述时间属性可以包括数据的生命周期以及数据的写入时间,其中数据的生命周期用于指示在存储系统中存储该数据的总时长,写入时间用于指示该数据写入存储系统的时间,则以写入时间为生命周期的起始时刻,经过生命周期指示的总时长后,生命周期的结束时刻即为失效时间。
上述数据的生命周期可以根据数据所属的业务确定,属于不同的业务的数据都有各自对应的生命周期。
在EC的场景中,通常会将还未存储至存储系统中的原始数据进行EC编码,生成多个数据分片,由于这多个数据分片都是基于原始数据生成的,并且在向存储系统中写这多个数据分片时,多个数据分片是被连续地写入存储系统中不同的存储器,也就是说,这多个数据的写入时间也是相同的,因此,上述多个数据分片中每个分片的失效时间是相同的。
当第二存储器出现数据故障,导致第二存储器中存储的上述多个数据分片中的目标有效数据分片(即目标有效数据)丢失,则可以从存储系统存储的多个数据分片中的其他数据分片中获得目标有效数据分片的失效时间。
需要说明的是,存储系统为分布式存储系统时,分布式存储系统的客户端在接收到用户写入的数据后,在通过EC编码将用户写入的数据分割为多个数据分片后,便可以根据业务数据所属的业务在每个数据分片的元数据中添加生命周期。
当分布式存储系统中的每个存储节点(包括数据分片节点和冗余分片节点),在接收到上述多个数据分片中的至少部分数据分片后,在向存储节点存储数据分片的过程中,可以在每个数据分片的元数据中添加写入时间。
若上述多个数据分片的元数据中直接存储的是每个数据分片的失效时间,则可以由上述各个存储节点在写数据分片的过程中,在数据分片的元数据中直接添加失效时间。
在镜像存储的场景中,数据不仅需要存储在主节点中,还需要在备节点中备份主节点中的数据,由于备节点中存储的数据是主节点中存储的数据的副本,那么主节点中存储的数据与备节点中存储的该数据的副本数据的失效时间是相同的。
当第二存储器(例如,主节点中的任意存储器)出现数据故障,导致第二存储器中存储的目标有效数据丢失,则可以从备节点中获取目标有效数据的失效时间。
当第二存储器出现数据故障时,为了能够获取待恢复的多个目标有效数据的失效时间,在向存储系统中存储数据的过程中,需要在数据对应的元数据中添加该数据的失效时间,或者添加用于指示该数据的失效时间的时间属性,例如,数据的生命周期以及数据的写入时间。下文结合图5介绍本申请实施例的时间属性的结构的示意图。需要说明的是,图5所示的时间属性的结构仅仅是一种可能的实现方式,该时间属性的结构还可以有其他的实现方式。
从图5所示的时间属性的结构的示意图中,可以看出,时间属性总共可以占用17比特(bit),其中生命周期可以占用5bit,写入时间可以占用12bit,其中写入时间的单位可以为天,则该时间属性结构可以支持2 5=32种不同生命周期的业务,可以表示2 12(天)=10(年)的数据写入量。
上述对目标有效数据进行恢复可包括根据存储系统中存储的数据,对多个目标有效数据进行恢复。
在EC场景中,可以通过对存储系统中存储多个数据分片进行EC解码,恢复上述多个目标有效数据,其具体的解码方式可以参照传统的EC解码方式,本申请实施例对此不再详细说明。
在镜像存储的场景中,可以从备节点中将多个目标有效数据进行复制,以恢复上述多个目标有效数据。
另外,上述恢复数据的执行过程与步骤410以及步骤420中选择目标存储区域之间的时间关系比较灵活,例如,恢复数据的执行过程可以在步骤410之前进行,恢复数据的执行过程还可以在选择目标存储区域之前进行,恢复数据的执行过程还可以在选择目标存储区域之后进行,恢复数据的执行过程还可以在步骤410之后进行,本申请实施例对此不作具体限定。当恢复数据的执行过程在步骤410之前进行时,上述多个目标有效数据的失效时间可以直接从恢复后的数据的元数据中获取。
420,从多个存储区域中选择目标存储区域,并将恢复后的多个目标有效数据中的至少部分目标有效数据发送至目标存储区域所在的存储器,其中,在把至少部分目标有效数据存储至目标存储区域之后,目标存储区域存储的有效数据的失效时间中最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度。
在将多个目标有效数据存储目标存储区域之前,目标存储区域可以是没有存储数据的空白的存储区域,还可以是已经存储了一部分数据的存储区域。
上述目标有效数据存储的有效数据,可以包括上述多个有目标有效数据中的至少部分目标有效数据,还可以包括除上述至少部分目标有效数据之外的其他数据,该其他有效数据可以是目标存储区域在存储上述至少部分目标有效数据之前存储有的数据,还可以是在目标存储区域存储了上述至少部分目标有效数据之后,通过追加写的方式新写入的数据。
上述目标存储区域中存储的有效数据的失效时间中:最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度,具体分为以下三种情况。
情况一:若在本次存入至少部分目标有效数据之前,目标存储区域已存有有效数据, 则在本次存入至少部分目标有效数据之后,本次存入的有效数据和已有的有效数据组成的集合中:有效数据的最早的失效时间与有效数据的最晚的失效时间之间的时间长度,小于或等于预设的时间长度。
情况二:若在本次存入至少部分目标有效数据之前,所述目标存储区域不存在有效数据。本次存入的有效数据中,有效数据的最早的失效时间与有效数据的最晚的失效时间之间的时间长度,小于或等于预设的时间长度。
情况三:若在本次存入至少部分目标有效数据之后,所述目标存储区域中又继续存储了其他有效数据,则本次存储的至少部分目标有效数据和上述其他有效数据组成的集合中:有效数据的最早失效时间与有效数据的最晚的失效时间之间的时间长度,小于或等于预设的时间长度。
需要说明的是,上述三种情况还可以结合,例如,情况三可以分别和情况一、情况二结合。若情况三和情况一结合时,目标存储区域中可以存储了三部分数据,即在至少部分目标有效数据之前存储至目标存储区域中的有效数据,至少部分目标有效数据本身,以及在至少部分目标有效数据之后存储至目标存储区域中的其他有效数据,即至少部分目标有效数据之前存储至目标存储区域中的有效数据,至少部分目标有效数据本身,以及在至少部分目标有效数据之后存储至目标存储区域中的其他有效数据组成的集合中,有效数据的最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度。
上述目标存储区域存储的有效数据可以包括多个有目标有效数据中的部分目标有效数据,目标存储区域存储的有效数据可以包括多个目标有效数据中的全部目标有效数据。
如上文所述,若目标存储区域原本为空白的存储区域,那么可以直接将上述至少部分目标有效数据存储至目标存储区域。若目标存储区域原本存储有数据,那么需要确保目标有效存储区域中即将存储的数据和已经存储的数据的失效时间中最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度。
在本申请实施例中,通过限定目标存储区域中存储的有效数据的失效时间中最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度,使得目标存储区域中存储的有效数据的失效时间比较集中,在以目标存储区域进行垃圾回收的过程中,有利于减少迁移有效数据占用的时间,以提高垃圾回收的效率。
具体地,在以存储区域为单位,进行垃圾回收的过程中,如果垃圾回收的时间早于上述最早的失效时间,那么上述存储区域不需要被回收。如果垃圾回收的时间晚于上述最晚的失效时间,那么上述存储区域中的数据全部变为无效数据,可以直接被回收,不需要执行迁移有效数据的步骤。如果垃圾回收的时间位于最早的失效时间和最晚的失效时间之间,则由于存储区域中存储的数据的失效时间较集中,从宏观的角度看,每个存储区域中需要迁移的有效数据的数据量,少于传统的恢复数据存储机制中需要迁移的有效数据的数据量。
当然,目标存储区域中存储的有效数据的失效时间还可以全部相同,即目标存储区域中存储的数据要么同时变为失效数据,要么就都是有效数据,如此,若目标存储区域中的数据全部失效,需要进行垃圾回收时,无需再执行迁移有效数据的步骤,最大程度的减少了垃圾回收过程中,迁移有效数据占用的时间。
在通过时间属性指示失效时间时,上述失效时间相同的数据可以是生命周期和写入时 间相同的数据,上述失效时间相同的数据还可以是写入时间不同,生命周期不同,但是生命周期的结束时间相同的数据,例如剩余生命周期相同,并且恢复数据的时间相同的数据,其中剩余生命周期指示数据在恢复之后还剩的有效时间,即以恢复数据的时间为起点,经过剩余生命周期,剩余生命周期的结束时刻即为该数据的失效时间。
图6是本申请实施例的恢复数据的存储结果的示意图。图6所示的存储结果中以用p表示的生命周期(period),和用t表示的写入时间(write time)指示数据失效时间。假设上述多个存储器中的存储器2中存储有数据a至数据p,其中,数据a、数据b、数据c以及数据d的失效时间为第一失效时间,用(p1,t1)表示;数据e、数据f、数据g以及数据h的失效时间为第二失效时间,用(p1,t2)表示;数据i、数据j、数据k以及数据l的失效时间为第三失效时间,用(p2,t1)表示;数据m、数据n、数据o以及数据p的失效时间为第四失效时间,用(p2,t2)表示。
基于上述目标存储区域中存储的有效数据的失效时间还可以全部相同的存储策略,将存储器2中丢失的数据进行恢复,并将恢复后的数据分别存储至存储器1中的存储区域1,存储器3中的存储区域2,存储器4中的存储区域3以及存储器5中的存储区域4。从图6所示的存储结果中可以看出存储区域1至存储区域4中,每个存储区域中存储的数据的生命周期相同,且每个存储区域中存储的数据的写入时间也相同。
上述多个存储区域可以位于一个或多个第一存储器中。另外,上述至少一个第一存储器和第二存储器位于一个存储节点中,上述至少一个第一存储器和第二存储器还可以位于不同的存储节点中,本申请实施例对此不作限定。
上述存储多个目标有效数据的失效时间的存储器可以与存储目标有效数据的存储器(即目标存储区域所在的存储器)是相同的存储器;上述存储多个目标有效数据的失效时间的存储器可以与存储目标有效数据的存储器(即目标存储区域所在的存储器)是不同的存储器,本申请实施例对此不作限定。
上述存储失效时间的存储器所在的存储节点可以与至少一个第一存储器位于相同的存储节点中,上述存储失效时间的存储器所在的存储节点还可以与至少一个第一存储器所在的存储节点不同。
上述执行步骤410以及步骤420的控制器所在的存储节点,可以与目标存储区域所在的存储节点是相同的存储节点,也就是说,该存储节点中的第二存储器出现数据故障后,可以由该服务器中的控制器通过执行上述步骤410和步骤420将目标有效数据存储至目标存储区域。
上述执行步骤410和步骤420的控制器所在的存储节点,还可以与目标存储区域所在的存储节点是不同的存储节点,也就是说,控制器可以将恢复后的至少部分目标有效数据发送至目标存储区域所在的存储节点,由目标存储区域所在的存储节点将至少部分目标有效数据存储至目标存储区域。
上述执行步骤410和步骤420的控制器所在的存储节点可以与第二存储器所在的存储节点是相同的存储节点,上述执行步骤410和步骤420的控制器所在的存储节点可以与第二存储器所在的存储节点是不同的存储节点。
需要说明的是,上述存储用于恢复目标有效数据的数据的存储区域,可以与目标存储 区域在相同的存储节点,甚至相同的存储器中。上述存储用于恢复目标有效数据的数据的存储区域,可以与目标存储区域位于不同的存储节点。
对于支持顺序写的存储介质(例如SMR的Szone,以及SSD)而言,在通过顺序写的方式写数据过程中,同一数据流的数据会被自然地写入连续的存储空间,如此,每个目标存储区域中存储的数据的失效时间通常是相同的,即使每个目标存储区域中存储的数据的失效时间不完全相同,通常也是比较集中的。在以目标存储区域为单位进行垃圾回收的过程中,有效数据的迁移量通常也不是很大。但是,如前文所述,存储恢复数据的过程中,由于恢复数据的随机存储,导致目标存储区域中存储的恢复后的数据的失效时间也同样是随机的,因此,本申请实施例的方法也同样适用上述支持顺序写的存储介质。
需要说明的是,本发明实施例中所提及的存储器可以是“顺序写介质”例如SSD或者SMR,此外还可以是“追加写”(append)的存储器。例如,对于非顺序写的存储器(例如普通硬盘驱动器(hard disk drive,HDD)),可以通过软件(例如存储管理软件)在逻辑上规定数据以追加写的方式写入存储器中,在这种情况下,也适用本发明实施例。在追加写的存储器中,新写入的数据不能直接覆盖已有数据,例如新写入的数据追加在存储器已有数据之后进行存储。
在向目标存储区域中存储目标有效数据的过程中,通常可以将需要存储在同一目标存储区域中的数据连续地写入。避免了在存储多个目标有效数据的过程中,相邻写入的目标有效数据是存储在不同目标存储区域中的数据,导致在存储相邻写入的目标有效数据的过程中需要在不同的目标存储区域中来回切换,有利于提高存储目标有效数据的效率。尤其,对于SMR硬盘而言,当存储相邻写入的目标有效数据分别在不同的目标存储空间中,会造成SMR硬盘的磁头摆动,导致存储数据的时间增长,严重影响SMR硬盘写数据的性能。
因此,在向目标存储区域中存储目标有效数据之前,可以先根据多个目标有效数据的失效时间确定多个目标有效数据的存储顺序,基于存储顺序,将多个目标有效数据中准备存储在同一存储区域的目标有效数据连续地写入目标存储区域。
即,根据所述多个目标有效数据中每个有效数据的所述失效时间,确定所述多个目标有效数据的存储顺序,所述存储顺序用于指示所述多个目标有效数据中准备存储至所述目标存储区域中的数据被全部写入后,再向下一个存储区域中写准备存储至所述下一个存储区域中的全部有效数据,所述下一个存储区域为所述多个存储区域中除所述目标存储区域之外的存储区域;步骤420,还包括:从所述多个存储区域中选择目标存储区域,并按照所述存储顺序,将所述至少部分目标有效数据存储至所述目标存储区域。
上述存储顺序的具体排序方式可以与前文介绍的存储策略相配合使用。例如,若存储策略为将失效时间相同的目标有效数据存储在相同的存储区域时,上述存储顺序具体可以指示将失效时间相同的目标有效数据被连续的写入。又例如,若存储策略为将失效时间位于预设时间长度阈值内的目标有效数据存储在相同的存储区域中,则上述存储顺序可以指示将满足该存储策略的目标有效数据被连续的写入。
需要说明的是,上述建立存储顺序的过程可以在恢复完成多个目标有效数据后,再按照存储顺序,将恢复后的目标有效数据存储中目标存储区域。上述建立存储顺序的过程还可以在恢复多个目标有效数据之前执行。
若上述建立存储顺序的过程在恢复多个目标有效数据之后执行,则可以根据失效时间,将会存储至一个存储区域中的数据汇聚至一个数据集合,暂时存储起来。然后,以数据集合为单位将数据集合中的目标有效数据依次向存储区域中存储。
若上述建立存储顺序的过程在恢复多个目标有效数据之前执行,此时,多个目标有效数据分片还没有被恢复,可以通过调整数据恢复的顺序以达到调整存储顺序的目的。
下面以目标存储区域中存储的有效数据的失效时间相同为例,介绍调整数据的存储顺序的方法。由于上述失效时间,或者用于指示失效时间的时间属性都可以被作为键(key)添加在数据的元数据中,因此,可以复用传统的哈希(hash)算法,对多个目标有效数据的键进行hash,如此,可以将携带相同键的数据hash到相同的桶中,然后以桶为单位,对多个目标有效数据进行恢复,并将恢复后的目标有效数据存储至该桶对应的目标存储区域中。
例如,分别于2018年6月2日向第二存储器中写入需要存储10天的数据,于2018年6月1日向第二存储器中写入需要存储10天的数据,于2018年6月2日向第二存储器中写入需要存储30天的数据,则上述数据的元数据中存储的时间属性(生命周期以及写入时间)分别为1020180602、1020180601以及3020180602。
第二存储器数据故障时,可以从存储系统中获取需要恢复的数据的时间属性分别为1020180602、1020180601以及3020180602,并将上述时间属性分别作为key进行hash,那么2018年6月2日写入需要存储10天的数据、2018年6月1日写入需要存储10天的数据以及2018年6月2日写入需要存储30天的数据,分别会hash到1020180602、1020180601以及3020180602为key的hash桶。然后,按照hash桶的顺序,分别依次恢复1020180602、1020180601、3020180602桶上的数据,如此,便能将上述存储至相同存储区域的数据相继写入。
上文结合图1至图6详细介绍了本申请实施例的数据的存储方法,下文结合图7和图8详细地描述本申请实施例的数据的存储装置。需要说明的是,图7和图8所示的装置可以实现上述方法中各个步骤,为了简洁,在此不再赘述。
图7是本申请实施例的数据的存储装置的示意图。图7所示的装置700可以应用于存储系统,所述存储系统包括至少一个第一存储器,所述存储系统还包括第二存储器,所述至少一个第一存储器包括多个存储区域,所述多个存储区域中每个存储区域是进行垃圾回收的单元。装置700可以包括:处理单元710。
处理单元710,用于在所述第二存储器出现数据故障后,获取待恢复的多个目标有效数据的失效时间,所述失效时间用于指示有效数据变为无效数据的时间;
所述处理单元710,还用于从所述多个存储区域中选择目标存储区域,并将恢复后的所述多个目标有效数据中的至少部分目标有效数据存储至所述目标存储区域,其中,在把所述至少部分目标有效数据存储至所述目标存储区域之后,所述目标存储区域存储的有效数据的失效时间中最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度。
在本申请实施例中,通过限定目标存储区域中存储的有效数据的失效时间中最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度,使得目标存储区域 中存储的有效数据的失效时间比较集中,在以目标存储区域进行垃圾回收的过程中,有利于减少迁移有效数据占用的时间,以提高垃圾回收的效率。
在一种可能的实现方式中,在把所述至少部分目标有效数据存储至所述目标存储区域之后,所述目标存储区域存储的有效数据的所述失效时间相同。
在一种可能的实现方式中,所述多个存储区域中存储的数据通过顺序写的方式写入。
在一种可能的实现方式中,所述至少一个第一存储器以及所述第二存储器是叠瓦式磁记录SMR硬盘,所述多个存储区域中的每个存储区域是顺序区Szone。
在一种可能的实现方式中,所述至少一个存储器是固态硬盘SSD,所述多个存储区域中的每个存储区域是块block。
在一种可能的实现方式中,所述处理单元,还用于:根据所述多个目标有效数据中每个有效数据的所述失效时间,确定所述多个目标有效数据的存储顺序,所述存储顺序用于指示所述多个目标有效数据中准备存储在同一存储区域的数据被连续地写入;以及从所述多个存储区域中选择目标存储区域,并按照所述存储顺序,将所述至少部分目标有效数据存储至所述目标存储区域。
在一种可能的实现方式中,所述多个目标有效数据中任一个目标有效数据的所述失效时间,存储在所述任一个目标有效数据对应的数据分片的元数据中,所述任一个目标有效数据与所述任一个目标有效数据对应的数据分片是对原始数据通过纠删码编码生成的。
在一种可能的实现方式中,所述失效时间通过数据的生命周期和数据的写入时间表示。
在可选的实施例中,上述装置700还可以是控制器800,具体地,所述处理单元710可以为至少一个处理器820。所述控制器800还可以包括至少一个存储器810和至少一个输入/输出接口830,具体如图8所示。
图8是本申请实施例的控制器的示意图。图8所示的控制器800可以包括:至少一个存储器810、至少一个处理器820和至少一个输入/输出接口830。其中,至少一个存储器810、至少一个处理器820、和至少一个输入/输出接口830通过通信连接相连,该至少一个存储器810用于存储程序指令,该至少一个处理器820用于执行该存储器820存储的程序指令,以控制至少一个输入/输出接口830接收输入的数据和信息,输出操作结果等数据。
该存储器810可以包括只读存储器和随机存取存储器,并向处理器820提供指令和数据。处理器820的一部分还可以包括非易失性随机存取存储器。例如,处理器820还可以存储设备类型的信息。
在实现过程中,上述方法的各步骤可以通过处理器820中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的方法可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器810,处理器820读取存储器810中的信息,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。
应理解,本申请实施例中,该处理器可以为中央处理单元(central processing unit,CPU),该处理器还可以是其它通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field  programmable gate array,FPGA)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
本申请实施例还提供一种存储系统。该存储系统的架构图可以参见图1。所述存储系统包括至少一个第一存储器,所述存储系统还包括第二存储器,所述至少一个第一存储器包括多个存储区域,所述多个存储区域中每个存储区域是进行垃圾回收的单元。所述存储系统可以包括:处理单元和处理单元。
处理单元,用于在所述第二存储器出现数据故障后,获取待恢复的多个目标有效数据的失效时间,所述失效时间用于指示有效数据变为无效数据的时间;
处理单元,用于从所述多个存储区域中选择目标存储区域,并将恢复后的所述多个目标有效数据中的至少部分目标有效数据存储至所述目标存储区域,其中,在把所述至少部分目标有效数据存储至所述目标存储区域之后,所述目标存储区域存储的有效数据的失效时间中最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度。
上述存储系统中涉及3种存储节点,即出现数据故障的第二存储器所在的存储节点,目标存储区域所在的存储节点以及处理单元和处理单元所在的存储器节点。
上述3种存储节点可以是相同的存储节点(为了便于描述,下文暂时称为第一存储节点)此时,第一存储节点中的第二存储器出现数据故障后,第一存储节点中的控制器再将恢复后的上述至少部分目标有效数据存储至第一存储节点中的目标存储区域中。此时,上述存储系统至少包括一个存储节点,即第一存储节点。
上述3种存储节点还可以是不同的存储节点,例如,第二存储器所在的存储节点与处理单元和处理单元所在的存储节点为相同的存储节点可以是相同的存储节点(又称第二存储节点),但第二存储节点与目标存储区域所在的存储节点(又称第三存储节点)是不同的存储节点,也就是说,第二存储节点中的第二存储器出现数据故障之后,第二存储器可以通过处理单元以及处理单元对多个目标有效数据进行数据恢复,并将恢复后的多个目标有效数据中的至少部分目标有效数据发送至第三存储节点,以便第三存储节点将上述至少部分目标有效数据存储至目标存储区域。此时,上述存储系统至少包括两个存储节点,即第二存储节点和第三存储节点。
在本申请实施例中,通过限定目标存储区域中存储的有效数据的失效时间中最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度,使得目标存储区域中存储的有效数据的失效时间比较集中,在以目标存储区域进行垃圾回收的过程中,有利于减少迁移有效数据占用的时间,以提高垃圾回收的效率。
在一种可能的实现方式中,在把所述至少部分目标有效数据存储至所述目标存储区域之后,所述目标存储区域存储的有效数据的所述失效时间相同。
在一种可能的实现方式中,所述多个存储区域中存储的数据通过顺序写的方式写入。
在一种可能的实现方式中,所述至少一个第一存储器以及所述第二存储器是叠瓦式磁记录SMR硬盘,所述多个存储区域中的每个存储区域是顺序区Szone。
在一种可能的实现方式中,所述至少一个存储器是固态硬盘SSD,所述多个存储区域中的每个存储区域是块block。
在一种可能的实现方式中,所述处理单元,还用于:根据所述多个目标有效数据中每 个有效数据的所述失效时间,确定所述多个目标有效数据的存储顺序,所述存储顺序用于指示所述多个目标有效数据中准备存储在同一存储区域的数据被连续地写入;以及从所述多个存储区域中选择目标存储区域,并按照所述存储顺序,将所述至少部分目标有效数据存储至所述目标存储区域。
在一种可能的实现方式中,所述多个目标有效数据中任一个目标有效数据的所述失效时间,存储在所述任一个目标有效数据对应的数据分片的元数据中,所述任一个目标有效数据与所述任一个目标有效数据对应的数据分片是对原始数据通过纠删码编码生成的。
在一种可能的实现方式中,所述失效时间通过数据的生命周期和数据的写入时间表示。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够读取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,数字通用光盘(digital video disc,DVD))或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (18)

  1. 一种数据的存储方法,其特征在于,所述方法应用于存储系统,所述存储系统包括至少一个第一存储器,所述存储系统还包括第二存储器,所述至少一个第一存储器包括多个存储区域,所述多个存储区域中每个存储区域是进行垃圾回收的单元,
    所述方法包括:
    在所述第二存储器出现数据故障后,获取待恢复的多个目标有效数据的失效时间,以及对所述多个目标有效数据进行恢复,所述失效时间用于指示有效数据变为无效数据的时间;
    从所述多个存储区域中选择目标存储区域,并将恢复后的所述多个目标有效数据中的至少部分目标有效数据发送至所述目标存储区域所在的存储器,
    其中,在把所述至少部分目标有效数据存储至所述目标存储区域之后,所述目标存储区域存储的有效数据的失效时间中:最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度。
  2. 如权利要求1所述的方法,其特征在于,在把所述至少部分目标有效数据存储至所述目标存储区域之后,所述目标存储区域存储的有效数据的所述失效时间相同。
  3. 如权利要求1或2所述的方法,其特征在于,所述多个存储区域中存储的数据通过顺序写的方式写入。
  4. 如权利要求1-3中任一项所述的方法,其特征在于,所述至少一个第一存储器以及所述第二存储器是叠瓦式磁记录SMR硬盘,所述多个存储区域中的每个存储区域是顺序区Szone。
  5. 如权利要求1-3中任一项所述的方法,其特征在于,所述至少一个存储器是固态硬盘SSD,所述多个存储区域中的每个存储区域是块block。
  6. 如权利要求1-5中任一项所述的方法,其特征在于,所述方法还包括:
    根据所述多个目标有效数据中每个有效数据的所述失效时间,确定所述多个目标有效数据的存储顺序,所述存储顺序用于指示所述多个目标有效数据中准备存储至所述目标存储区域中的数据被全部写入后,再向下一个存储区域中写准备存储至所述下一个存储区域中的全部有效数据,所述下一个存储区域为所述多个存储区域中除所述目标存储区域之外的存储区域;
    所述从所述多个存储区域中选择目标存储区域,并将恢复后的所述多个目标有效数据中的至少部分目标有效数据存储至所述目标存储区域,包括:
    从所述多个存储区域中选择目标存储区域,并按照所述存储顺序,将所述至少部分目标有效数据存储至所述目标存储区域。
  7. 如权利要求1-6中任一项所述的方法,其特征在于,所述目标存储区域存储的有效数据包括所述至少部分目标有效数据和/或其他数据,所述其他数据包括在向目标存储区域存储所述至少部分目标有效数据之前存储至所述目标存储区域中的数据,和/或在向目标存储区域存储所述至少部分目标有效数据之后存储至所述目标存储区域中的数据。
  8. 如权要求1-7中任一项所述的方法,其特征在于,所述多个目标有效数据中任一个目标有效数据的所述失效时间,存储在所述任一个目标有效数据对应的数据分片的元数据中,所述任一个目标有效数据与所述任一个目标有效数据对应的数据分片是对原始数据通 过纠删码编码生成的。
  9. 如权利要求1-8中任一项所述的方法,其特征在于,所述失效时间通过数据的生命周期和数据的写入时间表示。
  10. 一种存储系统,其特征在于,所述存储系统包括至少一个第一存储器,所述存储系统还包括第二存储器,所述至少一个第一存储器包括多个存储区域,所述多个存储区域中每个存储区域是进行垃圾回收的单元,
    所述存储系统包括:
    处理单元,用于在所述第二存储器出现数据故障后,获取待恢复的多个目标有效数据的失效时间,以及对所述多个目标有效数据进行恢复,所述失效时间用于指示有效数据变为无效数据的时间;
    所述处理单元,还用于从所述多个存储区域中选择目标存储区域,并将恢复后的所述多个目标有效数据中的至少部分目标有效数据发送至所述目标存储区域所在的存储器,
    其中,在把所述至少部分目标有效数据存储至所述目标存储区域之后,所述目标存储区域存储的有效数据的失效时间中最早的失效时间与最晚的失效时间之间的时间长度小于或等于预设的时间长度。
  11. 如权利要求10所述的装置,其特征在于,在把所述至少部分目标有效数据存储至所述目标存储区域之后,所述目标存储区域存储的有效数据的所述失效时间相同。
  12. 如权利要求10或11所述的装置,其特征在于,所述多个存储区域中存储的数据通过顺序写的方式写入。
  13. 如权利要求10-12中任一项所述的装置,其特征在于,所述至少一个第一存储器以及所述第二存储器是叠瓦式磁记录SMR硬盘,所述多个存储区域中的每个存储区域是顺序区Szone。
  14. 如权利要求10-12中任一项所述的装置,其特征在于,所述至少一个存储器是固态硬盘SSD,所述多个存储区域中的每个存储区域是块block。
  15. 如权利要求10-14中任一项所述的装置,其特征在于,所述处理单元,还用于:
    根据所述多个目标有效数据中每个有效数据的所述失效时间,确定所述多个目标有效数据的存储顺序,所述存储顺序用于指示所述多个目标有效数据中准备存储至所述目标存储区域中的数据被全部写入后,再向下一个存储区域中写准备存储至所述下一个存储区域中的全部有效数据,所述下一个存储区域为所述多个存储区域中除所述目标存储区域之外的存储区域;以及
    从所述多个存储区域中选择目标存储区域,并按照所述存储顺序,将所述至少部分目标有效数据存储至所述目标存储区域。
  16. 如权要求10-15中任一项所述的装置,其特征在于,所述多个目标有效数据中任一个目标有效数据的所述失效时间,存储在所述任一个目标有效数据对应的数据分片的元数据中,所述任一个目标有效数据与所述任一个目标有效数据对应的数据分片是对原始数据通过纠删码编码生成的。
  17. 一种计算机可读介质,其特征在于,所述计算机可读介质存储有程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行如权利要求1-9中任一项所述的方 法。
  18. 一种存储系统,其特征在于,所述存储系统包括至少一个处理器和至少一个存储器,所述至少一个存储器用于存储计算机程序,所述至少一个处理器用于从至少一个存储器中调用并运行该计算机程序,使得该存储系统执行如权利要求1-9中任一项所述的方法。
PCT/CN2019/098256 2018-08-27 2019-07-30 数据的存储方法、装置和存储系统 Ceased WO2020042850A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19855114.5A EP3839716B1 (en) 2018-08-27 2019-07-30 Data storage method and apparatus and storage system
US17/183,657 US12008263B2 (en) 2018-08-27 2021-02-24 Garbage collection and data storage method and apparatus, and storage system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810983555.0A CN109445681B (zh) 2018-08-27 2018-08-27 数据的存储方法、装置和存储系统
CN201810983555.0 2018-08-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/183,657 Continuation US12008263B2 (en) 2018-08-27 2021-02-24 Garbage collection and data storage method and apparatus, and storage system

Publications (1)

Publication Number Publication Date
WO2020042850A1 true WO2020042850A1 (zh) 2020-03-05

Family

ID=65532785

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/098256 Ceased WO2020042850A1 (zh) 2018-08-27 2019-07-30 数据的存储方法、装置和存储系统

Country Status (4)

Country Link
US (1) US12008263B2 (zh)
EP (1) EP3839716B1 (zh)
CN (1) CN109445681B (zh)
WO (1) WO2020042850A1 (zh)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109445681B (zh) * 2018-08-27 2021-05-11 华为技术有限公司 数据的存储方法、装置和存储系统
CN114830077A (zh) * 2019-12-19 2022-07-29 华为技术有限公司 一种数据存储方法及存储装置
CN112346661B (zh) * 2020-11-16 2023-09-29 脸萌有限公司 数据处理方法、装置和电子设备
CN115989485B (zh) * 2020-11-30 2025-09-30 华为技术有限公司 一种数据处理方法、装置及系统
CN112714031B (zh) * 2021-03-29 2021-06-22 中南大学 一种基于带宽感知的故障节点快速修复方法
CN113687774B (zh) * 2021-07-19 2024-11-15 锐捷网络股份有限公司 空间回收方法、装置及设备
CN113900590B (zh) * 2021-09-28 2023-01-31 重庆紫光华山智安科技有限公司 叠瓦式磁盘存储方法、装置、设备及介质
CN115291796A (zh) 2022-07-27 2022-11-04 三星(中国)半导体有限公司 存储数据的方法和装置
CN115291806B (zh) * 2022-08-11 2025-12-23 北京青云科技集团股份有限公司 一种处理方法、装置、电子设备及存储介质
CN115543689B (zh) * 2022-10-11 2025-08-01 重庆紫光华山智安科技有限公司 一种数据恢复方法、系统、设备及存储介质
CN116182686B (zh) * 2022-12-30 2025-09-19 湖南中联重科应急装备有限公司 用于确定水带回收长度的方法、装置、存储介质及处理器
CN116182687B (zh) * 2022-12-30 2025-09-19 湖南中联重科应急装备有限公司 用于确定水带回收长度的方法、装置、存储介质及处理器
CN117591011B (zh) * 2023-10-31 2025-02-18 深圳大学 基于瓦块重叠的数据回储优化方法、装置、设备及介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105204783A (zh) * 2015-10-13 2015-12-30 华中科技大学 一种基于数据生存期的固态盘垃圾回收方法
CN107102819A (zh) * 2014-12-12 2017-08-29 西安三星电子研究有限公司 向固态硬盘写入数据的方法及设备
CN109445681A (zh) * 2018-08-27 2019-03-08 华为技术有限公司 数据的存储方法、装置和存储系统

Family Cites Families (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6526448B1 (en) * 1998-12-22 2003-02-25 At&T Corp. Pseudo proxy server providing instant overflow capacity to computer networks
US8195912B2 (en) * 2007-12-06 2012-06-05 Fusion-io, Inc Apparatus, system, and method for efficient mapping of virtual and physical addresses
EP4361815A3 (en) * 2009-10-09 2024-06-19 Violin Systems LLC Memory system with multiple striping of raid groups and method for performing the same
JP5066199B2 (ja) * 2010-02-12 2012-11-07 株式会社東芝 半導体記憶装置
US8316176B1 (en) * 2010-02-17 2012-11-20 Western Digital Technologies, Inc. Non-volatile semiconductor memory segregating sequential data during garbage collection to reduce write amplification
FI124455B (fi) * 2010-04-20 2014-09-15 Tellabs Oy Menetelmä ja laite verkko-osoitteiden konfiguroimiseksi
US20120030260A1 (en) * 2010-07-30 2012-02-02 Maohua Lu Scalable and parallel garbage collection method and system for incremental backups with data de-duplication
US8886990B2 (en) * 2011-01-27 2014-11-11 Apple Inc. Block management schemes in hybrid SLC/MLC memory
CN102439572B (zh) * 2011-10-27 2014-04-02 华为技术有限公司 控制缓存映射的方法及缓存系统
CN103577338B (zh) * 2013-11-14 2016-06-29 华为技术有限公司 一种回收垃圾数据的方法及存储设备
US9501393B2 (en) * 2014-01-27 2016-11-22 Western Digital Technologies, Inc. Data storage system garbage collection based on at least one attribute
KR102289919B1 (ko) * 2014-04-15 2021-08-12 삼성전자주식회사 스토리지 컨트롤러, 스토리지 장치, 스토리지 시스템 및 상기 스토리지 컨트롤러의 동작 방법
US10114557B2 (en) * 2014-05-30 2018-10-30 Sandisk Technologies Llc Identification of hot regions to enhance performance and endurance of a non-volatile storage device
US9705990B2 (en) * 2014-06-05 2017-07-11 Toyota Jidosha Kabushiki Kaisha Transfer of digital data to mobile software systems
US9600409B2 (en) * 2014-08-29 2017-03-21 EMC IP Holding Company LLC Method and system for garbage collection in a storage system based on longevity of stored data
US9734051B2 (en) * 2015-02-16 2017-08-15 Quantum Corporation Garbage collection and defragmentation for solid state drives (SSD) and shingled magnetic recording (SMR) drives
US10261725B2 (en) * 2015-04-10 2019-04-16 Toshiba Memory Corporation Storage system capable of invalidating data stored in a storage device thereof
WO2016175028A1 (ja) * 2015-04-28 2016-11-03 日本電気株式会社 情報処理システム、記憶制御装置、記憶制御方法および記憶制御プログラム
US10650024B2 (en) * 2015-07-30 2020-05-12 Google Llc System and method of replicating data in a distributed system
CN106548789B (zh) * 2015-09-17 2019-05-17 伊姆西公司 用于操作叠瓦式磁记录设备的方法和装置
US20170153842A1 (en) * 2015-12-01 2017-06-01 HGST Netherlands B.V. Data allocation in hard drives
CN105677243B (zh) * 2015-12-31 2018-12-28 华为技术有限公司 数据写入装置及方法
CN106951375B (zh) * 2016-01-06 2021-11-30 北京忆恒创源科技股份有限公司 在存储系统中删除快照卷的方法及装置
KR102533389B1 (ko) * 2016-02-24 2023-05-17 삼성전자주식회사 장치 수명을 향상시키는 데이터 저장 장치 및 이를 포함하는 raid 시스템
US10339044B2 (en) * 2016-03-30 2019-07-02 Sandisk Technologies Llc Method and system for blending data reclamation and data integrity garbage collection
CN107515728B (zh) * 2016-06-17 2019-12-24 清华大学 发挥闪存设备内部并发特性的数据管理方法和装置
US9880745B2 (en) * 2016-06-21 2018-01-30 International Business Machines Corporation Reducing concurrency of garbage collection operations
US10467134B2 (en) * 2016-08-25 2019-11-05 Sandisk Technologies Llc Dynamic anneal characteristics for annealing non-volatile memory
US10740294B2 (en) * 2017-01-12 2020-08-11 Pure Storage, Inc. Garbage collection of data blocks in a storage system with direct-mapped storage devices
US10496293B2 (en) * 2017-03-14 2019-12-03 International Business Machines Corporation Techniques for selecting storage blocks for garbage collection based on longevity information
CN107391774B (zh) * 2017-09-15 2019-11-19 厦门大学 基于重复数据删除的日志文件系统的垃圾回收方法
CN107766180B (zh) * 2017-09-22 2020-08-14 成都华为技术有限公司 存储介质的管理方法、装置及可读存储介质
CN111562880A (zh) * 2019-02-14 2020-08-21 英韧科技(上海)有限公司 一种数据存储装置、系统及数据写入方法
KR102721567B1 (ko) * 2019-10-18 2024-10-25 에스케이하이닉스 주식회사 마이그레이션 동작을 위한 메모리 시스템 및 메모리 시스템의 동작방법

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107102819A (zh) * 2014-12-12 2017-08-29 西安三星电子研究有限公司 向固态硬盘写入数据的方法及设备
CN105204783A (zh) * 2015-10-13 2015-12-30 华中科技大学 一种基于数据生存期的固态盘垃圾回收方法
CN109445681A (zh) * 2018-08-27 2019-03-08 华为技术有限公司 数据的存储方法、装置和存储系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3839716A4 *

Also Published As

Publication number Publication date
US12008263B2 (en) 2024-06-11
CN109445681B (zh) 2021-05-11
EP3839716A4 (en) 2021-10-20
EP3839716B1 (en) 2024-09-25
US20210181992A1 (en) 2021-06-17
CN109445681A (zh) 2019-03-08
EP3839716A1 (en) 2021-06-23

Similar Documents

Publication Publication Date Title
WO2020042850A1 (zh) 数据的存储方法、装置和存储系统
US11023448B2 (en) Data scrubbing method and apparatus, and computer readable storage medium
US11003533B2 (en) Data processing method, system, and apparatus
CN102598020B (zh) 用于改进的数据去重的装置、系统及方法
US11397538B2 (en) Data migration method and apparatus
US10769035B2 (en) Key-value index recovery by log feed caching
CN110651246B (zh) 一种数据读写方法、装置和存储服务器
CN103765373B (zh) 数据存储方法、数据存储装置和存储设备
CN106547859A (zh) 一种多租户数据存储系统下的数据文件的存储方法及装置
WO2019001521A1 (zh) 数据存储方法、存储设备、客户端及系统
WO2017020576A1 (zh) 一种键值存储系统中文件压实的方法和装置
CN102073739A (zh) 带有快照功能的分布式文件系统中的数据读与数据写方法
WO2022033269A1 (zh) 数据处理的方法、设备及系统
WO2015085529A1 (zh) 数据复制方法、数据复制装置和存储设备
WO2024131379A1 (zh) 一种数据存储方法、装置及系统
WO2020103512A1 (zh) 一种存储系统中的数据重构方法和装置
WO2023197937A1 (zh) 数据处理方法及其装置、存储介质、计算机程序产品
CN107798063A (zh) 快照处理方法和快照处理装置
WO2021088586A1 (zh) 一种存储系统中的元数据的管理方法及装置
CN112256657B (zh) 日志镜像方法及系统
US20240411646A1 (en) Reverse operation for snapshot chains with inline consolidation and garbage collection
CN109213621B (zh) 一种数据处理方法及数据处理设备
CN115981559A (zh) 分布式数据存储方法、装置、电子设备和可读介质
CN116868173A (zh) 降低在恢复操作期间网络延时的影响
US12393606B1 (en) Data migration using a consistency group

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19855114

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019855114

Country of ref document: EP

Effective date: 20210316