WO2017162086A1 - 任务调度方法和装置 - Google Patents

任务调度方法和装置 Download PDF

Info

Publication number
WO2017162086A1
WO2017162086A1 PCT/CN2017/076877 CN2017076877W WO2017162086A1 WO 2017162086 A1 WO2017162086 A1 WO 2017162086A1 CN 2017076877 W CN2017076877 W CN 2017076877W WO 2017162086 A1 WO2017162086 A1 WO 2017162086A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
cluster
data
scheduling
scheduled
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2017/076877
Other languages
English (en)
French (fr)
Inventor
何乐
黄俨
史英杰
张�杰
张辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to SG11201808118PA priority Critical patent/SG11201808118PA/en
Priority to EP17769363.7A priority patent/EP3413197B1/en
Priority to US16/072,701 priority patent/US10922133B2/en
Publication of WO2017162086A1 publication Critical patent/WO2017162086A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5033Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering data affinity
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/486Scheduler internals
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/504Resource capping

Definitions

  • the present invention relates to computer technology, and in particular, to a task scheduling method and apparatus.
  • cluster technology In order to improve the stability of the system and the data processing capability and service capability of the network center, cluster technology is usually adopted.
  • clustering technology enables servers to be connected to each other to form a cluster. Multiple clusters are interconnected to form a distributed system. Each cluster in the distributed system runs a series of common applications.
  • the running application can be divided into multiple tasks. For a single task, it can be divided into different business units according to the type of service being run, and then belong to the same business unit. Tasks run on the same cluster and the task data for these tasks is also stored on the cluster.
  • the invention provides a task scheduling method and device for solving the situation that the bandwidth occupation between clusters is too high in the prior art.
  • a task scheduling method including:
  • the task is scheduled according to the situation of network resources required for reading and writing.
  • a task scheduling apparatus including:
  • An analysis module configured to analyze network resources required for performing tasks for reading and writing across the cluster, to obtain a network resource required for the task to perform read and write operations across the cluster;
  • a scheduling module configured to schedule the task according to the situation of the network resources required for the reading and writing.
  • the task scheduling method and device analyzes the network resources required for performing cross-cluster reading and writing tasks to obtain the situation that the task performs network resources for reading and writing across the cluster, according to the reading and writing
  • the task is scheduled in the case of the required network resources. Due to the network resources occupied by reading and writing, the network resources saved by the cluster in which the data is accessed when the task is scheduled to be read and written are respectively reflected. Therefore, it is determined that the cluster to which the task is scheduled can make the task realization better.
  • the use of less network resources solves the problem of excessive bandwidth occupation between clusters in the prior art.
  • FIG. 1 is a schematic flowchart of a task scheduling method according to Embodiment 1 of the present invention.
  • FIG. 2 is a schematic flowchart of a task scheduling method according to Embodiment 2 of the present invention.
  • FIG. 3 is a schematic structural diagram of a task scheduling apparatus according to Embodiment 3 of the present invention.
  • FIG. 4 is a schematic structural diagram of a task scheduling apparatus according to Embodiment 4 of the present invention.
  • FIG. 1 is a schematic flowchart of a task scheduling method according to Embodiment 1 of the present invention.
  • the method can be performed by a task manager in a distributed system, as shown in FIG. 1, the method includes:
  • Step 101 Analyze network resources required for performing cross-cluster reading and writing tasks to obtain network resources required for tasks to perform read and write operations across the cluster.
  • distributed systems generally carry many tasks, and a considerable part of them are tasks that run periodically. These tasks may run periodically, hourly, daily, or weekly. Before the task runs, you need to access the data required for the operation. The access mentioned here is read, and the data that the task periodically accesses is the latest data at that time, thus ensuring the accuracy of the result data obtained by the operation. After reading the data, it is also necessary to input the read data into the running task, and therefore, the read data is the input data of the task. If the task accesses data in a process involving cross-cluster reading, it will occupy network resources such as a certain bandwidth. In the case of such cross-cluster access data, it is inevitable to involve reading data from another cluster to the cluster running by the task. At this time, the input data of the task can reflect the network resources occupied by the cluster read. For example: the case of occupying bandwidth.
  • the result data of the task running needs to be returned to the default cluster originally allocated by the task manager.
  • the returned result data needs to be output first by the task, and the outputted data is called the output data of the task. And then write the result data to the default cluster. If the task returns the result data in the process of cross-cluster writing, it will also occupy a certain bandwidth and other network resources. Since returning the result data across the cluster necessarily involves writing the result data from the cluster running by the task to another cluster, the output data of the task can reflect the network resources occupied by the cluster write, for example, occupying bandwidth. happening.
  • the network resources required for the task when reading and writing across the cluster By executing the network resources required for the task when reading and writing across the cluster, on the one hand, it can predict the network resources occupied by the task if the original cluster where the task is currently located and the cluster in which the dependent data is read are different clusters. On the other hand, it can be predicted that the network resources occupied by the task when the cluster where the task is currently located and the cluster where the result data is written are different clusters.
  • the history records the amount of data input and output by each task running on the currently allocated original cluster, and the amount of data of the input data can be obtained for each task according to the history record.
  • the amount of data of the output data, and the input-output ratio is calculated for each task.
  • the input-output ratio is equal to the ratio of the data amount of the input data to the data amount of the output data.
  • Step 102 Schedule the task according to the condition of the network resources required for reading and writing.
  • the task is scheduled to the target cluster where the read dependent data is located.
  • scheduling the cross-cluster task to the target cluster may omit the reading of the dependent data from the target cluster to the original cluster.
  • the network resources are occupied, and the amount of data of the output data of the task is small. Therefore, the network resources that are generated by the cross-cluster task to the target cluster and the result data is written from the target cluster to the default cluster are not occupied much, thereby making The cross-cluster network resource usage caused by scheduling will be significantly reduced. That is to say, the ratio of the amount of data of the input data to the amount of data of the output data, that is, the larger the input-output ratio, the more significant the reduction in resource occupancy across the cluster network.
  • the task input/output ratio is greater than a preset first threshold to predict whether the target cluster in which the data is dependent can effectively reduce network resources such as occupied bandwidth, and if so, the task is scheduled to The target cluster of the task that depends on the data.
  • the first threshold is greater than 1.
  • the task can be scheduled to the cluster in which the written result data is located.
  • the network resources required for performing the tasks of reading and writing across the cluster are analyzed to obtain the situation of reading and writing the occupied network resources, and the tasks are scheduled according to the network resources required for reading and writing. . Due to the network resources occupied by reading and writing, the network resources saved by the cluster in which the data is accessed when the task is scheduled to be read and written are respectively reflected. Therefore, it is determined that the cluster to which the task is scheduled can make the task realization better. The use of less network resources solves the problem of excessive bandwidth occupation between clusters in the prior art.
  • FIG. 2 is a schematic flowchart of a task scheduling method according to Embodiment 2 of the present invention. As shown in FIG. 2, the method includes:
  • Step 201 Perform analysis based on the historical record, and select a target task from the task of performing cross-cluster reading and writing.
  • the task information includes: a data volume of the input data and a data volume of the output data; the data information includes: a cluster where the input data is located, a cluster where the output data is located, a service unit to which the task belongs, and a cluster where the task is currently located, and a task running frequency and an operation cost.
  • the input task ratio, the output data volume, the running cost and the running frequency, and the filtering conditions of the cluster load of the current task are used to filter the target tasks of the network resources required for the read operation more than the network resources required for the write operation.
  • the filtering condition may be that the input/output ratio is greater than the first threshold, the first threshold is 100, and the output data volume is less than 100 GB, and the running cost, the running frequency, the cluster load, and the like are respectively smaller than the preset cluster quota.
  • the preset cluster quota mentioned here is determined according to the cluster resources such as CPU and storage space that the target cluster can provide.
  • the metrics for indicating the cluster resources occupied by the task such as the running cost, the running frequency, and the cluster load, are mainly caused by the fact that even if the input and output are relatively large, the scheduling task saves more network resources, but when the task runs.
  • the occupied cluster resources are large, and the target clusters that are scheduled cannot meet the task requirements, thus greatly increasing the load of the target cluster.
  • the overall performance of the distributed system after scheduling tasks to the target cluster is not obvious. Ascension, this kind of scheduling is not worth the loss. Therefore, when the cluster resources occupied by the task are running, the tasks are not scheduled.
  • Step 202 Extract a task identifier from the target task, and generate scheduling information for recording the task identifier.
  • the periodic attribute in the query statement can be masked, and the task is hashed and processed, for example, using a message.
  • the Message Digest Algorithm MD5 (MD5) algorithm performs hash digest processing and uses the hash digest as the task identifier of the task.
  • Another method, for non-SQL tasks can directly use the fixed number of the task as the task ID of the task. These fixed numbers can come from external systems such as Skynet.
  • the target task is marked with the task identifier.
  • the scheduling information can be used to record the target cluster to which the target task needs to be scheduled.
  • the analysis based on the history record Since the analysis based on the history record has a large amount of operations for filtering the target task, it can be executed in advance, thereby generating scheduling information based on the execution result. In this way, when the task to be scheduled is received, it is not necessary to analyze the task, and the scheduling information obtained by the pre-analysis can be directly scheduled, which saves time and improves the timeliness of scheduling.
  • the process of generating the scheduling information based on the history record may be referred to as a training process, and the process of subsequently scheduling according to the scheduling information is referred to as a decision process.
  • Step 203 When receiving the task to be scheduled, scheduling the task according to the scheduling information.
  • the task scheduling system receives the task to be scheduled, it can determine whether the type of the task is SQL. Task, if yes, extract the hash digest as the task ID, otherwise, extract the fixed number as the task ID. For the process of obtaining the task identifier, refer to the related description in step 202, and details are not described herein again. According to the obtained task identifier, the matching is performed in the scheduling information, and if it is matched, it is scheduled to the target cluster of the task; otherwise, the original cluster in which the result data written by the task is located is scheduled. Further, after scheduling, computing resources can also be allocated for scheduled tasks.
  • scheduling the cross-cluster task to the target cluster can save the network resource occupation that the dependent data is read from the target cluster to the original cluster, and the task is The amount of data of the output data is small, and the network resources that are sent from the target cluster to the original cluster added by scheduling the cross-cluster task to the target cluster are not occupied much, so the cross-cluster network resources caused by the scheduling are caused. The occupancy will be significantly reduced.
  • the ratio of the data amount of the input data to the data amount of the output data that is, the larger the input-output ratio, the more significant the decrease in the resource occupancy across the cluster network, and the value of the first threshold can be determined accordingly, if only If the situation of reducing the resource occupation across the cluster network is more significant, the first threshold is determined to be larger, otherwise the first threshold is greater than 1.
  • FIG. 3 is a schematic structural diagram of a task scheduling apparatus according to Embodiment 3 of the present invention. As shown in FIG. 3, the method includes: an analysis module 31 and a scheduling module 32.
  • the analyzing module 31 is configured to analyze network resources required for performing tasks read and written across the cluster to obtain network resources required for reading and writing the tasks;
  • the scheduling module 32 is configured to schedule the task according to the situation of the network resources required for the reading and writing.
  • the scheduling module 32 is specifically configured to: if the network resource required for the read operation is more than the network resource required for the write operation, schedule the task to the target cluster where the read dependent data is located.
  • the network resources required for performing the tasks of reading and writing across the cluster are analyzed to obtain the situation of reading and writing the occupied network resources, and the tasks are scheduled according to the network resources required for reading and writing. . Due to the network resources occupied by reading and writing, the network resources saved by the cluster in which the data is accessed when the task is scheduled to be read and written are respectively reflected. Therefore, it is determined that the cluster to which the task is scheduled can make the task realization better. The use of less network resources solves the problem of excessive bandwidth occupation between clusters in the prior art.
  • FIG. 4 is a schematic structural diagram of a task scheduling apparatus according to Embodiment 4 of the present invention.
  • the analyzing module 31 includes: an obtaining unit 311 and a calculating unit 312.
  • the obtaining unit 311 is configured to obtain, according to the history record, the data amount of the input data and the data amount of the output data for each of the tasks.
  • the calculating unit 312 is configured to calculate, for each of the tasks, an input-output ratio for indicating a ratio of network resources required for reading and writing.
  • the input-output ratio is equal to the ratio of the data amount of the input data to the data amount of the output data.
  • the scheduling module 32 includes: a determining unit 321, an identifying unit 322, a generating unit 323, and a scheduling unit 324.
  • the determining unit 321 is configured to determine whether the task meets a preset screening condition.
  • the screening condition includes: the input/output ratio is greater than a preset first threshold; wherein the first threshold is greater than 1.
  • the screening condition further includes: the amount of data of the output data is less than a second threshold; and/or, the occupied cluster resource is smaller than the preset quota, wherein the occupied cluster resource includes at least one of an operation cost, an operation frequency, and a cluster load.
  • the identifying unit 322 is configured to obtain a task identifier for a task that satisfies the screening condition.
  • the generating unit 323 is configured to generate scheduling information for recording the task identifier.
  • the scheduling unit 324 is configured to, if the screening condition is met, schedule the task to a target cluster where the dependent data read by the task is located.
  • the scheduling unit 324 is configured to: when receiving the task to be scheduled, acquire the task identifier of the to-be-scheduled task obtained by the identifier unit; if the task identifier of the to-be-scheduled task and the task identifier in the scheduling information If the matching is performed, the to-be-scheduled task is scheduled to the target cluster where the dependent data of the to-be-scheduled task is located.
  • the identifying unit 322 includes: a determining sub-unit 3221, a hash sub-unit 3222, and a numbering sub-unit 3223.
  • a determining subunit 3221 configured to determine whether the type of the task is SQL
  • the hash sub-unit 3222 is configured to perform hash processing on the task if the type of the task is SQL, obtain a hash digest, and use the hash digest as the task identifier.
  • the numbering subunit 3223 is configured to use the number of the task as the task identifier if the type of the task is not SQL.
  • network resources mentioned in the foregoing may be network bandwidth and/or network bandwidth time delay.
  • Those skilled in the art may know that other indicators for measuring network resources may also be used instead of Affect the implementation effect of each embodiment.
  • the tasks are scheduled according to the network resources required for reading and writing. Due to the network resources occupied by reading and writing, the network resources saved by the cluster in which the data is accessed when the task is scheduled to be read and written are respectively reflected. Therefore, it is determined that the cluster to which the task is scheduled can make the task realization better.
  • the use of less network resources solves the problem of excessive bandwidth occupation between clusters in the prior art.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明提供了任务调度方法和装置,通过对进行跨集群读写的任务所需的网络资源进行分析,以获得读和写所占用网络资源的情况,根据读和写所需的网络资源的情况,对任务进行调度。由于读和写所占用的网络资源的情况,分别体现了将任务调度至读和写时所访问数据所在集群能够节省的网络资源,因而,据此确定任务所调度至的集群能够使得任务实现较少的网络资源占用,解决现有技术中集群间的带宽占用过高的情况。

Description

任务调度方法和装置
本申请要求2016年03月25日递交的申请号为201610179807.5、发明名称为“任务调度方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机技术,尤其涉及一种任务调度方法和装置。
背景技术
为了提高系统的稳定性和网络中心的数据处理能力及服务能力,通常采用集群技术。集群技术的出现,能够使得服务器相互连接在一起,构成一个集群,多个集群相互连接构成一个分布式系统,该分布式系统内的各个集群运行一系列共同的应用程序。
在分布式系统中,可以将所运行的应用程序划分为多个任务,对于单个任务来说,可以将其按照运行的业务类型不同划分至不同的业务单元内,进而将同属于一个业务单元的任务运行于同一个集群上并将这些任务的任务数据也存储该集群上。
由于一个业务单元中的一个任务可能会需要读取另一业务单元中的另一任务的任务数据,也就是说在原集群上运行的任务需要依赖另一任务的任务数据。因此,当任务和其所依赖的另一任务的任务数据,即依赖数据,在不同集群上时,该任务会出现跨集群读写的情况,导致占用了大量的带宽。针对这一问题,现有技术中一旦发现某一任务存在跨集群读写的情况,便将该任务调度至其所读取的依赖数据所在的目标集群。但在实际运行过程中仍会出现集群间的带宽占用过高的情况。
发明内容
本发明提供一种任务调度方法和装置,用于解决现有技术中集群间的带宽占用过高的情况。
为达到上述目的,本发明的实施例采用如下技术方案:
第一方面,提供了一种任务调度方法,包括:
对进行跨集群读写的任务所需的网络资源进行分析,以获得所述任务跨集群执行读和写所需的网络资源的情况;
根据读和写所需的网络资源的情况,对所述任务进行调度。
第二方面,提供了一种任务调度装置,包括:
分析模块,用于对进行跨集群读写的任务所需的网络资源进行分析,以获得所述任务跨集群执行读和写所需的网络资源的情况;
调度模块,用于根据所述读和写所需的网络资源的情况,对所述任务进行调度。
本发明实施例提供的任务调度方法和装置,通过对进行跨集群读写的任务所需的网络资源进行分析,以获得任务跨集群执行读和写所占用网络资源的情况,根据读和写所需的网络资源的情况,对任务进行调度。由于读和写所占用的网络资源的情况,分别体现了将任务调度至读和写时所访问数据所在集群能够节省的网络资源,因而,据此确定任务所调度至的集群能够使得任务实现较少的网络资源占用,解决现有技术中集群间的带宽占用过高的情况。
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。
附图说明
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:
图1为本发明实施例一提供的一种任务调度方法的流程示意图;
图2为本发明实施例二提供的一种任务调度方法的流程示意图;
图3为本发明实施例三提供的一种任务调度装置的结构示意图;
图4为本发明实施例四提供的一种任务调度装置的结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
下面结合附图对本发明实施例提供的任务调度方法和装置进行详细描述。
实施例一
图1为本发明实施例一提供的一种任务调度方法的流程示意图,本实施例所提供的 方法,可以由分布式系统中的任务管理器执行,如图1所示,方法包括:
步骤101、对进行跨集群读写的任务所需的网络资源进行分析,以获得任务跨集群执行读和写所需的网络资源的情况。
具体的,分布式系统一般会承载着很多的任务,其中有相当一部分是周期性运行的任务,这些任务可能会每小时、每天或者每周的定期运行,任务运行之前需要访问运行所需的数据,这里所说的访问即读取,任务定期访问的数据会是当时最新的数据,从而保证了运行所获得的结果数据的准确性。在读取数据之后,还需要将所读取到的数据输入到运行的任务中,因此,所读取的数据为任务的输入数据。若该任务访问数据的过程中涉及跨集群读的情况,则会占用一定带宽等网络资源。由于在这种跨集群访问数据的情况下,必然涉及将数据从另一集群读取到任务所运行的集群,此时任务的输入数据便能够体现出跨集群读所占用的网络资源的情况,例如:占用带宽的情况。
另外,任务在运行结束后,还需要将任务运行的结果数据返回至任务管理器所最初分配的默认集群,所返回的结果数据需要首先由任务进行输出,所输出的数据称为任务的输出数据,进而将结果数据写入默认集群。若该任务返回结果数据的过程中涉及跨集群写的情况,则同样会占用一定带宽等网络资源。由于跨集群返回结果数据必然涉及将结果数据从任务所运行的集群写入到另一集群,此时任务的输出数据便能够体现出跨集群写所占用的网络资源的情况,例如:占用带宽的情况。
通过执行跨集群读和写时,任务所需的网络资源的情况,一方面可以预测出若任务当前所在的原集群与所读取的依赖数据所在集群为不同集群时,任务所占用的网络资源情况,另一方面可以预测出若任务当前所在的原集群与所写入的结果数据所在集群为不同集群时,任务所占用的网络资源情况。
作为一种可能的实现方式,历史记录中记录有每一个任务运行在当前所分配的原集群上所输入和输出的数据量,可以根据历史记录,针对每一个任务,获得输入数据的数据量、输出数据的数据量,针对每一个任务计算输入输出比。其中,输入输出比等于输入数据的数据量与输出数据的数据量的比值。
步骤102、根据读和写所需的网络资源的情况,对任务进行调度。
具体的,若读操作所需的网络资源多于写操作所需的网络资源,将任务调度至所读取的依赖数据所在的目标集群。
这是由于跨集群任务的数据存储和任务运行分别位于不同的集群上,输入数据和输出数据分别均存在三种情况:A.存储在任务当前所分配的原集群上;B.存储在任务所 待调度至的目标集群上;C.存储在原集群和目标集群之外的集群上。因此,在现有技术的一旦发现某一任务存在跨集群读写的情况,便将该任务调度至其所读取的依赖数据所在的目标集群的情况下,输入数据或输出数据只要不在目标集群上就需要通过跨集群复制或直读直写的方式访问,跨集群任务所产生的跨集群数据量过大时会对网络资源造成压力。
在一种可能的情况下,若跨集群任务的输入数据存储于目标集群上并且数据量很大,那么调度该跨集群任务至目标集群可以省去将依赖数据从目标集群读取到原集群的网络资源占用,同时任务的输出数据的数据量很小,那么调度该跨集群任务至目标集群所导致增加的将结果数据从目标集群写入到默认集群的网络资源占用不会很大,从而使得调度引起的跨集群网络资源占用会出现显著降低。也就是说,输入数据的数据量与输出数据的数据量的比值,即输入输出比越大则跨集群网络资源占用降低的情况越显著。
作为一种可能的实现方式,可以通过判断任务输入输出比是否大于预设第一阈值预测调度至依赖数据所在的目标集群是否能够有效降低所占用的带宽等网络资源,若是,则将任务调度至所述任务的依赖数据所在的目标集群。其中,第一阈值大于1。
相反的,若读操作所需的网络资源不多于写操作所需的网络资源,可以将任务调度至所写入的结果数据所在的集群。
本实施例中,通过对进行跨集群读写的任务所需的网络资源进行分析,以获得读和写所占用网络资源的情况,根据读和写所需的网络资源的情况,对任务进行调度。由于读和写所占用的网络资源的情况,分别体现了将任务调度至读和写时所访问数据所在集群能够节省的网络资源,因而,据此确定任务所调度至的集群能够使得任务实现较少的网络资源占用,解决现有技术中集群间的带宽占用过高的情况。
实施例二
图2为本发明实施例二提供的一种任务调度方法的流程示意图,如图2所示,包括:
步骤201、基于历史记录进行分析,从进行跨集群读写的任务中筛选出目标任务。
具体的,历史记录中记录有各个任务的任务信息和数据信息。其中,任务信息包括:输入数据的数据量和输出数据的数据量;数据信息包括:输入数据所在集群、输出数据所在集群、任务所属业务单元和任务当前所在集群,以及任务运行频率和运行开销。
基于历史记录,确定出输入数据或输出数据所在集群和任务当前所在集群为不同集群的跨集群任务。将任务当前所在集群作为原集群。
针对每一个跨集群任务,根据公式输入输出比=输入数据量/输出数据量,计算输入输出比。采用输入输出比、输出数据量、运行开销和运行频率、当前任务所在集群负载所构成的筛选条件筛选出读操作所需的网络资源多于写操作所需的网络资源的目标任务。
例如:筛选条件可以为输入输出比大于第一阈值,第一阈值为100,且输出数据量小于100GB,运行开销、运行频率、集群负载等分别小于预设集群配额。
这里所说的预设集群配额是根据目标集群能够提供的CPU和存储空间等集群资源所确定的。在筛选条件中增加运行开销、运行频率、集群负载等用于指示任务占用的集群资源的指标,主要是由于即使输入输出比较大也就是调度任务所节省的网络资源较多,但是当任务运行时所占用的集群资源较多,而所调度至的目标集群无法满足任务所需,从而大幅增加了目标集群的负载的情况下,将任务调度至目标集群后分布式系统的整体性能并没有得到明显提升,这种调度就是得不偿失的,因此,在当任务运行时所占用的集群资源较多时,不对任务进行调度。
步骤202、对目标任务提取任务标识,生成记录任务标识的调度信息。
具体的,获得目标任务的任务标记,有两种方法,对于结构化查询语言(StructuredQuery Language,SQL)任务可以将查询语句中周期性的属性屏蔽掉,对任务进行哈希摘要处理,例如采用消息摘要算法第五版(Message Digest Algorithm MD5,简称MD5)算法进行哈希摘要处理,并将哈希摘要作为该任务的任务标识。另一种方法,针对非SQL任务可以直接将任务的固定编号作为任务的任务标识,这些固定编号可以来自如天网系统等外部系统。
在调度信息中,用任务标识对该目标任务进行标记。另外,调度信息可以用于记录目标任务所需调度至的目标集群。
由于基于历史记录进行分析,筛选出目标任务的过程运算量较大,因而,可以预先执行,从而基于执行结果,生成调度信息。这样,当接收到待调度的任务时,则无需对其进行分析,可以直接根据预先分析所获得的调度信息进行调度,节省了时间,提高的调度的时效性。在实际操作过程中,可以将基于历史记录进行分析,生成调度信息的过程称为训练过程,将后续根据调度信息进行调度的过程称为决策过程。
步骤203、当接收到待调度的任务时,根据调度信息调度任务。
具体的,当接收到待调度的任务时,获取该任务的任务标识,从而对该任务进行识别。具体来说,可以在任务调度系统接收到待调度的任务后,判断任务的类型是否为SQL 任务,如果是则提取哈希摘要作为任务标识,否则,提取固定编号作为任务标识。具体获取任务标识的过程参见步骤202中的相关描述,此处不再赘述。根据获取到的任务标识,在调度信息中进行匹配,匹配到了则调度到任务的目标集群上,否则,调度到任务所写入的结果数据所在的原集群。进一步,在调度之后,还可以为经过调度的任务分配计算资源。
因为当跨集群任务的输入数据存储于目标集群上并且数据量很大时,调度该跨集群任务至目标集群可以省去将依赖数据从目标集群读取到原集群的网络资源占用,同时任务的输出数据的数据量很小,那么调度该跨集群任务至目标集群所增加的将结果数据从目标集群到原集群写入的网络资源占用不会很大,因此,使得调度引起的跨集群网络资源占用会出现显著降低。
也就是说,输入数据的数据量与输出数据的数据量的比值,即输入输出比越大则跨集群网络资源占用降低的情况越显著,可以据此确定第一阈值的取值,若仅在跨集群网络资源占用降低的情况越显著时调度任务至目标集群,则可以将第一阈值确定的较大,否则,确定的较小,但第一阈值应大于1。
实施例三
图3为本发明实施例三提供的一种任务调度装置的结构示意图,如图3所示,包括:分析模块31和调度模块32。
分析模块31,用于对进行跨集群读写的任务所需的网络资源进行分析,以获得所述任务读和写所需的网络资源的情况;
调度模块32,用于根据所述读和写所需的网络资源的情况,对所述任务进行调度。
具体的,调度模块32具体用于若所述读操作所需的网络资源多于写操作所需的网络资源,将所述任务调度至所读取的依赖数据所在的目标集群。
本实施例中,通过对进行跨集群读写的任务所需的网络资源进行分析,以获得读和写所占用网络资源的情况,根据读和写所需的网络资源的情况,对任务进行调度。由于读和写所占用的网络资源的情况,分别体现了将任务调度至读和写时所访问数据所在集群能够节省的网络资源,因而,据此确定任务所调度至的集群能够使得任务实现较少的网络资源占用,解决现有技术中集群间的带宽占用过高的情况。
实施例四
图4为本发明实施例四提供的一种任务调度装置的结构示意图,在图3所提供的任务调度装置的基础上,分析模块31,包括:获得单元311和计算单元312。
获得单元311,用于根据历史记录,针对每一个所述任务,获得输入数据的数据量、输出数据的数据量。
计算单元312,用于针对每一个所述任务计算用于指示读和写所需的网络资源的比例的输入输出比。
其中,输入输出比等于输入数据的数据量与输出数据的数据量的比值。
进一步,调度模块32,包括:判断单元321、标识单元322、生成单元323和调度单元324。
判断单元321,用于判断所述任务是否满足预设的筛选条件。
其中,筛选条件包括:所述输入输出比大于预设第一阈值;其中,第一阈值大于1。筛选条件还包括:输出数据的数据量小于第二阈值;和/或,所占用的集群资源小于预设配额,其中所占用的集群资源包括运行开销、运行频率和集群负载中的至少一个。
标识单元322,用于针对满足所述筛选条件的任务,获得任务标识。
生成单元323,用于生成用于记录所述任务标识的调度信息。
调度单元324,用于若满足所述筛选条件,则将所述任务调度至所述任务所读取的依赖数据所在的目标集群。
具体的,调度单元324,具体用于当接收到待调度任务时,获取标识单元所获得所述待调度任务的任务标识;若所述待调度任务的任务标识与所述调度信息中的任务标识相匹配,则将所述待调度任务调度至所述待调度任务的依赖数据所在的目标集群。
进一步,标识单元322,包括:判断子单元3221、哈希子单元3222和编号子单元3223。
判断子单元3221,用于判断所述任务的类型是否为SQL;
哈希子单元3222,用于若所述任务的类型为SQL,对所述任务进行哈希处理,获得哈希摘要,将所述哈希摘要作为所述任务标识;
编号子单元3223,用于若所述任务的类型不为SQL,将所述任务的编号作为所述任务标识。
需要说明的是,在前述各中所提及的网络资源可以为网络带宽和/或网络带宽时延积,本领域技术人员可以知晓,还可以采用其他用于衡量网络资源的指标,而不会影响各实施例的实现效果。
通过对进行跨集群读写的任务所需的网络资源进行分析,以获得读和写所占用网络资源的情况,根据读和写所需的网络资源的情况,对任务进行调度。由于读和写所占用的网络资源的情况,分别体现了将任务调度至读和写时所访问数据所在集群能够节省的网络资源,因而,据此确定任务所调度至的集群能够使得任务实现较少的网络资源占用,解决现有技术中集群间的带宽占用过高的情况。
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims (18)

  1. 任务调度方法,其特征在于,包括:
    对进行跨集群读写的任务所需的网络资源进行分析,以获得所述任务跨集群执行读和写所需的网络资源的情况;
    根据读和写所需的网络资源的情况,对所述任务进行调度。
  2. 根据权利要求1所述的任务调度方法,其特征在于,所述根据读和写所需的网络资源的情况,对所述任务进行调度,包括:
    若读操作所需的网络资源多于写操作所需的网络资源,将所述任务调度至所读取的依赖数据所在的目标集群。
  3. 根据权利要求1所述的任务调度方法,其特征在于,所述对进行跨集群读写的任务所需的网络资源进行分析,包括:
    根据历史记录,针对每一个所述任务,获得输入数据的数据量、输出数据的数据量;
    针对每一个所述任务计算用于指示读和写所需的网络资源的比例的输入输出比;其中,输入输出比等于输入数据的数据量与输出数据的数据量的比值。
  4. 根据权利要求3所述的任务调度方法,其特征在于,所述根据读和写所需的网络资源的情况,对所述任务进行调度,包括:
    判断所述任务是否满足预设的筛选条件;其中,所述筛选条件包括:所述输入输出比大于预设第一阈值;其中,第一阈值大于1;
    若满足所述筛选条件,则将所述任务调度至所述任务所读取的依赖数据所在的目标集群。
  5. 根据权利要求4所述的任务调度方法,其特征在于,所述判断所述任务是否满足预设的筛选条件之后,还包括:
    针对满足所述筛选条件的任务,获得任务标识;
    生成用于记录所述任务标识的调度信息。
  6. 根据权利要求5所述的任务调度方法,其特征在于,所述若满足所述筛选条件,则将所述任务调度至所述任务所读取的依赖数据所在的目标集群,包括:
    当接收到待调度任务时,针对所述待调度任务,获得任务标识;
    若所述待调度任务的任务标识与所述调度信息中的任务标识相匹配,则将所述待调度任务调度至所述待调度任务的依赖数据所在的目标集群。
  7. 根据权利要求5或6所述的任务调度方法,其特征在于,所述获得任务标识, 包括:
    判断所述任务的类型是否为SQL;
    若所述任务的类型为SQL,对所述任务进行哈希处理,获得哈希摘要,将所述哈希摘要作为所述任务标识;
    否则,将所述任务的编号作为所述任务标识。
  8. 根据权利要求4所述的任务调度方法,其特征在于,所述筛选条件还包括:输出数据的数据量小于第二阈值;
    和/或,所占用的集群资源小于预设配额,其中所占用的集群资源包括运行开销、运行频率和集群负载中的至少一个。
  9. 根据权利要求1-6任一项所述的任务调度方法,其特征在于,所述网络资源包括:网络带宽和网络带宽时延积中的至少一个。
  10. 一种任务调度装置,其特征在于,包括:
    分析模块,用于对进行跨集群读写的任务所需的网络资源进行分析,以获得所述任务跨集群执行读和写所需的网络资源的情况;
    调度模块,用于根据所述读和写所需的网络资源的情况,对所述任务进行调度。
  11. 根据权利要求10所述的任务调度装置,其特征在于,
    所述调度模块,具体用于若读操作所需的网络资源多于写操作所需的网络资源,将所述任务调度至所读取的依赖数据所在的目标集群。
  12. 根据权利要求10所述的任务调度装置,其特征在于,所述分析模块,包括:
    获得单元,用于根据历史记录,针对每一个所述任务,获得输入数据的数据量、输出数据的数据量;
    计算单元,用于针对每一个所述任务计算用于指示读和写所需的网络资源的比例的输入输出比;其中,输入输出比等于输入数据的数据量与输出数据的数据量的比值。
  13. 根据权利要求12所述的任务调度装置,其特征在于,所述调度模块,包括:
    判断单元,用于判断所述任务是否满足预设的筛选条件;其中,所述筛选条件包括:所述输入输出比大于预设第一阈值;其中,第一阈值大于1;
    调度单元,用于若满足所述筛选条件,则将所述任务调度至所述任务所读取的依赖数据所在的目标集群。
  14. 根据权利要求13所述的任务调度装置,其特征在于,所述调度模块,还包括:
    标识单元,用于针对满足所述筛选条件的任务,获得任务标识;
    生成单元,用于生成用于记录所述任务标识的调度信息。
  15. 根据权利要求14所述的任务调度装置,其特征在于,
    所述标识单元,还用于当接收到待调度任务时,针对所述待调度任务,获得任务标识;
    所述调度单元,具体用于当接收到待调度任务时,获取标识单元所获得所述待调度任务的任务标识;若所述待调度任务的任务标识与所述调度信息中的任务标识相匹配,则将所述待调度任务调度至所述待调度任务的依赖数据所在的目标集群。
  16. 根据权利要求14或15所述的任务调度装置,其特征在于,所述标识单元,包括:
    判断子单元,用于判断所述任务的类型是否为SQL;
    哈希子单元,用于若所述任务的类型为SQL,对所述任务进行哈希处理,获得哈希摘要,将所述哈希摘要作为所述任务标识;
    编号子单元,用于若所述任务的类型不为SQL,将所述任务的编号作为所述任务标识。
  17. 根据权利要求13所述的任务调度装置,其特征在于,所述筛选条件还包括:输出数据的数据量小于第二阈值;
    和/或,所占用的集群资源小于预设配额,其中所占用的集群资源包括运行开销、运行频率和集群负载中的至少一个。
  18. 根据权利要求10-15任一项所述的任务调度装置,其特征在于,所述网络资源包括:网络带宽和网络带宽时延积中的至少一个。
PCT/CN2017/076877 2016-03-25 2017-03-16 任务调度方法和装置 Ceased WO2017162086A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
SG11201808118PA SG11201808118PA (en) 2016-03-25 2017-03-16 Method and apparatus for task scheduling
EP17769363.7A EP3413197B1 (en) 2016-03-25 2017-03-16 Task scheduling method and device
US16/072,701 US10922133B2 (en) 2016-03-25 2017-03-16 Method and apparatus for task scheduling

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610179807.5A CN107229517A (zh) 2016-03-25 2016-03-25 任务调度方法和装置
CN201610179807.5 2016-03-25

Publications (1)

Publication Number Publication Date
WO2017162086A1 true WO2017162086A1 (zh) 2017-09-28

Family

ID=59899220

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/076877 Ceased WO2017162086A1 (zh) 2016-03-25 2017-03-16 任务调度方法和装置

Country Status (6)

Country Link
US (1) US10922133B2 (zh)
EP (1) EP3413197B1 (zh)
CN (1) CN107229517A (zh)
SG (2) SG11201808118PA (zh)
TW (1) TWI738721B (zh)
WO (1) WO2017162086A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117472530A (zh) * 2023-10-25 2024-01-30 上海宽睿信息科技有限责任公司 一种基于集中管理的数据智能调度方法及系统

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3553747B1 (en) * 2018-04-09 2021-09-01 BlackBerry Limited Methods and devices for predictive coding of point clouds
CN109145053B (zh) * 2018-08-01 2021-03-23 创新先进技术有限公司 数据处理方法和装置、客户端、服务器
CN109144791B (zh) * 2018-09-30 2020-12-22 北京金山云网络技术有限公司 数据转存方法、装置和数据管理服务器
US11579908B2 (en) 2018-12-18 2023-02-14 Vmware, Inc. Containerized workload scheduling
CN109743390B (zh) * 2019-01-04 2022-02-22 深圳壹账通智能科技有限公司 任务调度方法、装置、计算机设备和存储介质
WO2020177720A1 (en) 2019-03-06 2020-09-10 Enteromed Ltd Devices, systems, and methods for delivering therapy to intestinal muscle
CN111857981B (zh) * 2019-04-24 2025-08-15 阿里巴巴集团控股有限公司 一种数据处理方法以及装置
US12271749B2 (en) 2019-04-25 2025-04-08 VMware LLC Containerized workload scheduling
US12017073B2 (en) 2019-05-16 2024-06-25 Enteromed Ltd Devices, systems, and methods for delivering therapy to a sacral nerve
CN110188490B (zh) * 2019-06-03 2021-03-23 珠海格力电器股份有限公司 提高数据仿真效率的方法及装置、存储介质和电子装置
CN113971082B (zh) * 2021-10-25 2025-12-30 北京百度网讯科技有限公司 任务调度方法、装置、设备、介质及产品
CN114978929B (zh) * 2022-04-29 2023-08-18 苏州浪潮智能科技有限公司 一种网络调度装置和方法
CN115633034A (zh) * 2022-10-08 2023-01-20 北京八分量信息科技有限公司 一种跨集群的资源调度方法、装置、设备及存储介质
CN116366654B (zh) * 2023-03-20 2026-03-17 中国工商银行股份有限公司 任务处理方法、装置、计算机可读存储介质及电子设备

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090193427A1 (en) * 2008-01-30 2009-07-30 International Business Machines Corporation Managing parallel data processing jobs in grid environments
CN103019853A (zh) * 2012-11-19 2013-04-03 北京亿赞普网络技术有限公司 一种作业任务的调度方法和装置
CN103377075A (zh) * 2012-04-28 2013-10-30 腾讯科技(深圳)有限公司 管理任务的方法、装置及系统

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6647408B1 (en) 1999-07-16 2003-11-11 Novell, Inc. Task distribution
US7519726B2 (en) * 2003-12-12 2009-04-14 International Business Machines Corporation Methods, apparatus and computer programs for enhanced access to resources within a network
CN101023417A (zh) 2004-06-08 2007-08-22 罗切斯特大学 对集群处理器中的通信并行性折衷的动态管理
US7861246B2 (en) 2004-06-17 2010-12-28 Platform Computing Corporation Job-centric scheduling in a grid environment
JP4477437B2 (ja) * 2004-07-02 2010-06-09 株式会社日立製作所 ストレージ装置、そのクラスタ間データ通信方法、及びそのクラスタ通信制御プログラム
US20060184941A1 (en) 2005-02-15 2006-08-17 Bea Systems, Inc. Distributed task framework
US7934199B2 (en) 2005-09-16 2011-04-26 International Business Machines Corporation Automated operation of IT resources with multiple choice configuration
US8082362B1 (en) 2006-04-27 2011-12-20 Netapp, Inc. System and method for selection of data paths in a clustered storage system
CN101236513B (zh) 2007-01-30 2012-07-18 阿里巴巴集团控股有限公司 一种分布式任务系统和分布式任务管理方法
US20090319608A1 (en) 2008-06-23 2009-12-24 Microsoft Corporation Automated task centered collaboration
US9239994B2 (en) 2009-02-25 2016-01-19 Empire Technology Development Llc Data centers task mapping
US8874505B2 (en) * 2011-01-11 2014-10-28 Hitachi, Ltd. Data replication and failure recovery method for distributed key-value store
CN103092683B (zh) * 2011-11-07 2017-12-26 Sap欧洲公司 用于数据分析的基于启发式的调度
CN104520815B (zh) * 2014-03-17 2019-03-01 华为技术有限公司 一种任务调度的方法及装置
US9977699B2 (en) 2014-11-17 2018-05-22 Mediatek, Inc. Energy efficient multi-cluster system and its operations
CN104679479B (zh) * 2015-03-12 2017-10-24 中国人民解放军信息工程大学 一种基于任务编号的调度控制机制的多核密码处理器
CN106161525B (zh) 2015-04-03 2019-09-17 阿里巴巴集团控股有限公司 一种多集群管理方法与设备
CN105162878B (zh) * 2015-09-24 2018-08-31 网宿科技股份有限公司 基于分布式存储的文件分发系统及方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090193427A1 (en) * 2008-01-30 2009-07-30 International Business Machines Corporation Managing parallel data processing jobs in grid environments
CN103377075A (zh) * 2012-04-28 2013-10-30 腾讯科技(深圳)有限公司 管理任务的方法、装置及系统
CN103019853A (zh) * 2012-11-19 2013-04-03 北京亿赞普网络技术有限公司 一种作业任务的调度方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3413197A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117472530A (zh) * 2023-10-25 2024-01-30 上海宽睿信息科技有限责任公司 一种基于集中管理的数据智能调度方法及系统
CN117472530B (zh) * 2023-10-25 2024-04-05 上海宽睿信息科技有限责任公司 一种基于集中管理的数据智能调度方法及系统

Also Published As

Publication number Publication date
TWI738721B (zh) 2021-09-11
EP3413197A4 (en) 2019-10-02
TW201737113A (zh) 2017-10-16
US20190034228A1 (en) 2019-01-31
CN107229517A (zh) 2017-10-03
EP3413197B1 (en) 2022-11-30
EP3413197A1 (en) 2018-12-12
SG10202009481XA (en) 2020-11-27
US10922133B2 (en) 2021-02-16
SG11201808118PA (en) 2018-10-30

Similar Documents

Publication Publication Date Title
WO2017162086A1 (zh) 任务调度方法和装置
US11609911B2 (en) Selecting a normalized form for conversion of a query expression
US10606806B2 (en) Method and apparatus for storing time series data
US20180004441A1 (en) Information processing apparatus, computer-readable recording medium having storage control program stored therein, and method of controlling storage
CN110377519B (zh) 大数据系统的性能容量测试方法、装置、设备及存储介质
US20160203416A1 (en) A method and system for analyzing accesses to a data storage type and recommending a change of storage type
JP5699715B2 (ja) データ保存装置、データ保存方法
CN103176974A (zh) 优化数据库中访问路径的方法和装置
CN116089414A (zh) 基于海量数据场景的时序数据库写入性能优化方法及装置
CN114185919A (zh) 慢查询告警方法、电子设备及存储介质
CN118535074B (zh) 一种数据处理方法及数据存储系统
CN117131138B (zh) 基于数据湖的数据处理方法、装置、设备和介质
CN113297245A (zh) 获取执行信息的方法及装置
CN113760176A (zh) 数据存储方法和装置
US10191668B1 (en) Method for dynamically modeling medium error evolution to predict disk failure
CN111522870A (zh) 数据库访问方法、中间件和可读存储介质
CN111831754A (zh) 数据库中数据的复制方法、装置、系统和介质
CN120067223A (zh) 一种跨引擎数据的处理方法、装置和计算机设备
CN112231292A (zh) 文件处理方法、装置、存储介质及计算机设备
CN115291806B (zh) 一种处理方法、装置、电子设备及存储介质
CN110874601A (zh) 识别设备运行状态的方法、状态识别模型训练方法及装置
CN115098314A (zh) 慢盘检测方法、装置、电子设备和可读存储介质
CN114428711A (zh) 数据检测方法、装置、设备及存储介质
US11960939B2 (en) Management computer, management system, and recording medium
CN117453812A (zh) 数据库的数据实时同步方法及其装置

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2017769363

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2017769363

Country of ref document: EP

Effective date: 20180903

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 11201808118P

Country of ref document: SG

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17769363

Country of ref document: EP

Kind code of ref document: A1