Detailed Description
The principles and spirit of the present application will be described with reference to a number of exemplary embodiments. It should be understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present application, and are not intended to limit the scope of the present application in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As will be appreciated by one skilled in the art, embodiments of the present description may be embodied as a system, an apparatus, a method, or a computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
The embodiment of the specification provides a data processing aging test method. FIG. 1 is a flow diagram illustrating a data processing aging test method in one embodiment of the present disclosure. Although the present specification provides method operational steps or apparatus configurations as illustrated in the following examples or figures, more or fewer operational steps or modular units may be included in the methods or apparatus based on conventional or non-inventive efforts. In the case of steps or structures which do not logically have the necessary cause and effect relationship, the execution sequence of the steps or the module structure of the apparatus is not limited to the execution sequence or the module structure described in the embodiments and shown in the drawings. When the described method or module structure is applied in an actual device or end product, the method or module structure according to the embodiments or shown in the drawings can be executed sequentially or executed in parallel (for example, in a parallel processor or multi-thread processing environment, or even in a distributed processing environment).
Specifically, as shown in fig. 1, the data processing aging test method provided by an embodiment of the present specification may include the following steps:
in step S101, job run time data, job dependency data, and a change job set of a plurality of jobs are acquired.
The method in the embodiment of the specification can be applied to a job test server. The job testing server may obtain job run time data, job dependency data, and change job sets for a plurality of jobs. The plurality of operations comprise data processing of the designated data according to preset logic. For example, the plurality of jobs may be a plurality of large data jobs that need to be executed to complete one or more target tasks. The preset logic may be a data processing logic determined according to the preset service logic. The specified data may be business data of a bank, etc. Job runtime data may be used to characterize the time taken to execute a job. Job dependency data can be used to characterize dependencies between multiple jobs. The change job set may be a set of jobs that are newly added or modified when a version is updated.
And S102, determining key jobs influencing the overall job timeliness in the plurality of jobs according to the job running time data and the job dependency relationship data.
The job testing server can determine the key jobs influencing the overall job timeliness in the plurality of jobs according to the acquired job running time data and job dependency relationship data. In particular, a critical job that affects the overall timeliness of the job may be a job on a critical path. The critical path may refer to a logic path with the longest delay from input to output. In a big data job, the critical path may be the longest job execution path through rocks from the beginning of execution to the end of execution. According to the operation running time data and the operation dependency relationship data, the key operation which influences the overall aging of the operation can be determined.
Step S103, matching the key operation with the change operation set to obtain a change key operation.
After the key job is determined, the key job may be matched with the change job set to obtain a change key job. That is, the critical job existing in the change job set is set as the change critical job. For example, if the key job includes job a, job C, job D, and job E, and if the change job set includes job C, job E, job F, and job G, the key job is changed to job C and job E.
And step S104, carrying out aging test on the changed key operation, and identifying whether the changed key operation has aging risk or not according to the aging test result.
And after obtaining the change key operation, carrying out an aging test on the change key operation to obtain an aging test result. The aging test result may include changing the running aging data of the critical job. After the aging test result is obtained, whether the key operation is changed to have aging risk or not can be identified according to the aging test result.
According to the method in the embodiment, the key operation influencing the overall operation timeliness can be determined, the key operation can be matched with the change operation set to obtain the change key operation, the timeliness test is conducted on the change key operation, the timeliness test range is greatly reduced, and the change key operation with the timeliness risk can be identified through the timeliness verification of the timeliness test result. The method is high in automation degree, can accurately position the change key operation with the aging risk, judges the whole aging influence, shortens the verification time period, and greatly improves the aging test efficiency of big data operation.
In some embodiments of the present description, obtaining job runtime data, job dependency data, and change job sets for a plurality of jobs may include: collecting operation running time data of a plurality of operations from an operation and maintenance platform; performing blood-based analysis on data processing logic in operation scripts of a plurality of operations to generate operation dependency relationship data of the plurality of operations; and inquiring the job code submission records in the job version library to obtain a change job set of a plurality of jobs.
Specifically, the data of the operation running time of a plurality of operations can be collected from the operation and maintenance platform in quasi-real time through the continuous integration task, the data is formatted, and the measurement units of time are unified. The operation scripts of a plurality of operations can be obtained from the version library, and the trend of data flow is formed by performing blood-related analysis on data processing logic in the operation scripts, so that operation dependency relationship data are automatically generated. The change job set may be obtained by querying the code submission records in the version library. Further, the obtained job runtime data, job dependency data, and change job sets may be saved to a database. The database type may be mysql or oracle, etc. relational database. The source data can be processed in the above way, so that the required operation running time data, operation dependency relation data and change operation can be obtained.
In some embodiments of the present description, determining, from the job runtime data and the job dependency data, a key job of the plurality of jobs that affects the overall timeliness of the job may include: constructing a plurality of jobs into an AOE network according to job running time data and job dependency relationship data, wherein each job in the plurality of jobs is used as a vertex of the AOE network, job dependency relationships among the jobs are used as directed edges of the AOE network, and the job running time of each job is used as a weight of the directed edges; and determining key operation influencing the overall operation timeliness in the plurality of operations based on the AOE network.
An AOE Network (Activity On Edge Network) is a commonly used weighted directed graph, which can be used to estimate the shortest construction period of a project and which activities are the key to influence the progress of the project. Referring to fig. 2, a schematic diagram of an AOE network in an embodiment of the present application is shown. In fig. 2, circles represent vertices in the AOE network, arrows represent directed edges in the AOE network, and values labeled on the directed edges represent weights of the directed edges. Where vertices represent events, directed edges represent activities, and weights on directed edges generally represent the duration of the activities. The event represented by the vertex can only occur if the activity represented by each directed edge at the point of entry has ended. Only after an event represented by a vertex occurs can the activity represented by each directed edge from that vertex begin. The time required to complete the entire project depends on the longest path length from source to sink, i.e. the sum of the durations of all activities on this path. The path with the longest path length is called a critical path, and the activities on the critical path are called critical activities. In FIG. 2, the critical path is A-B-C-H-J-K and the critical activities include A, B, C, H, J and K.
Multiple jobs may be built into an AOE network based on job run time data and job dependency data. Each of the plurality of jobs may be a vertex of the AOE network, the job dependency between the plurality of jobs may be a directed edge of the AOE network, and the job run time of each of the plurality of jobs may be a weight of the directed edge. After the AOE network is obtained, key jobs which influence the overall aging of the jobs in the plurality of jobs can be determined based on the AOE network. By building multiple jobs into an AOE network, critical jobs can be determined based on the AOE network, and the process is simple and efficient.
In some embodiments of the present description, determining, based on the AOE network, a critical job of the plurality of jobs that affects overall job timeliness may include: determining whether a cyclic graph exists in the AOE network; and under the condition that the cyclic graph does not exist in the AOE network, determining key operation influencing the overall operation timeliness in the plurality of operations based on the AOE network.
Specifically, after the AOE network is constructed, whether a cyclic graph exists in the AOE network can be determined according to a topological sorting algorithm. The cyclic graph refers to a link with a source point and a sink point superposed in the AOE network. And under the condition that the cyclic graph does not exist in the AOE network, determining key operation influencing the overall operation timeliness in a plurality of operations based on the AOE network. By the method, the key operation is determined and the subsequent steps are executed only under the condition that no cyclic graph exists in the AOE network, so that resource waste caused by the fact that the subsequent steps are executed under the condition that the AOE network does not conform to data processing logic can be avoided.
In some embodiments of the present specification, after determining whether a cyclic graph exists in the AOE network, the method may further include: in the event that a cyclic graph is determined to be present in the AOE network, a plurality of jobs are determined to be non-compliant with the data processing logic and a notification message is generated for transmission to the user.
Specifically, in the case where it is determined that a cyclic graph exists in the AOE network, it is determined that a plurality of jobs do not conform to the data processing logic, and a notification message is generated and sent to the user. After the user receives the notification message, the user may make troubleshooting adjustments to the job based on the notification message. By the mode, the data processing logic of the operation can be checked in time, and the accuracy and the efficiency are improved.
In some embodiments of the present description, determining, based on the AOE network, a critical job of the plurality of jobs that affects overall job timeliness may include: starting from a source point of the AOE network to perform topological sequencing, and calculating the earliest starting time of each operation in a plurality of operations; carrying out topology sequencing from a sink of the AOE network, and calculating the latest starting time of each operation in a plurality of operations; determining whether an earliest start time and a latest start time of each of the plurality of jobs are equal; and determining the operation with the earliest starting time equal to the latest starting time as the key operation influencing the overall operation timeliness.
Specifically, according to the AOE network, topological sorting may be performed from a source point, the earliest start time of each job is calculated, and added to a set of the earliest start times of the jobs. Referring to fig. 3, a flowchart for calculating the earliest start time of each job in the embodiment of the present application is shown. As shown in fig. 3, calculating the earliest start time of a job may include the steps of: (1) initializing a set of the earliest start time of the operation, and uniformly assigning the earliest start time of all the operations as 0; (2) starting from a source point; (3) carrying out topology sequencing according to the AOE network; (4) obtaining the running time T of the current operationiI.e. the job run time attribute of the current vertex; (5) obtaining the earliest starting time ET of the preorder operation from the seti-1(ii) a (6) Calculate the earliest start time of the current job: ETi=max(ETi-1+Ti) If there are several preorder operations, then the maximum value of the calculation result is the earliest starting time of the operation; (7) the earliest starting time ET of the current operationiAdding the operation into the earliest starting time set; (8) judging whether the vertex is a sink, if so, executing (9), and if not, returning to (3) to execute topological sorting on the next vertex; (9) the set of earliest start times for all jobs is obtained.
According to the AOE network, topological sorting can be carried out from a sink, the latest starting time of each job is calculated, and the latest starting time is added into a job latest starting time set. The calculation of the latest start time of the job is the reverse process of the calculation of the earliest start time of the job. The step of calculating the latest start time of the job may refer to the step of calculating the earliest start time described above. Wherein, the calculation formula of the latest starting time is LT according to the inverse deduction of the earliest starting timei=min(LTi+1-Ti)。
And acquiring the earliest starting time and the latest starting time of each job from the earliest starting time set and the latest starting time set of the job, wherein if the earliest starting time and the latest starting time are equal, the job is a key job. The time to reach each job is variable because there are various combinations of the preamble path of each job, with the earliest start time being the shortest to reach the point and the latest start time being the longest to reach the point. If the earliest starting time and the latest starting time of the vertex are equal, the vertex does not have redundant maneuvering time, and the vertex is the key activity. Referring to fig. 4, a schematic diagram of an AOE network in an embodiment of the present application is shown. In fig. 4, a circle represents a job in the AOE network, characters on the left side of the circle represent the number of the job, a numerical value on the upper right side of the circle represents the earliest start time of the job, and a numerical value on the lower right side of the circle represents the latest start time of the job. That is, FIG. 4 shows an AOE network with the job earliest start time and the job latest start time updated. From FIG. 4, it can be readily determined that key jobs include: job V1, job V4, job V6, job V5, job V7, job V9, and job V10. By the method, topological sequencing can be performed based on the AOE network, and the earliest starting time and the latest starting time of each operation are obtained, so that the key operation can be quickly determined.
In some embodiments of the present description, performing an aging test on a change critical operation may include: acquiring a historical version and a current version of a key operation from an operation version library; uploading the historical version of the job script and the current version of the job script to a to-be-executed script directory of a job scheduling server, wherein the job scheduling server utilizes a job monitoring process to scan the to-be-executed script directory at regular time, if the to-be-executed script is found, running a job scheduling framework to execute the job script, determining the time consumption of executing the job script as the aging result of the job corresponding to the job script after the job script is executed, storing the aging result to a database, and deleting the corresponding job script under the to-be-executed script directory.
Specifically, the historical version and the current version of the change key job may be acquired from the job version library, and after the version labels are respectively labeled, the job script of the historical version and the current version of the change key job may be uploaded to the to-be-executed script directory of the job scheduling server. The job scheduling server may periodically scan the to-be-executed script directory of the job scheduling server using the job monitoring process. And if the script to be executed is found, automatically running the job scheduling framework and executing the job script. After the operation is completed, the time consumed by the operation can be taken as an aging result, and the time consumed by the operation is brought into an aging memory, and the corresponding script in the script directory to be executed is deleted. The aging memory can record the aging result of the operation and store the aging result into a database, and the type of the database can be mysql or oracle and other relational databases. By the aid of the mode, the script to be verified is automatically deployed to the job scheduling server and then is automatically executed by the job scheduling framework, results are stored in the database, the whole process is automatic, efficiency can be improved, and labor cost is saved.
In some embodiments of the present description, identifying whether there is an aging risk in changing the critical job according to the aging test result may include: acquiring an aging result of a historical version of the changed key operation and an aging result of a current version of the changed key operation from a database; acquiring a job running time threshold from a job version library; determining a path formed by the key operation as a key path, and calculating the length of the key path; determining a difference between a job run time threshold and the length of the critical path as a job run window time margin; and identifying whether the changed key operation has an aging risk or not according to the operation running window time allowance, the aging result of the historical version of the changed key operation and the aging result of the current version of the changed key operation.
In particular, the job runtime threshold may be obtained from a version library. The aging result of the history version of the change key job and the aging result of the current version of the change key job may be acquired from the database. The path formed by the critical job may be determined as a critical path, and the length of the critical path may be calculated. Determining a difference between the job run time threshold and the length of the critical path as a job run window time margin. And comparing the aging result of the historical version of the changed key operation and the aging result of the current version of the changed key operation with the operation window time allowance, thereby judging the influence on the overall aging and identifying the changed key operation with the aging risk. By the method, the influence of each change key operation on the overall time efficiency can be judged, and the change key operation with time efficiency risk can be identified.
In some embodiments of the present description, identifying whether there is an aging risk for the change key job based on the job run window time margin, the aging result of the historical version of the change key job, and the aging result of the current version of the change key job may include: determining whether the key operation is changed into a new operation; under the condition that the key operation is determined to be newly added, judging whether the aging result of the current version of the key operation is larger than the time allowance of the operation window; and under the condition that the aging result of the current version of the key change operation is judged to be larger than the time allowance of the operation window, determining the key change operation as an aging risk operation.
Specifically, the change type of the change key job can be determined. If the key operation is changed into a new operation, comparing the aging result of the current version with the operation window time allowance, if the aging result of the current version exceeds the operation window time allowance, indicating that all the operations on the key path cannot be executed in the set operation window time window, and simultaneously prolonging the overall aging of the operations, identifying the operations as aging risk operations, otherwise, having no related risk. By the above method, it can be determined whether the total time efficiency is prolonged by the newly added change key operation, and thus it can be determined whether the operation is time efficiency risk operation.
In some embodiments of the present specification, after determining whether the change critical job is an add-on job, the method may further include: determining whether the aging result of the current version of the change key job is larger than the aging result of the historical version of the change key job under the condition that the change key job is determined not to be a new addition job; determining whether a difference between the aging result of the current version of the change key job and the aging result of the historical version of the change key job is greater than a job operation window time margin under the condition that the aging result of the current version of the change key job is determined to be greater than the aging result of the historical version of the change key job; and under the condition that the difference value between the aging result of the current version of the change key job and the aging result of the historical version of the change key job is larger than the time allowance of the job operation window, determining the change key job as the aging risk job.
Specifically, when the change-critical job is not a new job, the change-critical job is a modification job. The aging result of the current version can be compared with the aging result of the historical version, and if the aging result of the current version exceeds the aging result of the historical version, the modification can prolong the overall aging of the operation. In this case, the difference between the aging result of the current version and the aging result of the historical version is compared with the time margin, if the former is greater than the latter, it indicates that all the jobs on the critical path cannot be executed in the set time window, and the job is identified as an aging risk job, otherwise, there is no related risk. By the above method, it is possible to determine whether the modified change key job extends the entire aging, and determine whether the job is an aging risk job.
In some embodiments of the present description, after determining the change critical job as the aging risk job, the method may further include: archiving data related to the aging risk operation to a database; and/or pushing data related to the aging risk operation to a user.
Specifically, after the change key job is determined as the aging risk job, data related to the aging risk job can be archived to a database, so that subsequent query and acquisition are facilitated. After determining the change critical job as the aging risk job, data related to the aging risk job may be pushed to the user. Wherein the data related to aging risk work may include at least one of: job identification, job script, job runtime, and job dependencies related to the job. The data related to the aging risk operation is pushed to the user, so that the user can conveniently check and process the data as soon as possible.
The above method is described below with reference to a specific example, however, it should be noted that the specific example is only for better describing the present application and is not to be construed as limiting the present application.
Referring to fig. 5, a flow chart of a data processing aging test method in this embodiment is shown. As shown in fig. 5, in this embodiment, the method includes the following steps:
step 1, source data processing: and processing the source data to obtain operation running time, operation dependency relationship and a change operation set.
Step 2, building a big data operation model: the nodes and edges in the AOE network are mapped with the big data operation scheduling, so that a big data operation model based on the AOE network is constructed.
And 3, judging a cyclic graph: and checking whether a cyclic graph exists in the big data operation model, if so, indicating that the cyclic graph does not accord with the data processing logic, and automatically notifying related personnel to check.
Step 4, establishing a critical path: and (4) carrying out path analysis on the big data operation model by using a key path algorithm, and identifying a key path and key operation which influence the overall timeliness.
Step 5, matching the job change set: and matching the key operation with the change operation set to obtain a change key operation.
Step 6, changing key operation script collection and deployment: and acquiring the current version and the historical version of the key operation from the version library, respectively labeling version labels, and uploading the operation scripts of 2 versions to a script directory to be executed of the operation scheduling server.
And 7, changing the automatic accurate test of key operation: and the job monitoring process scans the to-be-executed script directory of the job scheduling server at regular time, and if the to-be-executed script is found, the job scheduling framework is automatically operated to execute the job script. And after the execution is finished, recording the operation time consumption of the operation, and deleting the corresponding script under the script directory to be executed.
Step 8, identifying aging risks: and evaluating the operation timeliness of the change key operation, and comparing the evaluation with the operation window time allowance, thereby judging the influence on the whole timeliness and identifying the change key operation with the timeliness risk.
And 9, automatically pushing the aging risk operation to related personnel, and recommending the related personnel to perform aging optimization on the aging risk operation or adjust an operation running time threshold.
The method in the embodiment is highly automatic and can be actually used after falling to the ground, testing personnel are not needed, accurate testing of key operation change is achieved under unattended operation, aging risks are rapidly identified, and testing efficiency is effectively improved. Please refer to the following table, comparing the present solution with the traditional manual test, it can be seen that the beneficial effect of the present solution can be brought.
TABLE 1
Based on the same inventive concept, the embodiment of the present specification further provides a data processing aging test apparatus, as described in the following embodiments. Because the principle of solving the problems of the data processing aging test device is similar to that of the data processing aging test method, the implementation of the data processing aging test device can refer to the implementation of the data processing aging test method, and repeated parts are not described again. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated. Fig. 6 is a block diagram of a configuration of a data processing aging test apparatus according to an embodiment of the present disclosure, and as shown in fig. 6, the data processing aging test apparatus includes: the structure of the acquisition module 601, the determination module 602, the matching module 603, and the identification module 604 is described below.
The obtaining module 601 is configured to obtain job runtime data, job dependency data, and a change job set of a plurality of jobs, where the plurality of jobs include data processing on specified data according to a preset logic.
The determining module 602 is configured to determine, according to the job runtime data and the job dependency data, a key job that affects overall timeliness of the job among the plurality of jobs.
The matching module 603 is configured to match the key job with the change job set to obtain a change key job.
The identifying module 604 is configured to perform an aging test on the change critical job, and identify whether there is an aging risk in the change critical job according to an aging test result.
The above-mentioned device is described with reference to fig. 7 to 12 in conjunction with a specific embodiment, however, it should be noted that this specific embodiment is only for better describing the present application and should not be construed as an undue limitation to the present application.
Referring to fig. 7, a schematic structural diagram of a data processing aging test apparatus in this embodiment is shown. As shown in fig. 7, in this embodiment, the method includes the following steps:
fig. 7 shows a schematic structural diagram of an aging testing apparatus for data processing based on an AOE network according to an embodiment of the present application. As shown in fig. 7, the apparatus may include: the system comprises a source data processing module 701, a big data operation model building module 702, a critical path building module 703, a precision testing module 704 and an aging risk identification module 705. The modules are explained below.
The source data processing module 701 is used for preprocessing and processing source data and incorporating the source data into a database, and can be divided into 3 parts. 1) Acquiring operation running time data, namely acquiring the operation running time data from an operation and maintenance platform in a quasi-real time manner through a continuous integrated task; 2) acquiring operation dependency relationship data, and automatically generating the operation dependency relationship data by performing blood-related analysis on data processing logic in an operation script; 3) and acquiring a change job set by inquiring the code submission record in the version library.
Big data job model building module 702 is used to build big data job models. And constructing a big data operation model based on the AOE network through operation running time data and operation dependency relationship data, wherein each operation is used as a vertex, the dependency relationship is used as a directed edge, and the running time is used as a weight on the edge.
The critical path construction module 703 is used to construct a critical path. And (4) carrying out path analysis on the big data operation model by using a key path algorithm, and identifying a key path and key operation which influence the overall timeliness. According to the AOE network diagram, starting from a source point and a sink point respectively, performing topological sorting, calculating the earliest starting time and the latest starting time of each job, if the earliest starting time is equal to the latest starting time, the job is a key job, and a path formed by the key jobs is a key path.
The precise test module 704 is used for precisely testing the change critical operation. And acquiring a change operation set from the version library, matching the change operation set with the key operation set, and automatically and accurately testing the change key operation. And acquiring the historical version and the current version of the changed key job from the version library, and automatically performing aging verification on the historical version and the current version through the job scheduling framework.
The aging risk identification module 705 is used for identifying the change operation with the aging risk. And evaluating the accurate test result, comparing the result with the time allowance (operation time threshold-critical path length) of an operation window, identifying the change operation with the aging risk, judging the influence on the overall aging, and pushing the result to related personnel.
Referring to fig. 8, fig. 8 is a schematic diagram of a source data processing module 800 in an embodiment of the present application. As shown in fig. 8, the source data processing module 800 may include: an information collector 801, a script parser 802, a version controller 803 and an information storage 804.
The information collector 801 can collect operation running time data from the operation and maintenance platform in a quasi-real time manner through the continuous integration task; and formatting the data and unifying the measurement units of time.
The script parser 802 may obtain the job script from the version library, and perform a blood-based analysis on the data processing logic in the job script to form a trend of a data stream, thereby automatically generating job dependency data.
The version controller 803 may obtain the change job set by querying the code submission records in the version repository.
The information storage 804 may store the job runtime data obtained by the information collector 801, the job dependency relationship data obtained by the script parser 802, and the change job set obtained by the version controller 803 into a database, where the database type may be mysql or oracle and other relational databases.
Referring to fig. 9, fig. 9 is a schematic diagram of a big data job model building module 900 in the embodiment of the present application. As shown in FIG. 9, big data job model build module 900 may include: a point set builder 901, an edge set builder 902, an edge weight value annotator 903, and an AOE network-based big data job model builder 904.
The point set builder 901 can obtain information of all jobs, including job names, applications to which the jobs belong, and the like, from the information storage 804, and perform deduplication, and define each job as a vertex (Node), where each vertex includes 3 attributes of a job name, a subsequent job name, and a job runtime. And all the vertexes are added into the point set after the attribute of the job name is updated according to the job information.
The edge set builder 902 may obtain the dependencies of all jobs from the information store 804, and define each dependency as a ring in a directed linked list (Link), each ring containing a current vertex and a subsequent vertex. And the directed chain table updates the subsequent job name attributes of all the vertexes according to the job dependency relationship and then adds the attributes into the edge set.
The edge weight annotator 903 may retrieve the run-time of all jobs from the information store 804 and update the job run-time attributes for all vertices accordingly.
The AOE network based big data job model builder 904 can build an AOE network Graph (Graph) from the point set builder 901, the edge set builder 902, and the edge weight value annotator 903. And judging whether the graph has a loop or not according to a topology sorting algorithm, if so, not conforming to the data processing logic, and automatically informing related personnel to check.
Referring to fig. 10, fig. 10 is a schematic diagram of a critical path constructing module 1000 in an embodiment of the present application. As shown in fig. 10, the critical path construction module 1000 may include a job earliest start time set builder 1001, a job latest start time set builder 1002, a critical job builder 1003, and a critical path builder 1004.
The job earliest start time set builder 1001 may perform topological sorting from a source point according to the AOE network map, calculate the earliest start time of each job, and add to the job earliest start time set. Specifically, the job earliest start time set builder 1001 may perform the following steps: (1) initializing a set of the earliest start time of the operation, and uniformly assigning the earliest start time of all the operations as 0; (2) starting from a source point; (3) carrying out topology sequencing according to the AOE network; (4) obtaining the running time T of the current operationiI.e. the job run time attribute of the current vertex; (5) obtaining the earliest starting time ET of the preorder operation from the seti-1(ii) a (6) Calculate the earliest start time of the current job: ETi=max(ETi-1+Ti) If there are several preorder operations, then the maximum value of the calculation result is the earliest starting time of the operation; (7) the earliest starting time ET of the current operationiAdding the operation into the earliest starting time set; (8) judging whether the vertex is a sink, if so, executing (10), otherwise, returning to (3) to execute topological sorting on the next vertex; (10) the set of earliest start times for all jobs is obtained.
The set builder 1002 may perform topological sort from the sink, calculate the latest start time of each job, and add to the set of latest start times of jobs, according to the AOE network map. The process is the reverse of the job earliest start time set builder 1001.
The key job creator 1003 may acquire the earliest start time and the latest start time of each job from the job earliest start time set builder 1001 and the job latest start time set builder 1002, respectively, and if they are equal, the job is a key job.
The critical path creator 1004 may acquire a critical job from the critical job creator 1003, and a path constituted by the critical job is a critical path.
Referring to fig. 11, a schematic diagram of an accurate test module 1100 in an embodiment of the present application is shown. As shown in fig. 11, the precision test module 1100 may include a change matcher 1101, a script collector 1102, a job scheduler 1103, and an aging memory 1114.
The change matcher 1101 may acquire a change job set from the information storage 804, acquire a key job from the key job generator 1003, and match the two to obtain a change key job.
The script collector 1102 may obtain the historical version and the current version of the changed key job from the version library, and upload the job scripts of 2 versions to the to-be-executed script directory of the job scheduling server after respectively labeling the version labels.
The job scheduler 1103 may scan the to-be-executed script directory of the job scheduling server in a timing manner by the job monitoring process, and if the to-be-executed script is found, automatically run the job scheduling framework to execute the job script. After the execution is finished, the running time consumption is taken as aging data to be brought into the aging memory 1104, and the corresponding script in the script directory to be executed is deleted.
The aging memory 1104 can record the running time of the job and store the running time into a database, wherein the type of the database can be mysql or oracle and other relational databases.
Referring to fig. 12, a schematic diagram of an aging risk identification module 1200 in an embodiment of the present application is shown. As shown in fig. 12, the aging risk identification module 1200 may include a time margin calculator 1201, an aging risk identifier 1202, and a risk pusher 1203.
The time margin calculator 1201 may obtain the configuration of the job running time threshold from the version library, calculate the critical path length according to the critical path generator 1004, and calculate the time margin of the job running window by subtracting the critical path length from the critical path length.
The aging risk identifier 1202 may obtain and evaluate the aging result from the aging memory 1104, compare the aging result with the job operation window time margin obtained by the time margin calculator 1201, determine the influence on the overall aging, and identify the changed job with the aging risk. In particular, the aging risk identifier 1202 may be configured to perform the following steps: and judging the operation change type. And if the current version aging result exceeds the time margin, all the operations on the critical path cannot be executed in a set time window, meanwhile, the overall operation aging is prolonged, the operation is identified as aging risk operation, and otherwise, no related risk exists. If the operation is a modification operation, comparing the aging result of the current version with the aging result of the historical version, if the aging result of the current version exceeds the aging result of the historical version, indicating that the whole aging of the operation is prolonged by the modification, then comparing the difference value of the aging result of the current version and the aging result of the historical version with the time margin, if the difference value is greater than the time margin, indicating that all the operations on the key path cannot be executed in a set time window, and identifying the operation as an aging risk operation, otherwise, no related risk exists. And (4) bringing the aging risk operation into a database for filing.
Risk pusher 1203 may automatically push the aging risk job obtained by aging risk identifier 1202 to relevant personnel through a mail, and recommend the relevant personnel to perform aging optimization on the aging risk job or adjust a job running time threshold.
From the above description, it can be seen that the embodiments of the present specification achieve the following technical effects: the key operation influencing the overall operation timeliness can be determined and matched with the change operation set to obtain the change key operation, the change key operation is tested, the test range is greatly reduced, and the change key operation with the timeliness risk can be identified by performing timeliness verification on the timeliness test result. The method has high automation degree, can accurately position the key change operation with aging risk, judges the whole aging influence, shortens the verification time period and greatly improves the test efficiency of big data operation.
The embodiment of the present specification further provides a computer device, which may specifically refer to a schematic structural diagram of a computer device based on the data processing aging test method provided by the embodiment of the present specification, shown in fig. 13, where the computer device may specifically include an input device 131, a processor 132, and a memory 133. Wherein the memory 133 is used to store processor executable instructions. The processor 132, when executing the instructions, implements the steps of the data processing aging test method described in any of the embodiments above.
In this embodiment, the input device may be one of the main apparatuses for information exchange between a user and a computer system. The input device may include a keyboard, a mouse, a camera, a scanner, a light pen, a handwriting input board, a voice input device, etc.; the input device is used to input raw data and a program for processing the data into the computer. The input device can also acquire and receive data transmitted by other modules, units and devices. The processor may be implemented in any suitable way. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth. The memory may in particular be a memory device used in modern information technology for storing information. The memory may include multiple levels, and in a digital system, the memory may be any memory as long as it can store binary data; in an integrated circuit, a circuit without a physical form and with a storage function is also called a memory, such as a RAM, a FIFO and the like; in the system, the storage device in physical form is also called a memory, such as a memory bank, a TF card and the like.
In this embodiment, the functions and effects of the specific implementation of the computer device can be explained in comparison with other embodiments, and are not described herein again.
The present specification further provides a computer storage medium based on the data processing aging test method, and the computer storage medium stores computer program instructions, and when the computer program instructions are executed, the steps of the data processing aging test method in any of the above embodiments are implemented.
In this embodiment, the storage medium includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk Drive (HDD), or a Memory Card (Memory Card). The memory may be used to store computer program instructions. The network communication unit may be an interface for performing network connection communication, which is set in accordance with a standard prescribed by a communication protocol.
In this embodiment, the functions and effects specifically realized by the program instructions stored in the computer storage medium can be explained by comparing with other embodiments, and are not described herein again.
It will be apparent to those skilled in the art that the modules or steps of the embodiments of the present specification described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed over a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different from that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, embodiments of the present description are not limited to any specific combination of hardware and software.
It is to be understood that the above description is intended to be illustrative, and not restrictive. Many embodiments and many applications other than the examples provided will be apparent to those of skill in the art upon reading the above description. The scope of the application should, therefore, be determined not with reference to the above description, but instead should be determined with reference to the pending claims along with the full scope of equivalents to which such claims are entitled.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and it will be apparent to those skilled in the art that various modifications and variations can be made in the embodiment described herein. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.