EP1782201A2 - Verfahren zur steuerung eines softwareprozesses, verfahren und system für neuverteilung oder fortgesetzten betrieb in einer mehrcomputer-architektur - Google Patents

Verfahren zur steuerung eines softwareprozesses, verfahren und system für neuverteilung oder fortgesetzten betrieb in einer mehrcomputer-architektur

Info

Publication number: EP1782201A2
Authority: EP; European Patent Office
Prior art keywords: application; processes; controller; state; execution
Prior art date: 2004-06-30
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.): Withdrawn

Application number

EP05778898A

Other languages

English (en)

French (fr)

Inventor

Marc Vertes

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Meiosys SAS

Original Assignee

Meiosys SAS

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2004-06-30

Filing date

2005-06-22

Publication date

2007-05-09

2005-06-22 Application filed by Meiosys SAS filed Critical Meiosys SAS

2007-05-09 Publication of EP1782201A2 publication Critical patent/EP1782201A2/de

Status Withdrawn legal-status Critical Current

Links

Classifications

- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operations
- G06F11/1479—Generic software techniques for error detection or fault masking
- G06F11/1482—Generic software techniques for error detection or fault masking using middleware or operating system [OS] functionalities
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operations
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1415—Saving, restoring, recovering or retrying at system level
- G06F11/1438—Restarting or rejuvenating
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2035—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant without idle spare hardware

Definitions

the present invention relates to a method for managing a software application operating in a multi-computer (cluster) architecture, for example for analyzing or modifying its execution environment, in the most transparent manner possible vis-à-vis -vis this application. It also relates to a method for modifying or adjusting the operation of such an application by using this operation management method to redistribute its processes within a cluster. This redistribution method can in particular be used to modify the distribution of the workload between different machines of a network, or to make the application more reliable by improving the continuity of operation. The invention also relates to a multi-computer system implementing this method of redistribution of operation.
the field of the invention is that of networks or computer clusters formed of several computers collaborating with one another. These clusters are used to run software applications that provide one or more services to users. Such an application can be mono ⁇ process or multi-process, and be executed on a single computer or distributed on several computers, for example in the form of a distributed application of MPI type ("Message Passing Interface").
MPI type Message Passing Interface
an application is run on a computer or group of computers in the cluster, called the primary or operational node, while the other computers in the cluster are called secondary nodes or " stand-by Colour
the exploitation of such clusters shows that there are reliability problems that may be due to hardware or operating system failures, human errors, or the failure of the applications themselves.
Additional libraries can also be integrated during compilation, to add these features permanently to the executable code.
Such libraries can even interpose between the calls provided in the application and the libraries of origins as described in patent FR 02/00398, for diverting these calls to a new library, editable running.
these methods require intervention at the stage of the compilation of the application, which is expensive and complex, may require intervention of the designer of the application and still be a source of errors or incompatibilities.
An object of the invention is then to allow a more complete management of an application process, more transparently for the operation of this application.
this objective is achieved with a method of managing a software application comprising at least a first software process, said target process, running on at least one computer and in a runtime environment comprising at least one execution memory space .
this method comprises an operation of injecting at least one executable instruction into the memory space of the target process, by at least a second software process, called a controller process, external to the application and able to act on the progress of the process target, this executable statement performing an analysis or modification of the execution environment of this target process.
the injection operation comprises steps of: interrupting the execution of the target process by the controller process;
this operation management method further comprises a combination of the following characteristics:
the step of interrupting the target process can be followed by at least one step of reading and saving instructions stored in the reassigned zone and / or the state of the execution context of the target process when it is interrupted.
the step of writing injected instructions may be preceded by a step of writing in the reassigned data area realizing an addressing correspondence between this reassigned zone and another determined memory space, called mapping zone.
the execution step of the injected instructions may be preceded by a step of writing in the reassigned data area constituting arguments of the injected instructions.
the execution step of the injected instructions may also be preceded by a step of modifying the execution context according to parameters corresponding to the injected instructions.
the execution step of the injected instructions may be followed by a step of reading data stored in the reassigned zone and / or reading the state of the execution context of the target process.
the step of writing injected instructions may comprise writing at least one execution interrupt instruction in the reassigned zone after the injected instructions.
Another object of the invention is to facilitate the implementation in the operation of an application, in the most transparent manner possible for this application, of functionalities allowing the analysis, the capture or the modification of the environment of this application or the resources it uses.
the invention proposes a method for managing the operation of a software application as above, carrying out an introspection operation of at least two introspected processes, each of these introspected processes using a first resource comprising itself. even a pointer designating a second resource itself having an attribute that is accessible to said process through said pointer, the method comprising the following steps:
the invention also proposes a method for managing the operation of a software application as above, performing a capture operation of the state of the target process, said captured process, and comprising steps of:
the operation management method may in particular carry out a state capture operation of at least two processes of this application, the interruption of these two processes occurring either simultaneously or at points of their respective processes of which one is calculated according to the other.
the capture operation may further comprise steps of: injection, by the controller process in the captured process of at least one system call instruction performing read in the inter-process agent of at least one communication data from another process of the application and not yet received by the process captured;
the capture operation may further include steps of:
the invention also proposes a method for managing the operation of a software application as above, carrying out a restore operation, by a controller process from data known as recovery data, the state of at least one software application process, called the recovery process.
the restore operation then includes steps of:
the restore operation also described above can also be combined with the following characteristics.
the restore operation may include in addition to a step of: injection, by the controller process in the captured process, of at least one system call instruction which, from the recovery data, writes the writing within the inter-process agent of at least one datum representing a datum of communication to the recovery process.
the restore operation may further include a step of: - injection, by the controller process into the recovery process at least one system call instruction creating or modifying, from the recovery data, at least one inheritance relationship of the recovery process with at least one other process of the application.
Another object of the invention is thus to be able to move the execution of all or part of this application from one hardware resource to another, for example from one computer to another or from one node to another.
the invention proposes using the above method to perform a method of replicating at least one process of the application, called the original process, into a clone process, comprising the following steps:
checkpoint data to restore at least one clone process to a state reproducing the state of the original process.
the invention also proposes using the above method to perform a method of redistributing all or part of a so-called redistributed software application, executed in a multi-computer architecture (cluster) and comprising at least a process, called the initial process, providing a data processing by being executed at a given moment on at least one computer of the cluster, called primary or operational node, other computers of said cluster being called secondary nodes, this redistribution method comprising the following steps: - replication of at least one initial process into at least one secondary process running on a secondary node;
Such a redistribution makes it possible in particular to transfer this or that computing task from one node to another within the cluster. It is thus possible to redistribute the workload of the different machines, to obtain a better balancing of this workload within the cluster.
the redistribution method further comprises the following steps: replication of all the processes executed by the operational node in one or more secondary processes executed on at least one secondary node;
the invention also proposes to use the above method to perform a method of suspending a software application comprising at least one process executed on at least one computer, this suspension method comprising the following steps:
checkpoint data to restore one or more clone processes in a state reproducing the state of the set of captured processes.
the restoration step can be carried out on the same machine or on another, at the chosen moment. It is thus possible to facilitate the maintenance or the replacement of a machine, in particular when it is not possible to transfer the application to another part of a cluster. It is also possible to facilitate the transfer of an application to one or more other machines, for example with which there is no direct digital communications.
Another object is to propose a method for achieving an improvement in the continuity of operation of a software application running in a multi-computer architecture.
This goal is achieved by a process for reliability of the operation of a software application, called a trusted application, executed in a multi-computer architecture (cluster) and providing a specific service, at least one process of this application being executed at a given moment. given on at least one computer of the cluster, called primary or operational node, other computers of said cluster being called secondary nodes.
This reliability process implements a management method described above to perform at least one capture operation and at least one restore operation, and comprises the following steps:
the operation management method according to the invention can associate, selectively or not, capture operations with restoration operations to achieve a holistic replication of the state of an application called original in a clone application.
the replication method described above is then implemented to replicate all the processes and resources of the original application as processes and resources of the clone application.
this method of continuity of operation can of course update or restore one or more processes clones after the detection of a failure rather than before, or achieve a combination of both.
the invention also proposes a method for making a trusted software application reliable, executed in a multi-computer architecture (cluster) and providing a determined service, at least one process of this application, said process being made reliable, being executed at at a given moment on at least one computer in the cluster, called the primary node or operational, other computers of said cluster being called secondary nodes, this reliability process comprising the following steps:
the invention also proposes a multi-computer system implementing the method according to the invention.
An advantage of using a different control process of the process to be managed, that is to say the target process, is in particular to be able to implement the operations necessary for the functionality of continuity or redistribution of operation in the form of operations external to the application, that is to say outside the memory space of the target process.
These external operations are, for example, checkpoint definitions, triggers for capturing or restoring states, analyzes or modifications of resource structures, or reads or writes data into these resources.
this access from the inside allows the management of a process not to depend on the functionality limits specific to these debuggers.
the present invention makes it possible not to be limited, through the list of "debug symbols" of the target application, to the functions already present in this target application.
system calls made by injection make it possible to use parameters stored in the registers, and at the top of the stack, as is the case with many debuggers.
this injection method also makes it possible to dispense with access permissions to certain resources such as the stack execution permission, which may exist in some operating systems such as SELinux, SUN-Solaris or OpenBSD.
controller and instruction injection makes it possible to perform a capture method of capturing a checkpoint or a restoration method that is simple and straightforward.
a basic demonstration program that provides these replication capabilities for a single process without files or connections can represent approximately 500 lines of C language programs.
the restricted and temporary aspect of the system call injection method makes it possible to insert only a few instructions in the memory space of the process to be managed and of which nothing remains in the end. operation. This therefore avoids "polluting" the target process, which is an advantage from the point of view of reliability and maintenance of the application.
the method according to the invention has the advantage of being usable both with a target application using static executable files, that is to say including all the necessary routines, that dynamic, that is to say doing calling subroutine libraries outside the application.
the method according to the invention allows a redistribution or continuity of operation by intervening little or no outside the field of work of the user.
the implementation of checkpointing and recovery operations in themselves require little or no system modification (kernel) or system resource additions (kernel modules).
kernel system modification
kernel modules system resource additions
controller process can perform a recovery of the state of a recovery process without having itself started the recovery process allows to work on an existing recovery process. This possibility allows the management of redistribution or continuity of operation, not to interfere with the starting modes of a target application or its processes, which facilitates for example the application of the invention to distributed applications ( MPI).
MPI distributed applications
FIG. 1a shows the organization of a cluster executing a software application, the operation of which is made reliable by a redistribution application implementing a method according to the invention to achieve a complete redistribution;
FIG. 1b represents the organization of a cluster executing a software application, whose operation is adjusted by a redistribution application implementing a method according to the invention to achieve a partial redistribution;
FIG. 2 is a symbolic diagram of the progress of a program instruction injection operation by a controller process within a target process
FIG. 3 is a symbolic diagram of the operation of an operation for capturing the state of a process
FIG. 4 is a symbolic diagram of the operation of a recovery operation of a recovery process
FIG. 5 is a diagram illustrating the structure of two processes using shared or separate file descriptors
FIG. 6 is a diagram illustrating the progress of a multi-process introspection method using an injection of system calls.
FIGS 1a and 1b are illustrated uses of a replication method according to the invention in a functional redistribution application.
This operation redistribution application is used to redistribute the operation of a software application, called a redistributed application, executed on an operational node OP of a multi-computer or cluster architecture.
a node can be a single computer within the cluster or include multiple computers working together within the cluster.
the redistributed application includes at least one process, referred to as the original PCA process, working in a runtime environment in which it accesses a number of resources of different types. Commonly, these resources comprise: an execution memory space allocated in the working memory of the node OP, and where the executed instructions constituting the process are stored; an execution context, including memory registers and different types of state resources such as flags, mutex, etc. ;
Some of the resources available to a process can be distributed across multiple computers or nodes, especially in the case of distributed applications, for example for variables stored in shared memory areas or as shared files or as shared files. external databases.
the operation redistribution application is executed on one or more computers of the cluster communicating with the operational node of the application and at least one secondary node SB.
This redistribution of operation is done by storing regularly or on an event, at a checkpoint, an instantaneous state of one or more original PCA processes of the redistributed application.
the redistribution application performs a checkpoint capture operation, according to a method described hereinafter.
this checkpoint capture operation uses a redistributed application operation management method, described hereinafter, implemented by a temporary PCl controller process acting on the original PCA process of the redistributed application.
the redistribution application stores a software object, called checkpoint state, in memory means within the cluster.
checkpoint state a software object
certain resources of the redistributed application such as databases or files, can also be saved or replicated over the water or in stages, according to known means. .
the redistribution application performs a complete redistribution of the redistributed application, i.e. all of its processes and the links that unite them.
such a complete redistribution can in particular be used to make the redistributed application reliable, by constituting a backup application, which will maintain a certain degree of reliability. continuity in the service provided in case of failure of the operational node OP.
the operation redistribution application uses a checkpoint state to perform one or more restores of the redistributed application in the form of at least one backup application, called recovery application.
a recovery application includes a clone process running on a secondary node SB of the cluster and resources ensuring a state corresponding to the state of the original PCA process when capturing this checkpoint.
This restoration can be done on a regular basis or on an event, and can include a complete start with creation of the clone process, also called recovery process, or restore by updating an already existing clone process.
the redistribution application performs an operation of updating the clone process from a checkpoint, according to a method described hereinafter.
this updating operation uses an operation management method, described hereinafter, implemented by a temporary PC2 controller process acting on the clone process of the system call injection recovery application. as described below.
the operation redistribution application In the event of a failure affecting the operation of the trusted application on the operational node, the operation redistribution application is notified by a monitoring or failure detection function, according to known means. The operation redistribution application then performs a service failover to the standby application, and the clone process then resumes the role that the original PCA process played before the failure.
the service redistribution application may also perform an update of the recovery application after the failure, or a complete start of this recovery application followed by an upgrade. day according to the method of the invention.
such a complete redistribution can be used also to completely move a application of one node to another, for example to release this node for a hardware intervention.
the redistribution application realizes a partial redistribution of the redistributed application, that is to say by a replication of only a part of its processes and the links which unite them between them, while updating the links that unite them to other processes.
the operation redistribution application When the operation redistribution application receives a partial redistribution command, it realizes a checkpoint state about the process (s) to replicate, or identifies a previously saved checkpoint state about those same processes. For each process, referred to as the original PCA process, to replicate, the operation redistribution application creates a clone process PCA 'within the SB to which the original PCA process is to be redistributed.
the operation redistribution application performs a restore of the clone process PCA 'in the state of the original PCA process at the time of establishment of the checkpoint.
This restoration also includes a restoration, between the various clone processes, of the state of the links which exist between their respective original processes. If the original PCA process has links to another PCB process that has not been replicated, a link in the same state will be created and restored between that other PCB process and the PCA clone process.
the redistribution operation application will also create for the PCB clone process a virtualized version of all or part of the resources used by the original PCA process, or a copy of these resources.
Such virtualization can be applied for example to process identifiers (PIDs), or file descriptor identities.
the functional redistribution application may then delete the original PCA process without disrupting the continuity of the redistributed application or the services provided.
Such partial redistribution can in particular be used to adjust the operation of the redistributed application, by moving certain processes to other nodes so as to modify the distribution of the workload within the cluster, for example in order to improve performances.
This workload can be, for example, computation, or file accesses, or network communications internal to the cluster or externally.
a partial redistribution can also be used to release a node or a communication line within the cluster, for example to perform interventions on the hardware that constitutes it.
FIG. 2 illustrates more specifically the above-mentioned operation management method.
This method is implemented by a controller process and applied to a process to be managed, or target process, on which it realizes a mechanism for injecting program instructions.
the vertical rectangle represents the execution memory ME containing the instructions executed by the target process
the group of rectangles on its right represents the working registers R used by this process
the triangle on its left represents the PE execution pointer of the process within the execution memory.
the controller process takes control of the target process, for example by an "attach” command based on the "ptrace” routine.
the controller process interrupts the execution of the target process, and defines a reassigned zone 2030, or "scratch area", within the execution memory of this target process.
the controller process then reads the contents of the reassigned zone SA, the position of the execution pointer PE, and the state of the working registers R, and makes a backup 204 of the initial state of these elements. .
the controller process checks 205 that the reassigned area SA is large enough to perform the following operations. In the opposite case, it can carry out 206 mapping of this zone according to known means, to make it correspond to another larger memory space, called mapping zone, determined outside the execution memory ME of target process. This mapping area can then be used by the target process in place of the reassigned zone. Then 207, the controller process writes the ID code corresponding to the program instructions to be injected within the reassigned zone SA, and writes a breakpoint instruction at the end of the reassigned zone SA.
the controller process can write in the reallocated area SA ARJ data corresponding to any arguments that must use UI instructions.
the controller process modifies the state of the working registers R to give them the RIJ values corresponding to the execution of the instructions to be injected IU.
the controller process will then set the execution pointer PE on the first IU instruction of the injected mechanism and start the execution of the target process.
the target process then executes the IU instructions of the injected mechanism, for example system calls performing an analysis or modification of the resource structure of the target process.
the execution of the injected mechanism may receive feedback data, which will be stored in the reassigned zone SA or in its work registers R, for example the responses returned by the operating system to the system calls included in the injected mechanism.
the execution pointer PE arrives at the previously written interrupt instruction 207, the target process is interrupted again and recalls the controller process.
the controller process will then collect the results of the execution of the injected mechanism, in the form of result data read in the reassigned area SA and in the work registers R, and save these results data independently of the environment of the process. execution of the target process.
the controller process uses the initial state data saved 204 previously to write in the reassigned area SA and the working registers R and return them to the state where they were during the initial interrupt 202.
the execution memory space is thus restored to the state where it was before the injection of the instruction ID.
the injection operation can thus be considered temporary or temporary, which avoids polluting the target process where the application that uses it.
the controller process can then reposition the PE execution pointer on the instruction which was initially the next to execute, and restart the target process. Once the process targets again in execution, the controller process releases it from its control, for example by a "detach” instruction or command, based on the "ptrace” routine in a similar way to the "attach” command.
FIG. 3 illustrates the use of the operation management method according to the invention to perform an operation of capturing the state of a process, referred to as the captured process, and of its execution environment, by a controller process.
the controller process first takes control of the captured process, for example by an "attach” instruction based on the "ptrace” routine.
the controller process can then interrupt the execution of the process captured in this step and suspend some or all of the resources it uses.
a next step 302 is to perform an introspection of the operating environment of the captured process to establish a list 303 resources of this runtime environment.
the controller process analyzes the structure of the resources to which it has access.
the controller process establishes a list of instructions to be injected into the process captured to access these resources, for example in the form of a list of 305 system calls and their settings.
the controller process injects each instruction or instruction group from this list and collects the result data, according to the operation management method described above. By this injection of system calls, the controller process obtains data 307 representing the structure of resources that were not directly accessible to it.
this step 306 implements a multi-process introspection method with injection of system instructions.
This method performs several injection operations coordinated with each other, applied to several target processes. Injection operations introduce changes in this resource through at least one of these target processes. The results of these operations are then compared with one another to obtain information on the mode of operation of the introspected resource.
the controller process can then capture 308 the contents of these same resources and save it 310 for constituting a checkpoint state 311, i.e., an image of the state of the captured process. So the instruction
system call injection phase 306 can also be used to obtain the content or status of certain resources by injecting the corresponding read instructions.
the process * takes control of a process by ptrace.
the process * is defined by its process id 7 ptrace (PTRACE_ATTACH, pid, 0, 0);
program instructions providing instruction injection for capturing the position of the write pointer of a file descriptor opened by the captured process.
⁇ int file_pos PT_LSEEK (pid, fd, 0, SEEK_CUR); return file_pos;
Figures 5 and 6 illustrate an example of a multi-process introspection method, applied to the analysis of a file descriptor.
the multi-process introspection method is then used to determine whether two FDA and FDB file descriptors, used by two different PA and PB processes and pointing to FA and FB files, are separate or shared descriptors.
a PCl controller process injects a system call instruction within the first target process PA.
This system call performs a ptAO reading of the read / write pointer position of the FDA file descriptor of this first target process PA.
This PCl controller process injects system call instructions into the second PB target process.
one of these system calls first performs a ptBO reading of the position of the read / write pointer of the FDB file descriptor of this second target process PB.
step 503 another of these system calls, for example an "Iseek" instruction, then realizes a modification of the position of this same pointer.
the PCl controller process injects a system call instruction into the first target process PA.
This system call performs a new read ptAl of the read / write pointer position of the FDA file descriptor of this first target process
the controller PCl process compares the ptAO and ptAl values obtained by the two pointer position readings of the first descriptor FD1.
the PCl controller process then stores data representing this information.
the PCl controller process then injects a system call instruction into one of the two target processes, for example PB, to return the pointer to its initial position ptBO.
the PCl controller process then stores data representing this information.
the controller PCl process then injects a system call instruction within the second target process PB to return its pointer to its initial position ptBO.
the modified pointer is returned to its initial position, and the process is therefore transparent for both target processes.
FIG. 4 illustrates the use of the operation management method according to the invention to carry out an operation for updating or restoring a process, called the recovery process, and its execution environment, by a controller process .
This figure represents a restore operation, comprising a part 401, 402, 403 for creating the recovery process.
the controller process triggers this creation by initiating 401 a new process, called the recovery process, under its control (forking technique), then using a "ptrace (TRACEMEM, ...)" instruction before launching it. 'execution.
the recovery process then starts normally by loading 402 the various resources as in a conventional cold start. At this stage begins the actual process of updating the state of a recovery process, that is to say the method that can be used on an already existing recovery process.
this recovery process stops 404 immediately after loading, because of its launch mode, and reminds the controller process.
the controller process begins by taking control of the captured process 405, for example by an "attach" instruction based on the "ptrace" routine. The controller process then performs a selection and reading of data previously saved and constituting a checkpoint state. From the contents of this checkpoint, the controller process evaluates the structure and content changes to be made in the execution environment of the recovery process as it is to bring it to the state of selected resume point.
the controller processes them by itself.
the controller process prepares a list of system calls that it injects 408 into the recovery process, according to the operation management method of the invention.
This injection is used, for example, to modify the addressing and the mapping of the memory segments used, by injecting one or more mmap system calls.
the same principle is used for all or part of the system resources that must be recreated to arrive at a state identical to the selected checkpoint state.
system resources are for example resources of the type "file”, “socket”, “pipe”, “timer”, “terminal control”, etc.
controller process 409 writes these system resources according to the data of the checkpoint state, to bring the recovery process back to the state where the captured process was. when establishing the selected recovery bridge.
the controller process then raises 410 the execution of the recovery process and releases it 411 of its control, for example by a "detach” command, based on the "ptrace” routine in a similar way to the "attach” command.
the system call injection phase 408 can also be used to write the content or status of certain resources, by injecting the corresponding read instructions.
this restore operation Since it operates from a process external to the recovery process, this restore operation is much simpler and more efficient than if it were to be done by operations planned within the recovery process itself.
C instructions for a POSIX environment program instructions that perform an instruction injection to restore the position of the write pointer of a file descriptor opened by or for the process. of recovery.
establishing a checkpoint may require capturing the status of many of these processes.
the use of one or more control processes outside the processes to be captured is an advantage provided by the method according to the invention.
the operation redistribution application performs a capture operation according to the invention on several captured processes, so as to synchronize or coordinate the initial interruption 301 of each of the catching operations and the suspension of the resources concerned.
some data being transmitted between several processes can be "frozen" within the IPC interprocessor software mechanism that manages these transmissions, for example the "Inter Process Communication" software object in an environment of Unix type.
the operation redistribution application uses the operation management method according to the invention to inject into each of the interrupted processes the system calls to manage. these data being transmitted. For example, it may be to purge IPC queues (pipe) of unprocessed data as part of a process state capture operation at a checkpoint. or restore the same data in the case of a process update.
IPC queues pipe
the capture operation then further comprises an analysis and storage of all the communications data, or packets, which have been intended for it but do not have have been received.
this inter-process agent is managed by the system, for example in a kernel module for the Unix case, it is advantageous not to have to intervene in the system.
the PCl controller process uses the operation management method according to the invention to inject into the process being captured system calls that will request a reading of these communication data during transit. The controller process then retrieves this data and saves it within the checkpoint state.
the PC2 controller process In a recovery situation, when all recovery processes are suspended, the PC2 controller process also uses the management process according to the invention for injecting into each recovery process system calls that will write within the IPC interprocessing agent the packets in transit that had been stored in the checkpoint state.
some of these processes may have inheritance relationships between them. That is, a "child” process may have been created from a "parent” process, and inherit from that inheritance relationship certain features or resources of its operating environment, in particular of type "file descriptor".
the PCl controller process When capturing the processes of an application, the PCl controller process will use the management process according to the invention to inject into each captured process system calls that will analyze its possible inheritance relationships with one or more other processes. The results of these analyzes will then be saved within the current checkpoint state.
the PCl controller process will use the management process according to the invention to inject into each recovery process system calls that will recreate the same inheritance relations that had been stored in the state. point of recovery.

Landscapes

Engineering & Computer Science (AREA)
Theoretical Computer Science (AREA)
Quality & Reliability (AREA)
Physics & Mathematics (AREA)
General Engineering & Computer Science (AREA)
General Physics & Mathematics (AREA)
Retry When Errors Occur (AREA)
Stored Programmes (AREA)
Hardware Redundancy (AREA)

EP05778898A 2004-06-30 2005-06-22 Verfahren zur steuerung eines softwareprozesses, verfahren und system für neuverteilung oder fortgesetzten betrieb in einer mehrcomputer-architektur Withdrawn EP1782201A2 (de)

Applications Claiming Priority (2)

Application Number	Priority Date	Filing Date	Title
FR0407180A FR2872605B1 (fr)	2004-06-30	2004-06-30	Procede de gestion d'un processus logiciel, procede et systeme de redistribution ou de continuite de fonctionnement dans une architecture multi-ordinateurs
PCT/FR2005/001564 WO2006010812A2 (fr)	2004-06-30	2005-06-22	Procede de gestion d'un processus logiciel, procede et systeme de redistribution ou de continuite de fonctionnement dans une architecture multi-ordinateurs

Publications (1)

Publication Number	Publication Date
EP1782201A2 true EP1782201A2 (de)	2007-05-09

Family

ID=34948448

Family Applications (1)

Application Number	Title	Priority Date	Filing Date
EP05778898A Withdrawn EP1782201A2 (de)	2004-06-30	2005-06-22	Verfahren zur steuerung eines softwareprozesses, verfahren und system für neuverteilung oder fortgesetzten betrieb in einer mehrcomputer-architektur

Country Status (5)

Country	Link
US (1)	US20080307265A1 (de)
EP (1)	EP1782201A2 (de)
CN (1)	CN100530120C (de)
FR (1)	FR2872605B1 (de)
WO (1)	WO2006010812A2 (de)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US7797576B2 (en)	2007-04-27	2010-09-14	International Business Machines Corporation	Checkpoint of applications using UNIX® domain sockets
US7685172B2 (en)	2007-04-27	2010-03-23	International Business Machines Corporation	In-flight file descriptors checkpoint
US8527650B2 (en)	2007-05-21	2013-09-03	International Business Machines Corporation	Creating a checkpoint for modules on a communications stream
US7930327B2 (en)	2007-05-21	2011-04-19	International Business Machines Corporation	Method and apparatus for obtaining the absolute path name of an open file system object from its file descriptor
US7950019B2 (en)	2007-05-21	2011-05-24	International Business Machines Corporation	Method and apparatus for checkpoint and restarting a stream in a software partition
US9384159B2 (en)	2007-05-24	2016-07-05	International Business Machines Corporation	Creating a checkpoint for a software partition in an asynchronous input/output environment
US8127289B2 (en)	2007-06-27	2012-02-28	International Business Machines Corporation	Enabling a third party application to participate in migration of a virtualized application instance
US7792983B2 (en)	2007-07-31	2010-09-07	International Business Machines Corporation	Method and apparatus for checkpoint and restart of pseudo terminals
US8006254B2 (en)	2007-10-04	2011-08-23	International Business Machines Corporation	Bequeathing privilege to a dynamically loaded module
US8495573B2 (en)	2007-10-04	2013-07-23	International Business Machines Corporation	Checkpoint and restartable applications and system services
US8156510B2 (en)	2007-10-04	2012-04-10	International Business Machines Corporation	Process retext for dynamically loaded modules
US7933991B2 (en)	2007-10-25	2011-04-26	International Business Machines Corporation	Preservation of file locks during checkpoint and restart of a mobile software partition
US7933976B2 (en)	2007-10-25	2011-04-26	International Business Machines Corporation	Checkpoint and restart of NFS version 2/version 3 clients with network state preservation inside a workload partition (WPAR)
US9473598B2 (en)	2007-12-18	2016-10-18	International Business Machines Corporation	Network connection failover during application service interruption
US9928349B2 (en) *	2008-02-14	2018-03-27	International Business Machines Corporation	System and method for controlling the disposition of computer-based objects
US8572237B2 (en) *	2008-12-16	2013-10-29	Sap Ag	Failover mechanism for distributed process execution
US7945808B2 (en) *	2009-01-30	2011-05-17	International Business Machines Corporation	Fanout connectivity structure for use in facilitating processing within a parallel computing environment
US20110191627A1 (en) *	2010-01-29	2011-08-04	Maarten Koning	System And Method for Handling a Failover Event
CN102117224B (zh) *	2011-03-15	2013-01-30	北京航空航天大学	一种面向多核处理器的操作系统噪声控制方法
WO2012124841A1 (ko) *	2011-03-15	2012-09-20	현대자동차 주식회사	통신 테스트 장치 및 방법
CN102984184B (zh) *	2011-09-05	2017-09-19	上海可鲁系统软件有限公司	一种分布式系统的服务负载均衡方法及装置
US8782651B2 (en)	2011-09-26	2014-07-15	International Business Machines Corporation	Dynamically redirecting a file descriptor of an executing process by another process by optionally suspending the executing process
CN102495802B (zh) *	2011-12-26	2015-03-18	华为技术有限公司	测试软件系统的方法和装置以及计算机系统
CN104077184B (zh) *	2013-03-25	2018-12-11	腾讯科技（深圳）有限公司	一种应用程序的进程控制方法及计算机系统
US9304896B2 (en)	2013-08-05	2016-04-05	Iii Holdings 2, Llc	Remote memory ring buffers in a cluster of data processing nodes
CN103885364B (zh) *	2014-03-24	2016-09-28	三和智控(北京)系统集成有限公司	一种通过计划队列实现控制逻辑的动态延时调用的方法
CN111435316A (zh) *	2019-01-14	2020-07-21	阿里巴巴集团控股有限公司	一种资源扩容方法及其装置
US12498969B2 (en) *	2022-04-21	2025-12-16	Microsoft Technology Licensing, Llc.	Distributed, decentralized traffic control for worker processes in limited-coordination environments
CN119025342B (zh) *	2024-10-23	2025-05-23	西安西电电力系统有限公司	一种数据库实时备份和恢复方法及相关设备

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US5297274A (en) *	1991-04-15	1994-03-22	International Business Machines Corporation	Performance analysis of program in multithread OS by creating concurrently running thread generating breakpoint interrupts to active tracing monitor
CA2106280C (en) *	1992-09-30	2000-01-18	Yennun Huang	Apparatus and methods for fault-tolerant computing employing a daemon monitoring process and fault-tolerant library to provide varying degrees of fault tolerance
US7047521B2 (en) *	2001-06-07	2006-05-16	Lynoxworks, Inc.	Dynamic instrumentation event trace system and methods
US6898785B2 (en) *	2001-08-16	2005-05-24	Hewlett-Packard Development Company, L.P.	Handling calls from relocated instrumented functions to functions that expect a return pointer value in an original address space
FR2843209B1 (fr) *	2002-08-02	2006-01-06	Cimai Technology	Procede de replication d'une application logicielle dans une architecture multi-ordinateurs, procede pour realiser une continuite de fonctionnement mettant en oeuvre ce procede de replication, et systeme multi-ordinateurs ainsi equipe.

2004
- 2004-06-30 FR FR0407180A patent/FR2872605B1/fr not_active Expired - Fee Related
2005
- 2005-06-22 EP EP05778898A patent/EP1782201A2/de not_active Withdrawn
- 2005-06-22 WO PCT/FR2005/001564 patent/WO2006010812A2/fr not_active Ceased
- 2005-06-22 US US11/813,908 patent/US20080307265A1/en not_active Abandoned
- 2005-06-22 CN CNB200580016201XA patent/CN100530120C/zh not_active Expired - Fee Related

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2006010812A2 *

Also Published As

Publication number	Publication date
US20080307265A1 (en)	2008-12-11
CN101002177A (zh)	2007-07-18
WO2006010812A3 (fr)	2007-03-22
FR2872605A1 (fr)	2006-01-06
FR2872605B1 (fr)	2006-10-06
CN100530120C (zh)	2009-08-19
WO2006010812A2 (fr)	2006-02-02

Legal Events

Date	Code	Title	Description
2007-04-06	PUAI	Public reference made under article 153(3) epc to a published international application that has entered the european phase	Free format text: ORIGINAL CODE: 0009012
2007-05-09	17P	Request for examination filed	Effective date: 20070124
2007-05-09	AK	Designated contracting states	Kind code of ref document: A2 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU MC NL PL PT RO SE SI SK TR
2007-05-09	AX	Request for extension of the european patent	Extension state: AL BA HR LV MK YU
2007-10-03	DAX	Request for extension of the european patent (deleted)
2007-12-26	17Q	First examination report despatched	Effective date: 20071126
2013-05-24	STAA	Information on the status of an ep patent application or granted ep patent	Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN
2013-06-26	18D	Application deemed to be withdrawn	Effective date: 20130103

Publication	Publication Date	Title
EP1782201A2 (de)	2007-05-09	Verfahren zur steuerung eines softwareprozesses, verfahren und system für neuverteilung oder fortgesetzten betrieb in einer mehrcomputer-architektur
US11422902B2 (en)	2022-08-23	Recreating a computing environment using tags and snapshots
EP4004738B1 (de)	2023-07-05	Zeitreisefehlerbeseitigung mit hot-code-ersatz
CA2493407C (fr)	2010-05-11	Procede de replication d'une application logicielle dans une architecture multi-ordinateurs, procede pour realiser une continuite de fonctionnement mettant en oeuvre ce procede de replication, et systeme multi-ordinateurs ainsi equipe
FR2881239A1 (fr)	2006-07-28	Procede de gestion d'acces a des ressources partagees dans un environnement multi-processeurs
Oliveira et al.	2016	Delivering software with agility and quality in a cloud environment
Xu et al.	2025	Design and operation of shared machine learning clusters on campus
FR2881241A1 (fr)	2006-07-28	Procede d'optimisation de la journalisation et du rejeu d'application multi-taches dans un systeme informatique mono-processeur ou multi-processeurs
De Iasio et al.	2021	A framework for microservices synchronization
US20250110780A1 (en)	2025-04-03	System and method for multi-cluster orchestration
CN118467256B (zh)	2024-09-06	一种集群的业务故障恢复方法、装置、介质以及产品
Stenbom	2019	Refunction: Eliminating serverless cold starts through container reuse
US20030045952A1 (en)	2003-03-06	Continuation manager
US6910103B2 (en)	2005-06-21	Caching data
Fördős et al.	2016	CRDTs for the configuration of distributed Erlang systems
US20250138840A1 (en)	2025-05-01	Testing for distributed systems
Alexandre	2022	R-Check: A Reactive Checkpointing Approach for Serverless Computing
FR2883083A1 (fr)	2006-09-15	Procede d'execution d'une application dans un conteneur virtuel formant une session d'environnement virtualise
Zhang	2023	Towards Elastic and Cost-effective Stateful Serverless Systems
Brunell	2024	Harbor Satellite
CN115859266A (zh)	2023-03-28	微服务管理方法、装置、介质和计算设备
Leonini	2014	Splay: A toolkit for the design and evaluation of Large Scale Distributed Systems
Chantzialexiou	2019	Developing Messaging and Real Time Processing System for Cloud Connected Cars
FR3140969A1 (fr)	2024-04-19	Procédé de gestion d’un appel système et produit programme d’ordinateur associé
De Florio et al.	2011	System Structure for Dependable Software Systems