WO1999019794A1 - Computer architecture for the deferral of exceptions on speculative instructions - Google Patents

Computer architecture for the deferral of exceptions on speculative instructions Download PDF

Info

Publication number
WO1999019794A1
WO1999019794A1 PCT/US1998/021454 US9821454W WO9919794A1 WO 1999019794 A1 WO1999019794 A1 WO 1999019794A1 US 9821454 W US9821454 W US 9821454W WO 9919794 A1 WO9919794 A1 WO 9919794A1
Authority
WO
WIPO (PCT)
Prior art keywords
exception
speculative
box
hardware
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US1998/021454
Other languages
French (fr)
Inventor
Jonathan K. Ross
Jack D. Mills
James O. Hays
Stephen G. Burger
Dale C. Morris
Carol L. Thompson
Rajiv Gupta
Stefan M. Freudenberger
Gary Hammond
Ralph M. Kling
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute for Development of Emerging Architectures LLC
Original Assignee
Institute for Development of Emerging Architectures LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute for Development of Emerging Architectures LLC filed Critical Institute for Development of Emerging Architectures LLC
Priority to AU97990/98A priority Critical patent/AU758574B2/en
Priority to EP98952242A priority patent/EP0951672B1/en
Priority to DE69811474T priority patent/DE69811474T2/en
Priority to AT98952242T priority patent/ATE232998T1/en
Publication of WO1999019794A1 publication Critical patent/WO1999019794A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3861Recovery, e.g. branch miss-prediction, exception handling
    • G06F9/3865Recovery, e.g. branch miss-prediction, exception handling using deferred exception handling, e.g. exception flags
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution

Definitions

  • This application relates in general to instruction set architecture and computer program optimizations, and in specific to software control of the
  • a "basic block” is a contiguous set of instructions bounded by
  • branches and/or branch targets containing no branches or branch targets.
  • compilers and processors The trend in compiler and processor design has been to increase the scope of the search for
  • dynamic speculation entails a significant amount of hardware complexity, furthermore, the complexity increases exponentially with the number of basic blocks over which dynamic speculation is applied - this places a practical limit on the scope of dynamic speculation.
  • the complexity increases exponentially with the number of basic blocks over which dynamic speculation is applied - this places a practical limit on the scope of dynamic speculation.
  • references to speculation, speculative instructions, etc. shall be taken to refer to static rather than dynamic
  • page zero In most systems access to page zero is illegal and typically
  • the compare instruction is defined such that it does not generate any
  • the mechanism must posses very low latency otherwise the performance of a program
  • the compiler marks the instruction as speculative. Non-speculative instructions that encounter an exceptional condition
  • a speculative instruction that reads a DET writes a DET into the instruction's destination (again the destination does not contain the
  • Recovery requires the program to be augmented
  • the OS would be required to emulate any auxiliary operations of the load in software, such as address
  • the inventive method and apparatus uses a mechanism in the processor hardware to write a DET into an instruction's
  • the present invention introduces new processor state to control the operation of eager deferral in the form of multiple exception
  • embodiment defines one bit in the DCR per load exception or class of related exceptions. It is to be noted that other mappings of bits to exceptions are possible.
  • each DCR bit determines whether one particular exception, or
  • class of related exceptions may be eagerly deferred or whether an exception
  • a single program is typically composed of multiple "compilation units"
  • modules are not compiled at the same time
  • the DCR bits would need to be set for the lowest common denominator among all modules - and potentially the lowest
  • TLB Lookaside Buffer
  • each page table entry (and therefore each page) has its own ITLB.ed
  • bits are possible without affecting the spirit and scope of the present invention, for example, multiple ITLB.ed bits could be defined that select from
  • the OS typically caches information relevant to the OS.
  • the OS can have
  • the present invention introduces additional processor state to allow the OS to
  • FIGURE 2 depicts a flow diagram for software resolution of exceptions, including software directed deferral of speculative load exceptions.
  • FIGURE 3 depicts a schematic diagram with the system implementing
  • FIGURE 3 depicts the inventive mechanism 300 that implements the
  • FIGURES 1 and 2. The system decides whether to have the
  • One of the bits is the instruction translation
  • TLB look-aside buffer 331
  • the inventive mechanism also uses multiple bits in a Control Register
  • mappings of bits to exception types are possible, e.g. a single
  • the operating system (OS) 320 has the flexibility to select 322 which exception types can be deferred in hardware and which cannot.
  • FIGURE 2 where fault resolution may be attempted on both speculative and
  • non-speculative operations including non-load operations.
  • the inventive mechanism 100 is defining exceptions to load conditions
  • begin load execution refers to an instruction which is fetched from memory
  • Box 103 includes all of the normal TLB checks, which are checking for virtual address translation presence, and whether the specified
  • Box 108 represents a successful load, and the return
  • execution is ended 111 , and the system is ready to continue on and fetch
  • box 104 where it is determined whether the load is a speculative load or non-speculative load. If the load is non-speculative, then none of the
  • FIGURE 1 hardware deferral mechanisms of FIGURE 1 will operate, because they only have an effect for speculative loads. If the load is non-speculative, then the
  • FIGURE 2 will operate, beginning with box 201.
  • instruction TLB format specifies whether exceptions raised by speculative
  • processors may be deferred by an OS for non-fatal exceptions.
  • the ITLB bit does not refer to the TLB for the data of the load, where the load is being
  • An eager deferral is where the system has determined that it may require a great deal of work to determine whether the
  • the ITLB bit is used to communicate the status of recovery code
  • box 110 where it will fault into the OS by generating an exception into the operating system. However, if the application can eagerly defer exception, then the yes path is taken out of box 105 into box 106, which selects a bit from the control register (DCR).
  • DCR control register
  • the control register (DCR) is the mechanism that the OS uses to
  • Deferral bits are defined in the DCR which classify _
  • box 107 determines whether the DCR bit has been selected.
  • FIGURE 2 box 201 An exception deferral
  • the operating system has a chance to walk through a cache of keys in the table before generating a deferral.
  • Box 109 writes the deferred exception indicator into the register, and then proceeds to end the load execution 111 , without generating a fault into the
  • the DCR deferral bit for that exception is set to 1 , and the ITLB.ed bit for the
  • a compiler/linker may mark text
  • the bit in box 107 is set by the operating system either statically or on
  • this is set, not at compile time, but is set based on the current
  • the load is speculative or not, is determined at compile time with static code -
  • box 110 the generation of an exception into the OS, which leads to box 201 on FIGURE 2, the starting of the OS exception handler.
  • box 202 After the first level optimizations in box 202 have been attempted, box 202
  • the application is returned to the interrupted instruction for a retry of the
  • the instruction in box 204 (as well as box 213) is also known as
  • box 202 is performed at run-time, but is
  • box 203 into box 205, where it is determined whether the load is speculative
  • box 205 is the software parallel to box 104.
  • the compiler will use a different instruction for speculative loads than
  • faulting instruction is a speculative load that fact
  • Box 205 is a precursor check to determine whether the exception can be eagerly deferred. Only speculative loads can be deferred.
  • ISR.sp bit is equals 1 , then the yes
  • Box 206 is also a precursor check to determine whether the exception can
  • the ISR.ed indicates whether the application that is
  • the ISR.ed is a copy of the ITLB.ed bit for the instruction raising the exception.
  • the bit is copied on an interruption.
  • box 206 is the software parallel to box 105.
  • boxes 205 and 206 allow software to quickly determine that this was a speculative load
  • box 206 is because ISR.ed is equal to 1 , then the yes path
  • box 207 is followed, where the OS may impose some of its own policy. If
  • the second level fault resolution techniques are the heavier weight
  • the IPSR.ed bit is set to 1.
  • the IPSR.ed control bit directs
  • the PSR represents processor status
  • box 212 the mechanism 200 proceeds to box 213, which is the execution of the return from interruption (RFI) instruction and re-executes the load instruction.
  • RFI return from interruption
  • PSR.ed is set (IPSR is copied to PSR by RFI), which is the main aspect of the compound load instruction. Instead, the hardware will set the deferred
  • the deferred exception indicator will be set by
  • boxes 204 and 213 are both RFI boxes. The difference between these two is that in box 204 the IPSR.ed bit remains set to 0, the
  • box 204 indicates that a fault has
  • IPSR.ed is set to 1 , this means that software is indicating
  • IPSR.ed bit is set to 1 in box 202, and thus passed to box 101 by box 213 as 1 , while IPSR.ed bit is set to 0 box 202 and remains
  • box 101 entry into box 101 from either box 204 or box 213 indicates
  • the mechanism defers all lower priority aspects in addition to the faulting higher priority aspect.
  • the mechanism can progress
  • the mechanism may decide to defer _ _
  • protection keys which is one of the mechanisms that used in the virtual memory management.
  • the OS can indicate to the hardware to defer this type of fault, and not to go through the more expensive operation of generating a pipeline break,
  • a TLB is a translation look aside buffer, which is a hardware mechanism that caches virtual translation information
  • operating system may define the key registers to be a cache containing a
  • operating system may want to attempt fault resolution by
  • the operating system is performing some caching of the resources and some optimizations that are
  • each page has associated with it a field which states the protection key for this page.
  • protection key register which is another privileged state register in the
  • the ISR.sp bit is set to indicate a speculative load
  • ISR.ed is set to the value of ITLB.ed field of the faulting instruction.
  • DCR bits allow communication between the OS and the hardware, and are
  • the PRS.ed bit allows communication between the OS and the . hardware, and indicates that a speculative load instruction should generate a
  • the inventive mechanism allows higher performance of programs

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

The inventive system (300) and method allows for software control (320) of hardware (330) deferral of exceptions in speculative operations (104), and comprises three components. The first component is processor stored information (107) which reflects the code generation strategy of applications and is used by hardware and the operating system to control exception deferral. The second component is processor stored information (105) set by the operating system to specify to hardware which type of faults should be automatically deferred. The third component is further processor stored information (102) which indicates to the hardware to defer certain exception causing aspects of the speculative operation, while performing other non excepting aspects of the speculative operation. The stored information is set after the operating system exception handler (200) has unsuccessfully attempted fault resolution (213).

Description

COMPUTER ARCHITECTURE FOR THE DEFERRAL OF EXCEPTIONS ON SPECULATIVE INSTRUCTIONS
REFERENCE TO RELATED APPLICATIONS
The present application is being concurrently filed with commonly
assigned U.S. Patent Application, Serial Number [HP Attorney docket No. 10871852-1 , by Jack D. Mills, et al.] entitled "RECOVERY FROM
EXCEPTION DEFERRED BY SPECULATIVE INSTRUCTIONS", the
disclosure of which is incorporated herein by reference.
TECHNICAL FIELD OF THE INVENTION
This application relates in general to instruction set architecture and computer program optimizations, and in specific to software control of the
mechanism to defer exceptions on speculative instructions.
BACKGROUND OF THE INVENTION
A "basic block" is a contiguous set of instructions bounded by
branches and/or branch targets, containing no branches or branch targets.
This implies that if any instruction in a basic block is executed, then all instructions in the basic block will be executed, i.e. the instructions contained
within any basic block are executed on an all-or-nothing basis. The
instructions within a basic block are enabled for execution when control is passed to the basic block by an earlier branch targeting the basic block ("targeting" as used here includes both explicit targeting via a taken branch
as well as implicit targeting via a not taken branch). The foregoing implies
that if control is passed to a basic block, then all instructions in the basic
block must be executed; if control is not passed to the basic block, then all
instructions in the basic block must not be executed. The act of executing, or
specifying the execution of, an instruction before control has been passed to
the instruction is called "speculation." Speculation performed by the
processor at program runtime is called "dynamic speculation" while
speculation specified by the compiler is called "static speculation." Dynamic
speculation is known in the prior art.
Two instructions are deemed "independent" when one does not
require the result of the other; when one instruction does require the result of the other they are termed "dependent" instructions. Independent instructions
may be executed in parallel while dependent instructions must be executed in
serial fashion. Program performance is improved by identifying independent
instructions and executing as many of them in parallel as possible.
Experience indicates that more independent instructions can be found by
searching across multiple basic blocks than can be found by searching only
within individual basic blocks, however, simultaneously executing instructions
from multiple basic blocks requires speculation. Identifying and scheduling independent instructions, and thereby increasing performance, is one of the
primary tasks of compilers and processors. The trend in compiler and processor design has been to increase the scope of the search for
independent instructions in each successive generation. In prior art
instruction sets, an instruction that may generate an exception cannot be
speculated by the compiler since, if the instruction causes an exception, the
program may erroneously generate an exception when the program should
not have. This restricts the useful scope of the compiler's search for
independent instructions and makes it necessary for speculation to be performed at program runtime by the processor via dynamic speculation.
However, dynamic speculation entails a significant amount of hardware complexity, furthermore, the complexity increases exponentially with the number of basic blocks over which dynamic speculation is applied - this places a practical limit on the scope of dynamic speculation. By contrast, the
scope over which the compiler can search for independent instructions is
much larger - potentially the entire program. Furthermore, once the compiler
has been designed to perform static speculation across a single basic block
boundary, very little additional complexity is incurred by statically speculating
across several basic block boundaries.
If static speculation is to be undertaken, then several problems must
be solved, one of the most important of which is the handling of exceptional
conditions encountered by statically speculated instructions. Hereafter,
unless explicitly stated otherwise, references to speculation, speculative instructions, etc. shall be taken to refer to static rather than dynamic
speculation.
Since, as noted above, exceptions on speculative instructions cannot
be delivered at the time of execution of the instructions, a compiler-visible _ _
mechanism is needed to defer the delivery of the exceptions until control is
passed to the basic block from which the instructions were speculated (known
as the "originating basic block"). Mechanisms that perform a similar function
exist in the prior art for deferring and later delivering exceptions on
dynamically speculated instructions, however, by definition the mechanisms are not visible to the compiler and therefore cannot be manipulated by the
compiler into playing a role in compiler-directed speculation. No known
method or apparatus for deferring and later delivering exceptions on statically speculated instructions has been enabled in the prior art. Limited forms of
static speculation do exist in the prior art, however: (1 ) the forms do not involve deferral and later recovery of exceptional conditions, and (2) the
forms do not enable static speculation over the breadth and scope of the present invention.
An example of prior art limited static speculation is special case
handling of loads from the memory page starting at address zero - called
"page zero." In most systems access to page zero is illegal and typically
causes a protection violation exception. In certain prior art systems, the
compiler and the operating system (OS) mutually agree that any exceptions
on loads from page zero are to be suppressed (not deferred) and that, in the
event of the suppression, the destination of the load is to be written with zero. _
This allows the compiler to speculate loads that possess the characteristic
that, if they do access illegal memory, they do so only via page zero. The characteristic occurs because the number zero is sometimes used to mark
the boundary of data structures and any load going beyond the boundary will
therefore attempt to access address zero. It should be noted that the limited form of speculation just described does not involve or allow deferral and later
delivery of exceptions and only applies to the narrow class of loads that
possess the characteristic of only accessing page zero when illegal. In the
event that the load is defined to perform auxiliary operations in addition to
reading memory, e.g. adding a value to an address register, then the OS is
responsible for emulating the auxiliary operations in software the emulation
will reduce program performance.
Another example of prior art limited static speculation is the
speculation of instructions that do not cause exceptions. For example, typically the compare instruction is defined such that it does not generate any
exceptions. A properly designed compiler may then speculate the compare
since the only side effect is the writing of a destination. In the event that
control is not passed to the compare's originating basic block, the destination
is simply discarded. Another example is a load instruction from an address
that is known to be valid at compile time and known to remain constant during, runtime, e.g. a global variable. These conditions guarantee that if any
exceptions do occur, they will not be fatal and can be handled speculatively
without side effects - although the handling of the speculative exceptions may
reduce overall performance. Again it should be noted that the limited forms of speculation just described do not involve or allow deferral and only apply to a restricted class of instructions.
Therefore, when undertaking static speculation, there is a need in the
art to enable a mechanism to defer exceptions on speculative instructions
that applies to as many forms of speculation as possible. The mechanism must posses very low latency otherwise the performance of a program
compiled with speculation may actually be lower than the same program
compiled without speculation. The mechanism must also place minimal
restrictions on the form and the construction of software in order to allow the
execution of legacy software, to minimize the impact on software developers, and to maximize the range of software implementation choices. A desired
characteristic of The mechanism is to allow the computer system to
dynamically adapt to program behavior in order to maximize performance over the broadest possible range of software.
SUMMARY OF THE INVENTION
While the present invention applies to any type of speculative
instruction, the following discussion will use the speculative load instruction by way of example. The data indicate that loads are one of the most
important class of instructions to speculate. It is also the case that loads encounter the broadest range of exceptional conditions including translation
cache misses, first access to a page, protection violations, and page not
present. It is to be expected that, relative to non-speculative loads,
speculative loads will tend to encounter more exceptional conditions. This is
due to the fact that the memory address accessed by speculative loads has a
greater probability of being nonsensical since, by definition, the speculative
load is being executed earlier than the programmer intended.
A compiler-visible mechanism to handle exceptional conditions
encountered by speculated instructions is the subject of IDEA application "RECOVERY FROM EXCEPTIONS DEFERRED BY SPECULATIVE
INSTRUCTIONS, [HP Attorney docket no. 10971852-1 ), by Jack D. Mills, et
al., which is concurrently filed. Instructions are divided into two classes:
speculative and non-speculative. Initially all instructions are marked
non-speculative. When the compiler schedules an instruction outside of the
instruction's basic block, the compiler marks the instruction as speculative. Non-speculative instructions that encounter an exceptional condition
generate an exception. Speculative instructions that encounter an
exceptional condition do not generate an exception but rather write a
"deferred exception token" (DET) into their destination, a note that the
destination does not contain the correct result at this point. A
non-speculative instruction that reads a deferred exception token generates
an exception. A speculative instruction that reads a DET writes a DET into the instruction's destination (again the destination does not contain the
correct result), this behavior is called "propagation." By placing a
non-speculative instruction into the originating basic block of a given
speculative instruction, and by configuring the non-speculative instruction to
read a destination of the speculative instruction (or any location into which a
DET may propagate), then a DET generated by the speculative instruction
can be converted into an exception at the point at which control is passed to
the originating basic block. After a DET is converted into an exception and . _ the exceptional condition is corrected, then it is necessary to replace all
previously generated DET's with correct results. This is achieved by a
process called "recovery." Recovery requires the program to be augmented
with additional code generated by the compiler. A compiler may choose not to include recovery code, e.g., to minimize program size, in which case the
opportunity to defer exceptions is dramatically restricted.
It is conceivable to have every exceptional condition encountered by
every speculative load generate an exception into the OS and to have the OS
either correct the exceptional condition (if the correction has no program
visible side effects) or manually write a DET into the load's destination thus
deferring the exception. The drawback of this approach is that generating exceptions into the OS is a high latency operation typically causing processor
pipeline flushes and cache misses. In addition, the OS would be required to emulate any auxiliary operations of the load in software, such as address
post-increment, further exacerbating overall latency. If this high latency
operation were to occur on every exceptional condition on every speculative
load the performance of a program with speculation may fall well below the
performance of the same program without speculation. What is desired is a
mechanism to allow the rapid creation of deferred exception tokens without _
OS intervention.
To achieve this, the inventive method and apparatus uses a mechanism in the processor hardware to write a DET into an instruction's
destination without generating an exception in a process called "eager
deferral." In addition, the present invention introduces new processor state to control the operation of eager deferral in the form of multiple exception
deferral bits contained in the Default Control Register (DCR). As noted earlier, loads may experience a broad range of exceptional conditions. In
addition, the actions associated with certain exceptional conditions are not
specified by the computer architecture but rather are determined by the
implementation of the OS. Thus it is desired that said eager deferral
mechanism allow deferral on an exception by exception basis thus allowing
maximum OS implementation freedom. To achieve this benefit the preferred
embodiment defines one bit in the DCR per load exception or class of related exceptions. It is to be noted that other mappings of bits to exceptions are
possible without affecting the spirit and scope of the present invention, e.g. a
single bit controlling multiple exception classes. In the preferred
embodiment, each DCR bit determines whether one particular exception, or
class of related exceptions, may be eagerly deferred or whether an exception
is to be generated into the OS.
A single program is typically composed of multiple "compilation units"
or "modules". In many cases all modules are not compiled at the same time
or by the same compiler. Further, through a process known as "dynamic
linking" it is possible that certain modules are identified only during runtime
and are therefore not known at compile time. The sharing of modules is a common practice in software development, e.g. libraries, and it is possible for
different modules to be compiled with different degrees of recovery code, e.g. recovery for all speculative loads vs. no recovery at all. The DCR bits of the
present invention apply equally to all modules in a program. In the case of
varying degrees of recovery code the DCR bits would need to be set for the lowest common denominator among all modules - and potentially the lowest
performance. Modules are placed on memory page boundaries in the virtual
address space. Virtual memory pages are mapped to physical memory pages
via an OS controlled data structure called the "page table" containing a
plurality of entries, each of which maps a single page. The page table maps pages containing both instructions and data, and typically instructions and
data do not share the same page. Furthermore, to improve performance, the processor caches the page table in a structure called the Translation
Lookaside Buffer (TLB). Modern processors typically cache page table
entries mapping instructions separately from page table entries mapping data_
- the former in the ITLB and the latter in the DTLB. The present invention introduces additional processor state contained in page table entries
mapping instructions (and therefore cached in the ITLB) called the ITLB.ed
bit, i.e. each page table entry (and therefore each page) has its own ITLB.ed
bit. The value of ITLB.ed for a particular page controls eager deferral for speculative loads contained on said page. The ITLB.ed bit specifies whether
to never eagerly defer or to eagerly defer based on the value of the DCR bits.
This affords the benefit of controlling eager deferral differently for different
modules. For example, if module A includes recovery code while module B
does not, then the ITLB.ed bits on the pages containing module A can be set
to eagerly defer based on the value of the DCR bits while the ITLB.ed bits on the pages containing module B can be set to never eagerly defer. Thus, this
inventive mechanism allows individual tailoring of eager deferral on a
module-by-module basis and therefore places minimal restriction on the form
and construction of software programs. The value of the DCR and ITLB.ed
bits are determined by two pieces of information: (1 ) compiler knowledge of
the state of recovery code which is transmitted to the OS via state in the load
module which is interpreted by the OS program loader; and (2) OS
self-knowledge of the usage of exceptions and the implementation of
exception handlers. Note that alternative embodiments of DCR and ITLB.ed
bits are possible without affecting the spirit and scope of the present invention, for example, multiple ITLB.ed bits could be defined that select from
multiple copies of DCR bits.
To improve performance, the OS typically caches information relevant
to instruction execution. This cached information is not visible to the hardware and therefore cannot factor into the decision on whether to eagerly defer since eager deferral is performed without reference to the OS - this may
cause the DCR bits to be set conservatively. In addition, the OS can have
greater visibility over program behavior than hardware and can use said
visibility to tune program performance. For these reasons, in certain
situations, it is desired to involve the OS in the exception deferral decision.
In the present invention this is implemented by setting the DCR bits to cause
an exception to be generated into the OS for those exceptions where the OS is caching information. The OS may then correct the exceptional condition
based on said cached information or may still decide that deferral is the proper course of action. As noted earlier, this would require the OS to
manually write a DET into the instruction's destination and to emulate any
auxiliary operations of said instruction. To reduce the latency of this situation
the present invention introduces additional processor state to allow the OS to
inform the hardware that a DET should be written into the destination of a
speculative load and that all other auxiliary operations should be performed.
This bit is called the ISR.ed bit.
Accordingly, it is one technical advantage of the invention to allow the
operating system to implement fault-specific optimizations, which is enabled
by the DCR register. It is another technical advantage of the invention that different recovery models can be supported among the various modules of a program.
It is a still further technical advantage of the invention that certain
failed operations can be rapidly deferred without software interrupts,
specifically without expensive pipeline breaks and software faults.
It is still further technical advantage of the invention to allow a more
aggressive use of software static speculation, because the deferral of a failed
speculation is less expensive.
The foregoing has outlined rather broadly the features and technical
advantages of the present invention in order that the detailed description of
the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those
skilled in the art that the conception and the specific embodiment disclosed
may be readily utilized as a basis for modifying or designing other structures
for carrying out the same purposes of the present invention. It should also be
realized by those skilled in the art that such equivalent constructions do not
depart from the spirit and scope of the invention as set forth in the appended
claims. BRIEF DESCRIPTION OF THE DRAWINGS
For a more complete understanding of the present invention, and the
advantages thereof, reference is now made to the following descriptions
taken in conjunction with the accompanying drawings, in which:
FIGURE 1 depicts a flow diagram for hardware exception deferral; and
FIGURE 2 depicts a flow diagram for software resolution of exceptions, including software directed deferral of speculative load exceptions.
FIGURE 3 depicts a schematic diagram with the system implementing
the flow diagrams of FIGURES 1 and 2.
DESCRIPTION OF THE PREFERRED EMBODIMENTS FIGURE 3 depicts the inventive mechanism 300 that implements the
flow diagrams of FIGURES 1 and 2. The system decides whether to have the
hardware write a deferred exception token into the designated register of the
load command or whether an exception should be generated. The inventive
mechanism uses several bits of processor states or stored information to
perform various functions. One of the bits is the instruction translation
look-aside buffer 331 (TLB) entry called ITLB.ed (the .ed is for exception deferral). This bit controls whether any exceptions on any speculative loads
contained in the present page can be deferred in hardware. By defining a bit
321 in the TLB entry to control hardware exception deferral, different software
modules (which map to different pages in memory) can set this bit
independently thus allowing each module to independently include recovery
code. The inventive mechanism also uses multiple bits in a Control Register
(DCR) 332. The preferred embodiment is to have one bit per exception type. _
Note that other mappings of bits to exception types are possible, e.g. a single
bit controlling multiple exception types. However, in the preferred
embodiment, each bit determines whether one particular exception type can
be deferred in hardware. Given the one-to-one correspondence between
DCR bits and exception types, the operating system (OS) 320 has the flexibility to select 322 which exception types can be deferred in hardware and which cannot.
Exception classes represent the different exceptions which can occur during instruction execution. One class of faults is the class of data
translation related faults. Other exception classes would comprise faults in
translating the load, floating point related faults, and instruction fetch related
faults. Within the data TLB class of faults, all the data reference type of
faults would be defined in the DCR because these are faults raised by
speculative loads. Each one of the different types of faults would have an
associated bit in the DCR for speculative loads.
Note that the inventive mechanism will work for any faulting
speculative operation, including any non-load operations. Thus, the load operation is used by way of example only. Moreover, the inventive
mechanism will operate for non-speculative operations, particularly in
FIGURE 2, where fault resolution may be attempted on both speculative and
non-speculative operations (including non-load operations).
The inventive mechanism 100 is defining exceptions to load conditions
when virtual translations are enabled. As shown in FIGURE 1 , box 101 or
begin load execution refers to an instruction which is fetched from memory
and as part of the program instruction stream. This instruction specifies a normal load operation which is a reference to memory. Box 102 is discussed
later with regard to recursive aspects of the inventive mechanism 100. Box 103 performs a test to determine whether an exception occurred during the
load operation. Box 103 includes all of the normal TLB checks, which are checking for virtual address translation presence, and whether the specified
reference is allowed. Other status checks are also performed which are
necessary for the operating system to maintain the correct image in memory
and still perform other operations such as paging, dirty bit references, or
whether the page is actually referenced, or if there is a debug fault
associated with this exception. Thus, all of these checks occur in box 103.
Note that if an exception did not occur, meaning that the load
operation was successful, then it really does not matter whether this was a speculative load or a non-speculative load, as the no path is taken from box
103 down to box 108. Box 108 represents a successful load, and the return
results from the load operation is written into the destination register of the _
load. After completion of the write and the other side effects, the load
execution is ended 111 , and the system is ready to continue on and fetch
next instruction. Note that part of writing the destination register 108, will
clear the deferred exception indicator for the target register. If an exception did occur in box 103, then the yes path is followed to
box 104, where it is determined whether the load is a speculative load or non-speculative load. If the load is non-speculative, then none of the
hardware deferral mechanisms of FIGURE 1 will operate, because they only have an effect for speculative loads. If the load is non-speculative, then the
exception occurred in the home basic block, and thus the exception cannot
be deferred, but rather must be addressed by whatever fault recovery
mechanisms are present in the operating system or basic block. Thus, the no
path is taken down to box 110, which generates the exception into the
operating system. The exception that is generated is dependent upon the
type of the exception. Note that at this point, the software mechanisms of
FIGURE 2 will operate, beginning with box 201.
If the load is a speculative load, then the yes path from box 104 is
followed into box 105, where it is determined whether a hardware deferral
can be performed. The exception deferral bit ITLB.ed, defined in the _ ,
instruction TLB format specifies whether exceptions raised by speculative
loads using this translation entry may be automatically deferred by the
processor or may be deferred by an OS for non-fatal exceptions. In other
words the ITLB.ed bit checks to see if the application 310, 311 has the ability
for eager deferral or whether the system has to try resolving the exception before setting the deferred exception indicator. Note that the ITLB bit does not refer to the TLB for the data of the load, where the load is being
performed from, but rather refers to the TLB for the instruction that is being
executed 334. This is important because recovery code will be associated
with the instruction that is executed, not associated with the data that is being
loaded. Therefore, whether eager deferral is allowed must be determined
through the attribute in the ITLB. An eager deferral is where the system has determined that it may require a great deal of work to determine whether the
exception can be resolved, and thus to save time, the system will perform an automatic deferral.
Moreover, statistically, there is a good chance that this exception will
never have to be dealt with, thus, it is more economical to automatically defer
these types of exceptions. Thus, they are eagerly deferred, the deferred exception indicator is set, and if the results really are needed later, then the
exception condition will be handled through recovery code.
However, if the application lacks recovery code for this type of
exception, then the system is not going to try to resolve this exception later,
thus failure had better really be a hard failure and the operating system
should try to resolve this exceptions as best as it can. This is indicated from the application both to the hardware and to the operating system through the
ITLB.ed bit.
Thus, the ITLB bit is used to communicate the status of recovery code
in the running application, from the application to both the OS and the
hardware. If the application does not have the ability to handle an eagerly
deferred exception, then the ITLB.ed bit is going to be zero. This means that
hardware cannot defer this exception, because it might be something that the
operating system could resolve. Thus, the no path is taken from box 105 to
box 110 where it will fault into the OS by generating an exception into the operating system. However, if the application can eagerly defer exception, then the yes path is taken out of box 105 into box 106, which selects a bit from the control register (DCR).
The control register (DCR) is the mechanism that the OS uses to
communicate to the hardware about whether or not it has performed any fault
specific optimizations. Deferral bits are defined in the DCR which classify _
exceptions that may be raised by speculative loads. These bits are used as
one of the qualifiers for hardware to perform automatic speculative load
exception deferral.
Once the DCR bit has been selected, box 107 determines whether the
OS has specified that there is an optimization (or other recovery mechanism) associated with this specific fault or exception, and this is indicated by the
DCR bit being equal to 0. Thus, this indicates which faults the operating
system wants to handle for speculative loads.
In this case, with the DCR bit equal to 0, then the no path is followed
from box 107 to box 110, and an exception into the operating system will be generated. This allows the operating system to perform its optimization for
this fault without doing exception deferral, see FIGURE 2 box 201. An
example, if a page fault occurred, then the operating system would have a
chance to walk the page tables and look for a translation and install it into the
TLB before generating a deferred exception. Another example is for a key
miss. Here the operating system has a chance to walk through a cache of keys in the table before generating a deferral.
If the operating system does not have any optimizations associated
with these particular faults, and the only thing it will do is defer the exception
by emulating the load and setting the deferred exception indicator, then it -
indicates that to the hardware by setting the DCR bit in the control register to
1. Thus, if the outcome of the test in box 107 is true, and thus both the
application is ready for a deferral and the operating system does not have
any optimizations to run, then the yes path is taken from box 107 to box 109.
Box 109 writes the deferred exception indicator into the register, and then proceeds to end the load execution 111 , without generating a fault into the
operating system. Thus, when a speculative load raises an exception and
the DCR deferral bit for that exception is set to 1 , and the ITLB.ed bit for the
speculative load's instruction page is set to 1 , the hardware will perform
automatic deferral of the exception. A compiler/linker may mark text
segments with an attribute which the OS will use to set the ITLB.ed bit.
Note that the boxes in FIGURE 1 are all performed during run time.
During compile time, the attribute bits which are reflected in the ITLB.ed bit
are set. The bit in box 107 is set by the operating system either statically or on
a process-by-process basis. Thus, at different times, the operating system
may have different deferred exception policies for different applications.
Therefore, this is set, not at compile time, but is set based on the current
running process by the operating system. The test in box 104, as to whether
the load is speculative or not, is determined at compile time with static code -
scheduling. There is a static determination by the optimization phase of
program compilation on whether to issue this load speculatively and this is tested for in box 104.
In FIGURE 1 , the no paths from boxes 104, 105, and 107 all lead into
box 110, the generation of an exception into the OS, which leads to box 201 on FIGURE 2, the starting of the OS exception handler. The software
mechanism 200 depicted in FIGURE 2 allows the operating system to
perform various virtual memory optimizations and various enhancements to
hardware structures in box 202. The software attempts first level fault
resolution techniques. For example, walking page tables, filling in protection
key caches, and other things that the hardware does not have structures to
do. This is not an exhaustive list of the kinds of things the operating system
could do, but just two examples of types of optimizations that could be
performed on speculative loads. After the first level optimizations in box 202 have been attempted, box
203 determines whether they successfully resolved the fault or exception. If
successfully resolved, the yes path from box 203 is followed to box 204, and
the application is returned to the interrupted instruction for a retry of the
instruction. The instruction in box 204 (as well as box 213) is also known as
RFI or return from interrupt. This time the instruction should move toward -
completion because the fault condition has been resolved. Other faults may
arise during the retry. Note that box 202 is performed at run-time, but is
compiled by the operating system, statically into the code. It is not something
that the hardware does dynamically based on the fault. If the fault is not resolved by box 202, then the no path is followed from
box 203 into box 205, where it is determined whether the load is speculative
or non-speculative. Note that box 205 is the software parallel to box 104.
Thus, software has the ability to make the same kinds of tests that the
hardware is making. The ISR.sp bit of interruption status 333 specifies that the interruption is related to a speculative load operation. The ISR.sp bit
allows the OS to quickly determine if a fault was generated by a speculative
load. The compiler will use a different instruction for speculative loads than
non-speculative loads. If the faulting instruction is a speculative load that fact
will be automatically reported by hardware in the interruption status register,
which sets the bit in box 205.
Box 205 is a precursor check to determine whether the exception can be eagerly deferred. Only speculative loads can be deferred.
Thus, if the load is not speculative then ISR.sp equals 0, and the no
path to box 208 is followed, where the second level fault resolution is - _
attempted. If the load is speculative, then ISR.sp bit is equals 1 , then the yes
path to box 206 is followed, where it is determined whether the ISR.ed set to
1. Box 206 is also a precursor check to determine whether the exception can
be eagerly deferred. The ISR.ed indicates whether the application that is
running can handle an eagerly deferred exception. The ISR.ed is a copy of the ITLB.ed bit for the instruction raising the exception. The bit is copied on an interruption. Note that box 206 is the software parallel to box 105. Thus
the check in box 105 is mirrored in software with box 206. Thus, boxes 205 and 206 allow software to quickly determine that this was a speculative load
and also to very quickly determine what kind of a speculative deferral
behavior is expected by the application program.
If the application cannot handle eagerly deferred exceptions, with
ISR.ed equal to 0, then the no path is followed from box 206 to box 208, to
attempt second level fault resolution. If the application can handle the eager
deferral of faults, box 206 is because ISR.ed is equal to 1 , then the yes path
to box 207 is followed, where the OS may impose some of its own policy. If
the OS decides to eagerly defer this fault, then the yes branch from box 207
is followed to box 212, and software deferral of the exception is begun. However, the OS may decide that even though eager deferral would be
allowed by the application and it is a speculative load, that it still wants to
attempt second level fault resolution for a speculative load. Then the no path
from box 207 is followed into box 208.
The second level fault resolution techniques are the heavier weight
techniques such as a page fault handler or an access rights handler, or other
virtual memory fault resolution routines. Again, these are provided as examples, and are not intended to constitute an exhaustive list. After
attempting that second level fault resolution in box 208, the success is
determined in box 209. If the resolution is successful, then the yes path is
followed from box 209 to box 204, which is the return to the interrupted
instruction. If the fault is not resolved in by the second level fault resolution techniques, then the no path is followed from box 208 to box 210.
In box 210 it is determined whether the original faulting instruction was
a speculative load by checking ISR.sp bit. If it was not a speculative load, and thus ISR.sp equals 0, then the no path from box 210 to box 211 is
followed, where a fault is delivered to interrupted context (the code that
issued the speculative load). This may terminate process execution.
Therefore, a non-speculative load will travel through mechanism 200, through
the second level fault resolution 208, and if it is un-resolved, box 210 will
determine that it is a non-speculative execution, and will begin proceedings
to terminate the process in box 211. However, if it is a speculative load, and
both the first and second levels of fault resolution are not successful, a fault
to software will not be raised, and the yes path from box 210 to box 212 will
be followed.
In box 212, the IPSR.ed bit is set to 1. The IPSR.ed control bit directs
the processor to set the deferred exception indicator for the next instruction (if it is a speculative load). This bit can only be set by the "return from
interruption" instruction (RFI) in box 212 and is cleared by hardware after the
execution of the current instruction. The PSR represents processor status
register. This indicates to the hardware that the load that has failed, and the
fault or exception has not been resolved, either because the OS could not via
boxes 203 and 209, or the OS did not want to resolve it via box 207. From
box 212, the mechanism 200 proceeds to box 213, which is the execution of the return from interruption (RFI) instruction and re-executes the load instruction. The hardware will not try to reissue the memory reference when
PSR.ed is set (IPSR is copied to PSR by RFI), which is the main aspect of the compound load instruction. Instead, the hardware will set the deferred
exception indicator in the target register and will perform the other load side
effects of the speculative load, which include base address modification and
ALAT updates (advance load address table hardware structure). Thus, the
OS will cause hardware to set the deferred exception indicator for all faults _
generated by speculative loads which cannot be resolved. Therefore, when an OS defers a speculative load exception, it only has to set the IPSR.ed bit
and issue the RFI instruction. The deferred exception indicator will be set by
hardware and all other non-memory components of the compound
speculative load operation will be performed. Note that boxes 204 and 213 are both RFI boxes. The difference between these two is that in box 204 the IPSR.ed bit remains set to 0, the
initial state set in box 202 by the hardware. However, in box 213 the IPSR.ed
bit equals 1. This bit instructs the hardware to set the deferred exception
indicator in the target register and only perform the load side effects, and not
to try to perform the memory access. Thus, box 204 indicates that a fault has
been resolved, but box 213 the fault is unresolved. The fault will be deferred
until a later time, when (and if) the program reaches its home basic block.
Thus, it is the responsibility of the code in the basic block to perform a check
instruction on the target register of this speculative chain. The check
instruction will cause recovery code to be invoked if there is a deferred
exception indicator. Note that the recovery code is static or at compile time.
So the deferred exception indicator is one way that you can determine
that the speculative load has failed when in the basic block. The other way is
by what is called non-speculative consumption of the target register. - .
Examples of non-speculative consumption of a register include: trying to
move the value to a control register, or trying to move it to a branch register,
or to store it to memory. These are termed non-speculative operations, and if
attempted and failed, then the processor will raise a different type of software
interruption, which will then be an indication that the program was not written correctly and, again, a fault to the interrupted context is raised but this time
from the home basic block, not from the point to which the speculative load was hoisted.
In box 212, IPSR.ed is set to 1 , this means that software is indicating
to hardware that it wants the deferred exception indicator set. Note that the
actual deferred exception indicator is different for a speculative load targeted at the floating point register file than it is for a speculative load targeted at the
general register file. In a particular implementation, there is a 65th bit on the
general register file that is used for deferred exception indicator, while there
is reserved encoding in the floating point register file to determine that there
is a deferred exception on the speculative load to that register. One of the
advantages of this invention is that the software sets the IPSR.ed bit and
leaves it up to hardware to do the deferral, and thus it is no more expensive
to determine whether it is a floating point or a general register target. Note
that nowhere in the inventive mechanism 100 and 200 is there a check what - on the type of speculative operation, i.e. general or floating point, and since
the hardware handles the deferral, then there is no need to further
differentiate between general or floating point loads.
Note that both boxes 204 and 213 return to box 101 of FIGURE 1. As
discussed earlier, the IPSR.ed bit is set to 1 in box 202, and thus passed to box 101 by box 213 as 1 , while IPSR.ed bit is set to 0 box 202 and remains
unchanged through out 200, and thus is passed to box 101 by box 204 as 0.
The IPSR bit is copied into PSR. Thus, in box 102, if PSR.ed equals 1 and
the operation is a speculative load, then the yes path is followed from box
102 to box 109, which is the hardware deferral of the exception. If either the
PSR.ed bit equals 0, or the operation is a non- speculative load or a non-load operation, then the no path is followed into box 103, for further operations as discussed above.
The only case in which the end load execution box 111 is not reached
is if the application is terminated in box 211 because a fault could not be
resolved for a non-speculative instruction, in that case the fault is raised to
the interrupted context, which may result in the program or application
terminating.
Note that entry into box 101 from either box 204 or box 213 indicates
that the execution is repeated. So faulting into the OS at box 201 , the restart
position that is indicated to the operating system is the faulting instruction.
So upon return, the application returns to that instruction and replays it again. The instruction is replayed with either the PSR.ed bit set to 1 from box 213 or
set to 0 from box 204. Since, there are multiple reasons why a load can fail,
and each are reported one at a time to the operating system. Thus, the recursive nature of this mechanism checks again to determine if there is
another fault in the next sequential aspect of the load operation. Note that aspects are in a sequence from the highest priority to the lowest priority.
Thus, if a higher priority aspect has a fault that is not resolved, then
the mechanism will not bother to check for and/or resolve any lower priority
aspect. In many cases if a higher priority fault cannot be resolved, then lower
priority ones cannot be resolved. For example if higher priority a page fault
cannot be resolved and is going to be deferred, then an access rights
violation cannot be resolved because the application does not have a page.
Therefore, the mechanism defers all lower priority aspects in addition to the faulting higher priority aspect. However, the mechanism can progress
through higher priority aspects and defer the lower priority aspects. For
example, suppose the above page fault is resolved, the mechanism then
checks for access rights and determines that the application is trying to read
an execute only page. At this point, the mechanism may decide to defer _ _
resolving this fault violation.
An example of the inventive mechanism 100, 200 involving protection
keys. Suppose that an operating system is not using protection keys, which is one of the mechanisms that used in the virtual memory management.
Consequently, if an operating system is not using the key mechanism, then any transaction that has a key fault should be invalid. So in this case, that
operating system would indicate in the DCR to eagerly defer all key miss faults resulting from speculative loads, because if a key miss would have
been encountered and that fault been raised into the operating system, the operating system would not have resolved the key miss for the load
instruction. Thus, by setting the bit corresponding to this fault to 1 in the
DCR, the OS can indicate to the hardware to defer this type of fault, and not to go through the more expensive operation of generating a pipeline break,
reporting a software exception, which is going to require the emulation of the
instruction and then return.
On the other hand, suppose an operating system uses keys as part of
its virtual address management. A TLB is a translation look aside buffer, which is a hardware mechanism that caches virtual translation information
and protection information. It can perform this operation very rapidly on an
instruction-by-instruction cycle without causing software intervention. The
operating system may define the key registers to be a cache containing a
subset of all of the capabilities of the current application. Thus, if there is a
key miss fault, then operating system may want to attempt fault resolution by
looking through a larger memory based cache, locating the key for this
reference, moving it into the protection key register, and then re-issuing the speculative load which may succeed. So in this case the operating system is performing some caching of the resources and some optimizations that are
more than the resources that are built into the processor. Thus, that operating system probably would not set automatic hardware deferral in the
DCR for key faults because it wants to attempt to resolve the faults before hardware does the deferral.
Furthermore, in the translation look aside buffers (TLB) each page has associated with it a field which states the protection key for this page.
Access to that page will only be granted if that protection key also exists in a
protection key register, which is another privileged state register in the
processor. So it's a way of allowing protection other than through address
space isolation. Thus, two different users can generate an address to a
location and but only one of them has access to it because that user has the key in the key registers, while the other one does not have the key and thus
does not have access. Now if a page is referenced, and the key is pulled out .
of the TLB, however the key is not found in the protection key register file,
then a key fault is generated. Now if this was being performed with a
speculative load, at the point where the key fault would have been raised, the
DCR is queried. If software has indicated that it does not have recovery code
or if the operating system has said that it wants to see key faults, then that fault will be raised to the OS. Then the operating system can actually
perform some optimizations. However, if the application has recovery
mechanisms and this indicated in the instruction TLB, and the operating system does not want to handle key miss faults, then this fault will be
automatically deferred in hardware.
Therefore, the ITLB.ed bit allows communication between the
application and the OS, and communicates information about the speculative recovery capability of an application. This allows an OS to defer expensive
exceptions at speculative load time, knowing that if the non-speculative use
of the data is on the execution trace that speculative recovery code will exist.
On a speculative exception the ISR.sp bit is set to indicate a speculative load
and ISR.ed is set to the value of ITLB.ed field of the faulting instruction. The
DCR bits allow communication between the OS and the hardware, and are
an indication of which speculative load exceptions should be automatically
deferred. The PRS.ed bit allows communication between the OS and the . hardware, and indicates that a speculative load instruction should generate a
deferred exception indicator and perform all non-memory components of the
compound operations specified in the current speculative load operation.
The inventive mechanism allows higher performance of programs
utilizing speculative execution. Operating system policy decisions and caching of translation information can operate in the presence of speculative
execution without the expense of unnecessarily deferred exceptions or the
expense of emulating instructions in order to defer expensive exceptions.
Thus, allowing automatic hardware deferral of certain exceptions and efficient
hardware deferral under explicit software control can lead to higher
performance through a reduced number of speculative check faults which are more easily resolved at speculative load time than at the non-speculative use
of the load data. Higher performance also arises from more efficient
mechanisms to defer exceptions which are too expensive to resolve at the time of the speculative load, or must be deferred until a non-speculative use.
Although the present invention and its advantages have been
described in detail, it should be understood that various changes,
substitutions and alterations can be made herein without departing from the
spirit and scope of the invention as defined by the appended claims.

Claims

WHAT IS CLAIMED IS
1. A system 300 that provides operating system 320 control of a hardware 330 deferral of an exception that has occurred in the
execution of an instruction in an application 310, the system comprising:
means 104 for determining whether the instruction is speculative;
means 105 for communicating first information, between the
operating system and the hardware, about whether the exception is of a
type which is to be automatically deferred by the hardware;
means 107 for communicating second information, between the
operating system and the application, about whether the operating system will attempt to recover from the exception prior to deferral; and
means 102 for communicating third information, between the
operating system and the hardware, that indicates whether the hardware should defer the exception based upon at least one of the first information
and the second information, and whether the instruction is speculative.
2. The system of claim 1 , wherein:
if the first information 105 indicates that the exception should be
automatically deferred and if the second information 107 indicates that the operating system will not attempt to recover from the exception prior to deferral and the instruction is speculative 104, then the hardware will
defer the exception 109.
3. The system of claim 1 , wherein: if the first information 105 indicates that the exception should not be automatically deferred, then a means for handling exceptions in the
operating system is invoked 110.
4. The system of claim 1 , wherein:
if the second information 107 indicates that the operating system
will attempt to recover from the exception prior to deferral, then a means
for handling exceptions in the operating system is invoked 110.
5. The system as in claim 3 or 4, wherein the means for handling exceptions comprises:
a first means 202 for attempting fault resolution;
wherein if the first means is successful 204, then the third
information 102 will indicate that exception is resolved and that the
hardware should not defer the exception 108.
6. The system of claim 5, wherein the first means was not
successful, and the means for handling further comprises:
means 207 for determining if the operating system chooses to defer the exception;
wherein if the operating system chooses to defer the exception 207 and the instruction is speculative 205, then the third information will
indicate that hardware should defer the exception 102.
7. The system of claim 6, wherein the operating system
chooses not to defer the exception 207, and the means for handling
further comprises:
a second means 208 for attempting fault resolution;
wherein if the second means is successful, then the third
information will indicate that exception is resolved and that the hardware
should not defer the exception 204.
8. The system of claim 7, wherein:
the second means 208 was not successful;
the instruction is speculative 210; and the third information 213 indicates that hardware should defer the
exception.
9. The system of claim 8, wherein: the second means 208 was not successful;
the instruction 210 is not speculative; and
the application is interrupted 211.
10. The system as in any of claims 1 -10, wherein:
the instruction is a speculative load 104, 205, 210.
PCT/US1998/021454 1997-10-13 1998-10-09 Computer architecture for the deferral of exceptions on speculative instructions Ceased WO1999019794A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
AU97990/98A AU758574B2 (en) 1997-10-13 1998-10-09 Computer architecture for the deferral of exceptions on speculative instructions
EP98952242A EP0951672B1 (en) 1997-10-13 1998-10-09 Computer architecture for the deferral of exceptions of statical speculative instructions
DE69811474T DE69811474T2 (en) 1997-10-13 1998-10-09 COMPUTER ARCHITECTURE FOR DEPENDING EXCEPTIONS OF STATIC SPECULATIVE COMMANDS
AT98952242T ATE232998T1 (en) 1997-10-13 1998-10-09 COMPUTER ARCHITECTURE FOR DEFERRING EXCEPTIONS OF STATIC SPECULATIVE COMMANDS

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/949,295 US5915117A (en) 1997-10-13 1997-10-13 Computer architecture for the deferral of exceptions on speculative instructions
US08/949,295 1997-10-13

Publications (1)

Publication Number Publication Date
WO1999019794A1 true WO1999019794A1 (en) 1999-04-22

Family

ID=25488867

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/021454 Ceased WO1999019794A1 (en) 1997-10-13 1998-10-09 Computer architecture for the deferral of exceptions on speculative instructions

Country Status (6)

Country Link
US (1) US5915117A (en)
EP (1) EP0951672B1 (en)
AT (1) ATE232998T1 (en)
AU (1) AU758574B2 (en)
DE (1) DE69811474T2 (en)
WO (1) WO1999019794A1 (en)

Families Citing this family (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6505296B2 (en) 1997-10-13 2003-01-07 Hewlett-Packard Company Emulated branch effected by trampoline mechanism
US6173248B1 (en) * 1998-02-09 2001-01-09 Hewlett-Packard Company Method and apparatus for handling masked exceptions in an instruction interpreter
US6260190B1 (en) * 1998-08-11 2001-07-10 Hewlett-Packard Company Unified compiler framework for control and data speculation with recovery code
US6301705B1 (en) * 1998-10-01 2001-10-09 Institute For The Development Of Emerging Architectures, L.L.C. System and method for deferring exceptions generated during speculative execution
US6519694B2 (en) * 1999-02-04 2003-02-11 Sun Microsystems, Inc. System for handling load errors having symbolic entity generator to generate symbolic entity and ALU to propagate the symbolic entity
US7761857B1 (en) 1999-10-13 2010-07-20 Robert Bedichek Method for switching between interpretation and dynamic translation in a processor system based upon code sequence execution counts
US6622235B1 (en) 2000-01-03 2003-09-16 Advanced Micro Devices, Inc. Scheduler which retries load/store hit situations
US6564315B1 (en) * 2000-01-03 2003-05-13 Advanced Micro Devices, Inc. Scheduler which discovers non-speculative nature of an instruction after issuing and reissues the instruction
US6542984B1 (en) * 2000-01-03 2003-04-01 Advanced Micro Devices, Inc. Scheduler capable of issuing and reissuing dependency chains
US6594821B1 (en) 2000-03-30 2003-07-15 Transmeta Corporation Translation consistency checking for modified target instructions by comparing to original copy
US6631460B1 (en) 2000-04-27 2003-10-07 Institute For The Development Of Emerging Architectures, L.L.C. Advanced load address table entry invalidation based on register address wraparound
US7188232B1 (en) * 2000-05-03 2007-03-06 Choquette Jack H Pipelined processing with commit speculation staging buffer and load/store centric exception handling
US6615343B1 (en) * 2000-06-22 2003-09-02 Sun Microsystems, Inc. Mechanism for delivering precise exceptions in an out-of-order processor with speculative execution
US6895527B1 (en) * 2000-09-30 2005-05-17 Intel Corporation Error recovery for speculative memory accesses
US6829700B2 (en) * 2000-12-29 2004-12-07 Stmicroelectronics, Inc. Circuit and method for supporting misaligned accesses in the presence of speculative load instructions
US20020199179A1 (en) * 2001-06-21 2002-12-26 Lavery Daniel M. Method and apparatus for compiler-generated triggering of auxiliary codes
US7240186B2 (en) * 2001-07-16 2007-07-03 Hewlett-Packard Development Company, L.P. System and method to avoid resource contention in the presence of exceptions
JP2005506630A (en) * 2001-10-25 2005-03-03 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Low overhead exception checking
US6941449B2 (en) * 2002-03-04 2005-09-06 Hewlett-Packard Development Company, L.P. Method and apparatus for performing critical tasks using speculative operations
US7051238B2 (en) * 2002-07-30 2006-05-23 Hewlett-Packard Development Company, L.P. Method and system for using machine-architecture support to distinguish function and routine return values
US20040123081A1 (en) * 2002-12-20 2004-06-24 Allan Knies Mechanism to increase performance of control speculation
US7310723B1 (en) * 2003-04-02 2007-12-18 Transmeta Corporation Methods and systems employing a flag for deferring exception handling to a commit or rollback point
US7321964B2 (en) * 2003-07-08 2008-01-22 Advanced Micro Devices, Inc. Store-to-load forwarding buffer using indexed lookup
US20050283770A1 (en) * 2004-06-18 2005-12-22 Karp Alan H Detecting memory address bounds violations
US7194604B2 (en) * 2004-08-26 2007-03-20 International Business Machines Corporation Address generation interlock resolution under runahead execution
US20060181949A1 (en) * 2004-12-31 2006-08-17 Kini M V Operating system-independent memory power management
US8413162B1 (en) 2005-06-28 2013-04-02 Guillermo J. Rozas Multi-threading based on rollback
US9772853B1 (en) * 2007-09-17 2017-09-26 Rocket Software, Inc Dispatching a unit of work to a specialty engine or a general processor and exception handling including continuing execution until reaching a defined exit point or restarting execution at a predefined retry point using a different engine or processor
US8458684B2 (en) * 2009-08-19 2013-06-04 International Business Machines Corporation Insertion of operation-and-indicate instructions for optimized SIMD code
US20110047358A1 (en) * 2009-08-19 2011-02-24 International Business Machines Corporation In-Data Path Tracking of Floating Point Exceptions and Store-Based Exception Indication
US8966230B2 (en) * 2009-09-30 2015-02-24 Intel Corporation Dynamic selection of execution stage
WO2012107800A1 (en) * 2011-02-11 2012-08-16 Freescale Semiconductor, Inc. Integrated circuit devices and methods for scheduling and executing a restricted load operation
US11176055B1 (en) 2019-08-06 2021-11-16 Marvell Asia Pte, Ltd. Managing potential faults for speculative page table access

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0484033A2 (en) * 1990-10-31 1992-05-06 Hewlett-Packard Company Method for implementing dismissible instructions on a computer
GB2294341A (en) * 1994-10-18 1996-04-24 Hewlett Packard Co Providing support for speculative execution
US5666508A (en) * 1995-06-07 1997-09-09 Texas Instruments Incorporated Four state two bit recoded alignment fault state circuit for microprocessor address misalignment fault generation

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5201043A (en) * 1989-04-05 1993-04-06 Intel Corporation System using both a supervisor level control bit and a user level control bit to enable/disable memory reference alignment checking
US5778219A (en) * 1990-12-14 1998-07-07 Hewlett-Packard Company Method and system for propagating exception status in data registers and for detecting exceptions from speculative operations with non-speculative operations
US5438677A (en) * 1992-08-17 1995-08-01 Intel Corporation Mutual exclusion for computer system
US5634023A (en) * 1994-07-01 1997-05-27 Digital Equipment Corporation Software mechanism for accurately handling exceptions generated by speculatively scheduled instructions
WO1998006038A1 (en) * 1996-08-07 1998-02-12 Sun Microsystems, Inc. Architectural support for software pipelining of loops

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0484033A2 (en) * 1990-10-31 1992-05-06 Hewlett-Packard Company Method for implementing dismissible instructions on a computer
GB2294341A (en) * 1994-10-18 1996-04-24 Hewlett Packard Co Providing support for speculative execution
US5666508A (en) * 1995-06-07 1997-09-09 Texas Instruments Incorporated Four state two bit recoded alignment fault state circuit for microprocessor address misalignment fault generation

Also Published As

Publication number Publication date
DE69811474D1 (en) 2003-03-27
EP0951672B1 (en) 2003-02-19
AU758574B2 (en) 2003-03-27
US5915117A (en) 1999-06-22
ATE232998T1 (en) 2003-03-15
DE69811474T2 (en) 2004-01-08
AU9799098A (en) 1999-05-03
EP0951672A1 (en) 1999-10-27

Similar Documents

Publication Publication Date Title
US5915117A (en) Computer architecture for the deferral of exceptions on speculative instructions
US7558889B2 (en) Accessing a collection of data items in a multithreaded environment
US7428727B2 (en) Debugging techniques in a multithreaded environment
US7712104B2 (en) Multi OS configuration method and computer system
US6785886B1 (en) Deferred shadowing of segment descriptors in a virtual machine monitor for a segmented computer architecture
US7533246B2 (en) Application program execution enhancing instruction set generation for coprocessor and code conversion with marking for function call translation
KR102770077B1 (en) Device and method for managing a limited pointer
US6631460B1 (en) Advanced load address table entry invalidation based on register address wraparound
US5842225A (en) Method and apparatus for implementing non-faulting load instruction
US6505296B2 (en) Emulated branch effected by trampoline mechanism
JP2001504957A (en) Memory data aliasing method and apparatus in advanced processor
JP2001519956A (en) A memory controller that detects the failure of thinking of the addressed component
CN110622133B (en) Apparatus and method for managing capability domains
US6718539B1 (en) Interrupt handling mechanism in translator from one instruction set to another
US6691306B1 (en) Use of limited program space of general purpose processor for unlimited sequence of translated instructions
JP7719863B2 (en) Techniques for constraining access to memory using capabilities
JP2001519955A (en) Translation memory protector for advanced processors
US6449713B1 (en) Implementation of a conditional move instruction in an out-of-order processor
JP2001249848A (en) Privileged advancement based on precedent privilege level
Zahir et al. OS and Compiler Considerations in the Design of the IA-64 Architecture
JP7369720B2 (en) Apparatus and method for triggering actions
US7269830B2 (en) Methods and hardware for safe memory allocation in arbitrary program environments
TW202319913A (en) Technique for constraining access to memory using capabilities
CN120359502A (en) Apparatus, method and computer program for performing translation table entry load/store operations
Bungale et al. Low-Complexity Dynamic Translation in VDebug

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 1998952242

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 97990/98

Country of ref document: AU

WWP Wipo information: published in national office

Ref document number: 1998952242

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: CA

WWG Wipo information: grant in national office

Ref document number: 1998952242

Country of ref document: EP

WWG Wipo information: grant in national office

Ref document number: 97990/98

Country of ref document: AU