CN106919367B - Processor and method for detecting self-correcting code - Google Patents
Processor and method for detecting self-correcting code Download PDFInfo
- Publication number
- CN106919367B CN106919367B CN201710138752.8A CN201710138752A CN106919367B CN 106919367 B CN106919367 B CN 106919367B CN 201710138752 A CN201710138752 A CN 201710138752A CN 106919367 B CN106919367 B CN 106919367B
- Authority
- CN
- China
- Prior art keywords
- instruction
- storage element
- ownership
- cache line
- storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3004—Arrangements for executing specific machine instructions to perform operations on memory
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Advance Control (AREA)
Abstract
A kind of processor and method for detecting modification program code, the processor and method determine memory ownership to detect modification program code according to cache line.Ownership queue stores cache line address and indexes with corresponding ownership.Cache line data are translated into instruction, and each instruction is provided together with the ownership of associated storage element index.New each cache line address is compared by the destination address relative to each storage instruction, and when determining, each destination address is compared to each cache line address in the ownership queue.Matched storage element is marked as overtime, and when each instructions arm caused by the storage element according to overtime exits generates exceptional cast.Whereby, the pairing between cache line and corresponding storage instruction generates exceptional cast.Exceptional cast refresh process device is to solve potential modified procedure code situation.The present invention can be improved the efficiency of processor.
Description
Technical field
The present invention is associated with memory ownership, is especially associated with and determines memory ownership to detect based on cache line
Modification program code.
Background technique
Modification program code (self modified code, SMC) has at least one instruction for being locally processed device execution
To correct another instruction or the subsequent procedure code sequence being processed by the processor.Modification program code may have a sequence
The procedure code of column is to correct the procedure code being just performed, so that being corrected and there is the procedure code of new function to be executed once again.
In another example, modification program code is to correct procedure code sequentially immediately and just be performed before.Although reviewing one's lessons by oneself
Positive procedure code is now and not as good as in the past generally, many old-fashioned programs still have modification program code and should be by execution appropriate.
Processor allows for detecting modification program code and correction calculation to avoid unsuitable result." processor " used herein
One word includes microprocessor (micro processor), central processing unit to represent any type of processing unit
(central processing unit, CPU), an operation core or a microcontroller (micro controller) etc..Herein
" processor " word used further comprises any type of processor architecture, such as is integrated with the chip of multiple processing units,
Either contain the integrated circuit (integrated with a System on chip (system of a chip, SOC)
circuit,IC)。
Modern processor is frequently performed pre- acquisition operation reading rows one or more in memory into instruction cache memory
(icache).The cache line of instruction cache memory is resolved to instruct and be performed.In order to maximize efficiency, acquisition unit
Either similar element can attempt to fill up instruction cache memory and the state filled up is maintained continuously to be supplied to ensure to instruct
To execution.In order to maximize efficiency, execution pipeline (execution pipeline) is hoped to be able to maintain that fully loaded state.
Modern processor is passed through to be executed frequently with out-of-order (out-of-order, OOO), that is to say, that evening receives but is ready for being performed
Instruction can prior to it is early receive but be not ready be performed instruction and be performed.Pre- capture is asked at least one of random ordering operation
Topic is may to be modified later by modification program code by pre- capture with the instruction for providing execution.Therefore, it has been provided and holds
Capable instruction may miss amendment, and may cause the operation of inappropriate or non-original meaning.
Modern processor needs to detect or prevent overtime instruction and is completed, overtime instruction refers to modified by procedure code after
It is not intended to the instruction being performed.The ownership of memory can be generally divided into an instruction area and a data area by processor, be referred to
Memory cache is enabled to possess instruction area, data cache (data cache, dcache) possesses data area.Instruction area
Domain is predetermined to be only storage to the instruction that executes, and be predetermined to be can be by the data and letter stored by software program for data area
Breath is utilized.If instruction cache memory is attempted to read the memory that data cache is possessed, ownership must quilt
The process converted, and converted from data cache will be slow and tediously long and make operation by tandem.
In previous framework, boundary of the ownership based on paging.The size of a usual paging is 4KB
(kilobytes).Although the memory of 4KB does not occupy significant capacity, modification program code can generate instruction cache and deposit
Ownership between reservoir and data cache is jolted (thrashing) phenomenon, and reduces operation efficiency.A kind of solution
Method is the memory block of the 1KB in big as low as a quarter page, that is, the paging size of 4KB for reduce ownership.But
Although only the ownership block of 1KB is still enough to cause trouble to modification program code in many cases.Moreover, bigger
Paging size is also often used, and seems 2MB (megabytes) even 1GB (gigabyte), therefore for reducing overall efficiency
For, ownership block is always an important subject under discussion.
Summary of the invention
According to an embodiment, a kind of processor based on cache line to determine memory ownership to detect modification program
Code, processor have ownership queue, acquisition system, processing system and comparator.Processing system has processing front end and executes
System.Ownership queue stores storage element, and each storage element corresponds to a wherein cache line.Acquisition system is by cache line
Cache line data be provided to processing front end, with determine each cache line ownership index, and by ownership index with it is corresponding
Cache line address be input to the storage element of ownership queue.Cache line data are translated to instruction and proposed every by processing front end
One instruction is to be executed.Each instruction has the ownership index of corresponding storage element in ownership queue.Execution system
It determines the destination address of each storage instruction, and when the overtime position of the storage element of ownership queue is set, executes first
Exceptional cast.The storage element has matched ownership index and is ready for the instruction being rejected.First comparator
Compare each cache line address for being input into ownership queue and each destination address, and the result to match at one is looked for
Then set corresponding overtime position.Second comparator is relatively performed each destination address of system decision and there are ownership teams
Each cache line address in column, and set the overtime position per a matched storage element.
First exceptional cast, which is performed, can make to execute system refresh processor, to avoid the finger of the first exceptional cast of triggering
Order is exited, and the first exceptional cast is also to make acquisition system capture instruction again.When inputting corresponding cache line address, pick
Take system that the storage element in ownership queue can be made effective.Processing front end can mark the storage final injunction in ownership queue
For final injunction.In this embodiment, when the instruction of corresponding cache line exited is marked as final injunction, system meeting is executed
Make storage element invalidation.
In one embodiment, processor can have an overtime detector.Overtime detector is according in the instruction being suggested
Ownership index reads the overtime position in ownership memory cache in corresponding storage element.And when in corresponding storage element
Overtime position when being set, overtime detector is to keep the instruction being suggested labeled to generate the first exceptional cast.It is real herein
It applies in example, when labeled the first exceptional cast with generation of the instruction for being ready to exit, executes system and execute the first exceptional cast.
Processing front end is also used to set from each instruction generated across the cache line data for standing on two cache lines across vertical position.It is real herein
It applies in example, when being set across vertical position, overtime detector can read right in ownership memory cache in the instruction being suggested
The next continuous storage element for the storage element answered, and when the overtime position of next continuous storage element is set,
Overtime detector can make the instruction being suggested labeled to generate the first exceptional cast.
Acquisition system can determine that ownership index is binary bit count value.When each storage element is input into ownership team
When column, binary bit count value be will increase using the total quantity as the storage element in ownership queue.In addition, ownership index
Most significant bit can be wound around position.Processor can also have overriding detector.Overriding detector is to according to the instruction being suggested
In ownership index to read the winding position in ownership memory cache in corresponding storage element.And when corresponding storage is single
The winding position of member and when not matching the winding position for the instruction being suggested, overriding detector is also mark the instruction being suggested
Note is to generate the first exceptional cast.When labeled the first exceptional cast with generation of the instruction for being ready to exit, system meeting is executed
Execute the first exceptional cast.
Processor can have storage queue.Storage queue can have multiple storage elements.Each storage element is used to store up
Deposit the storage instruction that processing front end proposes, and a destination address of each storage element to store the decision of execution system.It holds
Row system can also have storage pipeline.Storage pipeline is to determining the scheduled target with each storage instruction executed
Location.And store pipeline to provide each destination address being determined queue to storage corresponding storage element compared with second
Device.
Processing system can be single according to corresponding storage in the ownership indexed access ownership queue in the instruction being suggested
Member is to set the execution position in corresponding storage element.Processor can also have overtime detector.Overtime detector is to calculate
The execution position per a matched storage element that second comparator determines.And when any execution position in matched storage element
When being set, overtime detector can make the storage instruction for corresponding to the destination address being determined labeled to generate second case foreign affairs
Part.In this embodiment, when labeled the second exceptional cast with generation of the storage instruction for being ready to exit, system meeting is executed
Execute the second exceptional cast.In one embodiment, the second exceptional cast enables execution system labeled to generate second case foreign affairs
The storage instruction of part is exited, and with refresh process device, and so that acquisition system obtains instruction pointer to take from instruction cache memory
Instruction after instruction must be stored.
In each instruction that processing front end is generated to basis of design across the cache line data for standing on two cache lines across vertical
Position.Processing system is according to corresponding storage element in the ownership indexed access ownership queue in the instruction being suggested to set
Execution position in corresponding storage element.In addition, in the instruction being suggested when being set across vertical position, processing front end can be to
The execution position of next continuous storage element after setting corresponding storage element.Processor also has overtime detector.Exceed
When detector to calculate the second comparator decision per a matched storage element execution position.And work as matched storage element
In any execution position when being set, overtime detector can make the storage instruction for corresponding to the destination address being determined labeled
To generate the second pending exceptional cast.In this embodiment, when the storage instruction for being ready to exit is labeled to generate
When the second exceptional cast, execution system can execute the second exceptional cast.It is labeled to produce that second exceptional cast enables execution system
The storage instruction of raw second exceptional cast is exited, and with refresh process device, and so that acquisition system obtains instruction pointer with from instruction
Memory cache obtains the instruction after storage instruction.
According to an embodiment, it is in method of the foundation cache line decision memory ownership to detect modification program code
First obtain cache line.Each cache line has cache line address and cache line data.Determine all of the cache line of each acquirement
Power index, and one or more storage elements in ownership queue are added in cache line address and ownership index.When cache line
When location is added into storage element, in the method for Yu Suoshu, also more each cache line address refers to each storage being suggested
The each destination address enabled, and marking any matched storage element is overtime.In the described method, also have, it will
The cache line data of cache line are translated into instruction, and each instruction has ownership index, and ownership index is for ownership team
Storage element in column is determined.Storage element store the instruction being translated into from cache line.The method is also
The destination address for each storage instruction for being suggested to execution is executed and determined with instruction is proposed.The method also has
Have, when each destination address is determined, each effective storage element in more each destination address and ownership queue
Cache line address, and mark and a matched storage element is appointed to be overtime.The method also has, when the instruction for being ready to exit
Ownership index with the storage element being matched in ownership queue, and the instruction is noted as overtime.
The method has, refresh process device, avoids the instruction to trigger the first exceptional cast from exiting, captures again
To generate the instruction of the first exceptional cast.
The method has, when receiving new cache line address, storage element in validation ownership queue,
Marking the final injunction in each effective storage element in ownership queue is final injunction, and when labeled as final injunction
When instruction is exited, make the corresponding storage element invalidation in ownership queue.
The method also has, and the ownership having according to each instruction being suggested is indexed to access ownership queue
In corresponding storage element mark to generate first case foreign affairs and when corresponding storage element is marked as overtime
The instruction of part, and for ready for exit and be labeled to generate the first exceptional cast of each instruction execution of the first exceptional cast.
The method has, and setting is translated from each instruction across the cache line data for standing on two cache lines across vertical
Position.The method also has, when each instruction is suggested, according to ownership indexed access ownership team possessed by instruction
Corresponding storage element in column.And when being set across vertical position, next continuous storage element in the queue that acquires.When
When corresponding storage element is marked as overtime, mark instructions are to generate the first exceptional cast.And works as and be set across vertical position
And next continuous storage element in ownership queue, when be marked as overtime, mark instructions are to generate outside first case
Event.The method also has, and executes and is ready to exit and labeled the to generate each instruction of the first exceptional cast
One exceptional cast.
The method also has, and repeatedly increasing ownership index such as binary bit count value, binary bit count value has
One amounts to quantity at least to the total quantity as the storage element in ownership queue.Determine winding position, winding position be two into
The most significant bit of position count value.The corresponding winding position of each instruction being translated is identical to storing finger in ownership queue
The winding position for enabling the storage element of cache line based on being translated be determined.Institute is obtained according to the ownership index in instruction
The storage element having the right in memory cache, winding position in compare instruction and it is bent go out storage element winding position.Work as winding
Position and when mismatching, mark the instruction to generate the first exceptional cast, executes and mark to generate each of the first exceptional cast
First exceptional cast of instruction.
The method also has, and when proposing each instruction, indexes the memory cache that acquires according to ownership
Storage element, and set take out storage element execution position.The method also has, during translation, setting
According to each instruction generated across the cache line data for standing on two cache lines across vertical position.When proposing each instruction, work as instruction
Be set across vertical position, next continuous storage element in the memory cache that acquires, and setting next continuous
Storage element execution position.The method has, each in ownership queue when comparing the destination address that is determined
The cache line address of effective storage element and when matched storage element is found, determines the execution position of matched storage element
Whether it is set.When the execution position of matched storage element is set, label is associated with the storage for the destination address being determined
Instruction is to generate the second exceptional cast.The method also has, and makes to be labeled to generate the second exceptional cast and prepare and exit
Storage instruction exit and complete, with refresh process device, and by obtaining next finger of the storage instruction after program sequence
It enables, resume operations.
The present invention can be improved the efficiency of processor.
Detailed description of the invention
By narration below and schema, benefit of the invention, feature and advantage can more preferably be understood.
At one in conjunction with an ownership queue of the Fig. 1 to establish ownership of the data between instruction according to an embodiment implementation
Manage the simplification function block diagram of device.
Fig. 2 is that the ownership queue in Fig. 1 according to an embodiment implementation has relative to other ownership processing modules
Interface one simplify function block diagram.
Fig. 3 is the flow diagram according to the operation of the processing front end of Fig. 1 in an embodiment.
Fig. 4 is the flow diagram according to ownership and exceptional event handling in an embodiment.
Fig. 5 is according to executing in an embodiment, exit flow diagram with exceptional event handling.
Wherein, symbol is simply described as follows in attached drawing:
100: processor;101: ownership queue;102: system storage;103: pre- acquisition module;104: processing front end;
105: instruction cache memory;106: executing system;107: acquisition module;109: decoder;111: round-robin queue;113: circulation
Detector;115: instruction translator;117: register alias table;118: microoperation;119: branch's detector;121: reordering slow
Rush device;123: scheduler;125: execution unit;127: storage queue;129: storage pipeline;130: data cache;
131: other elements;135: exiting module;137,139: overtime detects comparator;141: overriding detector;143,145: overtime
Detector;CA: cache line address;DA: destination address;EXB: execution position;L, T1, T2: field;IP: instruction pointer;OWNI: institute
It has the right to index;SDB: across vertical position;STB: overtime position;UOP, UOPX: microoperation;WB: winding position.
Specific embodiment
Inventor has found the problem of memory ownership as caused by modification program code.They have developed according to
The ownership queue of memory ownership is established according to cache line to detect modification program code.
Fig. 1 is the simplified function block diagram that processor 100 combines an ownership queue (OWNQ) 101.Ownership
Queue 101 according to an embodiment and implementation with establish data and instruction between ownership.The standard instruction set framework of processor 100
(instruction set architecture, ISA) can be macro (macro) framework of an x86.This x86 macro architecture can be with
The most application program for being designed to be implemented in an x86 processor is appropriately carried out.The expected knot of one application program
When fruit is obtained, application program is performed correctly at last.Especially, processor 100 executes the instruction in x86 instruction set, and
With the visual buffer collection of x86 user.But the present invention is not restricted to x86 framework, processor 100 can be according to this field
Other interchangeable instruction set architectures that those of ordinary skill understands.As shown, processor 100 couples external system storage
Device 102.External system memory 102 is managed to store software program, application program, data and those of ordinary skill in the art
Other data of solution.Processor 100 can have a Bus Interface Unit (bus interface unit, BIU) or similar
Element (not being painted) is with coupling system memory 102.In the framework of a System on chip, processor 100, system storage 102
A shared integrated circuit can be incorporated into other processing function modules (not being painted).
Processor 100 has a processing system.Processing system have processing front end 104 and execute system 106 and other
In the processing module of subsequent explanation.There is an information to capture (PREFETCH) engine 103, an instruction cache in advance for processing front end 104
105, one acquisition unit 107 of memory (ICACHE), a decoder 109,111, one instruction translator of a round-robin queue (LQ)
(XLATE) 115, one register alias table (RAT) 117 and a branch predictor 119.Execution system 106 generally has one to reset
121, one scheduler 123 (also known as reservation station) of sequence buffer (ROB), execution unit 125 and a storage queue 127.Execution unit
125 have at least one storage pipeline 129 and other execution units 131.Execution unit 131 is, for example, one or more integers
(INT) unit, one or more floating number (or media) units or at least one load pipeline.In one embodiment, load pipeline with
Storage pipeline can be incorporated into a memory order buffer (MOB) (not being painted) or similar element.Store pipeline 129
It can be also coupled to a data cache (DCACHE) 130.Data cache 130 has the data of one or more ranks
Memory cache, for example, a first level (L1) memory cache or a second level (L2) memory cache etc..Number
System storage 102 can be also coupled to according to memory cache 130.As shown, resequencing buffer 121 also has one to exit mould
Block 135, correlative detail asks Rong Houzai to chat.
Other ownership logical AND circuits are provided together together with ownership queue 101, with carry out ownership determine with
Detect modification program code.The introduction of correlative detail is carried out below.Other ownership logical AND circuits have one the
One overtime detects comparator (STALE DETECT COMPARATOR1) 137, one second overtime and detects comparator (STALE
DETECT COMPARATOR2) 139, one overriding detector 141, one first overtime detector (STALE DETECTOR1) 143 with
One second overtime detector (STALE DETECTOR2) 145.
In general operation, the pre- engine 103 that captures is from described in 102 capturing program information of system storage and storage
Information is into the cache line of instruction cache memory 105.Each cache line can have a preset length.The preset length
For example, 64 bytes (byte).The size of cache line can be arbitrary and can be different under other frameworks.It picks
It takes unit 107 to obtain each cache line from instruction cache memory 105 and provides cache line data to decoder 109 with will be described
Data be parsed into command information.Cache line data are divided and are formatted into instruction and correspond to the letter of instruction by decoder 109
Breath, such as operand or similar information.For example, described in the case where processor 100 supports x86 instruction set architecture
Instruction be, for example, x86 instruction.Referring herein to each instruction set architecture be, for example, a macro-instruction or propped up according to processor 100
One macro operation of the instruction set held.Macro operation provided by decoder 109 is then added into round-robin queue 111, and is provided to
Instruction translator 115.Each macro operation is translated into one or more corresponding microcommands or microoperation by instruction translator 115
(micro operations,uop).The microcommand or microoperation are formed according to the native instruction set layout of processor 100.
When each microoperation is provided to resequencing buffer 121, an instruction pointer (IP) is also determined and together with each microoperation
It is provided.Microoperation is provided to register alias table 117.Register alias table 117 is to the program according to each microoperation
Sequence, operand source or renaming information, generate the interdependent information of each microoperation.
Each microoperation (together with associated information) from register alias table 117 is injected towards according to program sequence
Resequencing buffer 121, and it is injected towards scheduler 123.Scheduler 123 have at least one queue, the queue to
Store each microoperation and its interdependent information received from register alias table 117.When microoperation is ready for being performed,
The microoperation that 123 scheduled reception of scheduler arrives is to corresponding execution unit 125.Storage microoperation is provided to storage pipeline 129
To be handled, and every other instruction type is provided to unit appropriate (such as the integer in other execution units 131
Instruction is provided to Integer Execution Units, and Media instruction is provided to media execution unit, etc.).When all dependence relations
It is solved, a microoperation is considered as being ready for executing.Together with a microoperation is dispatched, register alias table 117 will be weighed
One storage element of order buffer 121 is arranged to the microoperation.Therefore, the microoperation is assigned by program sequence
Into resequencing buffer 121.Resequencing buffer 121 is for example arranged into a round-robin queue, to ensure the microoperation
It is exited according to program sequence.Corresponding instruction pointer is also supplied to weight together with corresponding interdependent information by register alias table 117
Order buffer 121, instruction pointer is stored in together with corresponding interdependent information store the storage operand of microoperation with
As a result storage element.In one embodiment, an individual physics buffer heap (PRF) (not being painted) can be included in
Come.One or more physics buffers in physics buffer heap can also be distributed or be mapped to each by register alias table 117
A microoperation, to store operand and result.
The result of execution unit 211 is for example passed back to resequencing buffer 121.Resequencing buffer 121, which updates, to be corresponded to
Field and/or more new architecture buffer (architectural register) or similar element.In a physics buffer
In the embodiment of heap, resequencing buffer 121 has index, and index is to corresponding buffer in more new physics buffer heap.
In one embodiment, framework buffer is mapped to the physics buffer in physics buffer heap by register alias table 117, and more
Correspond to the index or other similar information (not being painted) of microoperation in new resequencing buffer 121.Resequencing buffer 121
In index be for example updated in commission or after execution, and index in operation more new physics buffer heap it is temporary
Content in storage.The module 135 that exits in resequencing buffer 121 finally enables microoperation exit according to procedure code sequence, with
Ensure that operation appropriate is consistent with the instruction of software program or application program script.Either indicate have when a microoperation is labeled
When one exceptional cast, module 135 is exited according to the type of exceptional cast and takes action appropriate.Correlative detail is see following detailed
It states.
Storage pipeline 129 is injected towards to carry out pair that the storage microoperation of operation is also added into storage queue 127
The storage element answered.When being initially added from register alias table 117, the address for storing the operand of microoperation may
It is not known.The address for storing the operand of microoperation includes destination address (DA).When the storage decision of pipeline 129 is performed
One storage microoperation destination address, storage pipeline 129 provide destination address to storage queue 127 in corresponding storage list
Member.
Branch predictor 119 detects branch's macro operation output being provided by decoder 109 and/or in round-robin queue 111,
And whether branch predictor 119 is used according to branch and generates branch prediction results.Branch predictor 119 and acquisition unit
107 are communicated.Acquisition unit 107 can branch to different according to branch prediction results in instruction cache memory 105
Position.Acquisition unit 107 is also communicated each other with the pre- engine 103 that captures.Therefore, when branch location is not on instruction cache
When in memory 105, pre- acquisition engine 103 obtains corresponding position from system storage 102, is stored with inputting into instruction cache
Device 105.
In normal operation, the macro operation from decoder 109 is buffered and is provided to via round-robin queue 111
Instruction translator 115.Judge that the instruction in circulation is repeatedly pulled over when recycling detector 113, for example whole positions of the circulation
It is either at least partially disposed at round-robin queue 111 in round-robin queue 111, circulation detector 113 identification one recycles, in the circulation
Instruction be repeatedly removed from instruction cache memory 105 from being removed in round-robin queue 111.In an embodiment
In, when the circulation of a preset quantity, which is pulled over, to be occurred, circulation detector 113 detects a circulation.In a specific embodiment,
Number of pulling over is 24, but other numbers of suitably pulling over can also be used.In one embodiment, circulation detector 113 is assumed
Circulation can be unlimited continue, therefore recycle detector 113 and continue duplicate loop computation (loop branches are not until prediction is incorrect
It is used), at this point, system is refreshed, and the beginning of acquisition unit 107 is next after the circulation of instruction cache memory 105
A position (or may be another branch location) obtains information.
In the case where recycling detector 113 and having detected a circulation, acquisition unit 107 can constantly be obtained and by cache
Line is added to the buffer of decoder 109, and until buffer is filled, and capturing operation can temporarily stop.In an embodiment
In, when circulation detector 113 detects a circulation, acquisition unit 107 repeatedly obtains the cache line in circulation.In another reality
It applies in example, acquisition unit 107 can be notified circulation detector 113 and detect a circulation, and acquisition unit 107 can start to read
Data outside circulation.For example, acquisition unit 107 can start to read next continuous position of circulation.No matter which situation
In, in a circulation carries out, decoder 109 can be filled.
When decoder 109 is added in cache line data by acquisition unit 107, acquisition unit 107 is also by corresponding cache line
The storage element in ownership queue 101 is added for address (CA) and to mark this storage element be effective.Ownership queue 101
Can be organized into cyclic buffer or similar structure, ownership queue 101, which can have, to be added index and release index with area
The storage element not being assigned and the storage element being deallocated.In another embodiment, in ownership queue 101
Each storage element has a significance bit or a virtual value to distinguish effective storage element and invalid storage element.Wherein, each
The significance bit for being added into the new storage element of ownership queue 101 is set.In one embodiment, acquisition unit 107 determines
One ownership indexes (OWNI) and the winding position (wrap) (WB).Ownership index corresponds to the cache line of cache line with winding position
Address, and corresponding ownership index value be added together with together with cache line address with winding place value it is right in ownership queue 101
The storage element answered.Ownership index uniquely defines each storage element in ownership queue 101.Position quilt is wound herein
To detect the movement of the overriding in ownership queue 101.
Register alias table 117 is to identify last micro- behaviour in each cache line according to corresponding ownership index
Make, and the microoperation to mark the cache line is the microoperation of the last one, so that this information is provided to and reorders
Buffer 121.When exiting module 135 and exiting a microoperation, exits module 135 and determine whether the microoperation being rejected is marked
Note is the last one microoperation for the cache line being given in ownership queue 101.If so, it is all to exit the instruction of module 135
Power queue 101 releases corresponding storage element or keeps the corresponding storage element in ownership queue invalid.
When each new cache line address is added into a storage element in ownership queue 101 acquisition unit 107,
Cache line address is also supplied with the input terminal of the first overtime detecting comparator 137.Overtime detects comparator 137 also from storage team
Each effective destination address (DA), and more each destination address and new cache line address are read in column 127, to determine to be
It is no to have the person of matching.Overtime detecting comparator 137 can be considered as a kind of comparator of new storage element.When cache line address with
Any destination address matches, and a corresponding overtime position for the storage element in ownership queue 101 is set.Overtime position
One storage microoperation of STB instruction and cache line are hit each other, and also that is, storing instruction, modified cache line either stores instruction
Cache line will be modified.When a storage instruction is hit with the cache line for being stored in the effective storage element of ownership queue 101 1
It each other or collides with one another, any instruction generated according to this cache line can be invalid.When overtime position, STB is set,
Any microoperation from the cache line may be invalid (namely overtime).
Ownership index value is more added to or is associated with to the corresponding cache line number provided to decoder 109 with winding place value
According to.A corresponding winding place value and ownership index value of the decoder 109 with each macro operation are by decoder to identify
109 obtain from the corresponding cache line of which macro operation.It is same to wind when multiple macro operations are taken out from same cache line
Position is assigned to each macro operation from same cache line with ownership index.In one embodiment, macro operation not with
When cache line alignment in data cache 105, each macro operation also has one across vertical position SDB.Across vertical position SDB to know
Not Chu a macro-instruction across two different cache lines are stood on the case where.That is, a macro-instruction starts from a wherein cache
Line simultaneously ends at next continuous cache line.When this occurs, the ownership of first line is added in decoder 109
Index and set macro operation across vertical position to be true.When macro operation is included in a single cache line, vacation is set to across vertical position.
When being added into instruction translator 115, each macro operation has corresponding winding position, ownership index and across vertical position.When one
When a position or a field are set to true or false, the position or field (having at least one position) are set to logical one
To be set as true, and logical zero is set to be set as false.
Each macro operation is translated into one or more microoperations by instruction translator 115.In the process of translation, You Yihong
Each microoperation that operation generates equally have with from macro operation as winding place value, ownership index value with across vertical
Place value.Therefore, when a macro operation is translated into three other microoperations, in three microoperations it is each have and originally
The identical winding place value of macro operation, ownership index value with across vertical place value.When being transferred through register alias table 117, twine
Each microoperation is retained in around place value, ownership index value and still across vertical place value.
One exemplary microoperation uopx it is shown in Figure 1 118, and demonstration microoperation uopx is to by register alias table
117 release and to be added into resequencing buffer 121 and scheduler 123 be that any one is micro- defined in processor 100
Operation.Each microoperation has multiple fields in order to the operation of corresponding microoperation or executes by the execution system of processor 100
Performed by system 106.One or more fields (not being painted) are to identify specific instruction and instruction type and its associated operation
Member, such as constant operand, address, storage location and buffer index etc..Other fields are provided to store instruction
Index IP, winding position WB, ownership index OWNI and across vertical position SDB.As explained below, each microoperation also has field T1
To indicate the exceptional cast of the first kind, each microoperation has field T2 also to indicate the exceptional cast of Second Type, each
Whether it with indicator is by the other table 117 of buffer labeled as the final injunction in cache line that microoperation also has field L.
When each microoperation is released from register alias table 117 and is added into resequencing buffer 121 and scheduler 123
When middle, register alias table 117 is corresponding in the index value access ownership queue 101 according to the ownership index OWNI of microoperation
Storage element, and set an execution position EXB in the storage element of taking-up.What it is when microoperation is very to indicate it across vertical position
When for across vertical instruction, register alias table 117 sets the execution position of next continuous storage element in ownership queue 101.
In addition to this.The execution position of storage element is to the hit after detecting the storage microoperation that one is not detected as overtime.
When each microoperation is exported from register alias table 117, overriding detector 141 is had by microoperation
Ownership index value access ownership queue 101 in corresponding storage element, and override detector 141 had by microoperation
Some ownership index values read the winding place value for the storage element being removed.When microoperation winding place value and do not match institute
When the winding place value for the corresponding storage element having the right in queue 101, an overriding actuation once occurred, and override detector 141
The position T1 (label field T1 is true) of microoperation is set to indicate the exception thing of one first exceptional cast or the first kind
Part is performed when microoperation is rejected.In one embodiment, before microoperation is added into resequencing buffer 121, position T1 in
Microoperation can be written when being suggested detector 141 setting.In another embodiment, when being injected towards resequencing buffer
When 121 or after being injected towards resequencing buffer 121, the position T1 of the storage element in resequencing buffer 121 is written
Detector 141 sets or is reordered instruction setting of the buffer 121 according to overriding detector 141.One winding position is not
With the overriding being indicated generally in circulation initiation ownership queue 101, so that corresponding cache line no longer detects self-correction
Procedure code.It exits module 135 and detects T1 and be set to indicate the micro- of microoperation storage element corresponding to resequencing buffer 121
Operation is marked as the exceptional cast of the first kind.Overriding means that a storage element in ownership queue 101 is written
And make the modification program code for being associated with corresponding cache line that may become detect and lead to incorrect result.More into one
It walks for ground, the exceptional cast of the first kind has refreshed machine to prevent incorrect situation.
When each microoperation is exported from register alias table 117, the first overtime detector 143 is according to microoperation institute
Corresponding storage element in the ownership indexed access ownership queue 101 having, and the first overtime detector 143 is according to micro- behaviour
The index of ownership possessed by making reads the overtime position of the storage element taken out.When microoperation is true across vertical position, first exceedes
When detector 43 also read the overtime position STB of next continuous storage element in ownership queue 101.When ownership queue
This overtime position STB in 101 be true or microoperation as across shown in vertical position SDB to be instructed and in ownership queue 101 across vertical
Next continuous storage element overtime position be it is true, then the first overtime detector 143 by set field T1 be very (or
By setting position T1) to mark microoperation at the exceptional cast of the first kind.Overtime detector 143 can refer to when instruction
When being suggested, to detect the submission overtime detector of possible illegal command.Such as override the example of detector 141, field T1
Before the storage element being added into resequencing buffer 121, in the storage being added into resequencing buffer 121
It is set to very, can be by the first overtime when unit or after the storage element being added into resequencing buffer 121
Detector 143 or it is set as true by resequencing buffer 121.As earlier mentioned, overtime position STB is to indicate cache line by one
Microoperation modification is stored, therefore it may be invalid for instructing.
Whenever store pipeline 129 generate a destination address (DA) to it is corresponding one storage microoperation when, destination address in addition to
It is provided to update in storage queue 127 except a corresponding storage element, destination address is also provided to the second overtime detecting ratio
Compared with an input terminal of device 139.Overtime detecting comparator 139 is with also accessing all effective cache line addresses and the new target of comparison
Location each effective cache line address in ownership queue 101.Overtime detecting comparator 139 can be used as the ratio of fresh target address
Compared with device.When have one match result when, overtime detect comparator 139 set ownership queue 101 in corresponding storage element
Overtime position be true.In addition, when overtime detecting comparator 139 detected one match result when, corresponding ownership index
It is provided to an input terminal of the second overtime detector 145.Overtime detector 145 accesses corresponding storage in ownership queue 101
Memory cell, and read the execution position EXB in this storage element.When the execution position EXB of storage element is set to very, then overtime is detectd
Surveying device 145 makes the storage microoperation storage element in resequencing buffer 121 be marked as the second exceptional cast type either
The exceptional cast of Second Type, this is by setting the field T2 of storage element to be true.Overtime detector 145 can be used as to detect
Survey overtime detector in the execution of possible illegal command just in execution.Overtime detector 145 can be directly accessed and reorder
Storage microoperation storage element in buffer 121 can indicate resequencing buffer to set T2 or overtime detector 145
121 to set T2.
After the exemplary microoperation 118 of the specific microoperation uopx of one be associated in resequencing buffer 121 simplifies
It is shown in Fig. 1.Each other microoperation storage element has field T1 to indicate the exceptional cast of the first kind, and each
A other microoperation storage element has field T2 to indicate the exceptional cast of Second Type, each other microoperation storage is single
Member have field L with indicator whether be the cache line marked by register alias table 117 the last one microoperation instruction.
When the last one microoperation that microoperation is a cache line, field L is set to very, on the contrary then field L is set to vacation.It exits
Module 135 detects the field T1 and field T2 of the storage element of the microoperation of resequencing buffer 121, and exits module 135 and hold
Row initializes corresponding exception routine (routine) either program.However, including any microoperation for storing microoperation
It may be all marked as the exceptional cast of the first kind, but only storage microoperation can be marked as the exception thing of Second Type
Part.
It exits module 135 and detects each microoperation in when being ready to exit, such as when microoperation is resequencing buffer
In 121 when oldest instruction.When a microoperation is ready to exit, the storage that module 135 also detects corresponding microoperation is exited
Field T1, field T2 and field L in memory cell.When the field T1 of a microoperation is true, exits module 135 and generate the first kind
The exceptional cast of type gives the microoperation, and when field T2 is true, exits the exceptional cast that module 135 generates Second Type
To the microoperation.When it is true that field T1 and field T2, which are false and field L, exits module 135 and indicate ownership queue 101
It releases corresponding storage element in ownership queue 101 or makes the storage element invalidation in ownership queue 101,
And complete cache line is efficiently removed from ownership queue 101.
When microoperation (will namely exit) oldest in resequencing buffer 121 is indicated as the exception of the first kind
Event, resequencing buffer 121 broadcasts a corresponding exceptional cast signal in processor 100, and processing system is refreshed.?
Under such circumstances, any macro operation and microoperation in execution pipeline is described including causing by efficiently invalidation
The microoperation of exceptional cast.When the exceptional cast of the first kind occurs, all microoperations that do not exit are refreshed, including storage
Any storage microoperation that do not exit in queue 127.The storage microoperation exited still persistently rests in storage queue 127, directly
Memory architecture (such as data cache 130 and/or system storage 102) is submitted to its data.Lead to the first kind
The microoperation of the exceptional cast of type is not allowed to exit, and microoperation is recorded in corresponding finger in buffer reorder buffer 121
Enable index that can be used to address of the access microoperation in instruction cache memory 105.It is pre- to capture engine 103 and acquisition unit
107 temporary stop.Processor 100 interrupts the exception in a microprogram code read only memory (not being painted) of processor 100
Routine, and type of the corresponding exception procedure code to indicate exceptional cast.When processing system is refreshed, exception routine is taken
It call instruction index and instruction pointer is transmitted to acquisition unit 107 is associated with the macro of the microoperation for leading to exceptional cast to capture again
Operation.
Store the exception of the similar first kind in other kinds of microoperation of exceptional cast of the Second Type of microoperation
Event.In this case, storage microoperation is allowed to exit, and storage microoperation is made to complete its operation and update its destination address
Pointed memory location.Possessed because memory location is first commanded memory cache 105, and storing microoperation is to need
The data operation that data cache 130 in device 100 to be processed is possessed, therefore monitoring (snoop) unit is first
Beginningization is so that corresponding cache line invalidation in instruction cache memory 105.Memory amendment is ensured that with invalidation can be in example
Occur when outer event.The similar exceptional cast in the first kind, the exception routine of the exceptional cast of Second Type refresh machine, and
Access and transmitting instruction pointer are to acquisition unit, to restart in the position.Because leading to the exception thing of Second Type
The storage microoperation of part is allowed to complete, and instruction pointer is increased to storage microoperation after instruction cache memory 105
Next instruction, and operation is continued by the position after storage instruction.
Fig. 2 is that the ownership queue 101 in Fig. 1 according to an embodiment implementation has corresponding to other ownership processing mould
The one of the interface of block simplifies function block diagram.Ownership queue 101 has multiple storage elements.Each storage element has a column
Position WRAP is to store winding position.Each storage element has a field OWNI to store a corresponding index value.Each storage is single
Member has one to execute field to store a corresponding execution position.Each storage element has an effective field to store significance bit.
Each storage element has a cache line address field to store corresponding cache line address.Each storage element has one to exceed
When field to store corresponding overtime position.
In one embodiment, ownership index is a count value.When each storage element is added into ownership queue 101
When, the count value increases.In order to ensure the ownership index of each storage element in ownership queue 101 is only with one
The digit B of special index value, ownership index corresponds to the number N of the storage element in ownership queue 101, such as 2B≥N。
In one example, as shown in Fig. 2, the quantity of the storage element in ownership queue 101 be N=32, and ownership index
Position is 5.In one embodiment, acquisition unit 107 determines winding position in a similar manner, winds one that position is indexed as ownership
Additional most significant bit.In this case, when ownership index count down to a maximum value from 0, winding position is 0b (b
To represent a binary digit), wherein total number of the maximum value to indicate the storage element in ownership queue 101.Work as institute
When having the right to be reset to 0 and be added to maximum value again, and winding position is 1b.In other words, each for ownership queue 101
Secondary complete transmitting (pass), winding position WB are switched between two values.For ownership index digit B, storage element
Sum can be less than the number of storage element maximum possible.For example, for the storage element that total quantity is 26, first
Secondary when pulling over (OWNI adds up from decimal 0 to decimal 25, and WB is 0) WB | and OWNI is from 0 | and 00000b counts up to 0 |
11001b.Then, in pulling over for the second time (it is 1 that OWNI, which counts up to decimal 25 and WB from decimal 0), from 1 | 00000b
Count up to 1 | 11001b.It is subsequent to be repeated according to above-mentioned mode.
As earlier mentioned, a new cache line address CA is inserted into cache line address field by acquisition unit 107, and is arranged effective
Corresponding significance bit in field, and determine that ownership corresponding with insertion is indexed to field OWNI, and determine corresponding with insertion
Wind position WB to field WRAP.The cache line address being newly added into is provided to an input terminal of overtime detecting comparator 137.
Overtime detects comparator 137 and also receives destination address DA from storage queue 127.When new cache line address and from storage queue
Any destination address between when having the result to match, corresponding overtime position is set to very in overtime field.When each micro-
When operation is proposed from register alias table 117, it is according to corresponding in the ownership indexed access ownership queue 101 of microoperation
Storage element to set the corresponding execution position EXB of storage element.In addition, when microoperation across vertical position be set to indicate one across
When vertical microoperation, register alias table 117 accesses next storage element in ownership queue 101 and sets the storage element
Corresponding execution position.When the last microoperation of a cache line is exited, buffer reorder buffer 121 accesses ownership queue
Corresponding storage element and resetting or removing significance bit in 101.
Storage pipeline 129 determines the destination address of each storage microoperation and stores destination address in storage queue 127
Corresponding storage element.Destination address is also supplied with the second overtime detecting comparator 139.Second overtime detects comparator 139
Cache line address CA is accessed from ownership queue 101.When the destination address being newly determined is matched from ownership queue 101
When any one of cache line address, overtime detects the corresponding storage element in the setting ownership queue 101 of comparator 139
Overtime position.In addition, the index value of matched ownership index is provided to the second overtime detector 145.The detecting of second overtime
Device 145 is according to corresponding storage element in the access ownership queue 101 of ownership index value to access corresponding execution position EXB.
When the execution position of storage element corresponding in ownership queue 101 is set to very, overtime detector 145 is by resequencing buffer
The storage element label (or enabling it labeled) of the storage microoperation to conflict in 121 is at the exceptional cast of Second Type.
As earlier mentioned, the microoperation that overriding detector 141 is proposed from register alias table 117 receives winding place value and owns
Index value is weighed, and overrides detector 141 and is twined according to corresponding storage element access of the ownership index from ownership queue 101
Around place value.When the winding position WB of storage element corresponding in ownership queue 101 does not match the winding position of microoperation, overriding is detectd
Device 141 is surveyed by microoperation label (or enabling it labeled) into the exceptional cast of the first kind.In addition, the first overtime detector
143 microoperations proposed from register alias table 117 receive ownership index values with across vertical place value, and the first overtime detector
143 access the overtime position of corresponding storage element in ownership queue 101 according to ownership index value.When by buffer alias
In the microoperation that table 117 proposes is very with instruction one across vertical microoperation across vertical position, then overtime detector 143 accesses ownership team
The overtime position of next continuous storage element in column 101.When any one overtime position for the storage element being accessed is set
It is set to very, microoperation is marked (or enabling it labeled) at the exceptional cast of the first kind by overtime detector 143.
One first storage element is shown in the top of ownership queue 101, and the first storage element has winding a position WB, one
Ownership index, an execution position EXB, a significance bit, a corresponding cache line address CA_33 and an overtime position STB.Wherein, it twines
Around position WB=1b.The index value of ownership index is 00000b.The value of execution position is 0b.The value of significance bit is 1b.Overtime position
Value is 0b.The second storage element in one ownership queue 101 is located at the lower section of the first storage element.Second storage element has
One winding position WB, ownership index, an execution position EXB, a significance bit, a corresponding cache line address CA_34 and an overtime
Position STB.Wherein, position WB=1b is wound.The index value of ownership index is 00001b.The value of execution position is 0b.The value of significance bit
For 1b.The value of overtime position is 0b.Third storage element in one ownership queue 101 is located at the lower section of the second storage element.The
Three storage elements have winding a position WB, ownership index, an execution position EXB, a significance bit, a corresponding cache line address
A CA_03 and overtime position STB.Wherein, position WB=0b is wound.The index value of ownership index is 00010b.The value of execution position is
0b.The value of significance bit is 0b.The value of overtime position is 0b.Toward the end of ownership queue 101, last five storage elements have respectively
There is cache line address CA_28 to CA_32 to index with corresponding ownership, is respectively provided with ownership index value 11011b -11111b.
Storage element with cache line address CA_28 also has execution position, significance bit and overtime position.Execution position, significance bit and overtime
The value of position is all 0b.And three storage elements for being next respectively provided with cache line address CA_29-CA_31 are respectively provided with value is
The significance bit that the bit of storage and value of 1b is 1b.Storage element with cache line address CA_29 with there is cache line address CA_31
Also having value is the overtime position of 0b.And it is the overtime position of 1b that the storage element with cache line address CA_30, which then has value,.Have
The last storage element of cache line address CA_32 is effectively but to have not carried out, and be marked as overtime.
In first time transmitting, cache line address CA_1 to cache line address CA_32 is filled up with the winding position with value 0b
Ownership queue 101.When transmitting just beginning second, the storage element of foremost two can be by with cache in transmitting for the first time
Line address CA_33 and the storage with cache line address CA_34 and ownership index value 00000b ownership index value 00001b
For unit to overriding respectively, it is the winding position WB of 1b that storage element, which respectively has value,.These new storage elements (33 and 34) are effective
, but there has been no any microoperations to be performed.Third to the 28th storage element be invalidated (may be complete again without
Effect).29th and the 31st storage element is effective, and each is had at least during a microoperation is carrying out.Third
Ten storage elements are effective and have at least one microoperation still in commission, but have been marked as overtime.32nd
A storage element does not simultaneously have the microoperation being suggested from register alias table 117, therefore the 32nd storage element is still
It is not set to execute, but its overtime position has been set to indicate and a storage instruction conflict or hit.
When acquisition unit 107 counts ownership index value to 11111b, such as ownership queue 101 of the value of simultaneously wound position WB
In be 0b indicated by the last storage element with cache line address CA_32 (such as transmitting for the first time), it sets winding
The value of position is 1b and by ownership rope as having indicated by the storage element of cache line address CA_33 (start second transmit)
Draw value and reset to 00000b and restarts to count.The winding position of subsequent 31 storage elements for being subtracted the reading of unit 107
The value of WB is persistently maintained 1b until ownership index is reset as 00000b, and operation is repeated in such as above-mentioned mode.When one
When circulation is detected, macro operation is added in the no longer self-demarking code device 109 of round-robin queue 111, and acquisition unit 107 is still constantly from fast
Access to memory 105 reads cache line to ownership queue 101 and decoder 109, and corresponding in ownership queue 101, which recycles, to be referred to
The storage element enabled may be subtracted the overriding of unit 107.In this case, processor 100 may not be again institute
The cache line detecting modification program code stated.By register alias table 117 propose and be located at the microoperation in a circulation and twine
Value around position WB is no longer as the value of the winding position for the storage element being written in ownership queue 101.It is micro- what is be suggested
The winding place value of operation in the unmatched situation of winding place value of corresponding storage element in ownership queue 101, detect by overriding
It surveys device 141 and detects the cache line being written and by microoperation label (or enabling it labeled) at the exceptional cast of the first kind.
Even if the storage element in ownership queue 101 is marked as invalid or is released from queue, this is still true.One invalid
Or the storage element that is pushed out persistently rest in ownership queue 101 until being written.
Fig. 3 is the flow diagram according to the operation for handling front end 104 in an embodiment.In first block 301, cache
Line (such as from system storage 102) is read and is stored in instruction cache memory 105, is e.g. captured in advance by information
Engine 103.It is determined in a winding position of next block 303, next cache line with ownership index value, it is e.g. logical
Acquisition unit 107 is crossed, and these information are added into next available storage list in ownership queue 101 together with cache line address
Member.Acquisition unit 107 also sets the significance bit in the storage element in ownership queue 101.As earlier mentioned, ownership queue
101 are for example implemented to the buffer of a circulation, and the significance bit is to determine in any time point in ownership queue 101
Current effective storage element.In an alternative embodiment, index is added can be used with index is released.
As shown in next block 305, when a new cache line address is added into ownership queue 101, new cache
Line address is compared to the effective destination address of each of storage queue 127.As shown in next inquiry block 307, when
When having a hit to be determined, in block 309, the overtime position STB for receiving the storage element of new cache line address is set.?
Overtime place value is set or there is no when hit, the operation of ownership queue 101 terminates.
As shown in block 311, meanwhile, it is corresponding when a new cache line address is added into ownership queue 101
Cache line data are added into decoder 109 together with winding position and ownership index.And in next block 313, decoder 109 solves
The macro operation in cache line is analysed, and the corresponding winding position of the cache line where macro operation and ownership index are added into each
Macro operation.In addition, whether decoder 109 determines macro operation across vertical two cache lines, that is to say, that macro operation originates in one fastly
Line taking simultaneously ends at next continuous cache line.If so, macro operation is set across vertical position.At this point, each macro operation tool
There are winding place value, ownership index value and across vertical place value.
As shown in block 315, macro operation is then added into round-robin queue 111, and as shown in block 317, is then added into
Instruction translator 115.Macro operation is translated into corresponding microoperation.As earlier mentioned, each macro operation is converted into one or more
Microoperation.Each microoperation have the winding place value of macro operation being translated, ownership index value with across vertical place value.At this point, every
The instruction pointer of one microoperation, which is also designated as, is incorporated in microoperation.It in another example, is the instruction in block 319 or 321
Index is incorporated into each microoperation.Any in these frameworks, instruction pointer is added eventually together with each microoperation
Enter resequencing buffer 121.In next block 319, microoperation is added into register alias table 117, buffer alias
Interdependent information of the table 117 to generate each microoperation according to program sequence, operand and renaming information.In block 321,
Register alias table 117 identifies and marks out each microoperation last positioned at a cache line, and an embodiment as the aforementioned is
By setting field L to be true.This information is passed to resequencing buffer 121 and is provided to the correspondence of resequencing buffer 121
Storage element, therefore exit module 135 can recognize each cache line instruction it is when processed.Then, microoperation by from
It is proposed in register alias table 117 to carry out execution and aftermentioned ownership and exceptional event handling.
Fig. 4 is the flow diagram according to ownership and exceptional event handling in an embodiment.In first block 401,
Register alias table 117 proposes each microoperation to resequencing buffer 121 and scheduler 123.Furthermore each micro- behaviour of storage
It is also added into storage queue 127.Relevant operation continues to block 402, and what is proposed from register alias table 117 is micro-
The ownership of operation is indexed for accessing corresponding storage element in ownership queue 101.This operation is it is stated that in place above-mentioned
In the narration for managing multiple function blocks of device 100, but common logic can be concentrated on.When microoperation is by from register alias table
When proposing in 117, relevant operation then moves to three different blocks, block 403, block 405 and block 411.
In block 403, the execution position EXB of storage element is set.In addition, if microoperation is also true, institute across vertical position
The next continuous storage element having the right in queue 101 is also removed, and the execution position of storage element is also set.At this point,
At least cache line that microoperation is removed in ownership queue 101 is marked as in execution, also that is, an at least cache line it is micro-
Operation is suggested to be executed.After one or two execution positions are set, this branch in flow chart is completed.
Corresponding winding position WB is obtained and is compared to the winding position WB of microoperation in block 405, storage element.When
The winding position WB of microoperation winding position WB corresponding with the storage element in ownership queue 101 is mismatched, such as in next inquiry
Block 407, operation are carried out to block 409, and microoperation be marked as the first kind exceptional cast (such as by set T1 as
Very).It is judged as matching after label (mismatch) or in winding position WB, the relevant operation of this branch of flow chart terminates.
In block 411, the overtime position STB of the storage element taken out in ownership queue 101 is obtained.In addition, when micro-
Operation is very that the overtime position of next continuous storage element of ownership queue 101 is also obtained across vertical place value.In block
In 413, judge whether overtime position is set.When one of two overtime positions are set, relevant operation is carried out to block
409, microoperation is marked as the exceptional cast (such as by setting T1 be true) of the first kind.It is to mark micro- behaviour in block 409
Exceptional cast as the first kind is not later or when two overtime positions are all set, the operation knot of this branch of flow chart
Beam.
When being ready to be performed as earlier mentioned, it is suggested to each microoperation of scheduler 123 and is eventually scheduled to
One of correspondence in multiple execution units 125.It further comprises and dispatches storage microoperation as shown in block 415 to storage
Pipeline 129.In next block 417, stores pipeline 129 and determine the destination address of storage microoperation and update storage queue 127
In corresponding storage element.In next block 419, when each new destination address is determined, destination address is compared to institute
The effective cache line address having the right in queue 101.In block 421, it is effective fast to judge whether new destination address is matched with
Line taking address.When new destination address and any one of the effective cache line address in ownership queue 101 is not matched, phase
Operation is closed to complete.
When a new destination address is matched with an effective cache line address, relevant operation is carried out to block 423, often
The overtime position of an a matched storage element is set.In addition, the ownership index of matched storage element is transferred to overtime and detects
Survey device 145.In next block 425, overtime detector 145 is according to the corresponding storage element of ownership indexed access being provided
To obtain the execution position EXB of storage element.In next inquiry block 427, when execution position EXB is decided to be very, correlation is grasped
It carries out to block 429, the storage microoperation of conflict is marked as the exceptional cast of Second Type (such as by setting T2 be true).
In block 427, when execution position EXB be decided to be vacation or block 429 mark storage microoperation after, operation terminates.
Fig. 5 is according to executing in an embodiment, exit flow diagram with exceptional event handling.In first block
In 501, microoperation is scheduled to execution unit 125 from scheduler 123 as earlier mentioned.It is scheduled to be that operation execute but special
Determine really not so under operational circumstances.In next block 503, resequencing buffer 121 exit module 135 identify it is next
The microoperation to be exited.In next inquiry block 505, the field T1 of microoperation to be retired is determined whether be set to very
(such as being determined by exiting module 135).If so, relevant operation is carried out to block 507, the exceptional cast of the first kind is held
Row, including refresh process device 100.In addition, causing the microoperation of the exceptional cast of the first kind by as earlier mentioned from instruction cache
It is captured again in memory 105.The processing operation of exceptional cast is completed.
Such as next inquiry block 509, when T1 is not that true but T2 is decided to be very (such as via exit module 135), phase
Operation is closed to carry out to block 511, the exceptional cast of Second Type is performed at this time, and storage microoperation is allowed to complete and exit,
And processor 100 is refreshed.After storage microoperation starts exceptional cast, operation is resumed at instruction cache memory
Next instruction in 105.The relevant operation of exceptional event handling is completed in this.In block 513, when T1 and T2 is not
Very, microoperation is allowed to exit.In block 514, when the field L of microoperation is set to very, to be designated as operation as cache line
The last one microoperation, then block 515 exit module 135 indicate ownership queue 101 so that corresponding storage element without
Effectization, and operation is completed.It is invalid that the invalidation, which e.g. passes through label storage element, or releases ownership team
Storage element in column 101 stack in storage element.When field L is vacation, after instruction is rejected, operation is completed.
Related content above-mentioned can be made or used the present invention with those of ordinary skill in the art, be associated with as provided
The content of specific application and necessary condition.Although the present invention is retouched with reference in certain relevant versions by quite careful mode
It states, other versions and variation are feasible and are by thinking over.Multiple variation shapes of the aforementioned embodiment referred to
Can be for those of ordinary skills it will be apparent that and general member defined above be then readily applicable to other
Embodiment.Such as circuit described herein can be implemented into mode appropriate, such as logic device or similar circuit.
The foregoing is merely present pre-ferred embodiments, the range that however, it is not to limit the invention is any to be familiar with sheet
The personnel of item technology can do further improvements and changes without departing from the spirit and scope of the present invention on this basis, because
This protection scope of the present invention is when being subject to the range that following claims are defined.
Claims (21)
1. a kind of processor, which is characterized in that for foundation cache line decision memory ownership to detect modification program code,
The processor includes:
Ownership queue, including multiple storage elements;
Acquisition system, to provide processing front end of the cache line data to processing system of a plurality of cache line, wherein for each
The cache line, the acquisition system to determine ownership index, and by the ownership index it is defeated with corresponding cache line address
Enter one of multiple storage element into the ownership queue;
Wherein, the processing front end to by the cache line data conversion of a plurality of cache line at multiple instruction, each instruction
Ownership index including storing the storage element of cache line address in the ownership queue, which, which corresponds to, produces
The cache line data of raw each instruction, and the processing front end is to issue each instruction to execute the instruction;
Wherein, which also includes execution system, mesh of the execution system to each storage instruction for determining to be issued
Address is marked, when the overtime of the storage element in the ownership queue with the ownership index to match with the instruction that will be exited
When position is set, which executes the first exceptional cast;
First comparator, to by be input into each cache line address of the ownership queue be determined it is each should
Destination address is compared, when one of decision a plurality of cache line address is matched with one of described destination address
When, the first comparator is to set the overtime position;And
Second comparator, when each destination address is determined by the execution system, second comparator is to by the target
Each cache line address of the effective storage element stored in location and the ownership queue is compared, and second comparator is also
To set the overtime position of each matched storage element,
Wherein, when inputting the cache line address, it is to have which, which enables the corresponding storage element in the ownership queue,
Imitate storage element.
2. processor according to claim 1, which is characterized in that first exceptional cast makes the execution system refresh at this
Manage device, to avoid generate first exceptional cast instruction exit, and make the acquisition system from instruction cache memory again
Capture the instruction for generating first exceptional cast.
3. processor according to claim 1, which is characterized in that the processing front end will be multiple in the ownership queue
The final injunction of a corresponding storage element in storage element is labeled as final injunction;
Also, when the instruction exited is marked as the final injunction, which makes multiple in the ownership queue
A corresponding storage element in storage element is invalid.
4. processor according to claim 1, which is characterized in that also include:
Overtime detector, to use the ownership for the instruction being issued to index the corresponding storage of the reading ownership queue
The overtime position of unit, and when the overtime position of the corresponding storage element is set, which is issued to enable
The instruction it is labeled, to call first exceptional cast;
Wherein, when the instruction that will be exited is labeled to call first exceptional cast, which executes the first case
Outer event.
5. processor according to claim 4, which is characterized in that
The processing front end also to set in each instruction across vertical position, which is originated from the cache line number across vertical two cache lines
According to;And
Wherein, when be set possessed by the instruction being issued across vertical position, the overtime detector is to read the ownership
The overtime position of continuous next storage element of the corresponding storage element in queue, also, work as the ownership queue
In the overtime position of continuous next storage element of corresponding storage element when being set, which enables quilt
The instruction issued is labeled to call first exceptional cast.
6. processor according to claim 1, which is characterized in that the acquisition system is to determine that ownership index is two
System count value, the binary count value increase, the binary system as each storage element is input into the ownership queue
The total quantity of count value is at least the total quantity of the storage element in the ownership queue;
Wherein, the most significant bit of ownership index includes winding position;
The processor also includes overriding detector, which reads to use the ownership for the instruction being issued to index
The winding position of the corresponding storage element in the ownership queue is taken, and when the winding position of the corresponding storage element mismatches
When the corresponding winding position of the instruction being issued, which calls to enable the instruction being issued labeled should
First exceptional cast;
Wherein, when first exceptional cast is marked in the instruction that will be exited, which executes the first case foreign affairs
Part.
7. processor according to claim 1, which is characterized in that further include:
Queue, including multiple storage elements are stored, each storage element is referred to store by the storage that the processing front end issues
It enables, and destination address of each storage element to store execution system decision;
The execution system further includes storage pipeline, the mesh which instructs to determine scheduled each storage with execution
Mark address, and the corresponding storage element to be provided to each destination address determined in the storage queue and this second
Comparator.
8. processor according to claim 1, which is characterized in that
The processing system uses the corresponding storage list in the ownership indexed access of the instruction ownership queue being issued
Member, to set the execution position in the corresponding storage element;
The processor also includes: overtime detector, to calculate each matched storage element determined by the second comparator
The execution position, and when any execution position of any matched storage element is set, the overtime detector is also to make pair
The storage instruction for the destination address that Ying Yu is determined is labeled to call the second exceptional cast;
Wherein, when the storage instruction that will be exited it is labeled to call second exceptional cast when, the execution system execute this
Two exceptional casts.
9. processor according to claim 8, which is characterized in that second exceptional cast marks execution system permission
Note is exited with calling the storage of second exceptional cast to instruct, and refreshes the processor, and refers to acquisition system acquirement
Needle is to read the instruction after storage instruction from instruction cache memory.
10. processor according to claim 1, which is characterized in that the processing front end is also originated to set across standing on two
Each instruction of the cache line data of cache line across vertical position;
Wherein, which uses the corresponding storage in the ownership indexed access of the instruction ownership queue being issued
Memory cell, to set the execution position of the corresponding storage element, and when this of the instruction being issued be set across vertical position, if
It is scheduled on the execution position of next continuous storage element after the corresponding storage element.
11. processor according to claim 10, which is characterized in that also include:
Overtime detector to calculate the execution position of each matched storage element determined by the second comparator, and is worked as
When any execution position of any matched storage element is set, the overtime detector is also to make to correspond to the mesh being determined
The storage instruction for marking address is labeled to call the second pending exceptional cast;
Wherein, when the storage instruction that will be exited it is labeled to call second exceptional cast when, the execution system execute this
Two exceptional casts, wherein it is labeled to call being somebody's turn to do for second exceptional cast that second exceptional cast allows the execution system
Storage instruction is exited, and the processor is refreshed, and so that the acquisition system is obtained instruction pointer and be somebody's turn to do with reading from instruction cache memory
Instruction after storage instruction.
12. a kind of determine method of the memory ownership to detect modification program code according to cache line, which is characterized in that packet
It includes:
Obtain a plurality of cache line, each cache line has cache line address and cache line data, this is fast for acquirement each
Line taking determines ownership index, and the cache line address and ownership index is inputted into multiple storages in ownership queue
One of memory cell;
When each cache line address is input into the storage element of the ownership queue, compare the cache line address and foundation
Each storage instruction for being issued and each destination address for being determined, and when one of cache line address and destination address its
One of when matching, by matched storage element labeled as overtime;
The cache line data of a plurality of cache line are converted as multiple instruction, each instruction includes for the storage in the ownership queue
Unit and the ownership index determined, which stores cache line, which is converted from the cache line;
Multiple instruction is issued to execute;
It determines to be issued with the destination address of each storage instruction of execution;
When each destination address is determined, compare each effective storage in the destination address and the ownership queue being determined
The cache line address of unit, and marking any matched storage element is overtime, wherein when receiving new cache line address
When, make the effective storage element of corresponding storage element in the ownership queue;And
When the ownership for the instruction that will be exited index, which is matched with, is marked as the storage element of overtime in the ownership queue,
Execute the first exceptional cast.
13. according to the method for claim 12, which is characterized in that comprising at refreshing the step of first exceptional cast of execution
Device is managed, is avoided to call the instruction of first exceptional cast to exit, and first exceptional cast is called in acquisition again
Instruction.
14. according to the method for claim 12, which is characterized in that also include:
The final injunction for marking each effective storage element in the ownership queue is final injunction;And works as and be marked as most
When the instruction instructed afterwards has logged out, keep corresponding storage element in the ownership queue invalid.
15. according to the method for claim 12, which is characterized in that also include:
When each instruction is issued, by by comprising the ownership indexed access ownership queue in corresponding storage it is single
Member, and when the corresponding storage element is marked as overtime, the instruction is marked to call first exceptional cast;
Wherein, the step of first exceptional cast of execution also includes: for that will exit and be labeled to call the first case foreign affairs
Each instruction execution of part first exceptional cast.
16. according to the method for claim 12, which is characterized in that also include:
During the conversion, to from the cache line data conversion across two cache lines stood in a plurality of cache line at it is every
One instruction setting is across vertical position;
It is corresponding in the ownership indexed access ownership queue that includes using each instruction when each instruction is issued
Storage element access next continuous storage element in the ownership queue and when this is set across vertical position;
When the corresponding storage element is noted as overtime, the instruction is marked to call first exceptional cast;And
When this is set and when next continuous storage element in the ownership queue is marked as overtime across vertical position,
The instruction is marked to call first exceptional cast;
Wherein, the step of first exceptional cast of execution includes: for that will exit and be labeled to call first exceptional cast
Each instruction execution first exceptional cast.
17. according to the method for claim 12, which is characterized in that also include:
Repeatedly increase the ownership index of binary count value, it is all that the total quantity of the binary count value is at least this
Weigh the total quantity of the storage element in queue;
Determine the most significant bit of the binary count value for winding position;
Wherein, the step of conversion includes: so that corresponding winding position is contained in each instruction converted, that is converted is each
The storage for storing the cache line that the instruction converted is converted from of the winding position and the ownership queue that the instruction includes
The winding position being determined in unit is identical;
Using the storage element in the ownership queue of the ownership indexed access in instruction, and the winding position in compare instruction with
The winding position of the storage element accessed;
When the winding position and when mismatching, the instruction is marked to call first exceptional cast;And
Wherein, the step of first exceptional cast of execution includes: for labeled each instruction to call first exceptional cast
Execute first exceptional cast.
18. according to the method for claim 12, which is characterized in that also include:
When issuing instruction, using the storage element in the ownership queue of the ownership indexed access in the instruction, and set
The execution position of the fixed storage element accessed;
When matched storage element is in comparing each effective storage element in the destination address that is determined and the ownership queue
When being found during cache line address, determine whether the execution position in the matched storage element is set;
And when the execution position of the matched storage element is set, the storage that label corresponds to the destination address being determined refers to
It enables to call the second exceptional cast.
19. according to the method for claim 18, which is characterized in that also include: when the storage instruction that will be exited is labeled
When calling second exceptional cast, storage instruction is allowed to exit and terminate, refresh process device, and it is suitable in program by obtaining
Next instruction resume operations in sequence after storage instruction.
20. according to the method for claim 12, which is characterized in that also include:
In in the conversion the step of, to each finger for being originated from the cache line data across two cache lines stood in a plurality of cache line
Enable setting across vertical position;
When issuing instruction, the storage element in the ownership indexed access ownership queue being had using the instruction, and set
The execution position in the storage element accessed calmly, and when this of the instruction be set across vertical position, also access the ownership team
Next continuous storage element in column, and set the execution position of next continuous storage element;
When matched storage element is in each effective storage element for comparing the destination address and ownership queue being determined
Cache line address during when being found, determine whether the execution position of the matched storage element is set;
And when the execution position of the matched storage element is set, label corresponds to the storage for the destination address being determined
Instruction is deposited to call the second exceptional cast.
21. according to the method for claim 20, which is characterized in that also include:
When the storage instruction that will be exited is labeled to call second exceptional cast, storage instruction is allowed to exit and tie
Beam, refresh process device, and by obtaining the next instruction resume operations in program sequence after storage instruction.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201662324945P | 2016-04-20 | 2016-04-20 | |
| US62/324,945 | 2016-04-20 | ||
| US15/156,416 | 2016-05-17 | ||
| US15/156,416 US9798669B1 (en) | 2016-04-20 | 2016-05-17 | System and method of determining memory ownership on cache line basis for detecting self-modifying code |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN106919367A CN106919367A (en) | 2017-07-04 |
| CN106919367B true CN106919367B (en) | 2019-05-07 |
Family
ID=59460372
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201710138752.8A Active CN106919367B (en) | 2016-04-20 | 2017-03-09 | Processor and method for detecting self-correcting code |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN106919367B (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6009516A (en) * | 1996-10-21 | 1999-12-28 | Texas Instruments Incorporated | Pipelined microprocessor with efficient self-modifying code detection and handling |
| CN1521635A (en) * | 2003-01-14 | 2004-08-18 | 智权第一公司 | Device and method for solving deadlock extraction condition in branch target address cache |
| CN101894010A (en) * | 2009-08-24 | 2010-11-24 | 威盛电子股份有限公司 | Microprocessor and method of operation applicable to microprocessor |
| CN104615548A (en) * | 2010-03-29 | 2015-05-13 | 威盛电子股份有限公司 | Data prefetching method and microprocessor |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8656121B2 (en) * | 2011-05-17 | 2014-02-18 | International Business Machines Corporation | Facilitating data coherency using in-memory tag bits and tag test instructions |
| CN106796506B (en) * | 2014-05-12 | 2019-09-27 | 英特尔公司 | Method and apparatus for providing hardware support to self-modifying code |
-
2017
- 2017-03-09 CN CN201710138752.8A patent/CN106919367B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6009516A (en) * | 1996-10-21 | 1999-12-28 | Texas Instruments Incorporated | Pipelined microprocessor with efficient self-modifying code detection and handling |
| CN1521635A (en) * | 2003-01-14 | 2004-08-18 | 智权第一公司 | Device and method for solving deadlock extraction condition in branch target address cache |
| CN101894010A (en) * | 2009-08-24 | 2010-11-24 | 威盛电子股份有限公司 | Microprocessor and method of operation applicable to microprocessor |
| CN104615548A (en) * | 2010-03-29 | 2015-05-13 | 威盛电子股份有限公司 | Data prefetching method and microprocessor |
Also Published As
| Publication number | Publication date |
|---|---|
| CN106919367A (en) | 2017-07-04 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP3542020B2 (en) | Processor device and processor control method for executing instruction cache processing for instruction fetch alignment over multiple predictive branch instructions | |
| TWI289786B (en) | System and method of using speculative operand sources in order to speculatively bypass load-store operations | |
| US11513801B2 (en) | Controlling accesses to a branch prediction unit for sequences of fetch groups | |
| US4763245A (en) | Branch prediction mechanism in which a branch history table is updated using an operand sensitive branch table | |
| TWI551986B (en) | Computer program product, method, and system for controlling operation of a run-time instrumentation facility from a lesser-privileged state | |
| TWI529616B (en) | Method, system, and microprocessor for translating instructions | |
| US7962730B2 (en) | Replaying memory operation assigned a load/store buffer entry occupied by store operation processed beyond exception reporting stage and retired from scheduler | |
| US6442707B1 (en) | Alternate fault handler | |
| TWI338218B (en) | Method and apparatus for prefetching data from a data structure | |
| US8190825B2 (en) | Arithmetic processing apparatus and method of controlling the same | |
| TWI483186B (en) | Microprocessor and method for using an instruction loop cache thereof | |
| US6883086B2 (en) | Repair of mis-predicted load values | |
| CN106406822B (en) | Processor with improved alias queue and store conflict detection | |
| US7958336B2 (en) | System and method for reservation station load dependency matrix | |
| US20120290780A1 (en) | Multithreaded Operation of A Microprocessor Cache | |
| US9304777B1 (en) | Method and apparatus for determining relative ages of entries in a queue | |
| CN110515659A (en) | Atomic instruction execution method and device | |
| CN114924797B (en) | Method for prefetching instructions, information processing device, equipment and storage medium | |
| CN106919367B (en) | Processor and method for detecting self-correcting code | |
| CN106933538B (en) | Processor and method for detecting self-correcting code | |
| CN106933537B (en) | Detect the processor and method of modification program code | |
| CN106933539B (en) | Processor and method for detecting self-correcting code | |
| TWI242744B (en) | Apparatus, pipeline microprocessor and method for avoiding deadlock condition and storage media with a program for avoiding deadlock condition | |
| TWI283827B (en) | Apparatus and method for efficiently updating branch target address cache | |
| CN109799897A (en) | A kind of control method and device reducing GPU L2 cache energy consumption |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CP03 | Change of name, title or address | ||
| CP03 | Change of name, title or address |
Address after: Room 301, 2537 Jinke Road, Zhangjiang High Tech Park, Pudong New Area, Shanghai 201203 Patentee after: Shanghai Zhaoxin Semiconductor Co.,Ltd. Address before: Room 301, 2537 Jinke Road, Zhangjiang hi tech park, Pudong New Area, Shanghai 201203 Patentee before: VIA ALLIANCE SEMICONDUCTOR Co.,Ltd. |