JPH10301851A

JPH10301851A - Method and system for speculatively supplying cache memory data inside data processing system

Info

Publication number: JPH10301851A
Application number: JP10096007A
Authority: JP
Inventors: Kumar Arimiri Ravi; ラヴィ・カマー・アライミリ; Stephen Doddson John; ジョン・スティーブン・ドッドソン; Don Lewis Jerry; ジェリー・ドン・リュイス
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 1997-04-14
Filing date: 1998-04-08
Publication date: 1998-11-13
Also published as: KR19980079625A; CA2231361A1; CN1197956A; CN1110755C; TW386192B; KR100277446B1; SG68034A1

Abstract

PROBLEM TO BE SOLVED: To provide improved method and system for sharing cache memory data by reading requested data from a cache memory inside a processor before composite responses are returned from all the processors inside a data processing system to the processor. SOLUTION: The data processing system is provided with at least one CPU 11a-11n and provided with at least one each of primary cache 12a 12n and secondary cache 13a-13n and one high performance I/O device 16a-16n. In response to the request of the data by the high performance I/O device 16a-16n inside the data processing system, an intervention response is issued from the CPU 11a-11n provided with the requested data inside the data processing system. Then, the requested data are read from the secondary cache 13a-13n inside the CPU 11a-11n before the composite response is returned from all the CPUs 11a-11n inside the data processing system to the CPU 11a-11n.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、一般的にはキャッ
シュ・メモリ・データを共用する方法及びシステムに関
し、特にデータ処理システム内の処理装置とＩ／Ｏ装置
との間でキャッシュ・メモリ・データを共用する方法及
びシステム、なかでもデータ処理システム内の処理装置
から高機能（インテリジェント）Ｉ／Ｏ装置にキャッシ
ュ・メモリ・データを投機的に供給する方法及びシステ
ムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates generally to a method and system for sharing cache memory data, and more particularly to a method and system for sharing cache memory data between a processing unit and an I / O device in a data processing system. And, more particularly, to a method and system for speculatively supplying cache memory data from a processing device in a data processing system to an intelligent (intelligent) I / O device.

【０００２】[0002]

【従来の技術】データ処理システムは少なくとも１つの
処理装置、システム・メモリ及び様々なＩ／Ｏ装置を含
む。処理装置はプログラム命令を実行するための複数の
レジスタ及び実行装置を含み得る。また処理装置は、一
般に、高速メモリを利用して実現される命令キャッシュ
やデータ・キャッシュ等、１次キャッシュ（レベル１キ
ャッシュまたはＬ１キャッシュともいう）を有する。更
に処理装置は、先に述べたような１次キャッシュをサポ
ートするため、２次キャッシュ（レベル２キャッシュま
たはＬ２キャッシュともいう）を含み得る。2. Description of the Related Art A data processing system includes at least one processing unit, system memory, and various I / O devices. The processing unit may include a plurality of registers for executing program instructions and an execution unit. The processing device generally has a primary cache (also referred to as a level 1 cache or an L1 cache) such as an instruction cache or a data cache realized by using a high-speed memory. Further, the processing unit may include a secondary cache (also called a level 2 cache or L2 cache) to support the primary cache as described above.

【０００３】通常、ある処理装置からシステム・バス上
の他の処理装置またはＩ／Ｏ装置へ、システム・メモリ
を経由せずにデータを転送することは介入と呼ばれる。
介入プロトコルは、システム内の処理装置またはＩ／Ｏ
装置により読取りリクエストまたは変更予定読取り（Ｒ
ＷＩＴＭ）リクエストを満足するためにシステム・メモ
リをアクセスしなければならない回数を減らすことによ
ってシステム性能を改良する。[0003] Normally, transferring data from one processing unit to another processing unit or I / O device on a system bus without going through system memory is called intervention.
The intervention protocol is based on the processing unit or I / O in the system.
Read request or scheduled change read (R
WITM) Improves system performance by reducing the number of times system memory must be accessed to satisfy a request.

【０００４】概して、Ｉ／Ｏ装置による未決の読取り／
ＲＷＩＴＭリクエストがあるときは、システム・バスに
接続され、要求されたデータをそのキャッシュ内に保有
する他の処理装置はデータを要求側Ｉ／Ｏ装置に供給で
きる。従来の介入プロトコルでは、データがそのキャッ
シュにある処理装置は、そのキャッシュからデータを供
給するためデータ・バス・リクエストを発行する前に、
システム内の全ての処理装置からの"複合"応答を待つ。[0004] Generally, pending read /
When there is an RWITM request, another processing unit connected to the system bus and holding the requested data in its cache can supply the data to the requesting I / O device. In a conventional intervention protocol, a processing unit whose data is in its cache, before issuing a data bus request to supply data from that cache,
Wait for a "composite" response from all processors in the system.

【０００５】同時に、従来の介入プロトコルは"再試行"
メカニズムに対応可能であり、介入により満たされるい
かなる読取り／ＲＷＩＴＭリクエストも、システム・バ
ス上の任意の処理装置からの"再試行"によって割込むこ
とができる。ある処理装置が介入で応答し、他の処理装
置が"再試行"で応答した場合、再試行応答は介入応答を
自動的に無効にする。その結果、システム・バス上の処
理装置による再試行リクエストが未決の場合、データを
含む処理装置はデータ・バス・リクエストを発行しな
い。At the same time, the traditional intervention protocol is "retry"
Any read / RWITM request that is compatible with the mechanism and is satisfied by the intervention can be interrupted by a "retry" from any processing unit on the system bus. If one processor responds with an intervention and the other responds with a "retry", the retry response automatically overrides the intervention response. As a result, if a retry request by a processor on the system bus is pending, the processor containing the data will not issue a data bus request.

【０００６】データ処理システム内の処理装置からの"
再試行"による影響が少ない形で、介入データが要求側
Ｉ／Ｏ装置に供給される、改良された供給機構を提供す
ることが望ましい。[0006] From the processing unit in the data processing system,
It would be desirable to provide an improved provisioning mechanism where intervention data is provided to the requesting I / O device in a manner that is less affected by "retry".

【０００７】[0007]

【発明が解決しようとする課題】本発明の目的は、キャ
ッシュ・メモリ・データを共用する改良された方法及び
システムを提供することである。It is an object of the present invention to provide an improved method and system for sharing cache memory data.

【０００８】本発明の他の目的は、データ処理システム
内の処理装置とＩ／Ｏ装置との間でキャッシュ・メモリ
・データを共用する改良された方法及びシステムを提供
することである。It is another object of the present invention to provide an improved method and system for sharing cache memory data between a processing unit and an I / O device in a data processing system.

【０００９】本発明の他の目的は、データ処理システム
内の処理装置から高機能Ｉ／Ｏ装置へキャッシュ・メモ
リ・データを投機的に供給する改良された方法及びシス
テムを提供することである。It is another object of the present invention to provide an improved method and system for speculatively providing cache memory data from a processing unit in a data processing system to a sophisticated I / O device.

【００１０】[0010]

【課題を解決するための手段】本発明の方法及びシステ
ムに従って、データ処理システムは少なくとも１つの処
理装置を含み、処理装置はそれぞれ少なくとも１つのキ
ャッシュ・メモリと少なくとも１つの高機能Ｉ／Ｏ装置
を持つ。データ処理システム内の高機能Ｉ／Ｏ装置によ
るデータのリクエストに応答して、データ処理システム
内の要求されたデータを持つ処理装置から介入応答が発
行される。次に、データ処理システム内の全ての処理装
置からの複合応答が処理装置に戻る前に、要求されたデ
ータが処理装置内のキャッシュ・メモリから読取られ
る。In accordance with the method and system of the present invention, a data processing system includes at least one processing unit, each processing unit having at least one cache memory and at least one sophisticated I / O unit. Have. In response to a request for data by a sophisticated I / O device in the data processing system, an intervention response is issued from the processing device having the requested data in the data processing system. The requested data is then read from the cache memory in the processing device before the composite response from all the processing devices in the data processing system returns to the processing device.

【００１１】[0011]

【発明の実施の形態】本発明は少なくとも１つのキャッ
シュ・メモリを持つデータ処理システムで実現すること
ができる。また本発明は、それぞれのプロセッサが１次
キャッシュ及び２次キャッシュを持つ様々なマルチプロ
セッサ・データ処理システムに適用できることは理解さ
れよう。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention can be realized in a data processing system having at least one cache memory. It will also be appreciated that the present invention is applicable to various multiprocessor data processing systems where each processor has a primary cache and a secondary cache.

【００１２】各図、特に図１を参照する。本発明を適用
できるデータ処理システム１０のブロック図が示してあ
る。データ処理システム１０は複数の中央処理装置（Ｃ
ＰＵ）１１ａ乃至１１ｎを含み、ＣＰＵ１１ａ乃至１１
ｎはそれぞれ１次（Ｌ１）キャッシュを含む。図に示す
通り、ＣＰＵ１１ａは１次キャッシュ１２ａをＣＰＵ１
１ｎは１次キャッシュ１２ｎを含む。１次キャッシュ１
２ａ乃至１２ｎはそれぞれセクタ化されたキャッシュで
もよい。Reference is made to the figures, in particular to FIG. FIG. 1 shows a block diagram of a data processing system 10 to which the present invention can be applied. The data processing system 10 includes a plurality of central processing units (C
PU) 11a to 11n;
n each include a primary (L1) cache. As shown in the figure, the CPU 11a stores the primary cache 12a in the CPU 1
1n includes the primary cache 12n. Primary cache 1
Each of 2a to 12n may be a sectorized cache.

【００１３】ＣＰＵ１１ａ乃至１１ｎはそれぞれ２次
（Ｌ２）キャッシュ１３ａ乃至１３ｎに接続される。２
次キャッシュ１３ａ乃至１３ｎはそれぞれセクタ化され
たキャッシュでもよい。ＣＰＵ１１ａ乃至１１ｎ、１次
キャッシュ１２ａ乃至１２ｎ及び２次キャッシュ１３ａ
乃至１３ｎは相互接続部１５を介して互いに及びシステ
ム・メモリ１４に接続される。相互接続部１５はバスま
たはスイッチ等である。相互接続部１５にはまた高機能
Ｉ／Ｏ装置１６ａ乃至１６ｎが接続される。これら高機
能Ｉ／Ｏ装置１６ａ乃至１６ｎには、システム・メモリ
１４との間でデータ転送を開始する機能がある。高機能
Ｉ／Ｏ装置１６ａ乃至１６ｎには、イントラネットまた
はインターネット等のネットワークを介して他のデータ
処理システムとの通信に用いられる様々なアダプタを追
加できる。The CPUs 11a to 11n are connected to secondary (L2) caches 13a to 13n, respectively. 2
Each of the next caches 13a to 13n may be a sectorized cache. CPUs 11a to 11n, primary caches 12a to 12n, and secondary cache 13a
13n are connected to each other and to the system memory 14 via an interconnect 15. The interconnect 15 is a bus or a switch. The high-performance I / O devices 16a to 16n are also connected to the interconnection unit 15. These advanced I / O devices 16a to 16n have a function of starting data transfer with the system memory 14. Various adapters used for communication with other data processing systems via a network such as an intranet or the Internet can be added to the high-performance I / O devices 16a to 16n.

【００１４】本発明の好適な実施例として、ＣＰＵ、１
次キャッシュ及び２次キャッシュ、例えば図１に示した
ＣＰＵ１１ａ、１次キャッシュ１２ａ、そして２次キャ
ッシュ１３ａは処理装置と総称することができる。デー
タ処理システムの好適な実施例は図１に示しているが、
本発明は様々なシステム構成の中で実施できることは理
解されよう。例えばＣＰＵ１１ａ乃至１１ｎには３レベ
ル以上のキャッシュ・メモリがあってもよい。As a preferred embodiment of the present invention, a CPU,
The secondary cache and the secondary cache, for example, the CPU 11a, the primary cache 12a, and the secondary cache 13a shown in FIG. 1 can be collectively referred to as a processing device. A preferred embodiment of the data processing system is shown in FIG.
It is understood that the present invention can be implemented in various system configurations. For example, the CPUs 11a to 11n may have three or more levels of cache memories.

【００１５】表１を参照する。従来技術の介入プロトコ
ルによる処理装置からの設定済みコヒーレンシ応答が示
してある。マルチプロセッサ・データ処理システム内の
Ｉ／Ｏ装置がシステム・バス上に読取りリクエストまた
は変更予定読取り（ＲＷＩＴＭ）リクエストを出した
後、システム内の処理装置はスヌープの後、表１に従っ
て応答の１つを発行することができる。Please refer to Table 1. A set coherency response from a processing unit according to a prior art intervention protocol is shown. After an I / O device in a multiprocessor data processing system issues a read request or a read-to-change (RWITM) request on the system bus, the processing unit in the system, after a snoop, responds to one of the responses according to Table 1. Can be issued.

【表１】 [Table 1]

【００１６】表１に示す通り、コヒーレンシ応答は３ビ
ット・スヌープ応答信号の形を取り、各コヒーレンシ応
答はここで述べているように定義される。これらの信号
はエンコードされ、アドレス保持時間の後にスヌープ結
果が示される。また各応答に優先度値が関連付けられ、
これによりシステム・ロジックは、システム・バスの全
ての処理装置及び全てのＩ／Ｏ装置に返す１つのスヌー
プ応答信号を定式化するとき、どのコヒーレンシ応答を
優先するか決定できる。例えば、ある処理装置が共用介
入応答（優先度３）で応答し、他の処理装置が再試行応
答（優先度１）で応答した場合、システム・ロジックが
再試行コヒーレンシ応答を要求側処理装置及びシステム
・バスに接続された他の全ての処理装置に返すように、
再試行応答の処理装置に優先権が与えられる。このシス
テム・ロジックは、システム制御装置またはメモリ・コ
ントローラ等、システム内の様々なコンポーネントに置
くことができる。As shown in Table 1, the coherency response takes the form of a 3-bit snoop response signal, where each coherency response is defined as described herein. These signals are encoded and the snoop result is shown after the address holding time. Each response also has a priority value associated with it,
This allows system logic to determine which coherency response has priority when formulating one snoop response signal to return to all processing units and all I / O devices on the system bus. For example, if one processor responds with a shared intervention response (priority 3) and another responds with a retry response (priority 1), the system logic may send a retry coherency response to the requesting processor and To return to all other processing units connected to the system bus,
The processing unit of the retry response is given priority. This system logic can reside in various components in the system, such as a system controller or memory controller.

【００１７】要求されているデータの"所有者"はどの
（処理装置の）キャッシュか、従ってデータを供給する
資格があるかを確認するため周知のメカニズムをいくつ
か採用できる。従来のＭＥＳＩプロトコルでは、要求さ
れたデータをキャッシュが"変更"または"排他"の状態で
保持している場合、これは、このキャッシュがシステム
内でデータの有効なコピーを含む唯一のキャッシュであ
り、、明らかに所有者であることを意味する。しかし要
求されたデータをキャッシュが"共用"状態で保持してい
る場合、これは、データがシステム内の他の少なくとも
１つのキャッシュにも保持されているはずであることを
意味する。従って潜在的には、それら２つ以上のキャッ
シュのいずれもがデータを供給できる。このような場合
には、どのキャッシュが供給源となるか決定するためい
くつかの選択肢が利用できる。[0017] Several well-known mechanisms may be employed to ascertain which cache (of the processing unit) the requested data "owner" is and thus is eligible to supply the data. In the conventional MESI protocol, if the cache holds the requested data in a "modified" or "exclusive" state, this is the only cache in the system that contains a valid copy of the data. ,, obviously means the owner. However, if the cache holds the requested data in a "shared" state, this means that the data should also be held in at least one other cache in the system. Thus, potentially, any of the two or more caches can supply data. In such cases, several options are available to determine which cache is the source.

【００１８】図２を参照する。従来技術による供給機構
を説明するため代表的なデータ処理システムのブロック
図が示してある。例えば高機能Ｉ／Ｏ装置２４がシステ
ム・バス２３上に読取りリクエストまたはＲＷＩＴＭリ
クエストを出し、処理装置２１のＬ２キャッシュに、Ｉ
／Ｏ装置２４から要求されたデータがあるものとする。
更に処理装置２０内のＬ２キャッシュは"無効"状態にあ
り、処理装置２１内のＬ２キャッシュは"変更"状態にあ
り、処理装置２２内のＬ２キャッシュには要求されたデ
ータがないものとする。後に続く処理のシーケンスは従
来技術で述べられているように、ソース介入を実行する
ため各処理装置のＬ２キャッシュ・コントローラによっ
て担当される。Referring to FIG. A block diagram of a typical data processing system is shown to illustrate a prior art supply mechanism. For example, the advanced I / O device 24 issues a read request or an RWITM request on the system bus 23, and stores the I / O request in the L2 cache of the processing device 21.
It is assumed that there is data requested by the / O device 24.
Further, it is assumed that the L2 cache in the processing device 20 is in the "invalid" state, the L2 cache in the processing device 21 is in the "changed" state, and the L2 cache in the processing device 22 does not have the requested data. The subsequent sequence of processing is handled by the L2 cache controller of each processing unit to perform source intervention as described in the prior art.

【００１９】Ｉ／Ｏ装置２４が出した読取り／ＲＷＩＴ
Ｍリクエストはシステム・バス２３から、処理装置２
１、処理装置２２、及び処理装置２３によって"スヌー
プ"される。処理装置２１乃至２３のそれぞれでＬ２キ
ャッシュ・ディレクトリのルックアップが実行され、要
求されたデータがそのＬ２キャッシュに存在するかどう
か確認される。処理装置２１には要求されたデータがあ
るので、処理装置２１によって介入応答が発行され、処
理装置２１内の有限状態機械がディスパッチされて後に
続く処理が制御される。処理装置２１のＬ２キャッシュ
内のデータが"変更"状態にある場合、処理装置２１によ
って変更介入コヒーレンシ応答が発行される。処理装置
２１のＬ２キャッシュ内のデータが"共用"または"排他"
の状態にある場合は、処理装置２１によって共用介入コ
ヒーレンシ応答が発行される。処理装置２０内のＬ２キ
ャッシュは"無効"状態で、処理装置２２内のＬ２キャッ
シュには要求されたデータがないので、処理装置２０及
び２２はそれぞれヌル・コヒーレンシ応答を送る。Read / RWIT issued by I / O device 24
The M request is sent from the system bus 23 to the processing device 2
1, "Snoop" by the processing unit 22 and the processing unit 23. A lookup of the L2 cache directory is performed on each of the processing units 21 to 23 to determine whether the requested data exists in the L2 cache. Since the requested data exists in the processing unit 21, an intervention response is issued by the processing unit 21, and the finite state machine in the processing unit 21 is dispatched to control the subsequent processing. When the data in the L2 cache of the processing device 21 is in the “changed” state, the processing device 21 issues a change intervention coherency response. Data in the L2 cache of the processing device 21 is “shared” or “exclusive”
, The processing unit 21 issues a shared intervention coherency response. Processors 20 and 22 each send a null coherency response because the L2 cache in processor 20 is in an "invalid" state and there is no requested data in the L2 cache in processor 22.

【００２０】介入応答の発行後、処理装置２１は複合応
答に備える。複合応答は、基本的には、この例では自身
からのコヒーレンシ応答並びに処理装置２０、２２及び
Ｉ／Ｏ装置２４からのコヒーレンシ応答を含む。戻った
複合応答が変更介入コヒーレンシ応答である場合、処理
装置２１は要求されたデータの供給をそのＬ２キャッシ
ュから開始できる。処理装置２０または処理装置２２
が、どのような理由であれ再試行を要求した場合は、確
立された介入プロトコル下では、再試行リクエストが優
先される（つまり供給シーケンスは先に進まない）。例
えば処理装置２２がスヌープ・キュー・ビジー状態にあ
って、再試行リクエストを出しているかも知れない。After issuing the intervention response, the processor 21 prepares for the composite response. The composite response basically includes, in this example, a coherency response from itself and a coherency response from the processing units 20, 22 and the I / O device 24. If the returned composite response is a change intervention coherency response, the processing unit 21 can start supplying the requested data from its L2 cache. Processing device 20 or processing device 22
If, for any reason, a retry is requested, under the established intervention protocol, the retry request takes precedence (ie, the provisioning sequence does not proceed). For example, processor 22 may be in a snoop queue busy state and issue a retry request.

【００２１】スヌープ操作が開始されてから、処理装置
２１のＬ２キャッシュのデータが変更されていないか、
またはＬ１キャッシュに存在しない（つまりＬ１包含性
ではない）場合は、処理装置２１はシステム・バス・ア
ービタに対するシステム・バス・リクエストを開始でき
る（通常、要求されたデータはシステム・バス・リクエ
ストが開始される前にＬ２キャッシュ・コントローラに
よってバッファに読込まれなければならない）。でなけ
れば処理装置２１のＬ１キャッシュは、システム・バス
・リクエストが出される前にフラッシュされ無効化され
る（つまりＬ１キャッシュの変更されたデータをＬ２キ
ャッシュに書き戻し、Ｌ１キャッシュのコピーを無効化
する）。しかし処理装置２１のＬ１キャッシュが"共用"
状態にある場合は、データ・バス・リクエストを出す前
に必要になるのはＬ１キャッシュの無効化だけである。Whether the data in the L2 cache of the processing unit 21 has been changed since the start of the snoop operation,
Or, if not present in the L1 cache (ie, not L1 inclusive), the processing unit 21 can initiate a system bus request to the system bus arbiter (usually, the requested data is initiated by the system bus request). Must be read into the buffer by the L2 cache controller before it can be executed). Otherwise, the L1 cache of the processing unit 21 is flushed and invalidated before the system bus request is issued (that is, the changed data of the L1 cache is written back to the L2 cache, and the copy of the L1 cache is invalidated). Do). However, the L1 cache of the processing unit 21 is "shared"
If so, all that is required before invalidating the data bus request is to invalidate the L1 cache.

【００２２】次に処理装置２１はシステム・バスの使用
権が戻るのを待つ。Ｉ／Ｏ装置２４への実際のデータ供
給は、データ・バスの使用が許可された後に開始され
る。供給が完了すると、処理装置２１のＬ２キャッシュ
は"変更"状態から、読取りリクエストでは"共用"状態
に、ＲＷＩＴＭリクエストでは"無効"状態に代わる。処
理装置２０及び２２のＬ２キャッシュの状態は変わらな
い。Next, the processor 21 waits for the right to use the system bus to return. The actual supply of data to the I / O device 24 begins after the use of the data bus has been granted. When the supply is completed, the L2 cache of the processing unit 21 changes from the "changed" state to the "shared" state for a read request and the "invalid" state for an RWITM request. The state of the L2 cache of the processing devices 20 and 22 does not change.

【００２３】図３を参照する。本発明の好適な実施例に
従って、データ処理システム内の処理装置からＩ／Ｏ装
置へキャッシュ・メモリ・データを投機的に供給するハ
イレベル・ロジックのフローチャートが示してある。ブ
ロック３０から始まり、システム内の全ての処理装置に
より、システム・バスから読取り／ＲＷＩＴＭリクエス
トがスヌープされる（ブロック３１）。Ｌ２キャッシュ
・ディレクトリのルックアップが実行され、要求された
データがＬ２キャッシュに存在するかどうか各処理装置
によって確認がなされる（ブロック３２）。ヌル・コヒ
ーレンシ応答が、要求されたデータを保持していない全
ての処理装置（図２の処理装置２０及び２２等）によっ
て発行され（ブロック３３）、プロセスはブロック９９
で終了する。他方、要求されたデータを保持している処
理装置（図２の処理装置２１等）からは介入コヒーレン
シ応答が発行される（ブロック３４）。Referring to FIG. A flowchart of high-level logic for speculatively providing cache memory data from a processing unit in a data processing system to an I / O device in accordance with a preferred embodiment of the present invention is shown. Beginning at block 30, a read / RWITM request is snooped from the system bus by all processing units in the system (block 31). A lookup of the L2 cache directory is performed and a check is made by each processing unit whether the requested data is present in the L2 cache (block 32). A null coherency response is issued by all processors (such as processors 20 and 22 of FIG. 2) that do not hold the requested data (block 33) and the process proceeds to block 99.
Ends with On the other hand, a processing device (such as the processing device 21 in FIG. 2) holding the requested data issues an intervention coherency response (block 34).

【００２４】介入コヒーレンシ応答の発行後、介入側処
理装置は、特定のキャッシュ管理タスクを実行しなけれ
ばならない（ブロック３５）。これらのタスクは、Ｌ１
キャッシュのデータ・コピーが変更されている場合は介
入側処理装置のＬ１キャッシュのデータ・コピーをフラ
ッシュし無効化する操作、またはＬ１キャッシュのデー
タ・コピーが変更されていない場合は単に介入側処理装
置のＬ１キャッシュのデータ・コピーを無効化する操作
を含む。After issuing the intervention coherency response, the intervening processing unit must perform certain cache management tasks (block 35). These tasks are L1
The operation of flushing and invalidating the data copy of the L1 cache of the intervening processing unit if the data copy of the cache has been changed, or simply the intervening processing unit if the data copy of the L1 cache has not been changed. Invalidating the L1 cache data copy.

【００２５】その後、要求されたデータが介入側処理装
置のＬ２キャッシュから、好適にはバッファに読込ま
れ、システム・バス・アービタにシステム・データ・バ
スのリクエストが出される（ブロック３６）。システム
・データ・バスの使用が許可されているかどうか確認が
なされる（ブロック３７）。許可されていない場合は、
複合コヒーレンシ応答が戻っているかどうか確認がなさ
れる（ブロック３８）。組み合わせコヒーレンシ応答が
戻っていない場合はプロセスはブロック３７に戻る。Thereafter, the requested data is read from the L2 cache of the intervening processing unit, preferably into a buffer, and a request for a system data bus is issued to the system bus arbiter (block 36). A check is made as to whether use of the system data bus is authorized (block 37). If not,
A check is made to see if a composite coherency response has returned (block 38). If no combination coherency response has been returned, the process returns to block 37.

【００２６】しかしシステム・データ・バスの使用が許
可されている場合は、要求されたデータをシステム・デ
ータ・バスに駆動することによって、要求されたデータ
の供給を介入側処理装置から開始できる（ブロック３
９）。また、この時点ですでに複合コヒーレンシ応答が
戻っているかどうか確認がなされる（ブロック４０）。
複合コヒーレンシ応答がまだ戻っていない場合は、プロ
セスは複合コヒーレンシ応答が戻るのを待ち続け、その
間、要求されたデータのシステム・バスへの供給が続け
られる。However, if the use of the system data bus is permitted, the supply of the requested data can be initiated from the intervening processing unit by driving the requested data to the system data bus ( Block 3
9). It is also checked at this point if a composite coherency response has already been returned (block 40).
If the composite coherency response has not yet returned, the process continues to wait for the composite coherency response to return while the requested data continues to be supplied to the system bus.

【００２７】複合コヒーレンシ応答が戻った後、それ
が"再試行"かどうかの確認がなされる（ブロック４
１）。複合コヒーレンシ応答が再試行なら、システム・
データ・バスの使用がまだ許可されていない場合はシス
テム・データ・バス・リクエストが取り消されるか、ま
たは要求されたデータ供給がすぐに打ち切られる（ブロ
ック４２）。供給がこの時点ですでに完了している場合
も、再試行コヒーレンシ応答のために結果は廃棄され
る。複合コヒーレンシ応答が再試行でない場合は、もし
要求されたデータの供給が完了していなければ、それが
完了するまで続く。最後に介入側処理装置のＬ２キャッ
シュのステータスが更新され（ブロック４３）、プロセ
スはブロック９９で終了する。After returning the composite coherency response, a check is made as to whether it is a "retry" (block 4).
1). If the composite coherency response is a retry, the system
If the use of the data bus has not yet been granted, the system data bus request is canceled or the requested data supply is immediately aborted (block 42). If the feed has already been completed at this point, the result will be discarded due to the retry coherency response. If the composite coherency response is not a retry, if the supply of the requested data has not been completed, it continues until it is completed. Finally, the status of the intervening processing unit's L2 cache is updated (block 43) and the process ends at block 99.

【００２８】先に述べたように、本発明は、データ処理
システム内の処理装置から高機能Ｉ／Ｏ装置へキャッシ
ュ・メモリ・データを投機的に供給する方法を提供す
る。具体的には、本発明の開示では、複合コヒーレンシ
応答が戻る前に要求されたデータが介入側処理装置のＬ
２キャッシュから読取られる新規な介入例に述べてい
る。As noted above, the present invention provides a method for speculatively providing cache memory data from a processing unit in a data processing system to a sophisticated I / O device. Specifically, in the disclosure of the present invention, the data requested before the composite coherency response returns is transmitted to the L
2 describes an example of a new intervention read from the cache.

【００２９】本発明には、性能面で従来技術にはない明
らかな利点がある。システム・バス上の読取り／ＲＷＩ
ＴＭリクエストと複合応答のサンプリングとの間の遅延
がシステム・バスの数クロック・サイクルになるからで
ある。よって、要求されたデータを複合コヒーレンシ応
答が受信される前に介入側処理装置のＬ２キャッシュか
ら読取られるようにすることで、介入待ち時間が大幅に
短縮され、システム全体の性能も大きく改良される。The present invention has distinct advantages in performance over the prior art. Read / RWI on system bus
This is because the delay between the TM request and the sampling of the composite response is several clock cycles of the system bus. Thus, by requiring the requested data to be read from the L2 cache of the intervening processor before the composite coherency response is received, the intervention latency is significantly reduced and the overall system performance is greatly improved. .

【００３０】まとめとして、本発明の構成に関して以下
の事項を開示する。In summary, the following matters are disclosed regarding the configuration of the present invention.

【００３１】（１）データ処理システム内の、少なくと
も１つのキャッシュ・メモリを含む処理装置からＩ／Ｏ
装置へキャッシュ・メモリ・データを投機的に供給する
方法であって、前記Ｉ／Ｏ装置によるデータに対するリ
クエストに応答して、要求されたデータを持つ処理装置
によって介入応答を発行するステップと、前記データ処
理システム内の全ての処理装置からの複合応答が前記処
理装置に戻る前に、前記要求されたデータを前記処理装
置内のキャッシュ・メモリから読取るステップと、を含
む、方法。（２）前記読取るステップは、前記要求されたデータを
前記処理装置内のキャッシュ・メモリからバッファへ読
込むステップを含む、前記（１）記載の方法。（３）前記リクエストは、読取りリクエストまたは変更
予定読取りリクエストを含む、前記（１）記載の方法。（４）戻った複合応答が再試行である場合に前記読取る
ステップを止めるステップを含む、前記（１）記載の方
法。（５）前記要求されたデータを前記複合応答が戻る前に
前記処理装置によって供給することをシステム・バスに
要求するステップを含む、前記（１）記載の方法。（６）前記要求されたデータを前記複合応答が戻る前に
前記処理装置によって供給するステップを含む、前記
（５）記載の方法。（７）データ処理システム内のＩ／Ｏ装置へデータを投
機的に供給できるキャッシュ・メモリを持つ処理装置で
あって、前記Ｉ／Ｏ装置によるデータに対するリクエス
トに応答して、前記データ処理システム内の要求された
データを持つ処理装置から介入応答を発行する手段と、
全ての処理装置から複合応答が前記処理装置に戻る前
に、前記要求されたデータを前記処理装置内のキャッシ
ュ・メモリから読取る手段と、を含む、処理装置。（８）前記読取る手段は、前記処理装置内のキャッシュ
・メモリからバッファに前記要求されたデータを読込む
手段を含む、前記（７）記載の処理装置。（９）前記リクエストは読取りリクエストまたは変更予
定読取りリクエストを含む、前記（７）記載の処理装
置。（１０）前記処理装置は、戻った複合応答が再試行であ
る場合に前記読取る手段による読取りを止める手段を含
む、前記（７）記載の処理装置。（１１）前記処理装置は、前記複合応答が戻る前に前記
要求されたデータを前記処理装置によって供給すること
をシステム・バスに要求する手段を含む、前記（７）記
載の処理装置。（１２）前記処理装置は、前記複合応答が戻る前に前記
要求されたデータを前記処理装置によって供給する手段
を含む、前記（１１）記載の処理装置。(1) I / O from a processing unit in a data processing system including at least one cache memory
A method of speculatively providing cache memory data to a device, the method comprising: in response to a request for data by the I / O device, issuing an intervention response by a processing device having the requested data; Reading the requested data from a cache memory in the processing device before the composite response from all the processing devices in the data processing system returns to the processing device. (2) The method according to (1), wherein the reading step includes reading the requested data from a cache memory in the processing device into a buffer. (3) The method according to (1), wherein the request includes a read request or a read request to be changed. (4) The method according to (1), further comprising the step of stopping the reading step if the returned composite response is a retry. 5. The method of claim 1, further comprising requesting a system bus to provide the requested data by the processing device before the composite response returns. (6) The method of (5), comprising providing the requested data by the processing device before the composite response returns. (7) A processing device having a cache memory capable of speculatively supplying data to an I / O device in the data processing system, wherein the processing device has a cache memory in response to a request for data by the I / O device. Means for issuing an intervention response from a processing unit having the requested data of
Means for reading the requested data from a cache memory in the processing device before the composite response from all processing devices returns to the processing device. (8) The processing device according to (7), wherein the reading unit includes a unit that reads the requested data from a cache memory in the processing device to a buffer. (9) The processing device according to (7), wherein the request includes a read request or a read request to be changed. (10) The processing device according to (7), wherein the processing device includes means for stopping reading by the reading means when the returned composite response is a retry. (11) The processing device according to (7), wherein the processing device includes means for requesting a system bus to supply the requested data by the processing device before returning the composite response. (12) The processing device according to (11), wherein the processing device includes means for supplying the requested data by the processing device before the composite response returns.

[Brief description of the drawings]

【図１】本発明を適用できるデータ処理システムのブロ
ック図である。FIG. 1 is a block diagram of a data processing system to which the present invention can be applied.

【図２】従来技術による供給機構を示す代表的なデータ
処理システムのブロック図である。FIG. 2 is a block diagram of an exemplary data processing system showing a supply mechanism according to the prior art.

【図３】本発明の好適な実施例に従って、データ処理シ
ステム内の処理装置からＩ／Ｏ装置にキャッシュ・メモ
リ・データを投機的に供給する方法を示したハイレベル
・ロジックのフローチャートを示す図である。FIG. 3 is a high level logic flowchart illustrating a method for speculatively providing cache memory data from a processing unit in a data processing system to an I / O device in accordance with a preferred embodiment of the present invention. It is.

[Explanation of symbols]

１０データ処理システム１１ａ、１１ｂ、１１ｃ、１１ｄ、１１ｅ、１１ｆ、１
１ｇ、１１ｈ、１１ｉ、１１ｊ、１１ｋ、１１ｌ、１１
ｍ、１１ｎ中央処理装置（ＣＰＵ）１２ａ、１２ｂ、１２ｃ、１２ｄ、１２ｅ、１２ｆ、１
２ｇ、１２ｈ、１２ｉ、１２ｊ、１２ｋ、１２ｌ、１２
ｍ、１２ｎ１次キャッシュ１３ａ、１３ｂ、１３ｃ、１３ｄ、１３ｅ、１３ｆ、１
３ｇ、１３ｈ、１３ｉ、１３ｊ、１３ｋ、１３ｌ、１３
ｍ、１３ｎ２次キャッシュ１４システム・メモリ１５相互接続部１６ａ、１６ｂ、１６ｃ、１６ｄ、１６ｅ、１６ｆ、１
６ｇ、１６ｈ、１６ｉ、１６ｊ、１６ｋ、１６ｌ、１６
ｍ、１６ｎ、２４高機能Ｉ／Ｏ装置２０、２１、２２処理装置２３システム・バス10 Data processing system 11a, 11b, 11c, 11d, 11e, 11f, 1
1g, 11h, 11i, 11j, 11k, 11l, 11
m, 11n Central processing unit (CPU) 12a, 12b, 12c, 12d, 12e, 12f, 1
2g, 12h, 12i, 12j, 12k, 121, 12
m, 12n Primary cache 13a, 13b, 13c, 13d, 13e, 13f, 1
3g, 13h, 13i, 13j, 13k, 131, 13
m, 13n Secondary cache 14 System memory 15 Interconnections 16a, 16b, 16c, 16d, 16e, 16f, 1
6g, 16h, 16i, 16j, 16k, 16l, 16
m, 16n, 24 Advanced I / O device 20, 21, 22 Processing device 23 System bus

───────────────────────────────────────────────────── フロントページの続き (72)発明者ジョン・スティーブン・ドッドソンアメリカ合衆国78660、テキサス州フェラガービル、ベル・ロック・サークル 1205 (72)発明者ジェリー・ドン・リュイスアメリカ合衆国78681、テキサス州ラウンド・ロック、アローヘッド・サークル 3409 ────────────────────────────────────────────────── ─── Continued on the front page (72) Inventor John Stephen Dodson 78660 USA, Bell Rock Circle, Ferragerville, Texas 1205 (72) Inventor Jerry Don Luis United States 78681, Round Rock, Texas , Arrowhead Circle 3409

Claims

[Claims]

1. A method for speculatively supplying cache memory data to an I / O device from a processing device including at least one cache memory in a data processing system, the method comprising: Issuing an intervention response by the processing device having the requested data in response to the request for the data processing system, and before the combined response from all the processing devices in the data processing system returns to the processing device, Reading the read data from a cache memory in the processing unit.

2. The method of claim 1, wherein said reading step comprises reading said requested data from a cache memory in said processing unit to a buffer.

3. The method of claim 1, wherein the request comprises a read request or a read to change request.

4. The method of claim 1, further comprising the step of stopping said reading step if the returned composite response is a retry.

5. The system of claim 1 wherein said requested data is provided by said processing unit before said composite response returns.
The method of claim 1, comprising requesting a bus.

6. Providing the requested data by the processing device before the composite response returns.
The method of claim 5.

7. A processing device having a cache memory capable of speculatively supplying data to an I / O device in a data processing system, wherein the data processing device responds to a request for data by the I / O device. Means for issuing an intervention response from the processing device having the requested data in the system; and a cache memory in the processing device before the combined response from all the processing devices returns to the processing device. Means for reading from a processing device.

8. The processing apparatus according to claim 7, wherein said reading means includes means for reading said requested data from a cache memory in said processing apparatus into a buffer.

9. The processing device according to claim 7, wherein said request includes a read request or a read request to be changed.

10. The processing apparatus according to claim 7, wherein said processing apparatus includes means for stopping reading by said reading means when the returned composite response is a retry.

11. The processing unit of claim 7, wherein said processing unit includes means for requesting a system bus to provide said requested data by said processing unit before said composite response returns.

12. The processing device according to claim 11, wherein said processing device includes means for supplying said requested data by said processing device before said composite response returns.