JPH0454575A

JPH0454575A - Parallel computer

Info

Publication number: JPH0454575A
Application number: JP2163294A
Authority: JP
Inventors: Motohiko Matsuda; 松田　元彦; Taichi Yuasa; 太一湯浅
Original assignee: Sumitomo Metal Industries Ltd
Current assignee: Nippon Steel Corp
Priority date: 1990-06-21
Filing date: 1990-06-21
Publication date: 1992-02-21

Abstract

PURPOSE:To increase the capacity of a memory that can be directly referred to, to decrease the processing steps, and to shorten the processing time by separating the memory from each arithmetic part and setting the memory between the arithmetic part to directly refer to the memory. CONSTITUTION:When an arithmetic part P carries out a single instruction, plural means P refer to one of plural memory means M and take the data out of the means M to apply to processing to this means in parallel with each other. This processed result is stored in a certain means M. If the sum total is obtained at the periphery of a grating 4 in the image processing, for example, the means M are arrayed in a grating shape and the means P are set at each intersecting point between the diagonal lines of the grating. Then the image data are stored in the means M in response to the picture elements. The data stored in the means M near the grating 4 are referred to be the periphery means P and added together. Thus the sum total of data is obtained. Thus the due processing is attained just with the direct reference given to plural means M, and the processing speed is increased.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は単一命令を並列処理する単一命令型の並列計算
機に関し、特に複数のメモリを演算装置から直接参照す
る並列計算機に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a single-instruction parallel computer that processes a single instruction in parallel, and particularly to a parallel computer that directly references multiple memories from an arithmetic unit.

[Conventional technology]

単一命令型の並列計算機は画像処理装置等の大量のデー
タを単純な演算の繰返しにより処理する装置に用いられ
、並列処理により大量データの高速演算が可能になって
いる。Single-instruction type parallel computers are used in devices such as image processing devices that process large amounts of data by repeating simple operations, and parallel processing enables high-speed operations on large amounts of data.

第４図は単一命令型の並列計算機の一般的な構成を示す
模式的ブロック図であり、制御装置２０に複数の演算部
Ｐ、Ｐ・・・及びメモリＭ、Ｍ・・・が並列接続されて
いる。制御装置２０は命令及びアドレスの生成を行い、
各演算部Ｐ、Ｐ・・・に同一の命令を与えると共に、そ
の命令の実行に用いる同一のアドレスをメモリＭ、Ｍ・
・・に与える。FIG. 4 is a schematic block diagram showing the general configuration of a single-instruction type parallel computer, in which a plurality of calculation units P, P... and memories M, M... are connected in parallel to the control device 20. has been done. The control device 20 generates instructions and addresses,
The same instruction is given to each arithmetic unit P, P..., and the same address used for executing the instruction is assigned to the memories M, M...
give to...

従来の単一命令型の並列計算機の演算部ＰとメモＩＪ　
Ｍとの接続構造の一例は第５図に示す如くであり、演算
部Ｐ、　　Ｐ・・・は格子状に配置され、夫々の演算部
Ｐ、Ｐ・・・は自身に専用のメモリＭ、Ｍ・・・を有し
ている。また演算部Ｐ、Ｐ・・・の構成は第６図に示す
如く、命令を実行する演算器１０と他の演算部Ｐ、Ｐ・
・・との通信制御を司る通信器１１により構成される。Arithmetic unit P and memory IJ of a conventional single-instruction parallel computer
An example of the connection structure with M is as shown in FIG. 5, where the calculation units P, P... are arranged in a grid pattern, and each calculation unit P, P... has its own dedicated memory M, It has M... Furthermore, the configuration of the calculation units P, P, etc. is as shown in FIG.
It is composed of a communication device 11 that controls communication with...

[Problem to be solved by the invention]

並列計算機においては、処理の高速化が重要な課題であ
り、そのためにはデータの処理ステップを減少させるこ
とが必要となる。前述の如く構成された従来の単一命令
型の並列計算機においては、１つの演算部につき夫々専
用のメモリのみの参照に制限されているので、各演算部
が他の演算部のメモリを直接参照できず、それを参照す
る場合、そのメモリを通信によって間接的に参照してい
−た。In parallel computers, increasing the processing speed is an important issue, and for this purpose it is necessary to reduce the number of data processing steps. In a conventional single-instruction type parallel computer configured as described above, each calculation unit is limited to referencing only its own dedicated memory, so each calculation unit directly references the memory of other calculation units. If the memory cannot be accessed and referenced, the memory is referenced indirectly through communication.

即ちそのメモリを有する演算部及びその演算部との間に
ある演算部と通信を行い、それらを介してメモリを参照
していた。例えば画像処理に良く用いられる格子４近傍
の総和演算を行う場合、各演算部は４近傍の画素に対応
する４つの演算部と通信し、それらのメモリに格納され
た４つの近傍の値を得ていた。従って参照のために多く
の処理ステップが必要となり、処理時間が長くなるとい
う問題があった。In other words, communication was performed with a calculation unit having the memory and a calculation unit located between the calculation unit, and the memory was referenced via them. For example, when performing a summation calculation of 4 neighboring grids, which is often used in image processing, each calculation unit communicates with 4 calculation units corresponding to 4 neighboring pixels, and obtains the values of the 4 neighboring pixels stored in their memory. was. Therefore, there is a problem in that many processing steps are required for reference, resulting in a long processing time.

本発明は斯かる事情に鑑みなされたものであり、メモリ
を各演算部から分離し複数の演算部の中間に配置し、そ
れらを直接参照できるようにすることにより、直接参照
できるメモリを多くし、処理ステップを減少し、処理時
間を短縮できる単一命令型の並列計算機を提供すること
を目的にする。The present invention was developed in view of the above circumstances, and it increases the number of memories that can be directly referenced by separating memory from each calculation unit and placing it between multiple calculation units so that they can be directly referenced. The purpose of this invention is to provide a single-instruction type parallel computer that can reduce processing steps and shorten processing time.

[Means to solve the problem]

本発明に係る並列計算機は、単一命令を並列的に処理す
る並列計算機において、前記単一命令を実行する複数の
演算手段と、各演算手段から直接参照される複数のメモ
リ手段とを備え、各メモリ手段は複数の演算手段から直
接参照されるべくなしてあることを特徴とする。A parallel computer according to the present invention is a parallel computer that processes a single instruction in parallel, and includes a plurality of arithmetic means for executing the single instruction, and a plurality of memory means directly referenced by each arithmetic means, Each memory means is characterized in that it is designed to be directly referenced by a plurality of calculation means.

[Effect]

本発明においては演算手段が単一命令を実行する場合、
複数の演算手段が複数のメモリ手段から１つを参照し、
そこからデータを取り出し、それに並列的に処理を施し
、処理後の結果をいずれかのメモリ手段に格納する。例
えば画像処理において格子４近傍の総和を求める場合は
、メモリ手段を格子状に配列し、格子の対角線の交点に
演算手段を配置する構造とし、画像データをメモリ手段
に画素に対応して格納し、４近傍のメモリ手段に格納さ
れたデータをその近傍の演算手段が順次参照して加算し
、総和を求めることができる。従って複数のメモリ手段
を直接参照するだけで処理が行えるので処理の高速化を
図ることができる。In the present invention, when the arithmetic means executes a single instruction,
the plurality of calculation means refer to one from the plurality of memory means,
Data is extracted from there, processed in parallel, and the processed results are stored in any memory means. For example, in image processing, when calculating the sum of 4 neighboring grids, the memory means are arranged in a grid, the calculation means are arranged at the intersections of the diagonals of the grid, and the image data is stored in the memory means corresponding to the pixels. , the data stored in the four neighboring memory means can be sequentially referred to and added by the neighboring calculation means to obtain the total sum. Therefore, processing can be performed simply by directly referencing a plurality of memory means, so that processing speed can be increased.

〔Example〕

以下、本発明をその実施例を示す図面に基づいて詳述す
る。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described in detail below based on drawings showing embodiments thereof.

第１図は本発明に係る並列計算機の構成を示すブロック
図、第２図は演算部の構成を示すブロック図である。図
においてＰは演算手段たる演算部であり、該演算部Ｐは
第２図に示す如く、図示しない単一の制御装置から与え
られた単一命令を実行する演算器１０、他の演算部Ｐと
通信するための通信器１１、参照するメモリを選択する
セレクタ１３及びセレクタ１３への指示を与える選択信
号を出力する２ビツトの方向フラグレジスタ１２を備え
ている。各演算部Ｐは夫々双方向に通信可能な４本の通
信路が接続されており、セレクタ１３により通信路を切
換えるようにしている。演算部Ｐ、　　Ｐ・・・は格子
状に配列され、格子の対角線の交点にはメモＩＪＭ、Ｍ
・・・が配置されている。各メモリＭには夫々双方向に
通信可能な４本の通信路が接続されており、各演算部Ｐ
とメモリＭとは前記対角線の方向に沿って夫々の４近傍
が相互に接続されている。FIG. 1 is a block diagram showing the configuration of a parallel computer according to the present invention, and FIG. 2 is a block diagram showing the configuration of an arithmetic unit. In the figure, P is an arithmetic unit which is an arithmetic means, and as shown in FIG. It is provided with a communicator 11 for communicating with the memory, a selector 13 for selecting a memory to be referred to, and a 2-bit direction flag register 12 for outputting a selection signal to give an instruction to the selector 13. Each arithmetic unit P is connected to four communication paths capable of bidirectional communication, and the selector 13 switches the communication paths. The calculation units P, P... are arranged in a grid, and memos IJM, M are placed at the intersections of the diagonals of the grid.
...is placed. Each memory M is connected to four communication paths capable of bidirectional communication, and each calculation unit P
and memory M are interconnected at their respective four neighborhoods along the diagonal line.

またセレクタ１３は通常は単一の制御装置からの信号に
より通信路を選択し、すべての演算部Ｐ。Further, the selector 13 normally selects a communication path based on a signal from a single control device, and selects a communication path for all calculation units P.

Ｐ・・・において同一方向の近傍のメモリＭ、Ｍ・・・
を参照する。一方各演算部Ｐが夫々異なる方向のメモリ
Ｍ、Ｍ・・・を参照する場合は、演算器１０は方向フラ
グレジスタ１２に選択すべきメモリＭ、Ｍ・・・の対角
線の方向の値（例えば右上を“ＯＯ”、右下を“０１′
″、左上を“１０”、左下を“１１″）をセットし、そ
の値に基づきセレクタは通信路を選択する。Nearby memories M, M... in the same direction in P...
See. On the other hand, when each calculation unit P refers to memories M, M, etc. in different directions, the calculation unit 10 sets the value in the diagonal direction of the memories M, M, etc. to be selected in the direction flag register 12 (for example, Upper right is “OO”, lower right is “01’”
'', the upper left is set to "10", and the lower left is set to "11"), and the selector selects the communication path based on the values.

次に本発明の並列計算機の動作を格子４近傍の総和を求
める場合を例に説明する。第３図は総和の演算手順を説
明する図であり、ステップＳｌ、Ｓ２゜Ｓ３の３ステツ
プで４近傍の総和が求められる。Next, the operation of the parallel computer according to the present invention will be explained using an example in which the sum of the 4 neighboring grids is calculated. FIG. 3 is a diagram illustrating the calculation procedure of the summation, and the summation of four neighboring parts is obtained in three steps, steps S1, S2 and S3.

ステップＳ１では演算部Ｐはその右上のメモリＭに格納
された値Ｖ　＋　（又は■。）と左下のメモリＭに格納
された値Ｖ　２　（又はＶ３）との加算を行い、その結
果Ｖ＋＋Ｖｚ（又は■。＋Ｖ３）を夫々の左下のメモリ
Ｍに格納する。次のステップＳ２で左上のメモリＭ及び
右下のメモリＭに格納されたＶ、＋Ｖ２とＶ。＋Ｖ、と
をその間にある演算部Ｐで加算し、その加算結果（Ｖ、
　十Ｖ２＋Ｖ。＋Ｖ３）を右下のメモリＭに格納する。In step S1, the calculation unit P adds the value V + (or ■.) stored in the upper right memory M and the value V 2 (or V3) stored in the lower left memory M. or ■.+V3) is stored in the memory M at the lower left of each. V, +V2 and V stored in the upper left memory M and the lower right memory M in the next step S2. +V, and are added in the arithmetic unit P between them, and the addition result (V,
10V2+V. +V3) is stored in the lower right memory M.

これが格子４近傍の総和であるが、これはちょうど１つ
上のメモリの格子４近傍の総和であるのでステップＳ３
で演算部Ｐは左下のメモリＭの値を読出し、左上のメモ
リＭに格納し直し、１つ上のメモリＭに総和を転送する
。This is the sum of the neighborhood of grid 4, but since this is the sum of the neighborhood of grid 4 of the memory just one above, step S3
Then, the calculation unit P reads the value of the lower left memory M, stores it again in the upper left memory M, and transfers the sum to the memory M one level above.

以上のステップを全ての演算部Ｐ、Ｐ・・・について行
うことにより格子４近傍の総和が求められる。By performing the above steps for all the calculation units P, P, . . . , the total sum in the vicinity of the grid 4 can be obtained.

本発明の単一命令型の並列計算機は演算部に直結した近
傍のメモリ間で演算を行う場合、通信機能を用いず、直
接演算部がメモリ内のデータを参照して行う。従来の場
合、演算部に固有の局所的なメモリ以外のメモリの参照
は、その近傍の演算部がそのメモリを参照し、通信機能
によりデータを交換する必要があったが、本発明のもの
ではそのステップは必要ではない。従って上記した如く
格子４近傍の総和を本発明の並列計算機では３ステツプ
で求めることができるが、従来のものでは演算に用いる
オペランドの通信及び演算を１つのステ、７ブと数える
と、４ステツプ必要となる。In the single-instruction type parallel computer of the present invention, when an operation is performed between nearby memories directly connected to an operation section, the operation section directly refers to data in the memory without using a communication function. In the conventional case, when referencing a memory other than the local memory specific to an arithmetic unit, it was necessary for neighboring arithmetic units to refer to that memory and exchange data using a communication function, but with the present invention, this is not possible. That step is not necessary. Therefore, as mentioned above, the sum of the neighborhood of grid 4 can be found in 3 steps with the parallel computer of the present invention, but in the conventional system, if communication and calculation of operands used in calculations are counted as 1 step and 7 steps, it takes 4 steps. It becomes necessary.

また近傍間のデータ演算は通常顧度が高いので、本発明
の並列計算機の如く近傍間のメモリを直接参照できる場
合は、処理・が高速化する。また近傍間の演算では複数
のオペランドに対して演算を行うが、そのオペランドが
メモリ内にあり、そのメモリ間に演算部を配置すること
は演算に対して合理的な構成となっている。Furthermore, data operations between neighbors are usually time-consuming, so if the memory between neighbors can be directly referenced, as in the parallel computer of the present invention, processing speeds up. Furthermore, in a calculation between neighbors, a calculation is performed on a plurality of operands, and the operands are located in the memory, and arranging the calculation unit between the memories is a rational configuration for the calculation.

更にこの構成に用いられるデータ線の複雑度は、近傍の
相互の通信網に用いられるデータ線の複雑度と同程度で
あり、近傍同士の演算部の相互通信はその間にあるメモ
リを介して行えるので、近傍間の通信網を省略すること
ができる。Furthermore, the complexity of the data lines used in this configuration is comparable to the complexity of the data lines used in mutual communication networks in the vicinity, and mutual communication between the calculation units in the vicinity can be performed via the memory between them. Therefore, a communication network between neighbors can be omitted.

またセレクタによる通信路の選択を単一の制御装置から
の選択信号だけでなく、演算部内の方向フラグレジスタ
に値をセットすることにより可能にしたので、各演算部
が独立に異なるアドレスを参照することはできないが、
各演算部が独立に相対的に異なったメモリを参照でき、
実質的に異なるアドレスを参照するのと同様な効果を得
ることができる。In addition, the selection of the communication path by the selector is made possible not only by the selection signal from a single control device, but also by setting a value in the direction flag register in the calculation unit, so each calculation unit can independently refer to a different address. I can't do it, but
Each calculation unit can independently refer to relatively different memories,
The same effect as referring to substantially different addresses can be obtained.

なお、本実施例ではメモリ及び演算部を格子状に配列し
たが、本発明の配列はこれに限るものではなく、ｎ進水
、バタフライネットワーク及びパイパーキューブ等の平
面的、立体的な任意の配列を用いることができることは
言うまでもない。In this embodiment, the memories and calculation units are arranged in a grid pattern, but the arrangement of the present invention is not limited to this, and may be any two-dimensional or three-dimensional arrangement such as n-launch, butterfly network, piper cube, etc. Needless to say, it is possible to use

〔effect〕

以上説明したとおり、本発明においては複数の演算手段
に直接参照できる複数のメモリ手段を配置すると共に、
そのメモリ手段を複数の演算手段から直接参照できるよ
うに構成したので、ハードウェア構成を複雑化すること
なく複数のオペランドによる単一命令の演算を通信に要
するステップを省略し、高速処理することができる等価
れた効果を奏する。As explained above, in the present invention, in addition to arranging a plurality of memory means that can be directly referenced by a plurality of calculation means,
Since the memory means is configured so that it can be directly referenced by multiple calculation means, it is possible to perform high-speed processing of a single instruction using multiple operands without complicating the hardware configuration by omitting steps required for communication. It produces the same effect as possible.

[Brief explanation of the drawing]

第１図は本発明に係る並列計算機の構成を示す模式的ブ
ロック図、第２図は演算部の構成を示すブロック図、第
３図は格子４近傍の総和の演算手順を説明する図、第４
図は単一命令型の並列計算機の一般的な構成を示す模式
的ブロック図、第５図は従来の単一命令型の並列計算機
の構成を示すブロック図、第６図は従来の演算部の構成
を示すブロック図である。Ｐ・・・演算部　Ｍ・・・メモリ特　許　出願人　　住友金属工業株式会社代理人　弁理
士　　河　　野　　登　　夫第図］２第図第図第図第図FIG. 1 is a schematic block diagram showing the configuration of a parallel computer according to the present invention, FIG. 2 is a block diagram showing the configuration of the calculation section, FIG. 4
The figure is a schematic block diagram showing the general configuration of a single-instruction type parallel computer, Figure 5 is a block diagram showing the configuration of a conventional single-instruction type parallel computer, and Figure 6 is a block diagram showing the configuration of a conventional single-instruction type parallel computer. FIG. 2 is a block diagram showing the configuration. P...Arithmetic unit M...Memory patent Applicant Sumitomo Metal Industries Co., Ltd. Agent Patent attorney Noboru Kono Figure] 2 Figure Figure Figure Figure

Claims

[Claims] 1. A parallel computer that processes a single instruction in parallel, comprising: a plurality of arithmetic means for executing the single instruction; and a plurality of memory means directly referenced by each arithmetic means; A parallel computer characterized in that each memory means is designed to be directly referenced by a plurality of calculation means.