JPS60263268A

JPS60263268A - Vector processor

Info

Publication number: JPS60263268A
Application number: JP12014284A
Authority: JP
Inventors: Makoto Suwada; 諏訪田　誠
Original assignee: NEC Corp; Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1984-06-12
Filing date: 1984-06-12
Publication date: 1985-12-26
Also published as: JPH0325822B2

Abstract

PURPOSE:To execute high speed processing of compressive conversion and simple control by providing a parallel vector register part having plural operand vector registers, result vector registers and mask data registers. CONSTITUTION:A parallel vector register part 1 has operand vector registers, result vector registers and mask data registers of four pieces each. Four vector elements read out in parallel from each operand vector register are supplied to an aligning circuit 2 through a data bus 1000. These vector elements are transferred in parallel through a data bus 2000, and written in parallel to each result vector register. Also, four mask elements read out in parallel from each mask data register are transferred in parallel to a compressive conversion controlling circuit 3 through a mask data read-out bus 1300. In such a way, the compressive conversion of a vector is executed in parallel, therefore, high speed processing and simplification of a control can be realized.

Description

【発明の詳細な説明】（産業上の利用分野）本発明はベクトル処理装置におけるデータ転送制御に関
し、特にそのベクトル圧縮変換制御に関する。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to data transfer control in a vector processing device, and particularly to vector compression conversion control thereof.

（従来技術）従来のベクトル処理装置において、オペランドベクトル
レジスタからベクトルエレメントを読出して処理する場
合には各エレメントが順次、１個づつ読出され、リザル
トベクトルレジスタに書込む場合にも各エレメントが順
次、１個づつ書込まれていた。斯かるベクトル処理装置
においては、ベクトルの圧縮変換を行うことは比較的容
易である。(Prior Art) In a conventional vector processing device, when reading vector elements from the operand vector register and processing them, each element is read out one by one in sequence, and when writing to the result vector register, each element is read out one by one in sequence. They were written one by one. In such a vector processing device, it is relatively easy to perform vector compression conversion.

次に、圧縮変換について説明する。第１図は、圧縮変換
の説明図である。ｋ個（ｋ：正の整数）のマスクエレメ
ントを格納できるマスクデータレジスタＭＳＫと、各マ
スクエレメントに対応して同一のベクトルのエレメント
をに個まで格納できるオペランドベクトルレジスタＯＰ
Ｒと、同様に各マスクエレメントに対応して同一のベク
トルのエレメントをに個まで格納できるリザルトベクト
ルレジスタＲ８Ｌとがおる。そこで、マスクデータレジ
スタＭＳＫとオペランドベクトルレジスタＯＰＲとには
、第１図に示すようなエレメントがそれぞれ格納されて
いるとする。このような状態から−ＩＩが格納されてい
るマスクエレメントの↑ １′　格納位置に対応したオペランドベクトルレジスタ
ＯＰＲのエレメントな、その順序管乱すことなく順次、
リザルトベクトルレジスタに格納してゆくことが圧縮変
換でおる。Next, compression conversion will be explained. FIG. 1 is an explanatory diagram of compression conversion. A mask data register MSK that can store k (k: positive integer) mask elements, and an operand vector register OP that can store up to 2 elements of the same vector corresponding to each mask element.
Similarly, there is a result vector register R8L which can store up to two elements of the same vector corresponding to each mask element. Therefore, it is assumed that elements as shown in FIG. 1 are stored in the mask data register MSK and the operand vector register OPR, respectively. From this state, the elements of the operand vector register OPR corresponding to the ↑ 1' storage position of the mask element where -II is stored are sequentially written without disturbing their order.
Compression conversion is performed by storing it in the result vector register.

上に説明した圧縮変換において、高速処理を行う場合に
は複数個のベクトルエンメントを同時に並列処理するこ
とが望ましい。しかし、この場合には制御が複雑になぁ
という欠点がめった。In the above-described compression conversion, it is desirable to process a plurality of vector elements in parallel at the same time in order to perform high-speed processing. However, in this case, the drawback is that the control is complicated.

（発明の目的）本発明の目的は、比較的簡単な制御手段を使用し、並列
にベクトルの圧縮変換を行って高速処理を実行すること
によって上記欠点を除去し、簡単に制御を行うことがで
き石ように構成したベクトル処理装置を提供することに
ある。(Object of the Invention) An object of the present invention is to eliminate the above-mentioned drawbacks by performing high-speed processing by compressing vectors in parallel using a relatively simple control means, and to easily perform control. The object of the present invention is to provide a vector processing device that is constructed in a simple manner.

（発明の構成）本発明によるベクトル処理装置は、オペランドベクトル
７９７２手段と、リザルトベクトルＶジスタ手段と、マ
スクデータレジスタ手段と、読出しデータバス手段と、
書込みデータバス手段と。(Structure of the Invention) A vector processing device according to the present invention includes operand vector 7972 means, result vector V register means, mask data register means, read data bus means,
Write data bus means.

アライン回路手段と、積算回路手段と、エンコーダ手段
と、デコーダ手段とを具備し、ベクトル圧縮変換を行う
ように構成したものでおる。The apparatus includes align circuit means, integration circuit means, encoder means, and decoder means, and is configured to perform vector compression conversion.

オペランドベクトルレジスタ手段は、１サイクル中に同
一ベクトルに属する複数個のベクトルエレメントを読出
すだめのものである。The operand vector register means is for reading out a plurality of vector elements belonging to the same vector during one cycle.

リザルトベクトルＶジヌタ手段は、１サイクル中に同一
ベクトルに属する上記複数個のベクトルエレメントを書
込むためのものでおる。The result vector V-signuter means is for writing the plurality of vector elements belonging to the same vector during one cycle.

マスクデータ７ジヌタ手段は、オペランドベクトルレジ
スタ手段およびリザルトベクトルレジスタ手段の各要素
に対応し、１サイクル中に上記複数個のマスクエレメン
トを読出すだめのものであ７１゜読出しデータバス手段は、オペランドベクトルレジスタ
手段から上記複数個のベクトルエレメントを読出すため
のものでおる。The mask data 7 signal bus means corresponds to each element of the operand vector register means and the result vector register means, and is used to read out the plurality of mask elements during one cycle. This is for reading out the plurality of vector elements from the vector register means.

書込みデータバス手段は、リザルトベクトルレジスタ手
段に上記複数個のベクトルエンメントを書込むだめのも
のでめる。The write data bus means is provided for writing the plurality of vector elements into the result vector register means.

アライン回路手段は、読出しデータバス手段と書込みデ
ータバス手段とを選択的に接続するだめのものである。The align circuit means is for selectively connecting the read data bus means and the write data bus means.

積算回路手段は、読出しデータバス手段を通して読出さ
れたマスクエレメントの−１＃の数を積算するためのも
のである。The accumulation circuit means is for accumulating the -1# number of mask elements read through the read data bus means.

エンコーダ手段は、積算回路手段から得られた積算値と
、上記によ）読出されたマスクエレメントとによシ上記
すザルトベクトルレジスタ手段に対する書込みアドレス
歩進信号を生成するだめのものである。The encoder means is for generating a write address increment signal for the salt vector register means by the summation value obtained from the summation circuit means and the mask element read out (above).

デコーダ手段は、上記により読出されたマスクエレメン
トと上記積算値とによってアライン回路手段への接続制
御信号を生成するだめのものであるＯ（実施例）次に、本発明について図面を参照して詳細に説明する。The decoder means is for generating a connection control signal to the align circuit means based on the mask element read out above and the integrated value. Explain.

第２図は、本発明によるベクトル処理装置の−実施例金
示すブロック図である。第２図において、ベクトル処理
装置は並列ベクトルレジスタ部１と。FIG. 2 is a block diagram showing an embodiment of a vector processing device according to the present invention. In FIG. 2, the vector processing device includes a parallel vector register section 1.

アライン回路２と、圧縮変換制御回路６と、読出しデー
タバス１０００と、書込みデータ／＜ス２０００と、マ
スクデータ読出しバス１３００！：。Align circuit 2, compression conversion control circuit 6, read data bus 1000, write data/< bus 2000, and mask data read bus 1300! :.

書込みアドレス歩進制御信号線６０００と、アライン回
路接続制御信号線６２００とから成立っている。It consists of a write address increment control signal line 6000 and an align circuit connection control signal line 6200.

第８図は、並列ベクトルレジスタ部１の詳細を示すブロ
ック図である。本実施例においては４個（一般にはｎ個
、ｎ＝１ｔ２＋８・・・、正の整数、ここではｎ＝４）
の同一構成のベクトルレジスタ部ＶＥ−０〜ＶＥ−８を
備え、例えばベクトルレジスタ部ＶＥ−０はマスクデー
タレジスタＭＳＫ−０と、オペランドベクトルレジスタ
０ＰＲ−〇と、リザルトベクトルレジスタＲ８Ｌ−０と
から成立っている。一般に、ベクトルレジスタ部ｖＥ−
１（ｉ＝０，１，２，８）はｆｆ７クデ一タレジスタＭ
ＳＫ−ｉ　（ｉ＝０　、１　、２−８　）と。FIG. 8 is a block diagram showing details of the parallel vector register section 1. In this example, there are four (generally n, n=1t2+8..., a positive integer, here n=4)
For example, the vector register section VE-0 is composed of a mask data register MSK-0, an operand vector register 0PR-0, and a result vector register R8L-0. ing. Generally, the vector register section vE-
1 (i=0, 1, 2, 8) is ff7 data register M
SK-i (i=0, 1, 2-8).

オペランドベクトルレジスタ０ＰＲ−ｉ　（１＝ｏ　。Operand vector register 0PR-i (1=o.

１．２．３）と、リザルトベクトルレジスタＲ８？　Ｌ
−ｉ（ｉ＝ｏ、１ｔ１８）とから成立つ。ここで、オペ
ランドベクトルレジスタ０ＰＲ−ｉ（ｉ＝ｏ、１，２．
８）ならびにリザルトベクトルレジスタＲ８Ｌ−ｔ（ｉ
＝ｏ＋１＊２ｐ８）として読出し／書込みが可能なレジ
スタを使用すれば、リザルトベクトルＶジスタＲ８Ｌ＝
とオペランドベクトルレジスタ０ＰＲ−ｉ（ｉ＝ｏ１１
１２１８）とを同一のレジスタで構成することができる
。1.2.3) and result vector register R8? L
-i (i=o, 1t18). Here, operand vector register 0PR-i (i=o, 1, 2 .
8) and result vector register R8L-t(i
If you use a register that can be read/written as =o+1*2p8), the result vector V register R8L=
and operand vector register 0PR-i (i=o11
1218) can be configured with the same register.

各オペランドベクトルレジスタ０ＰＲ−０〜０ＰＲ−８
から並列に読出された４個のベクトルエレメントは、４
個のベクトルエレメントを並列に転送するだめの読出し
データバス１０００を介してアライン回路２に供給され
る。４個のベクトルエレメントを並列に転送するだめの
書込みデータバス２０００を介してアライン回路２から
供給されたベクトルエレメントは、各リザルトベクトル
レジスタＲ８Ｌ−０〜Ｒ８Ｌ−８に並列に書込むことが
できる。各マスクデータレジスタＭ８に一〇〜Ｍ　Ｓ　
Ｋ　−８から並列に読出された４個のマスクエレメント
は、マスクデータ読出しノくス１ろ００を介して並列に
圧縮変換制御回路３に転送される。Each operand vector register 0PR-0 to 0PR-8
The four vector elements read in parallel from
The data is supplied to the align circuit 2 via a read data bus 1000 for transferring vector elements in parallel. The vector elements supplied from the align circuit 2 via the write data bus 2000, which is used to transfer four vector elements in parallel, can be written in parallel to each of the result vector registers R8L-0 to R8L-8. 10 to M S in each mask data register M8
The four mask elements read out in parallel from K-8 are transferred in parallel to the compression conversion control circuit 3 via mask data readout nodes 1 and 00.

第４図は、アライン回路２を詳細に示すブロック図であ
る。第４図において、アライン回路２は読出しデータバ
ス１０００に接続された４個の入カポ−）２０−０〜２
０−３と、書込みデータバス２０００に接続された４個
の出力ボート２１−〇〜２１−３と、各入出力ボート間
を接続するための接続線２２とがら成立つ。アライン回
路２には、圧縮変換制御回路３から信号線６２００を介
してアライン回路接続制御信号が供給されている。FIG. 4 is a block diagram showing the align circuit 2 in detail. In FIG. 4, the align circuit 2 includes four input ports 20-0 to 2 connected to the read data bus 1000.
0-3, four output ports 21-0 to 21-3 connected to the write data bus 2000, and a connection line 22 for connecting each input/output boat. The align circuit 2 is supplied with an align circuit connection control signal from the compression conversion control circuit 3 via a signal line 6200.

この信号は、ｎ　＝　４に対応して存在する４個の入カ
ポ−）２１−０〜２１−３に供給され、この信号によっ
て各入出力ボート間の接続の仕方が制御されている。例
えば、後で説明するような情報ωｏ＝ω１＝０（気０，
０〃）、ω２　＝１　（’　Ｏｒ１〃）、ω３−２（気
１，０’）を含む制御信号が信号線６２００を介して供
給された場合には、各人出ボートの接続の仕方は次のよ
うになる。情報ω。が供給されている出カポ−）２１−
０および情報ω１が供給されている出カポ−）２１−１
はω０＝ω１　＝　Ｏに対応して共に入力ボート２〇−
〇に接続され、情報ω冨が供給されている出カポ−）２
１−２はω２＝１に対応して入力ボート２０−１に接続
され、情報ω３が供給されている出力ボート２１−８は
ω３＝２に対応して入力ボート２０−２に接続されてい
る。This signal is supplied to four input ports (21-0 to 21-3) corresponding to n=4, and the connection between each input/output port is controlled by this signal. For example, the information ωo=ω1=0 (ki 0,
0〃), ω2 = 1 ('Or1〃), ω3-2 (Ki 1, 0') is supplied via the signal line 6200, the way each boat is connected is as follows. It will look like this: Information ω. Output capo that is supplied with) 21-
0 and information ω1 are supplied) 21-1
corresponds to ω0 = ω1 = O, and both input boats 20−
Output port connected to 〇 and supplied with information ω-)2
1-2 is connected to the input boat 20-1 in response to ω2=1, and the output port 21-8 to which information ω3 is supplied is connected to the input port 20-2 in response to ω3=2. .

第５図は、圧縮変換制御回路６の詳細を示すブロック図
でらる。第５図において、圧縮変換制御回路６はマスク
データ読出しバス１６００に接続された４ビツトのマス
クレジスタ６１と、加算器６２１ならびにレジスタ６２
２から成る積算回路６２と、エンコーダ６６と、デコー
ダ６４１と。FIG. 5 is a block diagram showing details of the compression conversion control circuit 6. In FIG. 5, the compression conversion control circuit 6 includes a 4-bit mask register 61 connected to a mask data readout bus 1600, an adder 621, and a register 62.
2, an integrating circuit 62, an encoder 66, and a decoder 641.

シフタ６４２とから成立っている。エンコーダ３６はマ
スクレジスタ６１の出力と積算回路３２の積算値Ｘとを
入力して信号線６１００上に書込みアドレス歩進制御信
号を生成し、これを並列ベクトルレジスタ部１に供給し
、てリザルトベクトルレジスタＲ８Ｌ−０〜Ｒ８Ｌ−８
の書込みアドレスの歩道をそれぞれ制御する。デコーダ
６４１にはマスクデータが入力され、デコーダ６４１の
出力はアライン回路接続制御信号線６２００上に各２ピ
ントの情報ω０〜ω３として送出され、アライン回路２
に供給されている。It consists of a shifter 642. The encoder 36 inputs the output of the mask register 61 and the integrated value Register R8L-0 to R8L-8
Control the write address of each trail. Mask data is input to the decoder 641, and the output of the decoder 641 is sent out on the align circuit connection control signal line 6200 as information ω0 to ω3 for each 2 pins, and the align circuit 2
is supplied to.

次に、本実施例の動作を詳細に説明する。Next, the operation of this embodiment will be explained in detail.

最初に、並列ベクトルレジヌタ部１の各レジスタは次の
ようにして初期設定されるものとする。First, it is assumed that each register of the parallel vector register unit 1 is initialized as follows.

すなわち、各マスクデータレジスタＭＳＫ−０〜ＭＳＫ
−１には特定のマスクデータの値が設定される。設定の
順番は、例えば次のようにして決定される。すなわち、
マスクデータが第１図に示すように１０１１０１００・
・・・・に従って配置されている場合には、マスクデー
タレジスタＭＳＫ−０の最初のアドレスにマスクデータ
の最初の一１〃が設定され、マスクデータレジスタＭＳ
Ｋ−１の最初のアドレスに次のマスクデータ気０〃が設
定され、以下同様にしてマスクデータレジスタＭＳＫ−
８の最初のアドレスに４番目のマスクデーター１〃が設
定される。斯くして、並列ベクトルレジスタ部１の各マ
スクデータレジスタＭＳＫ−０〜ＭＳＫ−８の最初のア
ドレスには、第３図に示すようにマスクデーター１０１
１　＃が設定され、次のアドレスにマスクデータ’０１
００’が設定され、以下同様にして各マスクデータレジ
スタＭＳＫ−０〜ＭＳＫ−８に各マスクデータが設定さ
れる。That is, each mask data register MSK-0 to MSK
-1 is set to a specific mask data value. The order of settings is determined, for example, as follows. That is,
The mask data is 10110100 as shown in Figure 1.
..., the first one 1 of the mask data is set to the first address of the mask data register MSK-0, and the mask data register MSK-0 is set to the first address of the mask data register MSK-0.
The next mask data 0 is set at the first address of K-1, and the mask data register MSK-
The fourth mask data 1 is set at the first address of 8. Thus, the first address of each mask data register MSK-0 to MSK-8 of the parallel vector register section 1 contains mask data 101 as shown in FIG.
1 # is set and mask data '01' is placed in the next address.
00' is set, and thereafter each mask data is set in each mask data register MSK-0 to MSK-8 in the same manner.

次に、オペランドベクトルの各エレメントが第１図に示
すようにＡｏ　ｒ　Ａ１　＋　Ａ２　・・・であるとす
る、この場合には第８図に示すように、オペランドベク
トルレジスタ０ＰＲ−０の最初のアドレスにベクトルエ
ンメントＡＯが設定され、オペランドベクトルレジスタ
ＯＰ　Ｒ−１の最初のアドレスに次のベクトルエレメン
トＡ１が設定され、以下同様にしてオペランドベクトル
レジスタ０ＰＲ−８の最初のアドレスにベクトルエレメ
ントＡ３が設定される。斯くして、並列ベクトルレジス
タ部１の各オペランドベクトルレジスタ０ＰＲ−〇〜０
ＰＲ−８の最初のアドレスにはベクトルエレメントＡｏ
　＋Ａ１　＊Ａｚ　＋Ａｓがそれぞれ設定される。同様
にして、オペランドベクトルレジスタ０ＰＲ−０〜０Ｐ
Ｒ−８の次のアドレスにはベクトルエレメントＡ４　ｅ
　Ａｓ　ｒ　Ａａ　ｒ　Ａｔがそれぞれ設定され、以下
同様にしてすべてのオペランドベクトルのベクトルエレ
メントが各オペランドベクトルレジスタ０ＰＲ−０〜０
ＰＲ−８Ｋｌ［次設定される。Next, suppose that each element of the operand vector is Aor A1 + A2 as shown in FIG. 1. In this case, as shown in FIG. 8, the first element of the operand vector register 0PR-0 is The vector element AO is set to the address, the next vector element A1 is set to the first address of the operand vector register OP R-1, and the vector element A3 is set to the first address of the operand vector register 0PR-8 in the same manner. Set. In this way, each operand vector register 0PR-0 to 0 of the parallel vector register section 1
The first address of PR-8 is the vector element Ao.
+A1 *Az +As are set respectively. Similarly, operand vector register 0PR-0~0P
The next address of R-8 is vector element A4 e
As r Aa r At are set respectively, and in the same manner, vector elements of all operand vectors are stored in each operand vector register 0PR-0 to 0.
PR-8Kl [Next setting.

リザルトレジスタＲ８Ｌ−０〜Ｒ８Ｌ−８には圧縮変換
によシオペランドベクトルレジスタ０ＰＲ−０〜０ＰＲ
−８のベクトルエンメントＡｏ＋ＡＩ　ｒ　ＡＳ　・・
・が圧縮されて書込まれるので初期設定をする必要はな
い。したがって、第３図に示すリザルトベクトルレジス
タＲ８Ｌ−０〜Ｒ８Ｌ−８には以上説明したような初期
値ではなく、後で説明するような圧縮変換後の各ベクト
ルエレメントが設定されている。Result registers R8L-0 to R8L-8 are stored in sioperand vector registers 0PR-0 to 0PR by compression conversion.
-8 vector enement Ao+AI r AS ・・
・ is written compressed, so there is no need to make initial settings. Therefore, the result vector registers R8L-0 to R8L-8 shown in FIG. 3 are set not with the initial values as described above, but with vector elements after compression conversion as will be explained later.

以上の初期設定状態がら圧縮変換が開始嘔れるが、圧縮
変換の第０次サイクルにおいて、並列ベクトルレジスタ
部１のマスクデータレジスタＭＳＫ−０〜ＭＳＫ−８の
最初のアドレスに格納されているマスクデータ’１［１
１”が並列に読出され、マスクデータ読出しバス１３０
０を介して圧縮変換制御回路６のマスクンジスタロ１に
格納される。Compression conversion starts with the above initial settings, but in the 0th cycle of compression conversion, the mask data stored in the first address of the mask data registers MSK-0 to MSK-8 of parallel vector register section 1 '1[1
1” are read in parallel and the mask data read bus 130
The signal is stored in the masked register 1 of the compression conversion control circuit 6 via the bit 0.

このとき、各オペランドレジスタ０ＰＲ−０〜０ＰＲ−
８の最初のアドレスに格納されているオペランドベクト
ルの各ベクトルエレメントＡｏ　ｒ　Ａｔ　ｐＡ２　、
Ａｓが読出され、読出しデータバス１０００を介してア
ライン回路２の入力ポート２０−０〜２０−８に出力さ
れる。マスクレジスタ６１に格納されているマスクデー
タからデコーダ！＋４１によってアライン回路接続制御
信号を生成し、これをアライン回路２に供給してアライ
ン回路２の入カポ−）２０−０〜２０−３と出力ボート
２１−〇〜２１−８との間の接続を制御する。この制御
は以下のようにして行われる。At this time, each operand register 0PR-0 to 0PR-
Each vector element Aor At pA2 of the operand vector stored at the first address of 8.
As is read and output to input ports 20-0 to 20-8 of align circuit 2 via read data bus 1000. Decoder from the mask data stored in the mask register 61! +41 generates an align circuit connection control signal, supplies this to the align circuit 2, and connects the input ports 20-0 to 20-3 of the align circuit 2 and the output ports 21-0 to 21-8. control. This control is performed as follows.

第６図および第７図は、それぞれ第５図に示したエンコ
ーダ６６の回路構成図と論理値とを示す図である。エン
コーダ３６の出力はリザルトベクトルレジスタＲ８Ｌ−
０〜ＲＳ　Ｌ　−８０歩進制制御器として並列ベクトル
レジスタ部１に供給される。第７図に示す論理値はｎ　
＝　４とした時の実施例であるが、ｎ−ｆ−４の時にも
以下のように論理値を設定することによシ容易に同様な
エンコーダを構成することができる。すなわち、入力さ
れたマスクデータｍＱ−ｍｎ　１を加算踵加算値によシ
ｍ（１〜ｍ　ｎ−１に含まれる気１Ｎの数をめ、ａ（、
Ω側から左詰めで一１＃を割付け、残シをすべてＩ’ｔ
□Ｉとして結果を積算値Ｘだけサイクリックに右にシフ
ト（ライトローテート）する。このようにして得られた
リザルトベクトルレジスタＲ８Ｌ−０〜Ｒ８Ｌ’−８の
歩進制御信号は、マスクデータｍ（、−ｍｌがＬＳ　ｌ
　＃、％　Ｑ　＃％Ｓ　Ｉ　１１．％　ｌ　＃であって
、積算値ＸがＯのときに”１１１０”でめυ、リザルト
ベタトルレジスタＲ８Ｌ−０〜Ｒ８Ｌ−２の内容が歩進
される。このとき、そのベクトル部にはリザルトベクト
ルのベクトルエレメントが転送されているので、ベクト
ルエレメントＡｏ　ｒ　Ａ２　ｒ　Ａ３のみがリザルト
ベクトルレジスタＲ８Ｌ−０〜ＲＳ　Ｔ、　−２の最初
のアドレスに書込まれて残ることになる。しかし、リザ
ルトベクトルレジスタＲ８Ｌ−３の書込みアドレスは歩
進されないだめ、リザルトベクトルレジスタＲ８Ｌ−３
に転送されるべきデータとしてのベクトルエンメントＡ
３はリザルトベクトルンジスタＲ８Ｌ゛　−８の最初の
レジスタに書込まれるが、次のサイクルで書換えられて
しまうことになる。したがつて、第８図に示すようにリ
ザルトベクトルレジスタＲ８Ｌ−０〜Ｒ８Ｌ−２の最初
のアドレスにはベクトルエレメントＡｏ　、Ａｓ　、Ａ
ｓが格納される。6 and 7 are diagrams showing a circuit configuration diagram and logical values of the encoder 66 shown in FIG. 5, respectively. The output of the encoder 36 is the result vector register R8L-
0 to RS L -80 is supplied to the parallel vector register unit 1 as a step controller. The logical value shown in Figure 7 is n
= 4, but a similar encoder can be easily constructed even when n-f-4 by setting the logical values as follows. That is, the input mask data mQ-mn1 is added to the heel addition value m(1 to mn-1, and the number of qi1N included in a(,
Assign 1# from the Ω side to the left, and all the remaining numbers are I't.
As □I, the result is cyclically shifted to the right by the integrated value X (write rotation). The step control signals of the result vector registers R8L-0 to R8L'-8 obtained in this way are based on the mask data m(, -ml is LS l
#, % Q #%SI 11. %l#, and when the accumulated value At this time, since the vector element of the result vector has been transferred to that vector section, only the vector element Aor A2 r A3 is written to the first address of the result vector register R8L-0 to RST, -2. It will remain. However, the write address of result vector register R8L-3 is not incremented;
vector element A as data to be transferred to
3 is written to the first register of result vector register R8L-8, but it will be rewritten in the next cycle. Therefore, as shown in FIG. 8, the first addresses of the result vector registers R8L-0 to R8L-2 contain vector elements Ao, As, and A.
s is stored.

第１次サイクルにおいて各オペランドベクトルレジスタ
０ＰＲ−０〜０ＰＲ−８から次のアドレスのベクトルエ
レメントＡ４　＋　Ａｓ　ｐ　Ａｍ　ｒ　ＡＴが読出さ
れ、アライン回路２の入力ボート２０−０〜２０−８に
入力される。この場合には、マスクデータレジスタＭＳ
Ｋ−０〜ＭＳＫ−８からも同様にして次のデータ％　０
１００＃が読出されてマスクレジスタ６１に格納される
。積算回路６２のレジスタ６２２においては前回の積算
値Ｘが８でおって、マスクデータの値が％０１００＃で
あるため、信号線６２００上のアライン回路接続制御信
号の各成分ω０〜ω３は％８８８１　＃となシ、結果的
にはアライン回路２の出カポ−）２１−０〜２１−８に
はそれぞれＡ７　ｖ　Ａ７　＃　ＡＴ　＋ＡＢが出力さ
れる。一方、エンコーダ３るの出力は％Ｑ００１１とな
るため、リザルトベクトルレジスタＲＳ　Ｌ　−８のみ
にベクトルエレメントＡ５が書込まれ、その後に書込み
アドレスが歩進される。このとき、他のリザルトベクト
ルレジスタＲ８Ｌ−〇〜Ｒ８Ｌ−２にはベクトルエレメ
ントＡ７＊Ａｔ＋Ａ７が書込まれるが、アドレスの歩進
が行われないので次のサイクルで書換えられることにな
る。In the first cycle, the vector element A4 + As p Am r AT of the next address is read from each operand vector register 0PR-0 to 0PR-8, and is input to the input ports 20-0 to 20-8 of the align circuit 2. Ru. In this case, the mask data register MS
Similarly, the next data from K-0 to MSK-8 is % 0
100# is read and stored in mask register 61. In the register 622 of the integration circuit 62, the previous integration value As a result, A7 v A7 # AT +AB is output to the output ports 21-0 to 21-8 of the align circuit 2, respectively. On the other hand, since the output of encoder 3 is %Q0011, vector element A5 is written only to result vector register RS L -8, and then the write address is incremented. At this time, vector element A7*At+A7 is written to the other result vector registers R8L-0 to R8L-2, but since the address is not incremented, it will be rewritten in the next cycle.

第８図は、第５図に示すデコーダ６４１によって発生す
る情報を示す図でおる。第８図においてＸの部分は圧縮
変換の動作では使用しない接続であるため、ハードウェ
アで構成しやすいように設定すればよい。上の説明にお
いて、ｍ・に対応して電０〃の接続情報が割付けられ、
ｍｌに対応して−１〃の接続情報が割付けられ、ｍ２に
対応して気２〃の接続情報が割付けられ、ｍｌに対応し
ても３〃の接続情報が割付けられていた時に、マスクビ
ットｍＱ　ｔ　ＩＴＩＩ　ＨＩｎ２　ｒ　ｌ’ｎｌが′
Ｉ′であるような接続情報を左側から詰めて並べて配置
し、積算値に対応してサイクリックに右シフトしたもの
が上記接続の論理である。FIG. 8 is a diagram showing information generated by the decoder 641 shown in FIG. 5. In FIG. 8, the portion indicated by X is a connection that is not used in the compression/conversion operation, so it may be set so as to be easily configured in hardware. In the above explanation, the connection information of electric 0〃 is assigned corresponding to m.
When the connection information of -1 was assigned to ml, the connection information of 2 was assigned to m2, and the connection information of 3 was assigned to ml, the mask bit mQ t ITII HIn2 r l'nl'
The above connection logic is such that the connection information such as I' is arranged side by side from the left side and cyclically shifted to the right in accordance with the integrated value.

第９図はデコーダろ４１の構成例を示す回路図であシ、
第９図における気ｌｌは定数’　１　’を出力すること
を示す。ここで、第８図に示す論理値は第９図に示す回
路構成によって実現される。第９図はｎ　＝　４とした
場合の実施例でおって、ｎ≠４の場合においても容易に
構成することができる。FIG. 9 is a circuit diagram showing an example of the configuration of the decoder 41.
QIll in FIG. 9 indicates that a constant '1' is output. Here, the logical values shown in FIG. 8 are realized by the circuit configuration shown in FIG. FIG. 9 shows an embodiment in which n=4, and the structure can be easily constructed even in the case where n≠4.

例えば、ｎ　＝　２の時にはｍｌ）に対応した接続情報
として−Ｏｌを割付け、ｍｌに対応した接続情報として
Ｎ１Ｎを割付けてマスクビットｍＱ、ｍｌの内容が％ｌ
ｌであるような接続情報を左から並べて配置することに
よシ圧縮変換を行い得るようにアライン回路２の接続情
報を得ることができる。For example, when n = 2, -Ol is assigned as the connection information corresponding to ml, N1N is assigned as the connection information corresponding to ml, and the mask bit mQ and the contents of ml become %l.
By arranging connection information such as 1 from the left, connection information of the align circuit 2 can be obtained so that compression conversion can be performed.

例えば、ｎ＝５の時には同様にｍＱに対応した接続情報
として％　ＯＩを割付け、ｍｌに対応した接続情報とし
て亀】ｌを割付け、ｍ２に対応した接続情報として・２
Ｎを割付け、ｍｌに対応した接続情報として％８Ｎを割
付け、ｍ４に対応した接続情報として％４〃を割付けて
同様の操作を行えば、圧縮変換を行い得るデコーダ６４
１の論理が得られる。For example, when n=5, %OI is similarly assigned as the connection information corresponding to mQ, turtle]l is assigned as the connection information corresponding to ml, and 2 is assigned as the connection information corresponding to m2.
Decoder 64 that can perform compression conversion by allocating %N as connection information corresponding to ml, and %4〃 as connection information corresponding to m4 and performing the same operation.
1 logic is obtained.

第８図において、積算値Ｘに対応したシフタ３４２の出
力信号はデコーダ６４１からの出力信号（Ｘ＝０の時の
Ｇ＋６　＋　６’　１＋　ω２　’　ω＊　）を積算値
Ｘだけサイクリックに右へシフト（ライトローテート）
シたものである。この結果、タスクレジスタ３１に上記
マスクデータ’　１０１１　’が格納された場合には、
積算値Ｘ−０であるため、デコーダ６４１の出力はωｏ
−０．ω１＝２．ω２＝８、ω３＝３となってデコーダ
３４１は−０２８８’をアライン回路２に供給すること
になる。この結果、アライン回路２の各入力ポート２０
−０〜２０−３と各出力ボート２１−０〜２１−３との
間は上記のように接続され、出カポ−）２１−０〜２１
−８に接続された書込みデータバス２０００には結果的
にベクトルエンメン）Ａｏ　、Ａ：；　、Ａａ＋Ａ３が
データとして出力される。In FIG. 8, the output signal of the shifter 342 corresponding to the integrated value Shift (Light Rotate)
It's something new. As a result, when the mask data '1011' is stored in the task register 31,
Since the integrated value is X-0, the output of the decoder 641 is ωo
-0. ω1=2. Since ω2=8 and ω3=3, the decoder 341 supplies -0288' to the align circuit 2. As a result, each input port 20 of the align circuit 2
-0 to 20-3 and each output boat 21-0 to 21-3 are connected as described above.
As a result, the vector elements)Ao, A:;, Aa+A3 are output as data to the write data bus 2000 connected to the write data bus 2000 connected to the write data bus 2000 connected to the write data bus 2000 connected to the write data bus 2000 connected to the write data bus 2000 connected to the write data bus 2000 connected to the write data bus 2000 connected to the write data bus 2000 connected to the write data bus 2000 connected to the write data bus 2000 connected to the write data bus 2000 connected to the write data bus 2000.

以上のサイクルを次々に繰シ返すことによシ、第１図に
示すような圧縮変換が正しく実行筋れることになる。By repeating the above cycle one after another, the compression conversion shown in FIG. 1 can be executed correctly.

なお、本実施例においては同時に並列処理するデータの
数（ｎ）は主として４個として説明したが、これは一実
施例にすぎず、本発明は斯かる実施例に限定されるもの
ではない。In this embodiment, the number (n) of data to be processed in parallel is mainly four, but this is just one example, and the present invention is not limited to this example.

以上のように、本発明を採用するとベクトルデータの圧
縮変換を効率的に行うためのアライン回路および並列ベ
クトルレジスタ部に供給される制御信号を、比較的簡単
なハードウェア構成の圧縮変換制御回路により生成でき
る。As described above, when the present invention is adopted, the control signals supplied to the align circuit and parallel vector register section for efficiently compressing and converting vector data can be controlled by a compression conversion control circuit with a relatively simple hardware configuration. Can be generated.

（発明の効果）本発明には以上説明したように、複数個のオペランドベ
クトルレジスタと複数個のリザルトベクトルレジスタと
を備えて制御することにより、効率的な圧縮変換を実行
することができるという効果がある。(Effects of the Invention) As explained above, the present invention has the advantage that efficient compression conversion can be performed by providing and controlling a plurality of operand vector registers and a plurality of result vector registers. There is.

[Brief explanation of drawings]

第１図は、ベクトルエレメントの圧縮変換を説明するた
めの説明図である。第２図は、本発明によるベクトル処理装置の一実施例を
示すブロック図である。第８図は、第２図に示す並列ベクトルレジスタ部の詳細
を示すブロック図でめる。第４図は、第２図に示すアライン回路の詳細を示すブロ
ック図でちる。第５図は、第２図に示す圧縮変換制御回路の詳細を示す
ブロック図である。第６図は、第５図に示すエンコーダの詳細を示す回路図
である。第７図は、第５図および第６図に示すエンコーダに工っ
て得られたデータを示す図である。第８図は、第５図に示すデコーダによって得られる情報
を示す図でろる。第９図は、第８図に示す情報を実現するためのデコーダ
の回路構成例を示す回路図である。１・・・並列ベクトルレジスタ部２・・・アライン回路６・・・圧縮変換制御回路２０−０〜２０−３・・・入力ボート２１−０〜２１−３・・・出力ボート３１・・・マスクレジスタ６２・・・積Ｘ回路６６・・・エンコーダ６４１・・・デコーダ３４２・・・レジスタ６２１・・・加算器ＶＥ−０〜ＶＥ−８・・・ベクトルレジスタ部Ｍ８に−
０−Ｍ５Ｋ−、ｌ・・・マスクデータレジスタ０ＰＲ−０〜０ＰＲ−１・・・オペランドベクトルレジ
スタＴ？、５Ｌ−０〜Ｒ８Ｌ−３・・・リザルトベクトルレ
ジスタ２２．１０００．１３００，２０（）０．ろ０００゜６
２００・・・・・信号線お工びパヌ％計出願人　日本電気株式会社住理人弁理士井ノロ　壽 ′）ｖ６図 χ−Ｏχ＝Ｉ　Ｘ＝２　Ｘ＝３０００１　１００１）　ＤＩ　θｌ　００　ｔ　Ｏρ　
００１００＋　θ　１　θ　Ｏθ　θ　ｆ　θ　θ　０
０　ｉ　００　１７１θ　１　θｏ　ｒｏｏＯＤＩ　θ
　０　θ　ＯＴＯＤＯθ　１１　θθｏ　ｔｏｏ　θ　
ρ　ＴＯＤ　θθ　１０　００　θ　１０θ　ＩＩ　Ｉ
ＩＯＤ　ＯＩ　Ｉｏ　０１１１１１　Ｔ　ＯＯ＋ＤＩ　
０１　１１００　（Ｉ　ｔ　ＩＱ　θｉ＋Ｉ＋　、Ｉ　
０　（＋１１　θ　θ　Ｉ　ＩＩＩ０　θＩＩ　θ　θ
　Ｉ）ｌｌ　Ｉｆ）０１ρ＋Ｉ　θ　Ｉ　Ｉ　θ　θ　
０ＩＴＯθ　ＯＩＩ　Ｉ　ｏｏｔＩＯＩ　ＯＩ　（Ｏθ
　θ　ｔ　１　θ　θ　１１１＋１　ｔｉｌｌ　θ　Ｉ
ＩＩｌＩＤ　＋１１７　θ　ｏｔｔｏ　ｏ　θ　ＩＩ　
１００１θ　ＩＩＩ　ＩＩＩ　Ｏ０ｔｔＬ　／ｆｌｌｌ
ｌ　Ｉ　Ｌ　ｏ　ＩＴＯｔ＋　１１１１１　０　ｌｌｌ
　１ＫＩＩＩ　ｌ１ｌｔｔｃ＋　＋＋ｚｌ　７１＋１１
　ｌ７１１１　ＩＩ　Ｏ１１１１１７１１１０ＯｎＩ　
ｌｕｌ＋　１１　０１＋＋１１１Ｉ丁１１１１１１１１
１＋冨１１ス・８図 χ＝ＯＸ、＝Ｉ　Ｘ＝２　Ｘ＝３ｍｅ　７ｆｆｌ　ｔｉｌｌ　ＭＳ　＋ｆｆ＃　［１ｍ　
ｕｌ３　１Ｍ　ｕＪＴ　ｗｉ　ｄＪ　Ｗｅ　＋ｉ＃　Ｉ
ｆｆｚ　１１　ｍ　ｄｒ　ｕｌ　ｙ３θｏｏｏ　ｘにＸ
￥　ＸＸＸＸ　ＸＸＸＸ　ＸＸＸＸ６００ｔ　３ＸＸＸ
　ＸＪＸＸ　ＸＸｊＸ　ＸＸＸＪρ　σ　ＴＯｚｘＸＸ
　Ｘ２ｘｘ　ＸＸ２Ｘ　ＸＸＸ２θ　（１／Ｉ　ＺＪＸ
Ｘ　Ｘ２　３Ｘ　ＸＸ２．！　ＪＸＸ　２０ＩＬり０＋
ＸＸＸＸ＋ＸＸＸＸｌＸＸＸＸｌ０　ｌ　０　１　１ｊ
　ｙ、Ｘ　ＸＩ　３ｘ　ＸＸ１３　ＪＸＸＩＤＩｔｏ　
＋２ｘｙ＋　Ｘｌ２Ｘ　ＸＸ１２　２ｘＸＩＯＬ＋＋　
＋２３Ｘ　×　Ｉ２３　ＪＸＩ２　２ＪＸ１１００　θ
　θ　Ｘ’Ｘ％　Ｘ　θ　Ｙｘ　ＸＸ（ＩＸ　ＸＸＸＸ
６００ｔ　Ｄ　３　ｙｖ　Ｘ（ＩＪＸ　ｘＸＯ３３Ｘｙ
０１　θ　ｔ　σ　ｏｚｘ　Ｘ　Ｘ（１２Ｘ　ＸＸ　θ
　２　ＺＸｘ６’Ｏｌ　ｔ　６？３　Ｘ　ｘ　１）２３
　３ｘｏＺ　２ＪＸｌ１ＭＯＤ　（＋ｌＸＸ　Ｘ０ＴＸ
　ＸＸＯＩ　＋ＸＸ０１　ｆ　Ｏｔ　０１ＪＸ　Ｘｌ）
１３　ＪＸＩ７１　１３ＫＯＩＩＩ’０　０＋２Ｘ　Ｘ
ＤＴ２　２ＸＤ＋　＋ＺＸ０１１１１　０Ｉｚ５　５　
θ　ｔｚ　２３０１　ｒ２３０才９図FIG. 1 is an explanatory diagram for explaining compression conversion of vector elements. FIG. 2 is a block diagram showing an embodiment of a vector processing device according to the present invention. FIG. 8 is a block diagram showing details of the parallel vector register section shown in FIG. 2. FIG. 4 is a block diagram showing details of the align circuit shown in FIG. 2. FIG. 5 is a block diagram showing details of the compression conversion control circuit shown in FIG. 2. FIG. 6 is a circuit diagram showing details of the encoder shown in FIG. 5. FIG. 7 is a diagram showing data obtained by using the encoder shown in FIGS. 5 and 6. FIG. 8 is a diagram showing information obtained by the decoder shown in FIG. 5. FIG. 9 is a circuit diagram showing an example of a circuit configuration of a decoder for realizing the information shown in FIG. 8. 1... Parallel vector register section 2... Align circuit 6... Compression conversion control circuit 20-0 to 20-3... Input port 21-0 to 21-3... Output port 31... Mask register 62... Product X circuit 66... Encoder 641... Decoder 342... Register 621... Adder VE-0 to VE-8... To vector register section M8
0-M5K-, l...Mask data register 0PR-0 to 0PR-1... Operand vector register T? , 5L-0 to R8L-3... Result vector register 22.1000.1300, 20()0. Ro000゜6
200...Signal line machining PANU% meter Applicant: NEC Co., Ltd. Patent attorney Hisashi Inoro') v6 diagram χ-Oχ=I X=2 X=3 0001 1001) DI θl 00 t Oρ
00100+ θ 1 θ Oθ θ f θ θ 0
0 i 00 171θ 1 θo rooODI θ
0 θ OTODOθ 11 θθo too θ
ρ TOD θθ 10 00 θ 10θ II I
IOD OI Io 011111 T OO+DI
01 1100 (I t IQ θi+I+ , I
0 (+11 θ θ I III0 θII θ θ
I)ll If)01ρ+I θ I I θ θ
0ITOθ OII I ootIOI OI (Oθ
θ t 1 θ θ 111+1 till θ I
IIIID +117 θ otto o θ II
1001θ III III O0ttL /fllll
l I L o ITOt+ 11111 0 lll
1KIII l1lttc+ ++zl 71+11
l7111 II O1111171110OnI
lul+ 11 01++111I-d11111111
1+Full 11th・8Figure χ=OX,=I X=2 X=3 me 7ffl till MS +ff# [1m
ul3 1M uJT wi dJ We +i# I
ffz 11 m dr ul y3θooo x to X
¥ XXXX XXXX XXXX600t 3XXX
XJXX XXjX XXXJρ σ TOzxXX
X2xx XX2X XXX2θ (1/I ZJX
X X2 3X XX2. ! JXX 20IL 0+
XXXX+XXXXlXXXXl0 l 0 1 1j
y, X XI 3x XX13 JXXIDIto
+2xy+ Xl2X XX12 2xXIOL++
+23X × I23 JXI2 2JX1100 θ
θ X'X% X θ Yx XX(IX XXXXX
600t D 3 yv X (IJX xXO33Xy
01 θ t σ ozx X X (12X XX θ
2 ZXx6'Olt 6?3 X x 1) 23
3xoZ 2JXl1MOD (+lXX X0TX
XXOI +XX01 f Ot 01JX Xl)
13 JXI71 13KOIII'0 0+2X X
DT2 2XD+ +ZX01111 0Iz5 5
θ tz 2301 r230 years old 9 figures

Claims

[Claims]

Operand vector 7972 for reading multiple vector elements belonging to the same vector in one cycle
means, result vector register means for writing the plurality of vector elements belonging to the same vector during one cycle, corresponding to each element of the operand vector register means and the result vector register means, mask data register means for reading said plurality of mask elements during a cycle; read data bus means for reading said plurality of vector elements from said operand vector register means; and said result vector register means. write data bus means for writing said plurality of vector elements to said plurality of vector elements; align circuit means for selectively connecting said read data bus means and said write data bus means; an accumulation circuit means for accumulating -11 of the mask elements that have been read, and a write address for the result vector register means based on the accumulation value obtained from the accumulation circuit means and the read mask element; Vector compression conversion is performed by comprising an encoder means for generating a step signal and a decoder means for generating a connection control signal to the align circuit means based on the read mask element and the integrated value. A vector processing device characterized by being configured as follows.