JPH03220669A

JPH03220669A - Multiple loop vectorization compilation method

Info

Publication number: JPH03220669A
Application number: JP2016700A
Authority: JP
Inventors: Takayuki Nakatomi; 中富　孝幸
Original assignee: NEC Solution Innovators Ltd
Current assignee: NEC Solution Innovators Ltd
Priority date: 1990-01-26
Filing date: 1990-01-26
Publication date: 1991-09-27

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は電子計算機システムのコンパイラにおける多重
ループベクトル化コンパイル方式に関するものである。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a multi-loop vectorization compilation method in a compiler for an electronic computer system.

[Conventional technology]

記憶域上に規則的に並んでいるデータに対して一度に演
算を行うベクトル命令をもつベクトル処理プロセッザに
おいては、−船釣に目的プログラムのうちのベクトル命
令によって実行される部分の割合を大きくすればするほ
ど、プログラムの実行時間を短縮することができる。こ
れは、通常の命令（スカラ命令）を何回も繰り返さなけ
れば行えない演算を１個のベクトル命令で実行すること
ができるからである。従って、このようなヘクＩ・ル処
理プロセソザに対するコンパイラでは、与えられた原始
プログラムを可能な限りベクトル命令による並列実行可
能な形で目的プログラムに変換することが望まれる。In vector processing processors that have vector instructions that perform operations on data arranged regularly in a storage area at once, it is useful to increase the proportion of the part of the target program that is executed by vector instructions. The more you do this, the faster the program execution time can be. This is because a single vector instruction can perform an operation that cannot be performed without repeating a normal instruction (scalar instruction) many times. Therefore, it is desirable for a compiler for such a hexagonal processor to convert a given source program into a target program in a form that can be executed in parallel using vector instructions as much as possible.

ところで、ベクトル命令をもつベクＩ・ル処理プロセッ
サに対する従来のコンパイラは、一般に、高級言語で記
述された原始プログラムを読み込み構文解析を行って第
１中間テキストを生成する構文解析部と、第１中間テキ
ストから原始プログラム中のループ構造を検出してベク
トル化可能部分の認識を行いムク１〜ル処理用のテキス
トを含む第２中間テキストを生威するベクトル化処理部
と、第２中間テキストから目的プログラムを生成して出
力するコード生成部とから、その主要部が構成されてい
る。By the way, conventional compilers for vector I/L processors with vector instructions generally include a syntax analysis unit that reads a source program written in a high-level language and performs syntax analysis to generate a first intermediate text; A vectorization processing unit that detects a loop structure in a source program from text, recognizes a vectorizable part, and generates a second intermediate text containing text for Muk1-L processing; The main part consists of a code generation section that generates and outputs a program.

このような従来のコンパイラは、例えば第１０図に示す
ようなＦＯＲＴＲＡＮ言語によって記述されたＤＯルー
プを含む原始プログラムが与えられた場合、構文解析部
がこの原始プログラムを読み込んで第１１図に示すよう
な構成の第１中間テキスト（ステップ３１〜３４）を生
威し、次いで、ベクトル化処理部が第１中間テキストを
変形して第１２図に示すような構成の第２中間テキスト
（ステップＳ１１．３１２）を生威し、コード生成部が
第２中間テキストを読み込んで実際のコードから構成さ
れる目的プログラムを生成し出ノ〕するようにしていた
。For example, when such a conventional compiler is given a source program including a DO loop written in the FORTRAN language as shown in FIG. Next, the vectorization processing unit transforms the first intermediate text to produce a second intermediate text (steps S11 to S34) having the structure as shown in FIG. 312), and the code generation unit reads the second intermediate text and generates a target program consisting of actual code.

また、上記のような単一のループに限らず、多重ループ
であっても所定の条件が満たされれば一重化によるベク
トル化が可能であった。例えば、第１３図に示すような
多重ＤＯループ含む原始プログラムが与えられた場合、
第１４図に示すように、配列Ａ、　Ｂ、　　Ｃ，Ｄと要
素数を同しにした１次元の配列ＡＡ、ＢＢ、ＣＣ，ＤＤ
を新たに宣言し、ＥＱＵ　Ｉ　ＶＡＬＥＮＣＥ文により
配列Ａ。Further, vectorization by unifying is possible not only in a single loop as described above but also in multiple loops as long as a predetermined condition is satisfied. For example, if a source program including multiple DO loops as shown in Fig. 13 is given,
As shown in Figure 14, one-dimensional arrays AA, BB, CC, and DD have the same number of elements as arrays A, B, C, and D.
Newly declare array A using the EQU I VALENCE statement.

Ｂ、Ｃ，Ｄと配列ＡＡ、ＢＢ、ＣＣ，ＤＤとをそれぞれ
一致させることにより、多重ＤＯループ単一のＤＯルー
プ置き換え、この単一のＤＯループ対して第１０図〜第
１２図と同様な手法でベクトル化を行うものである。By matching B, C, and D with the arrays AA, BB, CC, and DD, respectively, the multiple DO loop can be replaced with a single DO loop, and the same steps as in FIGS. 10 to 12 can be performed for this single DO loop. This method performs vectorization.

[Problem to be solved by the invention]

上述したように、従来のコンパイラでは所定の条件のも
と、単一のループに限らず、多重ループであってもベク
トル化が可能であったが、−重化によってベクトル化で
きるための条件が厳しく、それを満たさないような場合
はベクトル化による実行時間短縮の効果が得られないと
いう欠点があった。As mentioned above, under certain conditions, conventional compilers can vectorize not only a single loop but also multiple loops. This is very strict, and if the conditions are not met, the shortcoming is that the effect of reducing execution time through vectorization cannot be obtained.

すなわち、多重ループについて一重化によるベクトル化
が行える条件としては、 ■ループ中に並列実行に矛盾するデータ依存関係がない
こと ■ループ中の配列要素の定義引用が記憶域上で等間隔で
あること ■ループ中の配列要素の定義引用が記憶域上で一方向で
あることである。なお、■は当然のこととして、■、■は１次元
化した配列（第１４図の例でばＡＡ、　ＢＢＣＣ，ＤＤ
）の要素を順次増減させて行くことから要求される条件
である。In other words, the conditions for vectorization by unification of multiple loops are: ■ There is no data dependency in the loop that contradicts parallel execution. ■ Definition references of array elements in the loop are equidistant on the storage area. ■Definition quotation of array elements in a loop is unidirectional in the storage area. Note that ■ is a matter of course, and ■ and ■ are one-dimensional arrays (in the example in Figure 14, AA, BBCC, DD).
) is required by sequentially increasing and decreasing the elements of

しかして、第４図に示すような多重Ｄｏループを含む原
始プログラムが与えられた場合、Ｄ○変数にの増分が「
２」であると共に、内側のＤＯループＤ○変数の終値が
その配列に許される最大値でないので、上記の■の条件
を満たさないため、この原始プログラムは一重化による
ベクトル化が行えず、最も内側のＤｏループについてだ
け、前述した第１０図〜第■２図と同様の手法でベクト
ル化が行えるに過ぎず、よって、生成された目的プログ
ラムを実行するベクトル処理プロセッサでは外側のルー
プ制御および配列の添字計算についてはスカラ命令で順
次実行することを余儀なくされていた。Therefore, when a source program including multiple Do loops as shown in Figure 4 is given, the increment in the D○ variable is "
2", and the final value of the inner DO loop D○ variable is not the maximum value allowed for that array, so the above condition (■) is not satisfied. Therefore, this source program cannot be vectorized by unification, and the most Only the inner Do loop can be vectorized using the same method as shown in Figs. The subscript calculations had to be executed sequentially using scalar instructions.

本発明は上記の点に鑑み提案されたものであり、その目
的とするとごろは、」二記の■、■の条件が満たされれ
ば、■の条件が満たされていない場合であっても一重化
によるベクトル化が行え、より一層の実行時間短縮を達
成することのできる多重ループベクトル化コンパイル方
式を提供することにある。The present invention has been proposed in view of the above-mentioned points, and its purpose is that if the conditions ``■'' and ``2'' are satisfied, even if the condition ``■'' is not satisfied, The object of the present invention is to provide a multi-loop vectorization compilation method that can perform vectorization by converting and achieve further reduction in execution time.

[Means to solve the problem]

本発明は上記の目的を達成するため、高級言語で記述さ
れた原始プログラムを読み込み構文解析を行って第１中
間テキストを生成する構文解析部と、第１中間テキスト
から原始プログラム中のループ構造を検出してベクトル
化可能部分の認識を行いベク（・ル処理用のテキストを
含む第２中間テキストを生成するベクトル化処理部と、
第２中間テキストから目的プログラムを生成して出力す
るコード生成部とを有し、ベクトル処理プロセッサに対
して、与えられた原始プログラムから目的プログラムを
生成して出力するコンパイル方式において、前記ベクトル化処理部に、第１中間テキストから原始プログラムのループ中の制御
の流れを解析する構造解析手段と、ループ中に並列実行
に矛盾するデータ依存関係があるか否かを判定するデー
タ依存関係判定手段と、並列実行に矛盾しないと判定された部分につき通常のも
しくはマスク使用による多重ループの−・重化が可能か
否かを解析して判定する多重ループ−重化解析手段と、一重化が可能と判定された部分および並列実行に矛盾し
ないと判定された他の部分を一重化およびもしくはベク
トル化して第２中間テキストを生成するベクトルテキス
ト生成手段とを設けるようにしている。In order to achieve the above object, the present invention includes a syntax analysis unit that reads a source program written in a high-level language and performs syntax analysis to generate a first intermediate text, and a syntax analysis unit that generates a first intermediate text from the first intermediate text. a vectorization processing unit that detects and recognizes vectorizable portions and generates a second intermediate text including text for vector processing;
a code generation unit that generates and outputs a target program from a second intermediate text, and generates and outputs a target program from a given source program to a vector processing processor, the vectorization process A structure analysis means for analyzing the flow of control in a loop of the source program from the first intermediate text; and a data dependency relationship determination means for determining whether there is a data dependency relationship inconsistent with parallel execution in the loop. , a multiple loop-overlapping analysis means for analyzing and determining whether or not multiple loops can be overlapped normally or by using a mask for parts that are determined to be consistent with parallel execution; Vector text generation means is provided for generating a second intermediate text by unifying and/or vectorizing the determined portion and other portions determined to be compatible with parallel execution.

[Effect]

本発明の多重ループベクトル化コンパイル方式にあって
は、構文解析部の生成した第１中間テキストに対し、ベ
クトル化処理部の構造解析手段が原始プログラムのルー
プ中の制御の流れを解析し、データ依存関係判定手段が
ループ中に並列実行に矛盾するデータ依存関係があるか
否かを判定し、並列実行に矛盾しないと判定された部分
につき多重ループ−重化解析手段が通常のもしくはマス
ク使用による多重ループの一重化が可能か否かを解析し
て判定し、−重化が可能と判定された部分および並列実
行に矛盾しないと判定された他の部分をベクトルテキス
ト生成手段が一重化およびもしくはベクトル化して第２
中間テキストを生成し、次いで、この第２中間テキスト
からコード生成部が目的プログラムを生成して出力する
。In the multiple loop vectorization compilation method of the present invention, the structure analysis means of the vectorization processing section analyzes the control flow in the loop of the source program with respect to the first intermediate text generated by the syntax analysis section, and The dependency determination means determines whether or not there is a data dependency that contradicts parallel execution in the loop, and the multiple loop-duplication analysis means determines whether or not there is a data dependency that is inconsistent with parallel execution in the loop. Analyze and determine whether it is possible to combine multiple loops, and - vector text generation means unify and/or Vectorized and second
An intermediate text is generated, and then a code generation section generates and outputs a target program from this second intermediate text.

〔Example〕

以下、本発明の実施例につき図面を参照して説明する。 Embodiments of the present invention will be described below with reference to the drawings.

第１図は本発明の多重ループベクトル化コンパイル方式
を適用したコンパイラの一実施例を示す構成図である。FIG. 1 is a block diagram showing an embodiment of a compiler to which the multiple loop vectorization compilation method of the present invention is applied.

第１図において、コンパイラ２は、基本的な構成として
、高級言語で記述された原始プログラム１を読み込み構
文解析を行って第１中間テキスト２４を生成する構文解
析部２１と、第１中間テキスト２４から原始プログラム
１中のループ構造を検出してベクトル化可能部分の認識
を行いベクトル処理用のテキストを含む第２中間テキス
ト２５を生成するベクトル化処理部２２と、第２中間テ
キスト２５から目的プログラム３を生成して出力するコ
ード生成部２３とを含んでいる。In FIG. 1, the compiler 2 basically includes a syntax analysis unit 21 that reads a source program 1 written in a high-level language and performs syntax analysis to generate a first intermediate text 24; a vectorization processing unit 22 that detects loop structures in the source program 1 from the source program 1 and recognizes vectorizable portions to generate a second intermediate text 25 that includes text for vector processing; 3 and a code generation section 23 that generates and outputs the code.

また、本発明の特徴部分として、ベクトル化処理部２２
には、第１中間テキスト２４から原始プログラム１のル
ープ中の制御の流れを解析する構造解析手段２２１と、
ループ中に並列実行に矛盾するデータ依存関係があるか
否かを判定するデータ依存関係判定手段２２２と、デー
タ依存関係判定手段２２２により並列実行に矛盾しない
と判定された部分につき通常のもしくはマスク使用によ
る多重ループの一重化が可能か否かを解析して判定する
多重ループ−重化解析手段２２３と、多重ループ−重化
解析手段２２３により一重化が可能と判定された部分お
よびデータ依存関係判定手段２２２により並列実行に矛
盾しないと判定された他の部分を一重化およびもしくは
ベクトル化して第２中間テキストを生成するベクトルテ
キスト生成手段２２４とが設けられている。Further, as a characteristic part of the present invention, the vectorization processing unit 22
a structure analysis means 221 for analyzing the control flow in the loop of the source program 1 from the first intermediate text 24;
A data dependency relationship determining means 222 determines whether there is a data dependency relationship inconsistent with parallel execution in the loop, and normal or mask use is performed for the portion determined by the data dependency relationship determining means 222 to be consistent with parallel execution. A multiple loop-duplication analysis means 223 that analyzes and determines whether or not multiple loops can be unified by the multiple loop-duplication analysis means 223, and portions and data dependence relationships determined by the multiple loop-duplication analysis means 223 to be unified. Vector text generation means 224 is provided which generates a second intermediate text by unifying and/or vectorizing other portions determined by means 222 to be compatible with parallel execution.

第２図は構造解析手段２２１において多重ループの構造
解析の結果である解析情報を表現するのに用いる多重Ｄ
○ループ情報テーブル４の論理的構成を示したものであ
り、次の多重Ｄｏ小ループ報テーブルへのポインタ４１
と、Ｄ○ループネストチェーン４２と、Ｄ○ループ−重
化テーブルへのポインタ４３と、配列テーブルへのポイ
ンタ４４と、ベクトルテキストへのポインタ４５と、Ｄ
Ｏ変数のシンボルテーブルへのポインタ４Ｇと、Ｄ○ル
ープ初期値のトライアト４７と、Ｄｏ小ループ値の１−
ライアト４８と、ＤＯループ分値のトライアト４９とか
ら構成されている。FIG. 2 shows the multiplex D used in the structural analysis means 221 to express analysis information that is the result of structural analysis of multiple loops.
○ This shows the logical configuration of the loop information table 4, and a pointer 41 to the next multiple Do small loop information table.
, a D○ loop nest chain 42, a pointer to the D○ loop-duplication table 43, a pointer to the array table 44, a pointer to the vector text 45, and D
Pointer 4G to the symbol table of the O variable, triat 47 of the D○ loop initial value, and 1- of the Do small loop value.
It consists of a triat 48 and a triat 49 for DO loop values.

第３図は多重ループ−重化解析手段２２３において一重
化が可能な多重ループを解析した結果の解析情報を表現
するのに用いるＤｏループ一重重化上１−ブル５の論理的構成を示したものであり、最内側の
多重Ｄ○ループ情報テーブルへのポインタ５１と、マス
ク用配列テーブルへのポインタ５２と、−重化後のＤ○
ループ初期値のトライアト５３と、−重化後のＤＯルー
プ終値のトライアト５４と、−重化後のＤＯループ分値
のトライアト５５と、−重化後のＤｏ小ループ繰り返し
数５６とから構成されている。FIG. 3 shows the logical configuration of the Do-loop-multiplexing method 1-Blu 5 used to express the analysis information of the result of analyzing multiple loops that can be combined in the multiple-loop-multiplexing analysis means 223. A pointer 51 to the innermost multiple D○ loop information table, a pointer 52 to the mask array table, and
It is composed of a triat 53 of the loop initial value, - a triat 54 of the DO loop final value after duplication, - a triat 55 of the DO loop value after duplication, and - a Do small loop repetition number 56 after duplication. ing.

以下、従来では多重ループの一重化によるベクトル化が
行えなかった第４図に示す原始プログラムが与えられた
場合を例にとって動作を説明する。The operation will be described below by taking as an example the case where the source program shown in FIG. 4 is given, which conventionally could not be vectorized by unifying multiple loops.

具体的な動作に先立って、本発明による多重ループの一
重化の手法を概念的に説明する。Prior to specific operations, a method of multiplexing multiple loops according to the present invention will be conceptually explained.

すなわち、本発明では、第４図の原始プログラムを第５
図に示すような２つの多重ＤＯループ含む形にし、最初
の多重Ｄｏ小ループ元の原始プログラムにおける多重Ｄ
Ｏループ同しＤｏ変数を用い、その最内側ではマスク用
配列Ｗに値「１」を代入するだけの命令を置き、次の多
重ＤＯループはＤ○変数の増分を全て「１」とすると共
に、２内側のＤ○変数のＰ、（Ｉｉ！をその配列で許される最
大値とし、そのループの最内側でマスク用配列Ｗの要素
が例えば「１」である場合にのみ元の原始プログラム中
におけると同し定義引用を行わせる。That is, in the present invention, the source program shown in FIG.
Contain two multiple DO loops as shown in the figure, and the first multiple DO small loop
The O loop uses the same Do variable, and at its innermost position there is an instruction that simply assigns the value "1" to the mask array W, and the next multiple DO loop sets all the increments of the D○ variable to "1", and , 2 Let P of the inner D○ variable, (Ii! be the maximum value allowed in the array, and only if the element of the mask array W at the innermost part of the loop is "1", for example, in the original source program Make the same definition citation as in .

つまり、第４図の原始プログラムにおける多重ＤＯルー
プが一重化できなかったのは、Ｄｏ変＠Ｋが増分「２」
で変化するため、「ループ中の配列要素の定義引用が記
憶域上で等間隔でなければならない」という−重化の条
件に合致しないためであったので、Ｄ○変数にの増分を
強制的に「１」にすると共に、内側のＤｏ小ループＤｏ
変数の終値をその配列で許される最大値としてしまうの
である。ただし、そのままでは、Ｄ○変変数部「２」「
４」、・・・等およびＤｏ変数Ｉ、ＪがＩＴ、ＪＪより
も大きいもの等の、プログラム作成者の意図していない
配列要素についても定義引用が行われてしまうこととな
るため、これを防止するために元の原始プログラムと同
じＤ○変数の多重ＤＯループ残し、その多重ループの中
で演算を実際に行うべきことを示す情報をマスク用配列
Ｗに書き込み、続く多重ＤＯループおいてマスク用配列
Ｗの要素を参照し、その値が「１」である場合にのみ定
義引用を行うようにしている。In other words, the reason why the multiple DO loops in the source program in Figure 4 could not be unified is because the Do variation @K is incremented by ``2''.
This was because it did not meet the duplication condition that ``the definition references of array elements in the loop must be at equal intervals in the storage area'', so the increment was forced to the D○ variable. In addition to setting it to "1", the inner Do small loop Do
The final value of the variable is set to the maximum value allowed in the array. However, if left as is, D○ variable part "2""
4", etc. and Do variables I and J are larger than IT and JJ, definitions will also be quoted for array elements that are not intended by the program creator. To prevent this, leave multiple DO loops with the same D○ variables as in the original source program, write information indicating that the operation should actually be performed in the multiple loops to the masking array W, and mask it in the subsequent multiple DO loops. The element of the data array W is referenced, and definition citation is performed only when the value is "1".

そして、第５図のように変形されたもののうち、２番目
の多重ＤＯループ、第１３図および第１４図で説明した
のと同様の手法で一重化する。この状態を第６図に示す
。そして、この−重化したＤ○ループを第１０図〜第１
２図で説明したのと同様の手法でベクトル化する。また
、１番目の多重ＤＯループついても、最内側のＤｏ小ル
ープついて第１０図〜第１２図で説明したのと同様の手
法でベクトル化できる。なお、実際の動作では第４図の
原始プログラムの解析結果に基づいて直接に該当する多
重Ｄ○ループ部分の一重化およびベクトル化が行われる
ため、第５図および第６図のような形に変換された状態
が存在するわけではない。Then, among those modified as shown in FIG. 5, the second multiplex DO loop is unified using the same method as explained in FIGS. 13 and 14. This state is shown in FIG. Then, this -duplicated D○ loop is shown in Figures 10 to 1.
Vectorization is performed using the same method as explained in Figure 2. Further, the first multiple DO loop can also be vectorized using the same method as that described for the innermost Do small loop with reference to FIGS. 10 to 12. In addition, in actual operation, the corresponding multiple D○ loop parts are unified and vectorized based on the analysis results of the source program shown in Figure 4, so the result is as shown in Figures 5 and 6. There is no transformed state.

次に、上記の例につき、第１図の実施例の各手段による
動作を説明する。Next, the operation of each means of the embodiment shown in FIG. 1 will be explained with respect to the above example.

先ず、原始プログラム１が与えられてコンパイラ２が起
動されると４、構文解析部２１は原始プログラム１を読
み込み、構文解析を行って第（中間テキスト２４を生成
する。第４図の原始プログラム１に刻しては、第７図に
示すような構成の第１中間テキスト２４　（ステップ２
４０１〜２４１２）が生成される。First, when the source program 1 is given and the compiler 2 is activated, the syntax analysis unit 21 reads the source program 1, performs syntax analysis, and generates the intermediate text 24.The source program 1 shown in FIG. The first intermediate text 24 (step 2) having the structure shown in FIG.
401 to 2412) are generated.

次いで、ベクトル化処理部２２の構造解析手段２２１は
第１中間テキスト２４を読み込み、ループを認識してそ
の制御の流れを解析する。具体的には次のような処理を
行う。Next, the structure analysis means 221 of the vectorization processing unit 22 reads the first intermediate text 24, recognizes loops, and analyzes the flow of control. Specifically, the following processing is performed.

■第１中間テキスト２４を分岐を単位としたブロックに
分割する。ここで、分岐はループの出口もしくは人口に
相当する。第７図の第１中間テキスト２４の場合は、ス
テップ２４０２．２４０４．２４０６で分割される。(2) Divide the first intermediate text 24 into blocks with branches as units. Here, the branch corresponds to the exit or population of the loop. In the case of the first intermediate text 24 in FIG. 7, it is divided in steps 2402, 2404, and 2406.

■ループ部分を文単位のフ１コックに分割する。■Divide the loop part into sentence units.

■プログラム全体の制御の流れを解析して各ブロックの
関係を求める。■Analyze the control flow of the entire program and find the relationship between each block.

■各々のブロックで定義引用されている配列および変数
に対して、ブロックへの人出情報を■５収集する。■5 Collect information on the number of people going to the block for the arrays and variables whose definitions are cited in each block.

なお、上記の解析結果は第２図に示した多重り。The above analysis results are multiplexed as shown in Figure 2.

ループ情報テーブル４を用いて表現される。例えば、第
７図の第１中間テキスト２４の各ＤＯループ情報は第８
図に示すように、多重ＤＯループ報テーブルへのポイン
タ６から始まる多重ＤＯループ報テーブル４．ａ、４ｂ
、４．ｃのチェーンおよびその内容として表現される。It is expressed using loop information table 4. For example, each DO loop information of the first intermediate text 24 in FIG.
As shown in the figure, the multiple DO loop report table 4.4 starts from pointer 6 to the multiple DO loop report table. a, 4b
,4. It is expressed as a chain of c and its contents.

ただし、この時点では第８図中のＤＯループ重化テーブ
ル５ａは付加されていない。However, at this point, the DO loop overlapping table 5a in FIG. 8 has not been added.

次いで、第１図において、データ依存関係判定手段２２
２は解析情報を利用し、ループ中に並列実行に矛盾する
データ依存関係があるか否かを判定する。第４図の原始
プログラムＩについて作威された第８図の解析情報から
は、並列実行に矛盾しないものと判定される。Next, in FIG. 1, the data dependency relationship determining means 22
Step 2 uses analysis information to determine whether there is a data dependency in the loop that is inconsistent with parallel execution. From the analysis information shown in FIG. 8 created for the source program I shown in FIG. 4, it is determined that there is no contradiction in parallel execution.

次いで、多重ループ−重化解析手段２２３はデータ依存
関係判定手段２２２によって並列実行に矛盾しないと判
定された部分につき多重ループの一重化が可能か否かを
解析して判定する。具体的６には次の処理を行う。Next, the multiple loop/duplication analysis means 223 analyzes and determines whether or not it is possible to unify multiple loops for the portion determined by the data dependency relationship determination means 222 to be compatible with parallel execution. For example 6, perform the following processing.

■Ｄ○ループ内で定義引用されている配列・変数に対し
て定義引用関係が全て矛盾していないものを候補として
取り出す。■D○ For arrays and variables whose definitions are cited in the loop, those whose definition citation relationships are consistent are extracted as candidates.

■その多重Ｄｏ小ループ対応する多重ＤＯループ報テー
ブル４のＤ○変数情報４７．４８゜４９より、配列・変
数の定義引用が多重Ｄ○ループを通して記憶域上で一つ
の方向性をもつか否かを調べる。第４図の原始プログラ
ム１について作威された第８図の解析情報からは、一つ
の方向性をもつものと判定される。■ From the D○ variable information 47.48゜49 of the multiple DO loop information table 4 corresponding to the multiple Do small loop, whether or not the array/variable definition quotation has one directionality in the storage area through the multiple D○ loop. Find out. From the analysis information shown in FIG. 8 created for the source program 1 shown in FIG. 4, it is determined that it has one directionality.

■一つの方向性をもつ場合には、多重Ｄ○ループを一重
化できると判定し、ＤＯループ重化テーブル５を作成す
る。今の例では、第８図においてＤｏ小ループ重化テー
ブル５ａが作威され、チェーニングされる。(2) If there is one directionality, it is determined that the multiple DO loops can be unified, and a DO loop duplication table 5 is created. In the present example, the Do small loop overlap table 5a is created and chained in FIG.

■ＤＯループー重化テーブル５ａが作成された後、その
多重Ｄｏ小ループ外側のＤｏ小ループＤ○変数の値が変
化する時に、配列要素の定義引用の記憶域上の位置の増
減の大きさが一定であるか否かを調べる。すなわち、ル
ープ中の配列要素の定義引用が記憶域上で等間隔である
か否かを調べる。第８図の解析情報からは、最外側のＤ
ＯループＤｏ変数の増分が「２」であるため、増減の大
きさは一定でないと判定される。■ After the DO loop-duplication table 5a is created, when the value of the Do small loop D○ variable outside the multiple Do loop changes, the magnitude of the increase/decrease in the storage location of the definition quotation of the array element is Check whether it is constant. That is, it is checked whether the definition references of array elements in the loop are evenly spaced in the storage area. From the analysis information in Figure 8, the outermost D
Since the increment of the O-loop Do variable is "2", it is determined that the magnitude of the increase/decrease is not constant.

■増減の大きさが一定でないと判定された場合、マスク
用配列の領域を確保し、その情報をマスク用配列テーブ
ル（図示せず）として作威し、Ｄ○ループ一重化テーブ
ル５にチェ一二ソグする。第８図の解析情報では、ＤＯ
ループ重化テーブル５ａにマスク用配列テーブルがチェ
一二ソグされる。■If it is determined that the magnitude of increase/decrease is not constant, secure a mask array area, use that information as a mask array table (not shown), and check the D○ loop unification table 5. Two sogs. In the analysis information in Figure 8, DO
A mask array table is checked into the loop overlapping table 5a.

■マスク用配列テーブルがチェーニングされた場合、−
重化後のＤ○変数の繰り返し数等の情報をＤｏ小ループ
重化テーブル５に設定する。第８図の解析情報では、−
重化後のＤＯループ数の繰り返し数はｒ　１．　ＯＯ＊
　５０　＊ＫＫＪとなる。すなわち、繰り返し数は内側
のＤ○変数のそれぞれの最大値と最り（イリリのＤ○変
数の終値とを掛は合わせたものとなる。■If the mask array table is chained, -
Information such as the number of repetitions of the D○ variable after duplication is set in the Do small loop duplication table 5. In the analysis information in Figure 8, −
The number of repetitions of the DO loop after overlapping is r1. OO*
50 * Becomes KKJ. That is, the number of repetitions is the sum of the maximum value of each inner D○ variable multiplied by the final value of the final D○ variable.

次いで、第１図において、ベクトルテキスト生成手段２
２４は多重ループ−重化解析手段２２３により一重化が
可能と判定された部分およびデータ依存関係判定手段２
２２により並列実行に矛盾しないと判定された部分を解
析情報を用いて一重化およびもしくはベクトル化して第
２中間テキスト２５を生成する。具体的には次のような
処理を行う。Next, in FIG. 1, vector text generation means 2
24 is a multiple loop - a portion determined to be possible to be unified by the duplication analysis means 223 and the data dependency relationship determination means 2
The second intermediate text 25 is generated by unifying and/or vectorizing the portion determined by 22 to be consistent with parallel execution using the analysis information. Specifically, the following processing is performed.

■並列実行可能部分をベクトル処理するために必要とな
るベクトル長設定用のテキストを生成する。例えば、第
８図の解析情報からは、通常の並列実行可能部分はなく
、マスク用配列を使うことにより一重化可能な多重ＤＯ
ループ１組あることが判明するため、その多重ＤＯルー
プ一重化に必要なベクトル長設定用のテキストとして、
第９図中のステップ２５０５およびステップ２５１ｏに
示すようなテキストを生成する。なお、ステップ２５０
５はマスク設定用の多重ＤＯループ最内９９側のＤＯループベクトル化する際に必要となるものであ
り、そのベクトル長はＤｏ変数■の終値であるｒｌ　ｌ
となる。ステップ２５１０は一重化されたＤＯループベ
クトル化する際に必要となるものであり、そのベクトル
長は一重化後のＤＯループ繰り返し数であるｒｌｏＯ＊
５０＊ＫＫｊとなる。■Generate text for vector length settings required for vector processing of parts that can be executed in parallel. For example, from the analysis information in Figure 8, there is no part that can be executed in parallel, but multiple DOs that can be unified by using a mask array.
Since it turns out that there is one set of loops, the text for setting the vector length required for unifying the multiple DO loops is as follows:
Text as shown in step 2505 and step 251o in FIG. 9 is generated. Note that step 250
5 is necessary when converting the innermost DO loop for mask setting into a DO loop vector, and the vector length is rl l which is the final value of the Do variable ■.
becomes. Step 2510 is necessary when converting the unified DO loop into a vector, and the vector length is rloO*, which is the number of DO loop repetitions after unification.
It becomes 50*KKj.

■通常の並列実行可能部分に対し、ベクトル処理用のテ
キストを生成する。第８図の解析情報からは通常の並列
実行可能部分はないことが判明するので、この処理は行
わない。■Generate text for vector processing for normal parallel executable parts. From the analysis information in FIG. 8, it is clear that there is no part that can be executed in parallel, so this process is not performed.

■−一重化可能多重ＤＯループ対し、その処理用のテキ
ストを生成する。マスク用配列を使うことにより一重化
可能な多重ＤＯループある場合、この処理は次のように
行う。(2) - Generate text for processing of the multiplexable DO loop. When there are multiple DO loops that can be combined by using a masking array, this process is performed as follows.

（ａｌマスク設定用の多重ＤＯループ外側のスカラ命令
による部分のテキストを生成すると共に、マスク用配列
に所定値を代入するテキストを生成する。今の例におい
ては、外側のスカラ命令として、第９図のステップ２５
０１０〜２５０４，２５０７．２５０８に示すテキストを生成
すると共に、ベクトル化された最内側のＤ○変数の終値
を保証するためにステップ２５０９に示すテキストを生
成する。また、マスク用配列に所定値を代入するテキス
トとしてステップ２５０６に示すテキストを生成する。(Al Generates the text of the part by the scalar instruction outside the multiple DO loop for setting the mask, and also generates the text for assigning a predetermined value to the mask array. In the present example, as the outer scalar instruction, the 9th Step 25 in the diagram
The text shown in steps 010 to 2504, 2507, and 2508 is generated, and the text shown in step 2509 is generated to guarantee the final value of the vectorized innermost D○ variable. Further, the text shown in step 2506 is generated as the text for substituting a predetermined value into the masking array.

ｆｂｌマスク用配列配列要素がいくつの時にマスクをオ
ンとするかというマスク情報を設定するためのテキスト
を生成する。今の例においては、第９図のステップ２５
１■に示すテキストを生成する。Generates text for setting mask information indicating how many fbl mask array elements the mask should be turned on. In the current example, step 25 in Figure 9
1) Generate the text shown in ■.

（Ｃｌマスクがオンの時に所定の定義引用を行うテキス
トを生成する。今の例においては、第９図のステップ２
５１２〜２５１４に示すテキストを生成する。(When the Cl mask is on, a text is generated that quotes the predetermined definition. In this example, step 2 in Figure 9 is used.
The texts shown in 512-2514 are generated.

■次に、ムク１〜ルテキスト生成手段２２４は、通常の
並列実行可能部分をベクトル処理するために必要となる
後処理用のテキストを生成する。今の例の場合は該当す
るものがないのでこの処理は行わない。(2) Next, the text generation means 224 generates text for post-processing, which is necessary for performing vector processing on the part that can be executed in parallel. In the case of the current example, there is no corresponding item, so this process is not performed.

次に、コード生成部２３は上記のようにして生成された
第２中間テキスト２５を読み込み、対応する機械語のコ
ードによる目的プログラム３を生成する。Next, the code generation unit 23 reads the second intermediate text 25 generated as described above, and generates the object program 3 using the corresponding machine language code.

〔Effect of the invention〕

以上説明したように、本発明の多重ループベクトル化コ
ンパイル方式にあっては、ループ中に並列実行に矛盾す
るデータ依存関係がないこととループ中の配列要素の定
義引用が記憶域上で一方向であることとが満たされれば
、ループ中の配列要素の定義引用が記憶域上で等間隔で
なくても配列の定義引用の部分をループから出してベク
トル化できるため、多重ループの全体を一重化してベク
トル化する場合と同等とまではいかないが、最も内側の
ループのみをベクトル化する場合に比して大幅な実行時
間短縮の効果が期待できる。As explained above, in the multiple loop vectorization compilation method of the present invention, there is no data dependency in the loop that contradicts parallel execution, and definition citation of array elements in the loop is unidirectional on the storage area. If the above is satisfied, even if the definition references of array elements in the loop are not evenly spaced in the storage area, the part of the array definition reference can be taken out of the loop and vectorized, so the entire multiple loop can be vectorized. Although it is not equivalent to vectorizing only the innermost loop, a significant reduction in execution time can be expected compared to vectorizing only the innermost loop.

[Brief explanation of drawings]

第１図は本発明の多重ループベクトル化コンパイル方式
を適用したコンパイラの一実施例を示す構成図、第２図はＤＯループ解析情報の表現に用いる多重ＤＯル
ープ報テーブルの論理的構成図、第３図はＤＯ小ループ
解析情報の表現に用いるＤｏループ−重化テーブルの論
理的構成図、第４図は多重ＤＯループ含む原始プログラ
ムの例を示す図、第５図は第４図の原始プログラムを一重化可能な形へ変
換した状態を原始プログラムの形式で示した概念図、第６図は第５図の原始プログラムの一重化後の状態を原
始プログラムの形式で示した概念図、第７図は第４図の
原始プログラムを構文解析して生成した第１中間テキス
トの構成の概念図（流れ図）、第８図は第４図の原始プログラム中のＤｏループの解析
情報の表現の例を示す図、第９図は第７図の第１中間テキストをベクトル化して生
成した第２中間テキストの構成の概念図、第１０図はＤ
Ｏ小ループ含む原始プログラムの３例を示す図、第１１図は第１０図の原始プログラムを構文解析して生
成した第１中間テキストの構成の概念図、第１２図は第
１１図の第１中間テキストをベクトル化して生成した第
２中間テキストの構成の概念図、第１３図は多重ＤＯループ含む原始プログラムの例を示
す図および、第１４図は第１３図の原始プログラムの一重化後の状態
を原始プログラムの形式で示した図である。図において、１・・・・・・・・・原始プログラム２・・・・・・・・・コンパイラ２１・・・・・・構文解析部２２・・・・・・ベクトル化処理部２２１・・・構造解析手段２２２・・・データ依存関係判定手段２２３・・・多重ループ−重化解析手段２２４・・・ベ
クトルテキスト生成手段４２３・・・・・・コード生底部２４・・・・・・第１中間テキスト２５・・・・・・第２中間テキスト３・・・・・・・・・目的プログラム４・・・・・・・・・多重ＤＯループ報テーブル５・・
・・・・・・・Ｄ○ループ一重化テーブル６・・・・・
・・・・ポインタ1 is a block diagram showing an embodiment of a compiler to which the multiple loop vectorization compilation method of the present invention is applied; FIG. 2 is a logical block diagram of a multiple DO loop information table used to express DO loop analysis information; Figure 3 is a logical configuration diagram of the Do loop-duplication table used to express DO small loop analysis information, Figure 4 is a diagram showing an example of a source program that includes multiple DO loops, and Figure 5 is the source program of Figure 4. Figure 6 is a conceptual diagram showing the state of the source program in Figure 5 after being unified in the form of a source program. The figure is a conceptual diagram (flow chart) of the structure of the first intermediate text generated by parsing the source program in Figure 4. Figure 8 is an example of the representation of analysis information of the Do loop in the source program in Figure 4. Figure 9 is a conceptual diagram of the structure of the second intermediate text generated by vectorizing the first intermediate text in Figure 7, and Figure 10 is a
Figure 11 is a conceptual diagram of the structure of the first intermediate text generated by parsing the source program in Figure 10. Figure 12 is the first intermediate text in Figure 11. A conceptual diagram of the structure of the second intermediate text generated by vectorizing the intermediate text, Figure 13 is a diagram showing an example of a source program including multiple DO loops, and Figure 14 is an example of the source program in Figure 13 after being unified. It is a diagram showing the state in the form of a source program. In the figure, 1... Source program 2... Compiler 21... Syntax analysis section 22... Vectorization processing section 221...・Structure analysis means 222...Data dependency determination means 223...Multiple loop-overlapping analysis means 224...Vector text generation means 4 23...Code raw bottom part 24... First intermediate text 25...Second intermediate text 3...Object program 4...Multiple DO loop information table 5...
......D○ loop unification table 6...
...pointer

Claims

[Claims]

(1) A syntactic analysis unit that reads a source program written in a high-level language and performs syntax analysis to generate a first intermediate text, and a syntax analysis unit that detects loop structures in the source program from the first intermediate text and converts them into vectorizable parts. It has a vectorization processing unit that performs recognition and generates a second intermediate text including text for vector processing, and a code generation unit that generates and outputs a target program from the second intermediate text, and has a vectorization processing unit that generates a target program from the second intermediate text and outputs it. , in a compilation method that generates and outputs a target program from a given source program, the vectorization processing unit includes a structure analyzer that analyzes the control flow in a loop of the source program from a first intermediate text, and a structure analyzer that analyzes a flow of control in a loop of the source program from a first intermediate text; A data dependency relationship determining means for determining whether there is a data dependency relationship that is inconsistent with parallel execution, and whether or not it is possible to combine multiple loops normally or by using a mask for the portion determined not to be inconsistent with parallel execution. a multiple loop unification analysis means that analyzes and determines whether unification is possible, and unification and/or vectorization of portions determined to be unification possible and other portions determined to be consistent with parallel execution to generate a second intermediate text. A multi-loop vectorization compilation method characterized by comprising a vector text generation means.

(2) The multiple loop vectorization compilation method according to claim 1, wherein the DO loop analysis information is expressed using a multiple DO loop information table and a DO loop unification table.