JPH0784977A

JPH0784977A - Construction method of multilayered neural network

Info

Publication number: JPH0784977A
Application number: JP5230406A
Authority: JP
Inventors: Mina Maruyama; 美奈丸山; Nobuo Tsuda; 伸生津田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: NTT Inc
Priority date: 1993-09-16
Filing date: 1993-09-16
Publication date: 1995-03-31

Abstract

(57)【要約】【目的】本発明の目的は、確実な分類規則を保存し、
不確実な分類規則を事例学習で精錬でき、分類精度の向
上を図り、学習効率を上げることができる多層構造型ニ
ューラルネットワークの構成方法を提供することであ
る。【構成】ニューラルネットワークの部分構成を概略分
類規則に基づいて初期設定し（ステップ１）、各結合
に、概略分類規則に基づいて確信度を割り付け（ステッ
プ２）、確信度の値に従って、抑制率を決定し（ステッ
プ３）、ニューラルネットワークの学習を行い（ステッ
プ４）、ニューラルネットワークが対象データに対する
分類機能を有しているかを検査する（ステップ５）。 (57) [Summary] [Objective] The object of the present invention is to store a reliable classification rule,
An object of the present invention is to provide a method for constructing a multilayered neural network that can refine uncertain classification rules by case learning, improve classification accuracy, and improve learning efficiency. [Configuration] A partial configuration of the neural network is initialized based on the rough classification rule (step 1), a certainty factor is assigned to each connection based on the rough classification rule (step 2), and the suppression rate is calculated according to the certainty value. Is determined (step 3), the neural network is learned (step 4), and it is checked whether the neural network has a classification function for the target data (step 5).

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、多層構造型ニューラル
ネットワークの構成方法に係り、特に、多次元特徴ベク
トルで表現された対象データの分類処理を行う多層構造
型ニューラルネットワークの構成方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for constructing a multi-layered neural network, and more particularly to a method for constructing a multi-layered neural network for classifying target data represented by multidimensional feature vectors.

【０００２】[0002]

【従来の技術】従来より、対象データの分類処理を目的
として、多層構造型ニューラルネットワークが用いられ
ている。2. Description of the Related Art Hitherto, a multilayer structure type neural network has been used for the purpose of classifying target data.

【０００３】図４は、多層構造型ニューラルネットワー
クの構成を示す。同図において、入力層１１は、分類対
象データの特徴量（ｘ₁，ｘ₂，ｘ₃，ｘ₄）を入力
し、出力層１２は、分類結果（ｏ_n1，ｏ_n2）を出力し、
入力層１１と出力層１２の間には、１層乃至はそれ以上
の中間層１３がある。各層のユニットの入力は、その前
後の層の各々のユニットの出力に結合されている。各々
の層のユニットの出力は、以下の式により決定される。FIG. 4 shows the structure of a multilayer structure type neural network. In the figure, the input layer 11 inputs the feature quantities (x ₁ , x ₂ , x ₃ , x ₄ ) of the classification target data, and the output layer 12 outputs the classification results (o _n1 , o _n2 ),
Between the input layer 11 and the output layer 12, there are one or more intermediate layers 13. The inputs of the units in each layer are coupled to the outputs of each unit in the layers before and after it. The output of each layer unit is determined by the following equation.

【数１】 [Equation 1]

【０００４】ここで、ｏ_j ^(k)は、ｋ層（ｋ≧１，ｋ＝
１が入力層）のｊ番目のユニットの出力値であり、ｗ_ij
^(k)は、ｋ−１層のｉ番目のユニットからｋ層のｊ番目
のユニットへの結合荷重であり、Ｎ^(k-1)はｋ−１層の
ユニット総数である。但し、式(2) のｗ_0j ^(k)は、ｋ層
のｊ番目のユニットにバイアスを与えるための結合荷重
であり、この場合の出力値ｏ₀ ^(k-1)は常に１とする。
また、入力層（ｋ＝１）の各ユニットは、入力された特
徴量をそのまま出力する。Here, o _j ^(k) is k layers (k ≧ 1, k =
1 is the output value of the j-th unit in the input layer), and w _ij
^(k) is the coupling load from the i-th unit of the k-1 layer to the j-th unit of the k layer, and N ^(k-1) is the total number of units in the k-1 layer. However, w _0j ^(k) in the equation (2) is a coupling weight for giving a bias to the j-th unit in the k layer, and the output value o ₀ ^(k-1) in this case is always 1.
Further, each unit in the input layer (k = 1) outputs the input feature amount as it is.

【０００５】このようなニューラルネットワークにデー
タの分類を行わせるには、入力層１１のユニットに分類
対象データの特徴を与えた時に、その対象データの属す
る分類カテゴリに対応する出力層１２のユニットのみが
高い値を出力し、他の出力層１２のユニットが低い値を
出力するように、上述の各ユニット間の結合荷重ｗ_ij
^(k)及びバイアス値を設定する必要がある。In such a neural network, the data
In order to classify the data into the units of the input layer 11,
When the characteristics of the target data are given, the
Only the units of the output layer 12 corresponding to the classification category
It outputs a high value and the other units in the output layer 12 output a low value.
As output, the coupling load w between each unit described above_ij
^(k)And the bias value needs to be set.

【０００６】この設定法として、まず、ニューラルネッ
トワークの初期荷重をランダム値に設定し、対象データ
のサンプルのベクトル要素を入力したときの実際の出力
値と望ましい出力値の誤差が減少するよう、荷重ｗ_ij
^(k)を微少量ずつ繰り返し調整する誤差逆伝搬学習方式
がある。しかし、この方式は、ランダムな初期荷重値か
ら荷重を微少量ずつ調整するため、学習が収束するまで
の収束時間がかかる点や、学習が収束しない場合が多い
等、収束時間、収束性に関する問題がある。また、サン
プルデータのみを参照して学習を行うため、多くのサン
プルデータを得ることができない場合や、サンプルデー
タに偏りがある場合、対象データに関する一般的な分類
性能が得られないという汎化能力に関する問題がある。As this setting method, first, the initial weight of the neural network is set to a random value, and the weight is adjusted so that the error between the actual output value and the desired output value when the vector element of the sample of the target data is input is reduced. w _ij
There is an error backpropagation learning method that repeatedly adjusts ^(k) in small increments. However, since this method adjusts the load in small increments from random initial load values, it takes a long time for the learning to converge, and in many cases the learning does not converge. There is. In addition, since the learning is performed by referring to only the sample data, generalization ability that general classification performance for the target data cannot be obtained when a large amount of sample data cannot be obtained or the sample data is biased. I have a problem with.

【０００７】そこで、従来の対象データに関して概ね正
しいと考えられる概略分類規則が既知である場合に、こ
の概略分類規則と等価な初期分類機能を有するようにニ
ューラルネットワークの構造と荷重を初期設定し（特開
平４−７６６６０）、しかる後、初期設定値をできるだ
け保ちつつ事例学習を行う重み変動抑制学習方法（特開
平４−２４２９３７）により事例学習を行う方法があ
る。Therefore, when the general classification rule which is considered to be almost correct for the conventional target data is known, the structure and weight of the neural network are initialized so as to have an initial classification function equivalent to this general classification rule ( Japanese Patent Laid-Open No. 4-76660), and thereafter, there is a method for performing case learning by a weight variation suppression learning method (Japanese Patent Laid-Open No. 4-242937) for performing case learning while keeping the initial setting value as much as possible.

【０００８】図５は、従来の方法を説明するためのフロ
ーチャートを示す。以下、図５を用いて各処理を説明す
る。対象データのベクトル要素ｘ₁，ｘ₂，ｘ₃，
ｘ₄，ｘ₅，ｘ₆から対象データのカテゴリｙ₁，ｙ₂
のいずれかに属するかを分類する。以下の概ね正しいと
考えられる概略分類規則が与えられているとする。FIG. 5 shows a flow chart for explaining the conventional method. Hereinafter, each process will be described with reference to FIG. Vector elements x ₁ , x ₂ , x ₃ , of the target data,
Categories of target data y ₁ , y ₂ from x ₄ , x ₅ , x ₆
Classify which one of them belongs to. Given the following general classification rules, which are considered to be generally correct:

【０００９】ＩＦｘ₁or ｘ₂ ＴＨＥＮｙ’ （３）ＩＦ（ｙ’and ｘ₃）or ｘ₄ ＴＨＥＮｙ₁ （４）ＩＦｘ₃or ｘ₅or ｘ₆ＴＨＥＮｙ₂ （５）規則（３）は、「特徴ｘ₁あるいはｘ₂が存在すれば、
対象データはサブカテゴリｙ’に属する」という規則で
あり、規則（４）は、「対象データがサブカテゴリｙ’
に属し、かつｘ₃が存在する、或いは、ｘ₄が存在すれ
ば、対象データは、カテゴリｙ₁に属する」という規則
であり、規則（５）は、「ｘ₃、ｘ₅あるいは、ｘ₆が
存在すれば、対象データはカテゴリｙ₂に属する」とい
う規則である。IF x ₁ or x ₂ THEN y '(3) IF (y'and x ₃ ) or x ₄ THEN y ₁ (4) IF x ₃ or x ₅ or x ₆ THEN y ₂ (5) Rule (3) ) Is "if the feature x ₁ or x ₂ exists,
The target data belongs to the subcategory y ', and the rule (4) is that the target data is the subcategory y'.
If x ₃ exists or x ₄ exists, the target data belongs to category y ₁ ”, and the rule (5) is“ x ₃ , x _5, or x ₆ Is present, the target data belongs to the category y ₂ ”.

【００１０】これらの概略分類規則は、分類規則／多段
論理式変換処理（ステップ２１）により、以下の多段論
理式に変換される。These general classification rules are converted into the following multi-step logical expressions by the classification rule / multi-step logical expression conversion processing (step 21).

【００１１】ｙ’＝ｘ₁＋ｘ₂ （６）ｙ₁＝（ｙ’＋ｘ₃）＋ｘ₄ （７）ｙ₂＝ｘ₄＋ｘ₅＋ｘ₆ （８）次に、多段論理式／荷重変換処理（ステップ２２）によ
り、分類処理を行うニューラルネットワークの結合構成
は、多段論理式（６）、（７）、（８）に従って、図６
に示すように設置される。図６は、多段論理式／荷重変
換処理により初期設定されたニューラルネットワークの
例を示す。Y ′ = x ₁ + x ₂ (6) y ₁ = (y ′ + x ₃ ) + x ₄ (7) y ₂ = x ₄ + x ₅ + x ₆ (8) Next, a multistage logical expression / load conversion process ( In step 22), the connection configuration of the neural network for performing the classification process is as shown in FIG.
It is installed as shown in. FIG. 6 shows an example of a neural network initially set by the multistage logical expression / weight conversion processing.

【００１２】ここで、ニューラルネットワーク３１の結
合構成は、同図に示すように、多段論理式の右辺に現れ
る変数毎に一つの入力層ユニットを乗法項毎に一つの第
２中間層ユニットを割り付け、出力層１２で全乗法項の
加法を実現するように行う。また、荷重ｗ_ijは、特開平
４−７６６６０に記述されている方法に従って決定され
る。Here, as shown in the figure, the connection structure of the neural network 31 is such that one input layer unit is allocated to each variable appearing on the right side of the multi-stage logical expression and one second intermediate layer unit is allocated to each multiplication term. , The output layer 12 is implemented so as to realize the addition of the multiplicative term. The load w _ij is determined according to the method described in Japanese Patent Laid-Open No. 4-76660.

【００１３】次に、パラメータ設定処理（ステップ２
３）により、学習パラメータが設定される。学習パラメ
ータの内容については、学習処理（ステップ２４）を説
明する部分で詳細に説明する。パラメータ設定処理にお
いて、学習パラメータは、人手により試行錯誤的に決定
している。Next, a parameter setting process (step 2)
By 3), the learning parameter is set. The contents of the learning parameter will be described in detail in the part for explaining the learning process (step 24). In the parameter setting process, the learning parameter is manually determined by trial and error.

【００１４】次に、事例学習処理（ステップ２４）によ
り、事例学習が行われる。以下、事例学習処理で用いら
れる、重み変動抑制学習法について説明する。Next, case learning is performed by a case learning process (step 24). The weight variation suppression learning method used in the case learning process will be described below.

【００１５】誤差逆伝搬学習法では、サンプルデータの
誤差のみを少なくするため、サンプルデータが少ない場
合や、サンプルデータに偏りがある場合、初期構造が大
きく変化し、対象データに関する一般的な分類能力が低
下するという問題がある。そこで、重み変動抑制学習法
では、以下の式で示される評価関数Ｅを減少させるよう
に荷重の値を変化させる。In the error back-propagation learning method, only the error of the sample data is reduced. Therefore, when the sample data is small or the sample data is biased, the initial structure changes greatly, and the general classification ability for the target data is improved. There is a problem that Therefore, in the weight variation suppression learning method, the value of the load is changed so as to decrease the evaluation function E represented by the following equation.

【００１６】[0016]

【数２】 [Equation 2]

【００１７】ここで、Ｋは、出力層１２の層番号、即
ち、ニューラルネットワークの全層数である。また、ｙ
_jは、出力層ｊ番目のユニットの当該学習サンプルに対
する望ましい出力値、即ち、正しい分類結果である。Ｗ
_ij ^(k)は、荷重ｗ_ij ^(k)の初期設定値である。ｆ（ｘ，
ｙ）は、ｘとｙの差の増加に伴って増加する関数であ
り、例えば、｜ｘ−ｙ｜、（ｘ−ｙ）²等である。η_ij
^(k)は抑制率と呼ばれ、評価関数Ｅに対する各荷重の差
分の寄与の大きさを決める学習パラメータである。抑制
率η_ij ^(k)は、通常、全ての荷重に対して一定の値ない
し、層毎に一定の値としている。右辺の第１項は、学習
サンプルデータの特徴量を入力した時のニューラルネッ
トワークの出力値と望ましい出力値の二乗誤差を示す。
即ち、誤差逆伝搬学習法で用いられた評価関数と等価で
ある。また、第２項は、荷重の初期設定値からのずれを
表す項である。Here, K is the layer number of the output layer 12, that is, the total number of layers of the neural network. Also, y
_j is a desired output value for the learning sample of the j-th unit in the output layer, that is, a correct classification result. W
_ij ^(k) is an initial setting value of the load w _ij ^(k) . f (x,
y) is a function that increases with an increase in the difference between x and y, and is, for example, | x−y |, (x−y) ^2, or the like. η _ij
^(k) is called a suppression rate, and is a learning parameter that determines the magnitude of the contribution of the difference of each load to the evaluation function E. The suppression rate η _ij ^(k) is usually a constant value for all loads or a constant value for each layer. The first term on the right side represents the squared error between the output value of the neural network and the desired output value when the feature amount of the learning sample data is input.
That is, it is equivalent to the evaluation function used in the error backpropagation learning method. The second term is a term that represents the deviation of the load from the initial setting value.

【００１８】このように、重み変動抑制学習法では、ニ
ューラルネットワークの出力値の二乗誤差と、荷重値の
初期設定値からのずれの和が減少するように荷重を調整
する。従って、抑制率を大きな値に設定すれば、荷重値
の変動を防止できるため、初期に与えた概略分類規則は
強く保存され、逆に小さい値に設定すれば、概略規則は
あまり保存されないことになる。As described above, in the weight variation suppression learning method, the weight is adjusted so that the sum of the squared error of the output value of the neural network and the deviation of the load value from the initial setting value decreases. Therefore, if the suppression rate is set to a large value, the fluctuation of the load value can be prevented, so the rough classification rule given at the beginning is strongly saved, and conversely, if it is set to a small value, the rough rule is not saved much. Become.

【００１９】上の評価関数Ｅを減少させるための、荷重
ｗ_ij ^(k)の変化量Δｗ_ij ^(k)は以下の式で与えられる。The change amount Δw _ij ^(k) of the load w _ij ^(k) for reducing the above evaluation function E is given by the following equation.

【００２０】[0020]

【数３】 [Equation 3]

【００２１】ここで、荷重の変化量Δｗ_ij ^(k)は、荷重
ないし、バイアス値の調整量である。また、右辺第１項
は通常の逆誤差伝搬学習における荷重の調整量と等価で
ある。εは、学習率とよばれ、１回の繰り返しでの調整
量の大きさを決める学習パラメータである。また、ｄ_j
^(k)は、ｋ層が出力層（ｋ＝Ｋ）のとき、以下の式で算
出される。Here, the change amount Δw _ij ^(k) of the load is the adjustment amount of the load or the bias value. Further, the first term on the right side is equivalent to the adjustment amount of the load in the ordinary inverse error propagation learning. ε is a learning parameter called a learning rate, which is a learning parameter that determines the magnitude of the adjustment amount in one iteration. Also, d _j
^(k) is calculated by the following equation when the k layer is the output layer (k = K).

【００２２】[0022]

【数４】 [Equation 4]

【００２３】ｋ層が中間層の時は、ｄ_j ^(k)は以下の式
で与えられる。When the k layer is the intermediate layer, d _j ^(k) is given by the following equation.

【００２４】[0024]

【数５】 [Equation 5]

【００２５】以上の式を用いて、繰り返し同じ学習パラ
メータで事例学習を行い、誤差がある一定値以下になっ
た場合に処理を終了する。また、ある回数事例学習を行
っても誤差が減少しない場合や、誤差が減少、増加を繰
り返す振動状態などで収束しない場合も処理を終了す
る。Using the above equations, case learning is repeatedly performed with the same learning parameters, and the processing is terminated when the error falls below a certain value. In addition, the processing is also ended when the error does not decrease even after performing the case learning a certain number of times or when the error does not converge due to the repeated decrease and increase of the vibration.

【００２６】次に、分類処理機能検査処理（ステップ２
５）は、テストサンプルデータを実際にニューラルネッ
トワークに入力し、正しい分類結果が出力されるかを検
査する。テストサンプルデータに対する分類機能が得ら
れていれば、すべての処理を終了する。十分な分類機能
が得られていない場合や、学習が振動状態に陥った場合
には、パラメータ設定処理（ステップ２３）に戻る。Next, the classification processing function inspection processing (step 2)
In 5), the test sample data is actually input to the neural network, and it is checked whether or not the correct classification result is output. If the classification function for the test sample data is obtained, all processing is terminated. When the sufficient classification function is not obtained or when the learning falls into the vibration state, the process returns to the parameter setting process (step 23).

【００２７】従来の方法は、概略分類規則と等価な初期
分類機能を有するようにニューラルネットワークの構造
と荷重を初期設定するため、収束速度、収束性に優れて
いる。また、抑制率を適切な値に設定できれば、サンプ
ルデータが十分得られていない場合や、サンプルデータ
に偏りがある場合であっても、初期構造を適切に保存し
つつ事例学習を行うことができるため、汎化能力低下を
防ぎ、初期に与えられた規則を精錬することが可能であ
る。The conventional method is excellent in convergence speed and convergence because the structure and weight of the neural network are initialized so as to have an initial classification function equivalent to the rough classification rule. In addition, if the suppression rate can be set to an appropriate value, case learning can be performed while preserving the initial structure even when sample data is not sufficiently obtained or the sample data is biased. Therefore, it is possible to prevent the generalization ability from decreasing and refine the rules given in the early stage.

【００２８】[0028]

【発明が解決しようとする課題】上記従来の方法は、学
習パラメータの再設定値は試行錯誤的に決定している。
このため、本来、各ユニットの結合毎に異なる値を設定
できる抑制率は、その設定作業に多大な労力が必要であ
るため、全ての結合に対して一定の値、或いは、各層内
の結合に関して一定の値として再設定するのが通常であ
る。しかし、概略分類規則には、略確実であるため強く
保存する必要があるものや、或いは、不確実なのであま
り強く保存する必要がないものなど、様々な確実性をも
つ規則が存在する。このような概略分類規則に対して同
じ抑制率で学習を行った場合、抑制率が大きければ、不
確実な規則の精錬が妨げられてしまい、抑制率が小さけ
れば、確実な規則がノイズの影響を受けて不必要に変化
し、汎化能力が低下する等の問題がある。In the above-mentioned conventional method, the reset value of the learning parameter is determined by trial and error.
Therefore, originally, the suppression rate at which a different value can be set for each coupling of each unit requires a great deal of labor for the setting work, and therefore a constant value for all couplings or a coupling within each layer is required. It is usually reset to a constant value. However, there are rules with various certainty in the rough classification rules, such as one that needs to be strongly preserved because it is almost certain, and one that does not need to be strongly preserved because it is uncertain. When learning is performed at the same suppression rate for such rough classification rules, if the suppression rate is large, refining of uncertain rules will be hindered, and if the suppression rate is small, the certain rules will be affected by noise. Therefore, there is a problem that it changes unnecessarily and the generalization ability decreases.

【００２９】また、概略分類規則全体がかなり確実であ
れば、事例学習による結合の変更は、微少量で済むた
め、１回に調整する荷重の量を決定する学習率は、小さ
な値に設定する方が収束性がよい。逆に概略分類規則が
不確実であるならば、学習の初期の段階においては、学
習率は大きな値に設定する方が効率的である。しかし、
従来の方法では、規則の確実性を考慮していなかったた
め、上記のように学習率を適切な値に設定できず、学習
が効率的に行うことができないという問題がある。Further, if the overall rough classification rule is fairly reliable, the change of the connection by case learning can be made in a very small amount, so the learning rate for determining the amount of weight to be adjusted at one time is set to a small value. The convergence is better. On the contrary, if the rough classification rule is uncertain, it is more efficient to set the learning rate to a large value in the early stage of learning. But,
In the conventional method, since the certainty of the rule is not taken into consideration, there is a problem that the learning rate cannot be set to an appropriate value as described above and learning cannot be performed efficiently.

【００３０】本発明は、上記の点に鑑みなされたもの
で、上記従来の問題点を解決し、分類処理を行う多層構
造型ニューラルネットワークを構成する際に、確実な分
類規則を保存し、不確実な分類規則を事例学習で精錬で
き、分類精度の向上を図り、学習効率を上げることがで
きる多層構造型ニューラルネットワークの構成方法を提
供することを目的とする。The present invention has been made in view of the above points, and when solving the above-mentioned conventional problems and constructing a multilayer structure type neural network for performing a classification process, a reliable classification rule is saved and An object of the present invention is to provide a method for constructing a multilayer structure type neural network which can refine a reliable classification rule by case learning, improve classification accuracy, and improve learning efficiency.

【００３１】[0031]

【課題を解決するための手段】図１は、本発明の原理説
明図である。本発明は、複数のベクトル要素からなる特
徴ベクトルで表現された対象データに関して、規則の確
実性を表す確信度付きの概略分類規則が既知である場合
に、対象データの分類処理を行う多層構造型ニューラル
ネットワークを構成する方法において、ニューラルネッ
トワークの部分構成を概略分類規則に基づいて初期設定
し（ステップ１）、ニューラルネットワークの初期設定
された部分構成の各結合に、概略分類規則に基づいて確
信度を割り付け（ステップ２）、部分構成の各結合に割
り付けられた確信度の値に従って、ニューラルネットワ
ークの評価関数に対する各荷重の差分の寄与の大きさを
決める学習パラメータである抑制率を決定し（ステップ
３）、サンプルデータを用いてニューラルネットワーク
の学習を行い（ステップ４）、ニューラルネットワーク
が対象データに対する分類機能を有しているかを検査
し、有していない場合には、学習パラメータを決定する
処理以降を繰り返し、有している場合には、処理を終了
する（ステップ５）。FIG. 1 is a diagram for explaining the principle of the present invention. The present invention relates to a multi-layered structure type that performs a classification process of target data when the rough classification rule with certainty factor indicating the certainty of the rule is known for the target data represented by the feature vector composed of a plurality of vector elements. In a method of constructing a neural network, a partial configuration of a neural network is initialized based on a rough classification rule (step 1), and a confidence factor is calculated based on the rough classification rule for each connection of the initialized partial configurations of the neural network. Is assigned (step 2), and the suppression rate, which is a learning parameter that determines the size of the contribution of the difference of each weight to the evaluation function of the neural network, is determined according to the value of the certainty factor assigned to each combination of partial configurations (step 2). 3), learning the neural network using the sample data (step 4), It is checked whether or not the Ural network has a classification function for the target data, and if it does not, the processing for determining the learning parameter and subsequent steps are repeated, and if it does, the processing ends ( Step 5).

【００３２】また、本発明は、学習パラメータを決定す
る場合（ステップ３）に、１回の繰り返しで荷重または
バイアスの調整量の大きさを決定する前記学習パラメー
タである学習率を、概略規則の確信度の代表値に従って
決定する。In addition, according to the present invention, when the learning parameter is determined (step 3), the learning rate, which is the learning parameter for determining the magnitude of the adjustment amount of the load or the bias in one iteration, is set as a rule. It is determined according to the representative value of the certainty factor.

【００３３】[0033]

【作用】本発明は、個々の概略分類規則に確信度と呼ば
れる規則の確実性を表す尺度を設け、この尺度に従っ
て、学習パラメータである抑制率の設定を行う。これに
より、学習をサンプルデータが少なかったり、偏りがあ
った場合に、概略分類規則が保存されるように、ニュー
ラルネットワークの各結合に割り付けられた確信度に従
って、抑制率を増加させ、また、誤差が減少しなくなる
場合には、抑制率を減少させるような調整を行うことに
より、確実な分類規則を保存し、不確実な分類規則を事
例学習で精錬することができるため、分類処理の精度が
向上する。According to the present invention, each rough classification rule is provided with a measure called certainty, which is a measure of the certainty of the rule, and the suppression rate, which is a learning parameter, is set according to this measure. This increases the suppression rate according to the certainty factor assigned to each connection of the neural network so that the rough classification rule is preserved when learning has a small amount of sample data or bias, and the error rate is also increased. If the value does not decrease, the accuracy of the classification process can be improved by adjusting the reduction rate so that the reliable classification rules can be saved and the uncertain classification rules can be refined by case learning. improves.

【００３４】また、個々の概略分類規則の確信度の代表
値に応じた学習率を設定できる。Further, the learning rate can be set according to the representative value of the certainty factor of each rough classification rule.

【００３５】[0035]

【実施例】以下、図面とともに本発明の実施例を説明す
る。Embodiments of the present invention will be described below with reference to the drawings.

【００３６】まず、以下のような書き下ろした確信度付
き概略分類規則があるとする。ＩＦｘ₁or ｘ₂ ＴＨＥＮｙ’with Ｃ１（１４）ＩＦ（ｙ’and ｘ₃）or ｘ₄ＴＨＥＮｙ₁with Ｃ２（１５）ＩＦｘ₃or ｘ₅or ｘ₆ＴＨＥＮｙ₂with Ｃ３（１６）ここで、Ｃ１，Ｃ２，Ｃ３は、確信度であり、規則（１
４）は、特徴ｘ₁あるいは、ｘ₂が存在すれば、対象デ
ータはサブカテゴリｙ’に属する」という概略規則がＣ
１程度信頼できることを示す。First, it is assumed that there is the following written general classification rule with certainty factor. IF x ₁ or x ₂ THEN y'with C1 (14) IF (y'and x ₃ ) or x ₄ THEN y ₁ with C2 (15) IF x ₃ or x ₅ or x ₆ THEN y ₂ with C3 (16) Here, C1, C2, C3 are the certainty factors, and the rule (1
4) is that if the feature x ₁ or x ₂ exists, the target data belongs to the subcategory y '.
It shows that it is reliable about 1.

【００３７】規則（１５）、規則（１６）も同様に、そ
れぞれＣ２，Ｃ３程度信頼できることを示す。また、確
信度ＣＦは、０＜ＣＦ＜１のアナログ値であり、ＣＦ＝
１が最も高い確信度、逆にＣＦ＝０が最も低い確信度で
あるとする。Similarly, the rules (15) and (16) also indicate that the reliability is about C2 and C3, respectively. The confidence factor CF is an analog value of 0 <CF <1, and CF =
It is assumed that 1 is the highest certainty, and conversely, CF = 0 is the lowest certainty.

【００３８】図２は、本発明の一実施例の概要を示すフ
ローチャートである。同図において、ステップ２１、２
２、２４、及び２５は、図５に示す従来の処理と同様で
ある。FIG. 2 is a flow chart showing the outline of one embodiment of the present invention. In the figure, steps 21, 2
2, 24, and 25 are similar to the conventional processing shown in FIG.

【００３９】まず、従来の方法と同様に、分類規則／多
段論理式変換処理（ステップ２１）及び多段論理式／荷
重変換処理（ステップ２２）により、図４に示すような
結合構成のニューラルネットワークを構成する。First, similarly to the conventional method, the classification rule / multistage logical formula conversion process (step 21) and the multistage logical formula / weight conversion process (step 22) are used to construct a neural network having a connection structure as shown in FIG. Constitute.

【００４０】図３は、本発明の一実施例の学習パラメー
タを設定するニューラルネットワークを説明するための
図である。同図において、点線に囲まれた部分ネットワ
ーク５１１、５２１、５３１がそれぞれ規則（１４）、
（１５）、（１６）に対応する。FIG. 3 is a diagram for explaining a neural network for setting learning parameters according to an embodiment of the present invention. In the figure, the partial networks 511, 521, and 531 surrounded by the dotted line are rule (14),
It corresponds to (15) and (16).

【００４１】確信度割り付け処理（ステップ４１）は、
図３に示した点線内の結合にそれぞれに割り付けた分類
規則の確信度を割り付ける。また、バイアス値について
は、入力する結合の確信度を割り付ける。但し、図３に
示すユニット５２２のように、確信度Ｃ１とＣ２という
異なる確信度を割り付けられた結合を入力とするユニッ
トのバイアス値は、各結合に割り付けられた確信度のう
ち、最小値を選択する。また、概略分類規則の全体の確
信度は、全規則の確信度の平均値とする。以下、各結合
に割り付けられた確信度をＣ_ij ^(k)、確信度の平均値を
Ｃ_midと表す。The certainty factor allocation process (step 41)
The certainty factor of the classification rule assigned to each of the connections within the dotted line shown in FIG. 3 is assigned. For the bias value, the certainty factor of the input combination is assigned. However, as in the unit 522 shown in FIG. 3, the bias value of a unit whose input is a combination to which different certainty factors C1 and C2 are assigned is the minimum value among the certainty factors assigned to each combination. select. The overall confidence level of the rough classification rules is the average value of the confidence levels of all the rules. Hereinafter, the certainty factor assigned to each combination will be referred to as C _ij ^(k) , and the average value of the certainty _factors will be referred to as C _mid .

【００４２】次に、確信度をもとにした学習パラメータ
算出処理（ステップ４２）は、以下の規則に従って、学
習パラメータの設定処理を行う。なお、以下の規則に従
った設定処理と、事例学習処理（ステップ２４）とを、
ニューラルネットワークが分類処理機能を得るまで繰り
返し行う。Next, in the learning parameter calculation process (step 42) based on the certainty factor, the learning parameter setting process is performed according to the following rules. In addition, the setting process according to the following rules and the case learning process (step 24)
This is repeated until the neural network has the classification processing function.

【００４３】《規則１》初期設定値は、以下のようにす
る。Ｈo _ij ^(k)＝０（１７）Ａ₀＝−０．２×Ｃ_mid＋０．２（１８）Ｒ＝１（１９）但し、Ｈt _ij ^(k)は、ｔ回目の再設定時における抑制率
の値、Ａ_tは、学習率である。また、Ｒは、抑制率の増
加、減少を繰り返した回数を保存するカウンタである。<< Rule 1 >> The initial setting values are as follows. Ho _ij ^(k) = 0 (17) A ₀ = −0.2 × C _mid +0.2 (18) R = 1 (19) where H t _ij ^(k) is the suppression rate at the t-th resetting. The value A _t is the learning rate. Further, R is a counter that stores the number of times that the suppression rate is repeatedly increased and decreased.

【００４４】《規則２》事例学習処理（ステップ２４）
において、学習サンプルデータに対する誤差が減少しな
い場合、以下のように再設定する。まず、（ｔ−１）回
目の処理で規則３が適用されていた場合、カウンタＲを
インクリメントする。Ｒ＝Ｒ＋１（２０）次に、抑制率を以下の式に従って設定する。Ｈ(t) _ij ^(k) ＝Ｈ(t-1) _ij ^(k)＋Ｃ_ij ^(k)／Ｒ（２１）《規則３》分類処理機能検査処理（ステップ２５）にお
いて、十分な分類処理機能が得られていない場合、以下
のように再設定する。（ｔ−１）回目の処理で規則２が
適用されていた場合、カウンタＲをインクリメントす
る。Ｒ＝Ｒ＋１（２２）次に、抑制率を以下の式に従って設定する。Ｈ(t) _ij ^(k) ＝Ｈ(t-1) _ij ^(k)−Ｃ_ij ^(k)／Ｒ（２３）以上の規則を適用した場合、学習は、以下のように進行
する。<Rule 2> Case study processing (step 24)
In, if the error with respect to the learning sample data does not decrease, reset as follows. First, when the rule 3 is applied in the (t-1) th process, the counter R is incremented. R = R + 1 (20) Next, the suppression rate is set according to the following formula. H (t) _ij ^(k) = H (t-1) _ij ^(k) + _Cij ^(k) / R (21) << Rule 3 >> Classification processing function In the inspection processing (step 25), a sufficient classification processing function is obtained. If not, reset as follows. When the rule 2 is applied in the (t-1) th process, the counter R is incremented. R = R + 1 (22) Next, the suppression rate is set according to the following formula. H (t) _ij ^(k) = H (t-1) _ij ^(k) _-Cij ^(k) / R (23) When the above rules are applied, learning proceeds as follows.

【００４５】まず、規則１が適用され、抑制率は、Ｈo _ij ^(k)＝０即ち、学習が従来の誤差逆伝搬学習方法と同じになるよ
うに初期設定される。また、学習率は、Ｃ_min＝１即ち、概略分類規則が１００％信用できる時、Ａ₀＝０
となり、Ｃ_min＝０即ち、規則を全く信用できない時、
Ａ₀＝０．２となるよう式（１８）に従って設定する。
ここで、０．２は、誤差逆伝搬学習方で一般に使われて
いる学習率である。First, rule 1 is applied, and the suppression rate is initialized so that Ho _ij ^(k) = 0, that is, the learning is the same as the conventional error backpropagation learning method. Further, the learning rate is C _min = 1, that is, when the rough classification rule is 100% credible, A ₀ = 0
And C _min = 0, that is, when the rule cannot be trusted at all,
It is set according to the equation (18) so that A ₀ = 0.2.
Here, 0.2 is a learning rate generally used in the error backpropagation learning method.

【００４６】以上のように設定し、事例学習処理を行う
が、学習サンプルデータが少なかったり、偏りがある場
合、例え、サンプルデータに関する誤差が減少しても汎
化能力は得られず、テストデータに関する分類機能は向
上しない。このような場合には、規則２が適用され、概
略分類規則が保存されるように、各結合に割り付けられ
た確信度に従って抑制率を増加させる。確信度の増加を
繰り返し行っていた場合、事例学習処理を行っても誤差
が減少しなくなる場合がある。このような場合には、規
則３が適用され、抑制率を減少させる。When the case learning processing is performed with the above settings, if the learning sample data is small or biased, the generalization ability cannot be obtained even if the error related to the sample data is reduced, and the test data is not obtained. The classification function for does not improve. In such a case, rule 2 is applied and the inhibition rate is increased according to the confidence assigned to each join so that the rough classification rule is preserved. If the confidence factor is repeatedly increased, the error may not decrease even if the case learning process is performed. In such cases, Rule 3 is applied to reduce the inhibition rate.

【００４７】以上の処理により、分類処理機能を有した
ニューラルネットワークが構成される。By the above processing, a neural network having a classification processing function is constructed.

【００４８】なお、本発明は、上記実施例で用いた方法
に限定されず、請求範囲を逸脱しない限度で種々変更が
可能である。例えば、本実施例において、異なる確信度
を割り付けられた結合を入力とするユニットのバイアス
値の確信度は、最小値以外に平均値などを用いることも
可能である。また、概略分類規則全体の確信度の代表値
は、メジアン等、平均値以外の基本的な方法で代行させ
ることも可能である。The present invention is not limited to the method used in the above embodiment, and various modifications can be made without departing from the scope of the claims. For example, in the present embodiment, as the confidence factor of the bias value of the unit that receives the combination to which different confidence factors are assigned, an average value or the like can be used in addition to the minimum value. Further, the representative value of the certainty factor of the overall rough classification rule can be substituted by a basic method other than the average value such as median.

【００４９】また、初期設定時から確信度に従って、抑
制率を設定する規則を適用することも可能である。It is also possible to apply a rule for setting the suppression rate according to the certainty factor from the initial setting.

【００５０】[0050]

【発明の効果】上述のように、本発明は、分類処理を行
う多層構造型ニューラルネットワークを構成する際に、
ニューラルネットワークに初期設定した概略分類規則に
基づいて、各結合に確信度を割り付けた確信度に従っ
て、抑制率を各結合毎に設定できるため、確実な分類規
則を保存し、また、不確実な分類規則を事例学習で精錬
でき、分類精度の向上が図れる。また、１回に調整する
量を決定する学習率を、概略分類規則全体が確実である
ときは、小さな値に設定でき、反対に、不確実であれば
大きな値に設定できるため、学習が効率的に行うことが
できる。また、従来人手により試行錯誤的に行っていた
パラメータ設定のための作業を軽減することができる。As described above, according to the present invention, in constructing a multilayer structure type neural network for performing classification processing,
Based on the general classification rule initially set in the neural network, the suppression rate can be set for each connection according to the certainty value assigned to each connection, so a reliable classification rule can be saved and uncertain classification can be performed. The rules can be refined by case study, and the classification accuracy can be improved. In addition, the learning rate that determines the amount to be adjusted at one time can be set to a small value when the overall rough classification rule is certain, and can be set to a large value when it is uncertain, so that the learning efficiency is high. Can be done on a regular basis. In addition, it is possible to reduce the work for setting parameters, which has conventionally been performed manually by trial and error.

[Brief description of drawings]

【図１】本発明の原理構成図である。FIG. 1 is a principle configuration diagram of the present invention.

【図２】本発明の一実施例の概要を示すフローチャート
である。FIG. 2 is a flowchart showing an outline of an embodiment of the present invention.

【図３】本発明の一実施例の学習パラメータを設定する
ニューラルネットワークを説明するための図である。FIG. 3 is a diagram for explaining a neural network that sets a learning parameter according to an embodiment of the present invention.

【図４】多層構造型ニューラルネットワークの構成例を
示す図である。FIG. 4 is a diagram showing a configuration example of a multilayer structure type neural network.

【図５】従来の方法を説明するためのフローチャートで
ある。FIG. 5 is a flowchart illustrating a conventional method.

【図６】多段論理式／荷重変換処理により初期設定され
たニューラルネットワークの例を示す図である。FIG. 6 is a diagram showing an example of a neural network initially set by a multi-stage logical expression / weight conversion process.

[Explanation of symbols]

１１分類対象の特徴量を入力する入力層１２分類結果を出力する出力層１３中間層５１、５２、５３多段論理式に従って構成されたニュ
ーラルネットワークの例５１１規則（１４）を割り付けたニューラルネットワ
ーク５２１規則（１５）を割り付けたニューラルネットワ
ーク５２２異なる確信度を割り付けた結合を入力とするユ
ニット５３１規則（１６）を割り付けた部分ニューラルネッ
トワーク11 Input layer for inputting feature amount of classification target 12 Output layer for outputting classification result 13 Intermediate layer 51, 52, 53 Example of neural network configured according to multistage logical expression 511 Neural network 521 to which rule (14) is assigned 521 Rule Neural network to which (15) is assigned 522 Unit which receives as input a combination to which different certainty factors are assigned 531 Partial neural network to which rule (16) is assigned

Claims

[Claims]

1. A multi-layer for performing a classification process on target data represented by a feature vector composed of a plurality of vector elements when a rough classification rule with a certainty factor indicating the certainty of the rule is known. In a method of constructing a structural neural network, a partial configuration of the neural network is initialized based on the general classification rule, and each combination of the initialized partial configurations of the neural network is based on the general classification rule. And assigning a certainty factor according to the value of the certainty factor assigned to each combination of the partial configurations, a suppression rate that is a learning parameter that determines the magnitude of the contribution of the difference of each weight to the evaluation function of the neural network is determined. , Learning of the neural network using sample data, the neural network Checks whether a classification function for data, if they do not have the
A method for constructing a multilayer structure type neural network, characterized in that the processing after the processing for determining the learning parameter is repeated, and if the learning parameter is included, the processing is terminated.

2. When determining the learning parameter,
The configuration of a multilayer structure type neural network according to claim 1, wherein the learning rate, which is the learning parameter for determining the magnitude of the adjustment amount of the load or the bias in one iteration, is determined according to the representative value of the certainty factor of the general rule. Method.