JPH0313186A

JPH0313186A - Encoder

Info

Publication number: JPH0313186A
Application number: JP1148923A
Authority: JP
Inventors: Noriaki Minami; 憲明南
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1989-06-12
Filing date: 1989-06-12
Publication date: 1991-01-22

Abstract

PURPOSE:To obtain an encoder with high universality by providing a converter whose output characteristic is set at the distribution characteristic of Laplace distribution or Gaussian distribution at the front stage of a neural net input layer. CONSTITUTION:A neural net 5 based on a back propagation method comprising of three layers of an input layer 6, an intermediate layer 7, and an output layer 8, orthogonal transform encoder, the converter 9 whose output characteristic is set at the distribution characteristic of Laplace distribution or Gaussian distribution, and a reverse converter 10 consisting of a decoder with the characteristic opposite to that of the converter 9 are provided. And image information is supplied to the input layer after converting to the information with the distribution characteristic of Laplace distribution or Gaussian distribution with the converter 9. Thereby, it is possible to obtain the encoder with high universality using the neural net in which little change occurs in the information supplied to the input layer even when the image information is changed, and various kinds of compression encoding of an image can be performed based on a taught compression rule.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、ニューラル・ネットを用いて静止画。[Detailed description of the invention] [Industrial application field] The present invention uses a neural network to capture still images.

動画の画像情報を圧縮する符号化装置に関する。The present invention relates to an encoding device that compresses image information of a moving image.

[Conventional technology]

従来１画像のデジタル伝送、デジタル記録再生の分野に
おいては、ニューラル・ネットを用いて画像情報の圧縮
、復号を行うことが考えられている。BACKGROUND ART Conventionally, in the field of digital transmission of a single image and digital recording/reproduction, it has been considered to compress and decode image information using a neural network.

−ｔして、ｒ情報処理Ｊ　ｔＤ　Ｖｏｌ、２９　、　Ａ
５　（情報処理学会、　１９８８年５月発行）の５１１
頁には、バックグロバゲーンヨン法のニューラル・ネッ
トヲ利用シ、このネットの入力層及び入力層よりニュー
ロン数（ユニット数）の少ない中間層が構成する符号化
装置により２静止画（デジタイズされた写真）の画像情
報を圧縮することが記載されている。-t, rInformation Processing J tD Vol, 29, A
511 of 5 (Information Processing Society of Japan, published May 1988)
This page uses a neural network using the backglobe method, and generates two still images (digitized photos) using an encoding device consisting of an input layer of this net and an intermediate layer with a smaller number of neurons (units) than the input layer. ) is described to compress image information.

ま念、中間層の圧縮情報を中間層及び出力層が構成する
復号化装置で復号化し１画像情報を再現することも記載
されている。It is also described that the compressed information of the intermediate layer is decoded by a decoding device composed of the intermediate layer and the output layer to reproduce one image information.

そして、前記文献には、中間層を１層とする３層構成の
ネットとし、゛入力層、出力層のユニット数を画像情報
の分割要素に応じた８層８個（２次元）、中間層のユニ
ット数を１６個とした場合、プロパゲーションの教示を
くり返してプログラミング設定され比圧縮規則に基き、
中間層のユニット量子化ビット８〜５のときに、再現性
の良い圧縮。In the above document, the net has a three-layer configuration with one intermediate layer, and ``the number of units in the input layer and output layer is 8 (two-dimensional) in 8 layers according to the division elements of image information; When the number of units is 16, the propagation teaching is repeated and the programming is set based on the ratio compression rule.
Compression with good reproducibility when unit quantization bits of the intermediate layer are 8 to 5.

復号が行えることが報告されている。It has been reported that decryption is possible.

この場合、原信号としての画像情報が分割要素光シ８ビ
ットの情報であれば、入力層のビット数（８ｘ８ｘ８＝
５１２）に対して中間層のビット数が（８〜５）Ｘ１６
＝８０〜１２８ビツトになるため、符号化装置により約
ｌ／６４〜１／４の圧縮が行える。In this case, if the image information as the original signal is 8-bit information of the divided element optical signal, the number of bits of the input layer (8x8x8=
512), the number of bits in the middle layer is (8 to 5) x 16
= 80 to 128 bits, so compression of about 1/64 to 1/4 can be performed by the encoding device.

[Problem to be solved by the invention]

前記従来の符号化装置の場合、画像情報がそのまま入力
層に供給され、教示された特定の画像に適合するように
８ＥＷ３規則が設定されるため、前記文献にも記載され
ているように、前記特定の画像と異なる画像の圧縮には
用いることができず、汎用性が低い問題５咀がある。In the case of the conventional encoding device, the image information is supplied to the input layer as is, and the 8EW3 rule is set to match the specific image taught. There are five problems with this method: it cannot be used to compress images different from a specific image, and its versatility is low.

本発明は、ニューラル・ネットを用いた汎用性の高い符
号化装置を提供することを目的とする。An object of the present invention is to provide a highly versatile encoding device using a neural network.

[Means to solve the problem]

前記目的を達成するため、本発明の符号化装置は、ニュ
ーラル・ネットの入力層の前段に、出力特性がラグラス
分布又はガウス分布の分布特性に設定された変換器を設
け、画像情報を前記分布特性の情報に変換して前記入力
層に供給するという技術的手段を講じる。In order to achieve the above object, the encoding device of the present invention includes a converter whose output characteristics are set to Lagras distribution or Gaussian distribution before the input layer of the neural net, and converts image information into the distribution characteristics of the Lagras distribution or Gaussian distribution. A technical measure is taken to convert it into characteristic information and supply it to the input layer.

[For production]

前記のように構成された符号化装置の場合、画像情報が
変わっても変換器の出力情報はラプラス分布又はガウス
分布に規格化されて変化が少なく。In the case of the encoding device configured as described above, even if the image information changes, the output information of the converter is standardized to a Laplace distribution or a Gaussian distribution and does not change much.

設定された圧縮規則に基き１種々の画像の圧縮が行える
。One variety of images can be compressed based on the set compression rules.

〔Example〕

ｌ実施例について、第１図を参照して以下に説明する。 An embodiment will be described below with reference to FIG.

第１図において、（１）は原信号としての画像情報の入
力端子、（２）は符号化装置、（３）は復号化装置。In FIG. 1, (1) is an input terminal for image information as an original signal, (2) is an encoding device, and (3) is a decoding device.

（４）は再現された画像情報の出力端子である。(4) is an output terminal for reproduced image information.

（５）は入力層（６）、中面層（７）、出力層（８）の
３層構成のパックプロパゲーション法の周知のニューラ
ル・ネットであり、コンピュータ等を用いて形成されて
いる。（９）は入力層（６）の前段に設けられた変換器
、ａ＊ｒｔ出力層（８）の後段に設けられた逆変換器で
ある。(5) is a well-known neural net using a pack propagation method having a three-layer structure of an input layer (6), an intermediate layer (7), and an output layer (8), and is formed using a computer or the like. (9) is a converter provided before the input layer (6), and an inverse converter provided after the a*rt output layer (8).

なお、ニューラル・ネット（５）の各層（６）〜（８）
の○印はニューロン、すなわちユニットを示す。In addition, each layer (6) to (8) of neural net (5)
The circle indicates a neuron, that is, a unit.

そして、変換器（９）は例えばアダマール変換、ｌ１ｌ
ｌＩ散コサイン変換等の直交変換を行う直交変換符号化
器からなり、出力特性がラプラス分布又はガウス分布の
分布特性に設定されている。The converter (9) is, for example, Hadamard transform, l1l
It consists of an orthogonal transform encoder that performs orthogonal transform such as lI scattered cosine transform, and the output characteristics are set to Laplace distribution or Gaussian distribution distribution characteristics.

また、逆変換器めは変換器（９）の逆特性の復号化器か
らなる。Further, the inverse transformer is composed of a decoder having an inverse characteristic to that of the transformer (9).

そして、デジタイズされ次写真等の静止画又は動画の画
像情報（輝度情報）が１例えば８サンプ）ｖ（画素点）
×８ラインに分割されて入力端子（１）から変換器（９
）に入力され、この変換器（９）の２次元の直交変換に
より、ラグラス分布又はガウス分布の分布特性の情報に
変換される。Then, the image information (luminance information) of the still image or video that is digitized is 1, for example, 8 samples) v (pixel point)
It is divided into ×8 lines from the input terminal (1) to the converter (9
), and is converted into information on the distribution characteristics of the Lagras distribution or Gaussian distribution by two-dimensional orthogonal transformation in this converter (9).

さらに、変換器（９）の出力情報がニューラル・ネット
（５）の入力層（６）に供給される。Furthermore, the output information of the transformer (9) is fed to the input layer (6) of the neural net (5).

このネット（５）は入力情報の分割要素に応じて、入力
層（６）、出力層（８）が８Ｘ８個のユニットで構成さ
れるとともに中間層（７）が１６個のユニットで構成さ
れている。In this net (5), the input layer (6) and output layer (8) are composed of 8x8 units, and the middle layer (7) is composed of 16 units, depending on the division element of input information. There is.

さらに、使用前の学習時、特定の画像の画像情報に基く
変換器（９）の出力情報又はラプラス分布あて入力層（
６）、中間層（７）のネットワークのＩｌｌ規則及び中
間層（７）、出力層（８）のネットワークの画面再生規
則がプログラミング設定される。Furthermore, during learning before use, the output information of the converter (9) based on the image information of a specific image or the Laplace distribution input layer (
6) Ill rules for the intermediate layer (7) network and screen playback rules for the intermediate layer (7) and output layer (8) networks are set by programming.

このとき、圧縮規則１画面再生規則は、変換器（９）の
出力特性に応じたラグラス分布又はガウス分布の情報に
従って設定される。At this time, the compression rule 1 screen reproduction rule is set according to information on the Lagras distribution or Gaussian distribution depending on the output characteristics of the converter (9).

そして、入力層（６）に供給された情報が前記Ｅｉ規則
に従って圧縮され、圧縮された情報が符号化装置（２）
の出力情報として復号化装置（３）に伝送される。Then, the information supplied to the input layer (6) is compressed according to the Ei rule, and the compressed information is sent to the encoding device (2).
is transmitted to the decoding device (3) as output information.

一方、復号化装置（３）においては、中間層（７）の圧
縮された情報が前記画面再生規則に従って復号再生され
、再生された情報が出力層（８ンから逆変換器αＯに供
給される。On the other hand, in the decoding device (3), the compressed information of the intermediate layer (7) is decoded and reproduced according to the screen reproduction rule, and the reproduced information is supplied from the output layer (8) to the inverse transformer αO. .

このとき、再生された情報に入力層（６）に供給される
情報と同様、ラプラス分布又はガウス分布の分布特性の
情報になる。At this time, like the information supplied to the input layer (6), the reproduced information becomes information with distribution characteristics of Laplace distribution or Gaussian distribution.

さらに、逆変換器ＱＯの逆変換により、再生された情報
が入力端子（１）の画像情報と同じ情報に戻され、出力
端子（４）から出力される。Furthermore, the reproduced information is returned to the same information as the image information of the input terminal (1) by the inverse transformation of the inverse transformer QO, and is outputted from the output terminal (4).

そして、どのような画像の画像情報を入力端子（１）に
供給しても、ニューラル・ネット（５）に供給される情
報の確率分布が大きく変化しない念め、教示された圧縮
規則１画面再生規則に基き、はぼ適正な圧縮、復号が行
える。Then, to make sure that the probability distribution of the information supplied to the neural net (5) does not change significantly no matter what kind of image image information is supplied to the input terminal (1), the taught compression rule 1-screen playback is performed. Appropriate compression and decoding can be performed based on rules.

ところで２画像情報の直流成分はラプラス分布。By the way, the DC component of the 2-image information has a Laplace distribution.

ガウス分布に従わず、この直流成分がニューラル・ネッ
ト（５）を通ることによって圧縮率の低下等を招く。This direct current component does not follow the Gaussian distribution and passes through the neural net (5), resulting in a decrease in compression ratio.

そして、直流成分に基く圧縮率の低下等が問題となると
きは、変換器（９）から直流成分の情報を独立して出力
し、この直流成分の情報を図中の１点鎖線に示すように
、ニューラル・ネット（５）を介さずに逆変換器αＱに
供給すればよい。When a reduction in compression ratio based on the DC component becomes a problem, the converter (9) outputs the DC component information independently, and the DC component information is expressed as shown in the dashed line in the figure. In this case, it is sufficient to supply the signal to the inverse transformer αQ without going through the neural net (5).

また、前記実施例では変換器（９）、逆変換器αｑを直
交変換符号化器、その復号化器で形成したが。Further, in the embodiment described above, the transformer (9) and the inverse transformer αq were formed by an orthogonal transform encoder and its decoder.

予測符号化器、その復号化器で形成してもよく、この場
合、変換器（９）、逆変換器００の構成が簡素化する。It may be formed by a predictive encoder and its decoder, and in this case, the configurations of the transformer (9) and the inverse transformer 00 are simplified.

さらに、動画の画像情報を圧縮、復号する際は、入力端
子（１）の画像情報をフィールド内、フィールド間又は
フレーム間の差分の情報としてもよく、変換器（９）に
通常の符号化器と前記差分の情報の１又は複数の符号化
器とを設け、各符号化器を適応的に切換えて用いるよう
にしてもよい。Furthermore, when compressing and decoding image information of a moving image, the image information at the input terminal (1) may be used as intra-field, inter-field, or inter-frame difference information, and the converter (9) is used as a normal encoder. and one or more encoders for the difference information, and each encoder may be adaptively switched and used.

また、ニューラル・ネットの構成等は実施例に限定され
るものでなく１例えばニューフル・ネットが、入力層（
６）、中間層（７）の間及び中間層（７）、出力層（８
）の間に補助的な中間層を有する場合にも適用すること
ができる。Furthermore, the configuration of the neural net is not limited to the embodiment. For example, a neural net may have an input layer (
6), between the intermediate layer (7) and between the intermediate layer (7) and the output layer (8).
) can also be applied to the case where there is an auxiliary intermediate layer between them.

〔Effect of the invention〕

本発明は、以上説明したように構成されているため、以
下に記載する“効果を奏する。Since the present invention is configured as described above, it achieves the effects described below.

ニューラル・ネットの入力層の前段に変換器ヲ設け、こ
の変換器により１画像情報をラプラス分布又はガウス分
布の分布特性の情報に変換して前記入力層に供給したこ
とにより１画像情報が変わっても入力層に供給される情
報の変化が少なく、教示された王縮規則に基き１種々の
画像の圧縮符号化が行え、ニューラル・ネットを用いた
汎用性の高い符号化装置を提供することができる。A converter is provided before the input layer of the neural net, and this converter converts one image information into information on the distribution characteristics of Laplace distribution or Gaussian distribution and supplies it to the input layer, so that one image information is changed. It is also possible to provide a highly versatile encoding device using a neural network, which has little change in the information supplied to the input layer, can compress and encode various images based on the taught compression rules, and can.

【図面の簡単な説明】図面は本発明の符号化装置の１実施例のブロック図であ
る。＜５）・・ニューフル・ネット、　（６）・・・入力層
、（７）・・中間層、（９）・・・変換器。BRIEF DESCRIPTION OF THE DRAWINGS The drawing is a block diagram of one embodiment of the encoding device of the present invention. <5) New full net, (6) Input layer, (7) Middle layer, (9) Converter.

Claims

[Claims]

(1) In an encoding device comprising an input layer of a neural net and an intermediate layer having a smaller number of neurons than the input layer, and compressing image information and outputting it from the intermediate layer, an output characteristic is provided at a stage before the input layer. An encoding device comprising: a converter set to a distribution characteristic of a Laplace distribution or a Gaussian distribution, and converting the image information into information having the distribution characteristic and supplying the information to the input layer.