JPH0478924B2

JPH0478924B2 -

Info

Publication number: JPH0478924B2
Application number: JP62179088A
Authority: JP
Inventors: Mutsumi Watanabe
Original assignee: Agency of Industrial Science and Technology
Current assignee: National Institute of Advanced Industrial Science and Technology AIST
Priority date: 1987-07-20
Filing date: 1987-07-20
Publication date: 1992-12-14
Also published as: JPS6423109A

Description

[Detailed description of the invention]

［発明の目的］（産業上の利用分野）この発明は、物体を２方向から撮像し、この２
方向からの画像により物体を立体的に視て（所謂
ステレオ視して）、例えば障害物等の物体の位置
等を検出するために使用されるステレオ視覚装置
に関する。（従来の技術）例えば、原子力発電所の保守および点検等の極
限作業を放射能等に対する安全性を考慮してロボ
ツトに行なわせる場合、ロボツトが移動場所にお
いて障害物等に衝突しないように障害物等の物体
の位置等を検出するためにロボツトに自律的な移
動能力を持たせ、移動場合に存在する物体を三次
元的に、すなわち立体的に観察させ（例えば、雑
誌Proceedings of the IEEE、Vol.71、No.７、
PP.872−884、1983年発行のエイチ・ピー・モラ
ベツク（H.P.Moravec）による「The Standard
Cart and the CMV Rover」参照）、これにより
障害物等を検出し、これを避けて目的地まで移動
することが必要である。このように物体の三次元的位置情報を得る方
法、すなわち物体をステレオ的に視て物体の位置
情報を得る方法として、従来、例えば超音波を使
用する方法やレーザ光パターンを投影する方法等
の多くの方法が提案されているが、一般環境にお
いて利用できること、画像を使用することにより
移動環境全体を画像的または三次元的に認識かつ
理解するという処理の高度化ができるようになつ
てきた等の理由からステレオ視覚法、すなわち２
台のカメラを用いて２方向から物体を撮像し、こ
の２方向からの２つの左右画像における対応点を
求め、この対応点とカメラの配置から三角測量の
原理により物体の三次元位置および物体までの距
離を求めるステレオ視覚法が従来から研究されて
いる（例えば、電子技術総合研究所彙報、第37
巻、第12号、第1101〜1119頁、1974年発行の安
江、白井著による「物体認識のための両眼立体
視」参照）。この従来のステレオ視覚法においては、２つの
左右画像の中の物体中の特徴点（物点）を左右両
画像から検出し、この検出した対応点に対して三
角測量の原理を適用して物点の位置を検出してい
るが、物点を対応付けするようにしているため、
この点が影や雑音の影響を受け、抽出失敗、物体
の重なり、左右カメラの視野の違い等によつて対
応付けを誤り、物体の位置計算が正確にできず、
信頼性が大幅に低下する。このような点を解決するために、従来、DPマ
ツチングを使用して広い範囲の整合性を保つよう
に対応付けを行なう別の方法が提案されている
（例えば、電子通信学会誌、第J68−Ｄ、No.４、第
554〜561頁、1985年４月発行の太田、正井、池田
著による「動的計画法によるステレオ画像の区間
対応法」参照）。（発明が解決しようとする問題点）上述した従来のステレオ視覚法では、影や雑音
の影響によつて必要な画像情報が失われ易いこ
と、物体の重なりやカメラの視野の違い等によつ
て対応付けを誤ると信頼性が大幅に低下すること
等の問題点があり、また別の方法では整合性の指
標の決め型に問題がある。この発明は、上記に鑑みてなされたもので、そ
の目的とするところは、被撮像体の三次元位置を
誤りなく正確かつ確実に検出することができるス
テレオ視覚装置を提供することができる。［発明の構成］（問題点を解決するための手段）上記問題点を解決するため、被撮像体を２方向
から撮像し、この２方向からの第１および第２の
画像により被撮像体を立体的に視て三次元位置を
測定するステレオ視覚装置であつて、この発明
は、上記第１の画像情報から特徴部を検出する特
徴部検出手段と、上記特徴部に対して所定領域の
第１のマスクウインドを形成する第１のマスクウ
インド形成手段と、上記第１の画像に対して測定
距離に対応する視差分だけ上記第１のマスクウイ
ンドに対してずれた位置にマスクウインドを第２
の画像上に形成する第２のマスクウインド形成手
段と、上記第１および第２の画像のそれぞれ対応
するマスクウインド内の類似指標を算出する類似
指標算出手段とを有することを要旨とする。（作用）この発明のステレオ視覚装置においては、被撮
像体を２方向から撮像した第１の画像から特徴部
を検出し、該特徴部に対して所定領域の第１のマ
スクウインドを形成し、該第１のマスクウインド
に対して視差分ずれた第２のマスクウインドを形
成し、第１および第２のマスクウインド内の対応
する第１および第２の画像の類似指標を算出して
いる。（実施例）以下、図面を用いてこの発明の実施例を説明す
る。第１図はこの発明の一実施例に係るステレオ視
覚装置の要部の構成を示すブロツク図である。こ
のステレオ視覚装置は、例えば原子力発電所内の
保守および点検等の極限作業を行なうために原子
力発電所内を移動するロボツトに組み込まれ、ロ
ボツトが移動する方向に存在する障害物等の物体
を検出し、ロボツトが物体に衝突しないように制
御するために使用されているものである。本ステレオ視覚装置は、例えば上記ロボツトに
水平方向に左右に並べて搭載された２台の左右の
カメラを有し、この２台のカメラによりロボツト
が移動する方向に存在する物体、すなわち移動環
境に対する２つの左右画像を撮像し、この撮像し
た左右の画像である移動環境画像情報が第１図の
矢印１で示すようにステレオ画像入力部３に入力
されている。上述した２台のカメラにより撮像されてステレ
オ画像入力部３に入力された移動環境の左右一対
の原画像情報は、いずれか一方の画像、例えば左
画像情報が特徴エツジ抽出部５に供給され、ここ
で後述するように撮像した画像の中で特徴のある
エツジ部分が抽出される。この抽出された特徴エ
ツジ部はマスク発生部７に供給される。このマス
ク発生部７は上記特徴エツジ部に対する左マスク
ウインドを作成し、この左マスクウインドに対応
する左原画像を左ウインドメモリとして記憶す
る。マスク発生部７において左ウインドメモリとし
て記憶された特徴エツジ部の左原画像は、類似指
標計算判定部９に供給され、左ウインドメモリに
対して視差分左にずれた位置を中心にした右マス
クウインドを作成し、この右マスクウインドに対
する右原画像を右ウインドメモリとして記憶す
る。それから、このようにそれぞれ記憶された左
右のウインドメモリ内の画像の類似指標値を算出
し、この類似指標値から障害物等の物体があるか
否かが判定され、矢印１１で示すように障害物有
無情報として出力されるのである。次に、第２図ないし第４図を参照して特徴エツ
ジ抽出部５、マスク発生部７、類似指標計算判定
部９各部の構成の詳細についてそれぞれまず説明
してから、全体の作用を第５図ないし第７図も参
照して説明する。まず、特徴エツジ抽出部５は、第２図に示すよ
うに、画像バス１３およびアドレスバス１５に接
続された第１ないし第５の画像メモリ１７，１
９，２１，２３，２５および同様に画像バス１３
およびアドレスバス１５に接続され、画像バス１
３を介して上記各画像メモリに接続されている
TVカメラ２７、空間フイルタリング回路２９、
２値化回路３３、細線化回路３７、ノイズ除去回
路４１を有するとともに、上記空間フイルタリン
グ回路２９に接続された空間フイルタバンク３
１、２値化回路３３に接続されたしきい値メモリ
３５、細線化回路３７に接続された論理フイルタ
バンク３９、ノイズ除去回路４１に接続されたし
きい値メモリ４３を有して構成される。なお、第
１の画像メモリ１７は、この第２図においては１
つのみ示されているが、この第１の画像メモリ１
７は上記左右の２台のカメラに対応して第１の左
画像メモリ１７Ｌおよび第１の右画像メモリ１７
Ｒからなり、第１の左画像メモリ１７Ｌには左カ
メラで撮像した左原画像情報が記憶され、第１の
右画像メモリ１７Ｒには右カメラで撮像した右原
画像情報が記憶されるようになつている。また、マスク発生部７は、第３図に示すよう
に、上記画像バス１３およびアドレスバス１５に
接続され、第５の画像メモリ２５に上記特徴エツ
ジ抽出部５で記憶された特徴エツジの位置を検出
するエツジ位置検出回路４５、このエツジ位置検
出回路４５で検出した特徴エツジの位置情報を記
憶する特徴エツジ位置メモリ４７、この特徴エツ
ジ位置メモリ４７に記憶された特徴エツジ位置情
報に対するマスクウインド、特に左マスクウイン
ドを作成する左マスクウインド作成回路４９Ｌ、
この左マスクウインド作成回路４９Ｌで作成され
た左マスクウインドに対応する左原画像を第１の
左画像メモリ１７Ｌから読み出して記憶する左ウ
インドメモリ５１Ｌを有する構成である。更に、類似指標計算判定部９は、第４図に示す
ように、上記マスク発生部７を構成する左マスク
ウインド作成回路４９Ｌおよび左ウインドメモリ
５１Ｌにそれぞれ対応して右マスクウインド作成
回路４９Ｒおよび右ウインドメモリ５１Ｒを有す
るとともに、測定する距離Ｌに対応する視差情報
Ｓを上記右マスクウインド作成回路４９Ｒに供給
する視差指定回路５３、上記左ウインドメモリ５
１Ｌおよび右ウインドメモリ５１Ｒから出力され
る左画像および右画像の類似指標値を算出する類
似指標計算回路５５、およびこの類似指標計算回
路５５で算出された類似指標値を記憶する類似指
標メモリ５７を更に有する構成である。以上のように構成される本実施例のステレオ視
覚装置の作用を第５図ないし第７図を参照して説
明する。まず、前期ステレオ画像入力部３を介して左右
の２台のカメラで撮像された左右の原画像情報
は、それぞれ第１の左画像メモリ１７Ｌおよび第
１の右画像メモリ１７Ｒに記憶される。この左右
の原画像情報のうち一方の原画像、例えば第１の
左画像メモリ１７Ｌに記憶されている左の原画像
情報は、第１の左画像メモリ１７Ｌから空間フイ
ルタリング回路２９に供給され、この左の原画像
に写つている物体または移動環境の画像の中のエ
ツジ部の強調処理が行なわれ、このエツジ強調左
画像情報は第２の画像メモリ１９に記憶される。
この空間フイルタリング回路２９において、エツ
ジ強調左画像情報を求める動作は、第１の左画像
メモリ１７Ｌから供給される画像値と空間フイル
タリング回路２９に接続されている空間フイルタ
バンク３１に貯えられている数値列との積和演算
を行なうことにより達成されるものであり、これ
は乗算器および加算器によつて構成される。一例
として、第５図ａに示すような左カメラで撮像し
た左原画像を空間フイルタリングする場合につい
て、例えば空間フイルタとして次に示す重み係数
を持つ３×３空間フイルタを使用して処理した結
果のエツジ強調左画像を第５図ｂに示す。 [Purpose of the invention] (Field of industrial application) This invention images an object from two directions, and
The present invention relates to a stereo visual device that is used to view an object three-dimensionally (so-called stereo viewing) using images from different directions and detect the position of an object such as an obstacle. (Prior art) For example, when a robot is to perform extreme work such as maintenance and inspection of a nuclear power plant in consideration of safety against radiation, etc., the robot must be able to move around obstacles to prevent it from colliding with them. In order to detect the position of objects such as .71, No.7,
PP.872-884, “The Standard” by H.P. Moravec, published in 1983.
(see "Cart and the CMV Rover"), it is necessary to detect obstacles and move to the destination while avoiding them. Conventional methods for obtaining three-dimensional positional information of an object, that is, methods for obtaining positional information of an object by stereoscopically viewing the object, include a method using ultrasound, a method of projecting a laser beam pattern, etc. Many methods have been proposed, but it has become possible to use them in general environments, and by using images, it has become possible to recognize and understand the entire moving environment visually or three-dimensionally. Stereo vision method, i.e. 2
An object is imaged from two directions using a camera on a stand, and corresponding points are found in the two left and right images from these two directions. From these corresponding points and the placement of the camera, the three-dimensional position of the object and the object are determined using the principle of triangulation. Stereo vision methods for determining the distance between the
(See "Binocular Stereoscopic Vision for Object Recognition" by Yasue and Shirai, Vol. 12, pp. 1101-1119, 1974). In this conventional stereo vision method, feature points (object points) in objects in the two left and right images are detected from both the left and right images, and the principle of triangulation is applied to these detected corresponding points. Although the position of the point is detected, since we are trying to associate the object points,
This point is affected by shadows and noise, and the correspondence is incorrect due to failures in extraction, overlapping objects, differences in the field of view of the left and right cameras, etc., and the position of the object cannot be calculated accurately.
Reliability is significantly reduced. To solve this problem, another method has been proposed that uses DP matching to maintain consistency over a wide range (for example, Journal of the Institute of Electronics and Communication Engineers, No. J68- D, No.4, No.
(See "Segment Correspondence Method for Stereo Images Using Dynamic Programming" by Ota, Masai, and Ikeda, pp. 554-561, April 1985). (Problems to be Solved by the Invention) In the conventional stereo vision method described above, necessary image information is easily lost due to the effects of shadows and noise, and due to overlapping objects and differences in camera field of view, etc. If the correspondence is incorrect, there are problems such as a significant drop in reliability, and other methods have problems in determining the consistency index. The present invention has been made in view of the above, and its purpose is to provide a stereo visual device that can accurately and reliably detect the three-dimensional position of an object to be imaged without error. [Structure of the Invention] (Means for Solving the Problems) In order to solve the above problems, the object to be imaged is imaged from two directions, and the object to be imaged is imaged from two directions. The present invention is a stereo visual device that measures a three-dimensional position by viewing stereoscopically, and the present invention includes a feature detecting means for detecting a feature from the first image information, and a feature detecting means for detecting a feature from the first image information, and a first mask window forming means for forming one mask window; and a second mask window forming means for forming a second mask window at a position shifted from the first mask window by a parallax corresponding to the measurement distance with respect to the first image.
The present invention is characterized in that it has a second mask window forming means for forming the image on the image, and a similarity index calculating means for calculating the similarity index in the mask window corresponding to each of the first and second images. (Function) In the stereo visual apparatus of the present invention, a characteristic part is detected from a first image of an object to be imaged from two directions, and a first mask window of a predetermined area is formed for the characteristic part, A second mask window is formed that is shifted by a parallax with respect to the first mask window, and a similarity index of corresponding first and second images in the first and second mask windows is calculated. (Example) Hereinafter, an example of the present invention will be described using the drawings. FIG. 1 is a block diagram showing the configuration of essential parts of a stereo visual apparatus according to an embodiment of the present invention. This stereo vision device is built into a robot that moves within a nuclear power plant to perform extreme work such as maintenance and inspection inside the nuclear power plant, and detects objects such as obstacles that exist in the direction in which the robot is moving. It is used to control robots so that they do not collide with objects. This stereo vision device has, for example, two left and right cameras mounted on the robot horizontally side by side, and uses these two cameras to detect objects existing in the direction in which the robot moves, that is, objects in the moving environment. Two left and right images are captured, and moving environment image information, which is the captured left and right images, is input to the stereo image input unit 3 as shown by arrow 1 in FIG. Of the pair of left and right original image information of the moving environment captured by the two cameras described above and input to the stereo image input unit 3, one of the images, for example, the left image information, is supplied to the feature edge extraction unit 5, Here, as will be described later, characteristic edge portions are extracted from the captured image. This extracted characteristic edge portion is supplied to the mask generation section 7. This mask generation section 7 creates a left mask window for the feature edge portion, and stores the left original image corresponding to this left mask window as a left window memory. The left original image of the characteristic edge portion stored as the left window memory in the mask generation unit 7 is supplied to the similarity index calculation/judgment unit 9, which generates a right mask centered on a position shifted to the left by the amount of parallax with respect to the left window memory. A window is created, and the right original image for this right mask window is stored as a right window memory. Then, the similarity index values of the images stored in the left and right window memories respectively stored in this way are calculated, and it is determined from this similarity index value whether or not there is an object such as an obstacle. This is output as information on the presence or absence of the object. Next, with reference to FIGS. 2 to 4, the details of the configuration of each part of the feature edge extraction section 5, mask generation section 7, and similarity index calculation determination section 9 will be explained in detail, and then the overall operation will be explained in the fifth section. The explanation will be made with reference to FIGS. 7 to 7. First, as shown in FIG.
9, 21, 23, 25 and similarly image bus 13
and address bus 15, and image bus 1
Connected to each image memory above through 3.
TV camera 27, spatial filtering circuit 29,
A spatial filter bank 3 includes a binarization circuit 33, a line thinning circuit 37, and a noise removal circuit 41, and is connected to the spatial filtering circuit 29.
1. Consisting of a threshold memory 35 connected to a binarization circuit 33, a logic filter bank 39 connected to a thinning circuit 37, and a threshold memory 43 connected to a noise removal circuit 41. . Note that the first image memory 17 is 1 in FIG.
Although only one image memory 1 is shown, this first image memory 1
Reference numeral 7 indicates a first left image memory 17L and a first right image memory 17 corresponding to the two left and right cameras.
The first left image memory 17L stores left original image information taken by the left camera, and the first right image memory 17R stores right original image information taken by the right camera. It's summery. Further, as shown in FIG. 3, the mask generation section 7 is connected to the image bus 13 and the address bus 15, and stores the positions of the characteristic edges stored in the characteristic edge extraction section 5 in the fifth image memory 25. An edge position detection circuit 45 to detect, a feature edge position memory 47 that stores position information of a feature edge detected by this edge position detection circuit 45, a mask window for the feature edge position information stored in this feature edge position memory 47, especially a left mask window creation circuit 49L that creates a left mask window;
The configuration includes a left window memory 51L that reads out and stores the left original image corresponding to the left mask window created by the left mask window creation circuit 49L from the first left image memory 17L. Furthermore, as shown in FIG. 4, the similarity index calculation/judgment unit 9 generates a right mask window generation circuit 49R and a right mask window generation circuit 49R and a right mask window generation circuit 49L and a left window memory 51L, respectively, which constitute the mask generation unit 7. A parallax designation circuit 53 having a window memory 51R and supplying parallax information S corresponding to the distance L to be measured to the right mask window creation circuit 49R, and the left window memory 5
A similarity index calculation circuit 55 that calculates similarity index values of the left image and right image output from the 1L and right window memories 51R, and a similarity index memory 57 that stores the similarity index values calculated by this similarity index calculation circuit 55. The configuration further includes: The operation of the stereo visual apparatus of this embodiment configured as described above will be explained with reference to FIGS. 5 to 7. First, left and right original image information captured by the two left and right cameras via the stereo image input section 3 is stored in the first left image memory 17L and first right image memory 17R, respectively. One of the left and right original image information, for example, the left original image information stored in the first left image memory 17L, is supplied from the first left image memory 17L to the spatial filtering circuit 29, Edge portions in the image of the object or moving environment shown in the left original image are emphasized, and this edge-enhanced left image information is stored in the second image memory 19.
In this spatial filtering circuit 29, the operation of obtaining edge-enhanced left image information is performed using image values supplied from the first left image memory 17L and stored in a spatial filter bank 31 connected to the spatial filtering circuit 29. This is achieved by performing a product-sum operation with a numerical string, and this is constructed by a multiplier and an adder. As an example, in the case of spatially filtering the left original image captured by the left camera as shown in Fig. 5a, the results are obtained by processing the left original image using, for example, a 3x3 spatial filter having the following weighting coefficient as the spatial filter. The edge-enhanced left image of is shown in FIG. 5b.

【表】第５図ａの画像には建物の中の壁、柱、天井の
梁等の構造体が写つているが、第５図ｂにはこの
構造体のエツジ部のみが強調された線のようなエ
ツジ強調左画像が写つており、この第５図ｂのエ
ツジ強調左画像に対応するエツジ強調左画像情報
が第２の画像メモリ１９に記載されるのである。第２の画像メモリ１９に記憶されたエツジ強調
左画像情報は、画像バス１３を介して２値化回路
３３に供給され、しきい値メモリ３５に予め記憶
されているしきい値に基づいて［０、１］の２値
画像に変換され、第３の画像メモリ２１に記憶さ
れる。なお、この場合、２値化するしきい値とし
ては、しきい値メモリ３５に記憶されている固定
値を用いてもよいが、原画像の濃度分布を測定し
て動的にしきい値を定めてもよい。本実施例では
しきい値メモリ３５に記憶されている固定値を使
用している。第５図ｃはこの２値化処理された２
値画像を示しているが、これはしきい値として
100を用いて行なつた場合のものである。第３の画像メモリ２１に記憶されたエツジ強調
左画像の２値画像は、画像バス１３を介して細線
化回路３７に供給され、論理フイルタバンク３９
に記憶されているマスクに従つて２値エツジの線
幅が１画素分になるまで細線処理が施され、この
細線化画像情報は第４の画像メモリ２３に記憶さ
れる。更に、第４の画像メモリ２３に記憶されたエツ
ジの細線化画像情報は、画像バス１３を介してノ
イズ除去回路４１に供給され、エツジの長さ順に
ラベリングされ、しきい値メモリ４３からのしき
い値以下のノイズと考えられうる短いものを除去
し、残つた長いエツジの細線化画像情報を特徴エ
ツジ画像情報として抽出して第５の画像メモリ２
５に記憶する。第５図ｄはこの抽出された特徴エ
ツジ画像を示しているものであり、第５図ｃに示
す２値画像に存在している短いエツジ画像はほと
んど除去され、長い特徴エツジ画像情報のみが示
されている。以上のようにして特徴エツジ抽出部５によつて
第５の画像メモリ２５に記憶された左画像に対す
る特徴エツジ画像情報は、第３図に示すマスク発
生部７において画像バス１３を介してエツジ位置
検出回路４５により走査され、その特徴エツジ群
の開始点座標（Xs、Ys）およびエツジ長ｌを検
出して特徴エツジ位置メモリ４７に記憶する。こ
の特徴エツジ位置メモリ４７に記憶された特徴エ
ツジの開始点座標およびエツジ長は左マスクウイ
ンド作成回路４９Ｌに供給され、これらの情報に
基づいて特徴エツジに対するマスクウインドが作
成され、このマスクウインドに対応する原画像、
すなわち第１の左画像メモリ１７Ｌに記載されて
いる左原画像のうち該マスクウインドに対応する
原画像が高速RAMで構成されている左ウインド
メモリ５１Ｌに記憶される。第６図ａ，ｂは、このマスクウインドおよび該
マスクウインドに対応する原画像を示している図
であるが、第６図ａの線６１で示すように開始点
座標（Xs、Ys）およびエツジ長ｌを有する特徴
エツジ６１に対して矩形６３で示すマスクウイン
ド６３は特徴エツジ６１上で開始点から一定画素
値分下つた点を中心とし、かつエツジ６１を中心
軸として矩形状の領域として形成される。このよ
うに形成されるマスクウインド６３に対して第６
図ｂに示すように対応する左原画像（第１の左画
像メモリ１７Ｌに記憶されている）のマスクウイ
ンド対応原画像６５が第１の左画像メモリ１７Ｌ
から取り出され、左ウインドメモリ５１Ｌに記憶
されるのである。なお、第６図ｂにおいて符号６
７は左カメラが撮像している移動環境の物体、す
なわち上述した構造体等を示しているものであ
る。以上のようにマスク発生部７の処理動作により
左マスクウインド作成回路４９Ｌで左マスクウイ
ンドが形成され、左ウインドメモリ５１Ｌに左の
マスクウインド対応原画像６５が記憶されると、
該左マスクウインドに対応する右マスクウインド
が類似指標計算判定部９の右マスクウインド作成
回路４９Ｒで形成される。この右マスクウインド
は、左マスクウインドに対して、測定する距離Ｌ
に対応する視差Ｓ分だけ左にずれた位置を中心に
形成されているので、右マスクウインド作成回路
４９Ｒはマスク発生部７の特徴エツジ位置メモリ
４７から供給される左マスクウインドの特徴エツ
ジの開始点座標および長さ情報および視差指定回
路５３から供給される視差情報に基づいて右マス
クウインドを形成し、この右マスクウインドに対
応する右原画像、すなわち右マスクウインド対応
原画像情報を第１の右画像メモリ１７Ｒから読み
出し、右ウインドメモリ５１Ｒに記憶する。そして、右ウインドメモリ５１Ｒに記憶された
右マスクウインド対応原画像情報および左ウイン
ドメモリ５１Ｌに記憶されている左マスクウイン
ド対応原画像情報は、類似指標計算回路５５に入
力され、ここで左右のウインドメモリ内の両画像
の類似指標値が算出され、類似指標メモリ５７に
記憶されるのである。そして、類似指標メモリ５
７に記憶された類似指標値を所定のしきい値と比
較することにより障害物が存在するかが適確に判
定され、ロボツトは障害物に衝突することなく移
動することができるのである。なお、上述した視差Ｓと距離Ｌとの間には、左
右一対のカメラの光軸を平行にし、かつ取付高さ
を等しく設定し、また左右カメラの間隔をａ、レ
ンズ中心の撮像面間距離をｌ、画素／距離変換比
をεとすると、次式の関係があり、この関係に基
づいて視差指定回路５３は視差情報Ｓを右マスク
ウインド作成回路４９Ｒに供給することができる
のである。Ｓ＝εal／Ｌ ……(1) また、上記類似指標値ψabは、次式により示さ
れる関数を使用して求められる。 ψab＝〓 {^Aij} 〓〓 {^Bij}｛（Aij−μa）／σ_a−（Bij−μb）／σ_b｝²……
(2) 上式において、 Aij……左入力画像のマスクウインド内の画像の
各画素の濃度値、 μa……左入力画像のマスクウインド内の画像の
平均濃度、 σa……左入力画像のマスクウインド内の画像の
標準偏差、 Bij……右入力画像のマスクウインド内の画像の
各画素の濃度値、 μb……右入力画像のマスクウインド内の画像の
平均濃度、 σb……右入力画像のマスクウインド内の画像の
標準偏差、である。第７図は、上式(2)に基づいて類似指標値を３×
10画素のマスクウインドの水平方向の予測位置に
対して示しているものである。この図からわかる
ように、障害物に存在する位置に左右の両マスク
ウインドの位置が正しく対応した位置で類似指標
値は最小になつていることが確認される。なお、
第７図では、第５図ｄの、で示す各エツジに
対する特性が示されている。なお、上記実施例では、例えばロボツト等によ
り障害物を検出する場合について説明したが、本
発明はこれに限定されるものでなく、物体の位
置、距離、姿勢等を含む三次元的位置を求めるの
に広く適用できるものである。［発明の効果］以上説明したように、この発明によれば、被撮
像体を２方向から撮像した第１の画像から特徴部
を検出し、該特徴部に対して所定領域の第１のマ
スクウインドを形成し、該第１のマスクウインド
に対して視差分ずれた第２のマスクウインドを形
成し、第１および第２のマスクウインド内の対応
する第１および第２の画像の類似指標を算出して
いるので、この類似指標に基づいて第１および第
２の画像の中の正確で適確な対応付けを検出でき
るため、三角測量の原理を利用して被撮像体の三
次元位置を誤りなく正確かつ確実に検出すること
ができる。[Table] The image in Figure 5a shows structures such as walls, columns, and ceiling beams inside the building, but in Figure 5b, only the edges of these structures are highlighted. The edge-enhanced left image shown in FIG. The edge-enhanced left image information stored in the second image memory 19 is supplied to the binarization circuit 33 via the image bus 13, and based on the threshold value stored in advance in the threshold value memory 35, 0, 1] and stored in the third image memory 21. In this case, a fixed value stored in the threshold memory 35 may be used as the threshold for binarization, but it is also possible to dynamically determine the threshold by measuring the density distribution of the original image. It's okay. In this embodiment, a fixed value stored in the threshold value memory 35 is used. Figure 5c shows this binarized 2
It shows a value image, which is used as a threshold
100. The binary image of the edge-enhanced left image stored in the third image memory 21 is supplied to the thinning circuit 37 via the image bus 13, and the logical filter bank 39
According to the mask stored in , thinning processing is performed until the line width of the binary edge becomes one pixel, and this thinning image information is stored in the fourth image memory 23 . Further, the edge thinning image information stored in the fourth image memory 23 is supplied to the noise removal circuit 41 via the image bus 13, where the edges are labeled in order of their length, and the edge thinning image information is output from the threshold memory 43. Short ones that can be considered as noise below the threshold are removed, and the thinned image information of the remaining long edges is extracted as characteristic edge image information and stored in the fifth image memory 2.
Store in 5. Figure 5d shows this extracted feature edge image, where most of the short edge images present in the binary image shown in Figure 5c have been removed, and only long feature edge image information is shown. has been done. The characteristic edge image information for the left image stored in the fifth image memory 25 by the characteristic edge extraction section 5 as described above is sent to the edge position via the image bus 13 in the mask generation section 7 shown in FIG. The detection circuit 45 scans and detects the starting point coordinates (Xs, Ys) and edge length l of the characteristic edge group and stores them in the characteristic edge position memory 47. The start point coordinates and edge length of the feature edge stored in the feature edge position memory 47 are supplied to the left mask window creation circuit 49L, and a mask window for the feature edge is created based on this information, and a mask window corresponding to this mask window is created. original image,
That is, among the left original images written in the first left image memory 17L, the original image corresponding to the mask window is stored in the left window memory 51L configured with a high-speed RAM. FIGS. 6a and 6b are diagrams showing this mask window and the original image corresponding to the mask window. As shown by the line 61 in FIG. A mask window 63 indicated by a rectangle 63 for a feature edge 61 having a length l is formed as a rectangular area centered on a point on the feature edge 61 that is a certain pixel value below the starting point, and with the edge 61 as the central axis. be done. For the mask window 63 formed in this way, the sixth
As shown in FIG. b, the mask window corresponding original image 65 of the corresponding left original image (stored in the first left image memory 17L) is stored in the first left image memory 17L.
, and stored in the left window memory 51L. In addition, in FIG. 6b, the reference numeral 6
Reference numeral 7 indicates an object in the moving environment imaged by the left camera, that is, the above-mentioned structure. As described above, when the left mask window is formed in the left mask window creation circuit 49L by the processing operation of the mask generation unit 7, and the left mask window corresponding original image 65 is stored in the left window memory 51L,
A right mask window corresponding to the left mask window is formed by the right mask window creation circuit 49R of the similarity index calculation/judgment section 9. This right mask window has a distance L to be measured with respect to the left mask window.
The right mask window creation circuit 49R uses the start of the characteristic edge of the left mask window supplied from the characteristic edge position memory 47 of the mask generation section 7. A right mask window is formed based on the point coordinates and length information and the parallax information supplied from the parallax designation circuit 53, and the right original image corresponding to this right mask window, that is, the right mask window corresponding original image information, is It is read from the right image memory 17R and stored in the right window memory 51R. The right mask window corresponding original image information stored in the right window memory 51R and the left mask window corresponding original image information stored in the left window memory 51L are input to the similarity index calculation circuit 55, where the left and right windows are Similarity index values for both images in the memory are calculated and stored in the similarity index memory 57. And the similarity index memory 5
By comparing the similarity index value stored in 7 with a predetermined threshold value, it is accurately determined whether an obstacle exists, and the robot can move without colliding with the obstacle. Note that between the above-mentioned parallax S and distance L, the optical axes of the left and right cameras are made parallel, the mounting heights are set equal, and the distance between the left and right cameras is a, and the distance between the imaging surfaces at the center of the lens is Assuming that l is the pixel/distance conversion ratio and ε is the pixel/distance conversion ratio, there is the following relationship, and based on this relationship, the parallax designation circuit 53 can supply the parallax information S to the right mask window creation circuit 49R. S=εal/L...(1) Moreover, the above-mentioned similarity index value ψab is obtained using a function expressed by the following equation. ψab＝〓 { ^Aij } 〓〓 { ^Bij }{(Aij−μa)/σ _a −(Bij−μb)/σ _b } ² ……
(2) In the above equation, Aij...the density value of each pixel of the image within the mask window of the left input image, μa...the average density of the image within the mask window of the left input image, σa...the mask of the left input image Standard deviation of the image within the window, Bij...The density value of each pixel of the image within the mask window of the right input image, μb...The average density of the image within the mask window of the right input image, σb...The density value of each pixel of the image within the mask window of the right input image The standard deviation of the image within the mask window is . Figure 7 shows the similarity index value 3× based on the above formula (2).
The figure shows the predicted position of a 10-pixel mask window in the horizontal direction. As can be seen from this figure, it is confirmed that the similarity index value is minimized at the position where the positions of both the left and right mask windows correctly correspond to the position of the obstacle. In addition,
In FIG. 7, characteristics for each edge indicated by d in FIG. 5 are shown. In the above embodiment, a case was explained in which an obstacle is detected by, for example, a robot, but the present invention is not limited to this, and the present invention is not limited to this, and the present invention is not limited to this, and the present invention is not limited to this. It is widely applicable to [Effects of the Invention] As described above, according to the present invention, a characteristic part is detected from a first image of an object to be imaged from two directions, and a first mask is applied to a predetermined area for the characteristic part. forming a second mask window that is shifted by a parallax with respect to the first mask window, and determining similarity indices of corresponding first and second images in the first and second mask windows. Since the similarity index is calculated, it is possible to detect an accurate and precise correspondence between the first and second images based on this similarity index, and the three-dimensional position of the imaged object can be determined using the principle of triangulation. Detection can be performed accurately and reliably without errors.

[Brief explanation of the drawing]

第１図はこの発明の一実施例に係るステレオ視
覚装置の構成を示すブロツク図、第２図は第１図
の装置に使用される特徴エツジ抽出部のブロツク
図、第３図は第１図の装置に使用されるマスク発
生部のブロツク図、第４図は第１図の装置に使用
される類似指標計算判定部のブロツク図、第５図
は第１図のステレオ視覚装置で撮像した画像およ
びこの画像を処理した結果の各画像を示す図、第
６図は第３図のマスク発生部で処理される特徴エ
ツジに対するマスクウインドおよび該マスクウイ
ンドに対応する原画像を示す図、第７図は第４図
の類似指標計算判定部で算出された類似指標値を
示すグラフである。３……ステレオ画像入力部、５……特徴エツジ
抽出部、７……マスク発生部、９……類似指標計
算判定部、１７……第１の画像メモリ、１９……
第２の画像メモリ、２１……第３の画像メモリ、
２３……第４の画像メモリ、２５……第５の画像
メモリ、４７……特徴エツジ位置メモリ、４９Ｌ
……左マスクウインド作成回路、４９Ｒ……右マ
スクウインド作成回路、５１Ｌ……左ウインドメ
モリ、５１Ｒ……右ウインドメモリ、５５……類
似指標計算回路。 FIG. 1 is a block diagram showing the configuration of a stereo visual device according to an embodiment of the present invention, FIG. 2 is a block diagram of a feature edge extraction unit used in the device shown in FIG. 1, and FIG. 4 is a block diagram of the similarity index calculation and determination section used in the device shown in FIG. 1, and FIG. 5 is an image captured by the stereo visual device shown in FIG. 1. and a diagram showing each image as a result of processing this image, FIG. 6 is a diagram showing a mask window for the feature edge processed by the mask generation unit of FIG. 3, and an original image corresponding to the mask window, and FIG. is a graph showing similarity index values calculated by the similarity index calculation determination section of FIG. 4; 3... Stereo image input unit, 5... Feature edge extraction unit, 7... Mask generation unit, 9... Similarity index calculation determination unit, 17... First image memory, 19...
Second image memory, 21...Third image memory,
23...Fourth image memory, 25...Fifth image memory, 47...Feature edge position memory, 49L
...Left mask window creation circuit, 49R...Right mask window creation circuit, 51L...Left window memory, 51R...Right window memory, 55...Similarity index calculation circuit.

Claims

[Scope of Claims] 1. A stereo visual device that images an object to be imaged from two directions, and measures the three-dimensional position by viewing the object three-dimensionally using first and second images from these two directions. a feature detection means for detecting a feature from the first image information; a first mask window forming means for forming a first mask window in a predetermined area with respect to the feature; a second mask window forming means for forming a mask window on a second image at a position shifted from the first mask window by a parallax corresponding to a measurement distance with respect to the image; 1. A stereo visual device comprising: similarity index calculation means for calculating similarity indexes within mask windows corresponding to two images. 2. The stereo visual apparatus according to claim 1, wherein the characteristic portion is an edge portion characteristically present in the first image information. 3 The second mask window forming means sets optical axes parallel to each other for imaging the object to be imaged from two directions, sets a distance to the object to be imaged by L, and sets a distance L between the imaging means for imaging the object to be imaged from two directions. When the distance a is the distance between the imaging surfaces of the lens center of the imaging means, l, and the pixel/distance conversion ratio is ε, the second mask window is converted to the first mask window based on the formula S=εal/L. 2. The stereo visual apparatus according to claim 1, further comprising a parallax calculation means for calculating a parallax S to be shifted relative to the stereo visual apparatus.