JP4670010B2

JP4670010B2 - Mobile object distribution estimation device, mobile object distribution estimation method, and mobile object distribution estimation program

Info

Publication number: JP4670010B2
Application number: JP2005301231A
Authority: JP
Inventors: 章内海; 大丈山添; 信二鉄谷; 憲一保坂; 誠二猪木
Original assignee: ATR Advanced Telecommunications Research Institute International; National Institute of Information and Communications Technology
Current assignee: ATR Advanced Telecommunications Research Institute International; National Institute of Information and Communications Technology
Priority date: 2005-10-17
Filing date: 2005-10-17
Publication date: 2011-04-13
Anticipated expiration: 2025-10-17
Also published as: JP2007109126A

Description

本発明は、複数の撮影手段を用いて移動体の数及び分布を推定する移動体分布推定装置、移動体分布推定方法及び移動体分布推定プログラムに関するものである。 The present invention relates to a mobile body distribution estimation apparatus, a mobile body distribution estimation method, and a mobile body distribution estimation program that estimate the number and distribution of mobile bodies using a plurality of photographing means.

画像を用いて人の動きを検出する方法は、ユーザが検出装置を装着する必要がなく、非接触で検出できるため、ユーザへの負担を軽減することができ、また、不特定者の検出が可能となるため、応用範囲が広がる等の利点があるが、服装や照明条件といった環境条件の変化や歩行者同士のオクルージョン等のシーンに依存した要因により、検出処理が不安定となるという問題がある。 The method of detecting a person's movement using an image can reduce the burden on the user because the user does not need to wear a detection device and can detect the movement without contact. However, there is a problem that the detection process becomes unstable due to factors such as changes in environmental conditions such as clothes and lighting conditions and scene dependent factors such as occlusion between pedestrians. is there.

一方、複数のカメラを用いて撮影した多視点観測を利用する従来の多視点システムでは、オクルージョンを確実に減少させながら、広範囲の観測が可能となることから，多くの研究が行われている（例えば、非特許文献１参照）。
加藤博一他、「楕円体モデルを用いたリアルタイム人物追跡」、１９９９年、情処学論、４０（１１）、４０８７−４０９６ On the other hand, in the conventional multi-viewpoint system using multi-view observations taken with multiple cameras, a wide range of observations are possible while reliably reducing occlusion, so much research has been conducted ( For example, refer nonpatent literature 1).
Hirokazu Kato et al., “Real-time human tracking using an ellipsoid model”, 1999, Information theory, 40 (11), 4087-4096.

しかしながら、上記の多視点システムでは、必要とされる視点数が検出エリアの広さ及び人物の分布(人物間の距離)に応じて増加し、人物同士が極度に接近する場合には、視点数の増加によりオクルージョンの発生を完全に抑制することは困難となる。また、上記の多視点システムにおいて、背景差分又は動きベクトルを利用して各人物の位置を独立に検出する場合、人物間でオクルージョンが発生すると、各人物を分離できないため、位置検出処理が破綻する。 However, in the multi-viewpoint system described above, the number of viewpoints required increases according to the size of the detection area and the distribution of people (distance between people). It becomes difficult to completely suppress the occurrence of occlusion due to the increase in the number of the occlusions. Also, in the above multi-viewpoint system, when the position of each person is detected independently using the background difference or motion vector, if the occlusion occurs between the persons, each person cannot be separated, so the position detection process fails. .

本発明の目的は、多数のオクルージョン領域を持つ複雑なシーンにおいても、移動体の数及び分布を高精度に推定することができる移動体分布推定装置、移動体分布推定方法及び移動体分布推定プログラムを提供することである。 An object of the present invention is to provide a mobile object distribution estimation device, a mobile object distribution estimation method, and a mobile object distribution estimation program capable of estimating the number and distribution of mobile objects with high accuracy even in a complex scene having a large number of occlusion areas. Is to provide.

また、本発明に係る移動体分布推定装置は、一の移動体のみが位置することが可能な大きさを有するＭ（Ｍは２以上の整数）個のブロックに分割された移動領域を移動する移動体を撮影する複数の撮影手段と、前記複数の撮影手段により撮影された複数の撮影画像から移動体のシルエットを表す複数の観測シルエット画像を作成する作成手段と、前記作成手段により作成された観測シルエット画像群に対して、各撮影手段のカメラ視点から各ブロック内に位置する移動体を撮影したと仮定したときに各撮影手段の画像座標系に投影されるモデル画像を当て嵌め、モデル選択基準に基づいて観測シルエット画像群に最も適した移動領域上の移動体の数及び分布を推定する推定手段とを備え、前記推定手段は、移動体の個数をゼロから１ずつ増加させ、移動体が存在するブロックを表す値を１からＭまで変化させながら観測シルエット画像群に対して各個数の移動体のモデル画像を当て嵌めて観測シルエット画像群の記述長を算出し、算出した記述長が低減していればモデル画像に基づき移動体が存在する全ブロックを表すブロックデータを更新し、記述長が低減しなくなったときの移動体の個数及びブロックデータを、最も適した移動領域上の移動体の数及び分布として推定するものである。 The moving body distribution estimation apparatus according to the present invention moves in a moving area divided into M (M is an integer of 2 or more) blocks having a size that allows only one moving body to be located. A plurality of photographing means for photographing the moving body, a creating means for creating a plurality of observation silhouette images representing silhouettes of the moving body from a plurality of photographed images photographed by the plurality of photographing means, and created by the creating means Model selection that is projected onto the image coordinate system of each imaging means when the moving silhouette located within each block is assumed to be captured from the camera viewpoint of each imaging means to the observation silhouette image group a estimating means for estimating the most suitable number and distribution of the moving object in a mobile region observed silhouette image group based on the reference, the estimating means increases the number of mobile one from zero The description length of the observed silhouette image group is calculated by fitting the model images of each number of moving bodies to the observed silhouette image group while changing the value representing the block in which the moving object exists from 1 to M, and calculating If the described description length is reduced, the block data representing all blocks in which the moving object is present is updated based on the model image, and the number of moving objects and the block data when the description length is no longer reduced are changed to the most suitable movement. a it shall be estimated as the number and distribution of the moving object region.

本発明に係る移動体分布推定装置においては、一の移動体のみが位置することが可能な大きさを有するＭ（Ｍは２以上の整数）個のブロックに分割された移動領域を移動する移動体が複数の撮影手段により撮影され、撮影された複数の撮影画像から移動体のシルエットを表す複数の観測シルエット画像が作成され、作成された観測シルエット画像群に対して、各撮影手段のカメラ視点から各ブロック内に位置する移動体を撮影したと仮定したときに各撮影手段の画像座標系に投影されるモデル画像が当て嵌められ、モデル選択基準に基づいて観測シルエット画像群に最も適した移動領域上の移動体の数及び分布が推定されるので、多数のオクルージョン領域を持つ複雑なシーンにおいても、移動体の数及び分布を高精度に推定することができる。 In the moving body distribution estimation apparatus according to the present invention, the movement that moves in the moving area divided into M (M is an integer of 2 or more) blocks having a size that allows only one moving body to be located. The body is photographed by a plurality of photographing means, a plurality of observation silhouette images representing the silhouette of the moving body are created from the plurality of photographed images, and the camera viewpoint of each photographing means is created with respect to the created observation silhouette image group The model image projected on the image coordinate system of each imaging means when it is assumed that the moving body located in each block is captured from is applied, and the most suitable movement for the observation silhouette image group based on the model selection criteria Since the number and distribution of moving objects on the area are estimated, the number and distribution of moving objects can be estimated with high accuracy even in complex scenes with many occlusion areas. .

また、移動領域は、一の移動体のみが位置することが可能な大きさを有するＭ（Ｍは２以上の整数）個のブロックに分割されているため、一のブロック内に一の移動体のみが位置することとなり、移動領域を推定に適した大きさのブッロクに分割することができるので、移動体の数及び分布をより高精度に推定することができる。 Further, since the moving area is divided into M (M is an integer of 2 or more) blocks having a size that allows only one moving body to be located, one moving body is included in one block. Since the moving region can be divided into blocks of a size suitable for estimation, the number and distribution of moving objects can be estimated with higher accuracy.

また、前記推定手段は、移動体の個数をゼロから１ずつ増加させ、移動体が存在するブロックを表す値を１からＭまで変化させながら観測シルエット画像群に対して各個数の移動体のモデル画像を当て嵌めて観測シルエット画像群の記述長を算出し、算出した記述長が低減していればモデル画像に基づき移動体が存在する全ブロックを表すブロックデータを更新し、記述長が低減しなくなったときの移動体の個数及びブロックデータを、最も適した移動領域上の移動体の数及び分布として推定している。 In addition, the estimation means increases the number of moving objects by one from zero, and changes the value representing the block in which the moving object is present from 1 to M while changing the number of moving object models for the observed silhouette image group. The description length of the observation silhouette image group is calculated by fitting the image, and if the calculated description length is reduced, the block data representing all blocks in which the moving object is present is updated based on the model image, and the description length is reduced. The number of mobile objects and the block data at the time of disappearance are estimated as the most suitable number and distribution of mobile objects on the moving area.

したがって、移動体の個数をゼロから１ずつ増加させ、移動体が存在するブロックを表す値を１からＭまで変化させながら観測シルエット画像群に対して各個数の移動体のモデル画像を当て嵌めて観測シルエット画像群の記述長を算出し、算出した記述長が低減していればモデル画像に基づき移動体が存在する全ブロックを表すブロックデータを更新し、記述長が低減しなくなったときの移動体の個数及びブロックデータを、最も適した移動領域上の移動体の数及び分布として推定しているので、移動体の数及び分布推定処理の計算量の増大を避けることができ、移動体の数及び分布をより高速に推定することができる。 Therefore, the number of the mobile is increased from zero one by one, by fitting the model image of the moving object in each number a value representing the block which the mobile body is present for the observed silhouette images while changing from 1 to M movement when observed silhouette images of description length is calculated, if the reduced calculated description length updates the block data representing all blocks moving object based on the model image is present, no longer reduced description length Since the number of bodies and block data are estimated as the most suitable number and distribution of moving objects on the moving region, it is possible to avoid an increase in the number of moving objects and the calculation amount of the distribution estimation process. Numbers and distributions can be estimated faster.

前記作成手段は、前記観測シルエット画像を２値化し、前記推定手段は、前記２値化された観測シルエット画像群に対して２値化されたモデル画像を当て嵌め、モデル選択基準に基づいて最も適した移動体の数及び分布を推定することが好ましい。 The creation means binarizes the observation silhouette image, and the estimation means fits the binarized model image to the binarized observation silhouette image group, and is based on a model selection criterion. It is preferable to estimate the number and distribution of suitable mobiles.

この場合、２値化された観測シルエット画像及びモデル画像を用い、モデル選択基準に基づいて最も適した移動体の数及び分布を推定しているので、移動体の数及び分布をさらに高速に推定することができ、リアルタイムに移動体の数及び分布を求めることができる。 In this case, the most suitable number and distribution of moving objects are estimated based on the model selection criteria using the binarized observation silhouette image and model image, so the number and distribution of moving objects are estimated even faster. The number and distribution of moving objects can be obtained in real time.

前記推定手段は、前記２値化されたモデル画像の画素値に対して前記２値化された観測シルエット画像の画素値が反転している部分を観測誤差として観測シルエット画像群の記述長を算出し、算出した記述長が最も短くなる移動体の数及び分布を最も適した移動領域上の移動体の数及び分布として推定することが好ましい。 The estimation means calculates a description length of the observation silhouette image group using an observation error as a portion where the pixel value of the binarized observation silhouette image is inverted with respect to the pixel value of the binarized model image Then, it is preferable to estimate the number and distribution of the moving objects having the shortest description length as the number and distribution of the moving objects on the most suitable moving region.

この場合、観測誤差を考慮して記述長が最も短くなる移動体の数及び分布を最も適した移動領域上の移動体の数及び分布として推定しているので、移動体の数及び分布をより高精度に推定することができる。 In this case, considering the observation error, the number and distribution of moving objects with the shortest description length are estimated as the number and distribution of moving objects on the most suitable moving region. It can be estimated with high accuracy.

前記複数のモデル画像を予め記憶する記憶手段をさらに備え、前記推定手段は、前記記憶手段に記憶されているモデル画像を順次読み出し、読み出したモデル画像を前記観測シルエット画像群に対して順次当て嵌め、モデル選択基準に基づいて最も適した移動体の数及び分布を推定することが好ましい。 The apparatus further comprises storage means for storing the plurality of model images in advance, and the estimation means sequentially reads the model images stored in the storage means, and sequentially applies the read model images to the observation silhouette image group. Preferably, the most suitable number and distribution of moving objects are estimated based on the model selection criteria.

この場合、複数のモデル画像を作成する必要がなくなり、モデル画像の作成処理を省略することができるので、移動体の数及び分布をより高速に推定することができる。 In this case, it is not necessary to create a plurality of model images, and the model image creation process can be omitted, so that the number and distribution of moving objects can be estimated at a higher speed.

前記複数のモデル画像は、移動体を異なるカメラ視点から撮影した複数の画像からモーフィング処理により作成されることが好ましい。この場合、少ない画像から任意のカメラ視点から撮影した画像を作成することができるので、モデル画像を容易に作成することができる。 The plurality of model images are preferably created by morphing processing from a plurality of images obtained by photographing the moving body from different camera viewpoints. In this case, since an image taken from an arbitrary camera viewpoint can be created from a small number of images, a model image can be easily created.

本発明に係る移動体分布推定方法は、複数の撮影手段、作成手段及び推定手段を備える移動体分布推定装置を用いた移動体分布推定方法であって、前記複数の撮影手段が、一の移動体のみが位置することが可能な大きさを有するＭ（Ｍは２以上の整数）個のブロックに分割された移動領域を移動する移動体を撮影するステップと、前記作成手段が、前記複数の撮影手段により撮影された複数の撮影画像から移動体のシルエットを表す複数の観測シルエット画像を作成するステップと、前記推定手段が、前記作成手段により作成された観測シルエット画像群に対して、各撮影手段のカメラ視点から各ブロック内に位置する移動体を撮影したと仮定したときに各撮影手段の画像座標系に投影されるモデル画像を当て嵌め、モデル選択基準に基づいて最も適した移動領域上の移動体の数及び分布を推定するステップであって、前記推定手段が、移動体の個数をゼロから１ずつ増加させ、移動体が存在するブロックを表す値を１からＭまで変化させながら観測シルエット画像群に対して各個数の移動体のモデル画像を当て嵌めて観測シルエット画像群の記述長を算出し、算出した記述長が低減していればモデル画像に基づき移動体が存在する全ブロックを表すブロックデータを更新し、記述長が低減しなくなったときの移動体の個数及びブロックデータを、最も適した移動領域上の移動体の数及び分布として推定するステップとを含むものである。 Mobile distribution estimation method according to the present invention, multiple imaging means, a mobile distribution estimation method using a mobile distribution estimation apparatus comprising creating means and estimating means, said plurality of imaging means, one M only moving body has the possible size of the position (M is an integer of 2 or more) comprising the steps of photographing a moving body that moves a moving region divided into blocks, the creation means, said plurality Creating a plurality of observation silhouette images representing silhouettes of moving objects from a plurality of photographed images photographed by the photographing means, and the estimation means for each of the observation silhouette image groups created by the creation means, Based on the model selection criteria, fitting the model image projected on the image coordinate system of each photographing means when assuming that the moving body located in each block is photographed from the camera viewpoint of the photographing means Comprising the steps of estimating the most suitable number and distribution of the moving body on the movement area, the estimating means, the number of the mobile is increased from zero one by one, a value representing the block which the mobile is present from 1 The model length of the observation silhouette image group is calculated by fitting model images of each number of moving bodies to the observation silhouette image group while changing to M, and if the calculated description length is reduced, the model image is moved. Updating block data representing all blocks in which a body exists, estimating the number of mobile objects and block data when the description length no longer decreases as the most suitable number and distribution of mobile objects on the moving area; and Is included.

本発明に係る移動体分布推定プログラムは、一の移動体のみが位置することが可能な大きさを有するＭ（Ｍは２以上の整数）個のブロックに分割された移動領域を移動する移動体を複数の撮影手段により撮影した複数の撮影画像から移動体のシルエットを表す複数の観測シルエット画像を作成する作成手段と、前記作成手段により作成された観測シルエット画像群に対して、各撮影手段のカメラ視点から各ブロック内に位置する移動体を撮影したと仮定したときに各撮影手段の画像座標系に投影されるモデル画像を当て嵌め、モデル選択基準に基づいて最も適した移動領域上の移動体の数及び分布を推定する推定手段であって、移動体の個数をゼロから１ずつ増加させ、移動体が存在するブロックを表す値を１からＭまで変化させながら観測シルエット画像群に対して各個数の移動体のモデル画像を当て嵌めて観測シルエット画像群の記述長を算出し、算出した記述長が低減していればモデル画像に基づき移動体が存在する全ブロックを表すブロックデータを更新し、記述長が低減しなくなったときの移動体の個数及びブロックデータを、最も適した移動領域上の移動体の数及び分布として推定する推定手段としてコンピュータを機能させるものである。 The moving object distribution estimation program according to the present invention is a moving object that moves in a moving area divided into M (M is an integer of 2 or more) blocks having a size that allows only one moving object to be located. and generating means for generating a plurality of observation silhouette image representing the silhouette of the moving body from a plurality of captured images captured by the multiple imaging means for observing the silhouette images created by the creation means, each imaging means The model image projected on the image coordinate system of each imaging means when it is assumed that the moving body located in each block is captured from the camera viewpoint of the camera, and on the most suitable moving area based on the model selection criteria a estimating means for estimating the number and distribution of the moving body, observing the number of the mobile is increased from zero one by one, while changing a value representing the block which the mobile is present from 1 to M Fit the model image of each moving object to the Luet image group to calculate the description length of the observation silhouette image group, and if the calculated description length is reduced, all blocks where the moving object exists based on the model image Update the block data to represent the number of moving objects and the block data when the description length no longer decreases, and let the computer function as estimation means for estimating the number and distribution of moving objects on the most suitable moving area It is.

本発明によれば、撮影された複数の撮影画像から移動体のシルエットを表す複数の観測シルエット画像が作成され、作成された観測シルエット画像群に対して、各撮影手段のカメラ視点から各ブロック内に位置する移動体を撮影したと仮定したときに各撮影手段の画像座標系に投影されるモデル画像が当て嵌められ、モデル選択基準に基づいて観測シルエット画像群に最も適した移動領域上の移動体の数及び分布が推定されるので、多数のオクルージョン領域を持つ複雑なシーンにおいても、移動体の数及び分布を高精度に推定することができる。
According to the present invention, a plurality of observation silhouette image is created representing the silhouette of the moving body from a plurality of images captured for the observed silhouette images created, in each block from the camera viewpoint of the respective photographing means The model image projected on the image coordinate system of each imaging means when it is assumed that the moving body located at is captured is fitted, and the movement on the moving area most suitable for the observation silhouette image group based on the model selection criteria Since the number and distribution of bodies are estimated, the number and distribution of moving objects can be estimated with high accuracy even in a complex scene having a large number of occlusion regions.

以下、本発明の一実施の形態による移動体分布推定装置について図面を参照しながら説明する。図１は、本発明の一実施の形態による移動体分布推定装置の構成を示すブロック図である。なお、以下の説明では、移動体として歩行者（人間）の分布及び数を推定する場合を例に説明するが、本発明が適用される移動体は、この例に特に限定されず、その形状が予めわかっているものであれば、種々の移動体に同様に適用することができる。 Hereinafter, a mobile body distribution estimation apparatus according to an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing a configuration of a moving object distribution estimation apparatus according to an embodiment of the present invention. In the following description, the case of estimating the distribution and number of pedestrians (humans) as a moving body will be described as an example, but the moving body to which the present invention is applied is not particularly limited to this example, and its shape Can be similarly applied to various mobile objects.

図１に示す移動体分布推定装置は、Ｋ台のカメラ１１〜１Ｋ（Ｋは２以上の整数）及び画像処理装置２０を備える。画像処理装置２０は、画像取得部２１、人物領域抽出部２２、観測ベクトル作成部２３、記述長算出部２４、分布推定部２５及び形状投影モデル記憶部２６を備える。 The moving body distribution estimation apparatus shown in FIG. 1 includes K cameras 11 to 1K (K is an integer of 2 or more) and an image processing apparatus 20. The image processing apparatus 20 includes an image acquisition unit 21, a person region extraction unit 22, an observation vector creation unit 23, a description length calculation unit 24, a distribution estimation unit 25, and a shape projection model storage unit 26.

カメラ１１〜１Ｋは、複数のブロックに分割された移動領域となる空間、例えば、部屋又は廊下等の天井又は壁等の所定箇所に取り付けられ、移動領域を移動する一人又は複数の歩行者をカラー画像で撮影し、撮影した画像データをＬＡＮ等の所定のネットワークを介して画像取得部２１へ出力する。なお、カメラ１１〜１Ｋの内部パラメータ、位置及び姿勢は、予め手動で校正されている。また、カメラ１１〜１Ｋは、上記の例に特に限定されず、白黒画像、赤外線画像を撮影する種々の撮像装置を用いることができる。また、推定精度を考慮すると、カメラ１１〜１Ｋの台数は、３台以上であることが好ましく、４台以上であることがより好ましいが、１台のカメラのみを用いて推定することも可能である。 The cameras 11 to 1K are attached to a predetermined area such as a ceiling or a wall such as a room or a hallway that is a movement area divided into a plurality of blocks, and color one or more pedestrians moving in the movement area. An image is captured, and the captured image data is output to the image acquisition unit 21 via a predetermined network such as a LAN. Note that the internal parameters, positions, and orientations of the cameras 11 to 1K are manually calibrated in advance. The cameras 11 to 1 </ b> K are not particularly limited to the above examples, and various imaging devices that capture black and white images and infrared images can be used. In consideration of the estimation accuracy, the number of cameras 11 to 1K is preferably 3 or more, more preferably 4 or more, but it is also possible to estimate using only one camera. is there.

画像処理装置２０は、ＲＯＭ（リードオンリメモリ）、ＣＰＵ（中央演算処理装置）、ＲＡＭ（ランダムアクセスメモリ）、外部記憶装置、入力装置、画像インタフェース装置及び表示装置等を備える通常のコンピュータから構成され、後述する移動体分布推定処理を実行するための移動体分布処理プログラムをＣＰＵ等で実行することにより上記の各ブロックの機能が実現される。なお、画像処理装置２０の構成は、この例に特に限定されず、図示の各ブロックの機能を専用のハードウエア回路から構成したり、上記の各機能の一部又は全部を１台又は複数台のコンピュータを用いて実行するようにしてもよい。 The image processing device 20 includes a normal computer including a ROM (Read Only Memory), a CPU (Central Processing Unit), a RAM (Random Access Memory), an external storage device, an input device, an image interface device, a display device, and the like. The functions of the respective blocks are realized by executing a moving body distribution processing program for executing a moving body distribution estimation process, which will be described later, by a CPU or the like. Note that the configuration of the image processing apparatus 20 is not particularly limited to this example, and the functions of each block shown in the figure are configured from dedicated hardware circuits, or a part or all of each of the above functions is one or more. You may make it perform using the computer of.

形状投影モデル記憶部２６には、各カメラ１１〜１Ｋのカメラ視点から各ブロック内に位置する一人の歩行者を撮影したと仮定したときの画像を各カメラ１１〜１Ｋの画像座標系に投影したモデル画像である形状投影モデルが予め記憶されている。 The shape projection model storage unit 26 projects an image on the image coordinate system of each of the cameras 11 to 1K when it is assumed that a single pedestrian located in each block is photographed from the camera viewpoint of each of the cameras 11 to 1K. A shape projection model which is a model image is stored in advance.

ここで、形状投影モデルについて詳細に説明する。まず、カメラ投影モデルについて説明する。図２は、カメラ投影モデルを説明するための模式図である。 Here, the shape projection model will be described in detail. First, the camera projection model will be described. FIG. 2 is a schematic diagram for explaining a camera projection model.

一般的なピンホールカメラモデルでは、空間中の３次元点（Ｘ，Ｙ，Ｚ）と対応する画像面上の像（ｘ，ｙ）との間には、次の関係が成り立つ。ここで、Ｓは画像面への投影を表す内部パラメータであり、Ｒ、ｔはそれぞれカメラの３次元姿勢及び位置を表す。 In a general pinhole camera model, the following relationship holds between a three-dimensional point (X, Y, Z) in space and a corresponding image (x, y) on the image plane. Here, S is an internal parameter representing the projection onto the image plane, and R and t represent the three-dimensional posture and position of the camera, respectively.

図２に示すように、観測画像は、単数又は複数の３次元人物形状が上式に従って画像面に投影されたものと考えられる。ここで、移動領域である床面をＭブロックに分割し、下式のように、ブロックｊに人が存在するか否かをｘ_ｊ（１≦ｊ≦Ｍ）で表す。 As shown in FIG. 2, the observed image is considered to be one or a plurality of three-dimensional human shapes projected on the image plane according to the above formula. Here, the floor which is the moving area is divided into M blocks, and x _j (1 ≦ j ≦ M) indicates whether or not there is a person in the block j as in the following equation.

但し、１ブロックは二人以上が同時に入ることがない程度に十分小さい領域に設定している。このとき、移動領域内の歩行者の分布は、ｘ_ｊを要素とするＭ次元ベクトルＸとして下式のように表すことができる。 However, one block is set to a sufficiently small area so that two or more people cannot enter at the same time. At this time, the distribution of pedestrians in the moving region can be expressed as the following expression as an M-dimensional vector X having x _j as an element.

このシーンをカメラｋで観測して得た歩行者のシルエット像をＡ^ｋ、Ｎをシルエット画像の画素数とすると、下式が成り立つ。 When the silhouette image of a pedestrian obtained by observing this scene with the camera k is A ^k and N is the number of pixels of the silhouette image, the following equation is established.

図３は、シルエット画像と床面上の歩行者の分布との対応関係を示す模式図である。図３に示すように、以下の説明では、ＸとＡ^ｋとの関係を考え、特定のブロックｌのみに歩行者が存在する場合、その分布をＸ_ｌとすると、下式が成り立つ。 FIG. 3 is a schematic diagram showing the correspondence between silhouette images and the distribution of pedestrians on the floor. As shown in FIG. 3, in the following description, consider the relationship between X and A ^k, if there is a pedestrian only certain blocks l, when the distribution and X _l, the following equation is established.

上記の分布をＸ_ｌに対して、画素ａ_ｌ，ｉ ^ｋが人物領域に属するか否かは、分布Ｘ_ｌだけでなく、歩行者の体格や姿勢にも依存するため、分布Ｘ_ｌに対してｉ番目の画素ａ_ｌ，ｉ ^ｋが１となる確率をｐ_ｌ，ｉ ^ｋとする。 The above distribution with respect to X _l, whether belonging to the pixel a _{l, i} ^k is the human region, not only the distribution X _l, because also on the pedestrian's size and position, with respect to distribution X _l The probability that the i-th pixel a _{l, i} ^k is 1 is p _{l, i} ^k .

実際に観測されるシルエット像では、さらに複数の人物像が重畳されていると考えられる。例えば、Ｌ個所（ｘ_ｌ１，…，ｘ_ｌＬ）に歩行者が存在する場合、カメラｋで観測して得た歩行者のシルエット像Ａ^ｋの画素ａ_ｉ ^ｋが１である確率ｐ_ｉ ^ｋは、画素ａ_ｉ ^ｋがいずれかの歩行者の人物領域に属する確率として次式で表される。 In the silhouette image actually observed, it is considered that a plurality of person images are further superimposed. For example, when there are pedestrians at L locations (x ₁₁ ,..., X _1L ), the probability p _i ^k that the pixel a _i ^k of the pedestrian silhouette image A ^k obtained by observing with the camera ^k is 1 is , Pixel a _i ^k is expressed by the following equation as the probability of belonging to the person area of any pedestrian.

各画素の変化が互いに独立であるとすると、カメラｋで観測されるシルエット像の分布は、以下の確率画像として表すことができる。 Assuming that the change of each pixel is independent from each other, the distribution of silhouette images observed by the camera k can be expressed as the following probability image.

任意のカメラ配置及び歩行者分布に対するシルエット像を生成するには、ポリゴンモデルなど３次元の幾何情報を含むモデルの利用も考えられるが、本実施の形態では、より簡便な方法として有限個の観測方向からのシルエット像を平均化したモデル生成用画像を作成し、これらを組合わせてモーフィング処理により任意方向に対するシルエット像を予め作成している。 In order to generate a silhouette image for an arbitrary camera arrangement and pedestrian distribution, use of a model including three-dimensional geometric information such as a polygon model can be considered, but in this embodiment, a finite number of observations is used as a simpler method. A model generation image in which silhouette images from directions are averaged is created, and these are combined to create a silhouette image in an arbitrary direction in advance by morphing processing.

図４は、モデル生成用画像の取得方法を説明するための模式図である。図４に示すように、ブルースクリーンＢＳを背景とし、直立姿勢にある１名の人間（男性、身長１７０ｃｍ)の位置及び向きを変化させながら、水平、斜め４５度上方及び真上の３種類の観測方向から撮影し、クロマキーを利用してそれぞれ７０〜１００枚のシルエット像を取得する。このとき、人間の位置は、床面上の約３０ｃｍ×３０ｃｍのエリア（１ブロック）内とし、多方向からの観測画像が含まれるよう人間を回転させながら、観測方向算出の基準高さを床面上８０ｃｍとし、画像を撮影する。 FIG. 4 is a schematic diagram for explaining a method for acquiring a model generation image. As shown in FIG. 4, with a blue screen BS in the background, while changing the position and orientation of one human (male, height 170 cm) in an upright position, three types of horizontal, diagonally 45 degrees above and directly above Images are taken from the observation direction, and 70 to 100 silhouette images are acquired using the chroma key. At this time, the position of the human is within an area of about 30 cm × 30 cm (one block) on the floor, and the reference height for calculating the observation direction is set to the floor while rotating the human so that observation images from multiple directions are included. An image is taken with a surface of 80 cm.

図５は、各観測方向における平均画像の一例を示す図である。図５の（ａ）は、水平方向（０度）から撮影した平均画像Ｐ_０、（ｂ）は、斜め４５度上方から撮影した平均画像Ｐ_４５、（ｃ）は、真上から撮影した平均画像Ｐ_９０を示している。これらの平均画像から任意の観測角度の投影像を生成するために、人物位置ＸＸ＝［ＸＹＺ］’、カメラ位置ＸＸ_Ｃ＝［Ｘ_ＣＹ_ＣＺ_Ｃ］’から、次式により算出される観測角θを利用する。 FIG. 5 is a diagram illustrating an example of an average image in each observation direction. 5A shows an average image P ₀ taken from the horizontal direction (0 degree), FIG. 5B shows an average image P ₄₅ taken from above 45 degrees obliquely, and FIG. 5C shows an average taken from directly above. An image _P90 is shown. In order to generate a projection image at an arbitrary observation angle from these average images, the person position XX = [X Y Z] ′ and the camera position XX _C = [X _C Y _C Z _C ] ′ are calculated by the following equations. The observation angle θ is used.

なお、Ｚ＝８０であり、カメラは常に水平方向よりも上方に位置するものとし、得られた観測角に応じて、０°＜θ＜４５°であれば、平均画像Ｐ_０、Ｐ_４５を、４５°＜θ＜９０°であれば、平均画像Ｐ_４５、Ｐ_９０をそれぞれ利用してモーフィングにより所定の投影シルエット画像を生成する。 It should be noted that Z = 80 and the camera is always positioned above the horizontal direction. If 0 ° <θ <45 ° according to the obtained observation angle, the average images P ₀ and P ₄₅ are obtained. If 45 ° <θ <90 °, a predetermined projected silhouette image is generated by morphing using the average images P ₄₅ and P ₉₀ , respectively.

図６は、投影シルエット画像の生成方法を説明するための模式図であり、図７は、投影シルエット画像の一例を示す図である。図６に示すように、モーフィングにより生成された画像ＭＩをカメラ画像面上に投影し、投影シルエット画像を得る。これにより、任意のカメラ配置及び歩行者分布に対する投影シルエット画像Ｐを生成することができ、例えば、図７に示すように、θ＝０、１５、３０、４５、６０、７５、９０（度）の各投影シルエット画像を生成することができる。 FIG. 6 is a schematic diagram for explaining a method for generating a projected silhouette image, and FIG. 7 is a diagram illustrating an example of a projected silhouette image. As shown in FIG. 6, an image MI generated by morphing is projected onto the camera image plane to obtain a projected silhouette image. Thereby, a projected silhouette image P for an arbitrary camera arrangement and pedestrian distribution can be generated. For example, as shown in FIG. 7, θ = 0, 15, 30, 45, 60, 75, 90 (degrees). Each projected silhouette image can be generated.

次に、下式のように、カメラｋにおけるＸ_ｌに対する投影シルエット画像を２値化し、２値化シルエット画像Ｂ_ｌ ^ｋを得る。 Next, as shown in the following equation, the projected silhouette image for X _l in the camera k is binarized to obtain a binarized silhouette image B _l ^k .

カメラＫ台分の２値化シルエット画像の集合Ｂは、以下のように定義される。 A set B of binarized silhouette images for K cameras is defined as follows.

本実施の形態においては、予め全てのブロックｌに対して２値化シルエット画像の集合Ｂｌが予め計算され、歩行者の位置とシルエット像との関係をモデル化した形状投影モデルとして、２値化シルエット画像が形状投影モデル記憶部２６に予め記憶されており、Ｌ個所（ｘ_ｌ１，…，ｘ_ｌＬ）に人物が存在する場合の２値化シルエット画像Ｂは、（６）式を簡略化した次式により算出される。 In the present embodiment, a binarized silhouette image set B1 is calculated in advance for all blocks l in advance, and binarized as a shape projection model that models the relationship between the position of the pedestrian and the silhouette image. A silhouette image is stored in advance in the shape projection model storage unit 26, and the binarized silhouette image B in the case where a person is present at L locations (x ₁₁ ,..., X _11L ) is obtained by simplifying the expression (6). It is calculated by the following formula.

なお、本実施の形態では、作成した形状投影モデルを形状投影モデル記憶部２６に予め記憶しているが、この例に特に限定されず、上記の平均画像を形状投影モデル記憶部２６に予め記憶し、記憶されている平均画像からモーフィング処理により移動体分布推定処理に使用する形状投影モデルを作成したり、移動体の３次元モデルを形状投影モデル記憶部２６に予め記憶し、記憶されている３次元モデルから移動体分布推定処理に使用する形状投影モデルを作成したりする等の種々の変更が可能である。 In the present embodiment, the created shape projection model is stored in advance in the shape projection model storage unit 26, but is not particularly limited to this example, and the above average image is stored in the shape projection model storage unit 26 in advance. Then, a shape projection model used for the moving object distribution estimation process is created from the stored average image by morphing process, or a three-dimensional model of the moving object is stored in advance in the shape projection model storage unit 26 and stored. Various modifications such as creating a shape projection model used for the mobile body distribution estimation process from the three-dimensional model are possible.

再び、図１を参照して、画像取得部２１は、カメラ１１〜１Ｋにより撮影された複数の画像データに撮影時刻情報を付加して人物領域抽出部２２へ出力する。人物領域抽出部２２は、フレーム間差分等の公知の人物領域抽出処理を実行して画像データから人間のシルエットを表す人物領域をシルエット領域として抽出し、シルエット領域を特定したシルエット画像を撮影時刻情報とともに観測ベクトル作成部２３へ出力する。観測ベクトル作成部２３は、シルエット領域を基にシルエット画像を人物領域と背景領域とに区分して２値化し、２値化シルエット画像を観測ベクトル（観測シルエット画像）として記述長算出部２４へ出力する。 Referring again to FIG. 1, the image acquisition unit 21 adds shooting time information to a plurality of image data shot by the cameras 11 to 1 </ b> K and outputs the image data to the person region extraction unit 22. The person area extraction unit 22 extracts a person area representing a human silhouette from image data as a silhouette area by executing a known person area extraction process such as inter-frame difference, and captures a silhouette image specifying the silhouette area as shooting time information. At the same time, the result is output to the observation vector creation unit 23. The observation vector creation unit 23 binarizes the silhouette image into a person region and a background region based on the silhouette region, and outputs the binarized silhouette image to the description length calculation unit 24 as an observation vector (observation silhouette image). To do.

ここで、観測ベクトルについて詳細に説明する。まず、カメラ１１〜１Ｋからのカラー画像の入力を仮定し、時刻ｔにカメラｋで得られるＮ画素からなる一枚の入力画像Ｃ_ｔ ^ｋを以下のように表す。 Here, the observation vector will be described in detail. First, assuming that color images are input from the cameras 11 to 1K, one input image C _t ^k composed of N pixels obtained by the camera k at time t is expressed as follows.

各画素ｃ_ｉ，ｔ ^ｋは、下式のように、Ｒ、Ｇ、Ｂの各画素値ｒ_ｉ，ｔ ^ｋ、ｇ_ｉ，ｔ ^ｋ、ｂ_ｉ，ｔ ^ｋから構成される。 Each pixel c _{i, t} ^k is composed of R, G, B pixel values r _{i, t} ^k , g _{i, t} ^k , b _{i, t} ^{k as in} _{the following equation} .

計算を単純化するため、本実施の形態では、人物領域抽出部２２は、ある時間間隔内に大きく変化する画素を人物領域として抽出する。カメラｋに対する入力ベクトルＺ^ｋは、以下のように表される。 In order to simplify the calculation, in the present embodiment, the person area extraction unit 22 extracts pixels that greatly change within a certain time interval as a person area. Input vector Z ^k for the camera k is expressed as follows.

したがって、カメラＫ台分の観測ベクトルの集合Ｚ_ｔは、以下のように定義される。 Therefore, a set Z _t of observation vectors for K cameras is defined as follows.

記述長算出部２４は、歩行者数を順次増大させながら、形状投影モデル記憶部２６から形状投影モデルを読み出し、観測ベクトル作成部２３から出力される観測ベクトル（観測シルエット画像群）に形状投影モデルを当て嵌めて記述長を算出し、観測ベクトルの記述長が最も短くなる形状投影モデルを観測ベクトルに最も適した形状投影モデルとして決定する（記述長最小基準（ＭＤＬ：ＭｉｎｉｍｕｍＤｅｓｃｒｉｐｔｉｏｎＬｅｎｇｔｈ））。 The description length calculation unit 24 reads the shape projection model from the shape projection model storage unit 26 while sequentially increasing the number of pedestrians, and applies the shape projection model to the observation vector (observation silhouette image group) output from the observation vector creation unit 23. Is used to calculate the description length, and the shape projection model with the shortest description length of the observation vector is determined as the shape projection model most suitable for the observation vector (Minimum Description Length (MDL)).

ここで、記述長最小基準とは、観測データを記述する適切なモデルを決定するために提案されたモデル選択基準の一つであり、モデルの複雑さと入力データに関する当て嵌め誤差の双方を考慮してモデルが選択されるものである。この記述長最小基準は、画像処理及びコンピュータビジョンの分野でも、領域分割や曲線当て嵌め等において既に広く用いられており、本実施の形態では、移動領域となる平面（床面）を小ブロックに分割し、形状投影モデルを利用しながら、歩行者画像の解析に記述長最小基準を用い、歩行者の数及び分布を直接推定している。 Here, the minimum description length criterion is one of the model selection criteria proposed to determine an appropriate model for describing observed data, and takes into account both the complexity of the model and the fitting error related to the input data. Model is selected. This minimum description length standard has already been widely used in the field of image processing and computer vision in area division and curve fitting. In this embodiment, the plane (floor surface) that becomes the moving area is made into small blocks. Using the shape projection model, the number and distribution of pedestrians are directly estimated using the minimum description length criterion for pedestrian image analysis.

また、実際には観測誤差により理想的な観測データが得られないため、本実施の形態では、以下のようにして、観測誤差を観測値が反転する現象としてモデル化し、観測誤差を考慮した記述を行っている。 In addition, since ideal observation data cannot actually be obtained due to observation errors, in this embodiment, the observation error is modeled as a phenomenon in which the observation value is inverted, and a description that takes the observation error into account as follows. It is carried out.

まず、人物領域についてＺ_ｊ＝０となる確率をｑ、背景領域についてＺ_ｊ＝１となる確率をｒとすると、カメラｋにおいて観測ベクトルＺ_ｔ ^ｋが観測される確率Ｐ（Ｚ_ｔ ^ｋ｜Ｂ^ｋ）は以下のように計算できる。 First, assuming that the probability of Z _j = 0 for the human region is q and r is the probability of Z _j = 1 for the background region, the probability P (Z _t ^k | B) that the observation vector Z _t ^k is observed in the camera k. ^k ) can be calculated as follows.

このとき、モデル分布Ｂのもとで画像集合Ｚ_ｔが観測される確率は、下式で表される。 At this time, the probability that the image set Z _t is observed under the model distribution B is expressed by the following equation.

ここで、ｎ_ａ、ｎ_ｂ、ｎ_ｃ、ｎ_ｄは、それぞれ（ｂ＝１、ｚ＝１）、（ｂ＝１、ｚ＝０）、（ｂ＝０、ｚ＝１）、（ｂ＝０、ｚ＝０）となる画素の総数であり、以下のように表される。 Here, n _a , n _b , n _c , and n _d are (b = 1, z = 1), (b = 1, z = 0), (b = 0, z = 1), (b = 0, z = 0), and is expressed as follows.

結果として、入力ベクトルＺ_ｔの対数尤度は、以下のように表される。 As a result, the log likelihood of the input vector Z _t is expressed as follows:

一方、一般に、記述長最小基準は、Ｆをモデルの自由度、ｎをデータ数とすると、以下のように定式化される。 On the other hand, in general, the minimum description length criterion is formulated as follows, where F is the degree of freedom of the model and n is the number of data.

上式に対応する形状投影モデルの記述長Ｄ（Ｂ，Ｚｔ）は、ｈを歩行者数（ｈ＜＜Ｍ）とすると、式（２０）から、次式により計算できる。 The description length D (B, Zt) of the shape projection model corresponding to the above equation can be calculated from the equation (20) by the following equation, where h is the number of pedestrians (h << M).

なお、本発明に適用される統計的なモデル選択基準は、上記の例に特に限定されず、赤池の情報量基準（ＡＩＣ：ＡｋａｉｋｅＩｎｆｏｒｍａｔｉｏｎＣｒｉｔｅｒｉｏｎ）、ベイズ理論に基づく基準（ＢＩＣ：ＢａｙｓｉａｎＩｎｆｏｒｍａｔｉｏｎＣｒｉｔｅｒｉｏｎ）等の他のモデル選択基準を用いてもよい。 Note that the statistical model selection criteria applied to the present invention are not particularly limited to the above example, and Akaike's information criterion (AIC: Akaike Information Criterion), a criterion based on Bayesian theory (BIC: Baysian Information Criterion). Other model selection criteria such as may be used.

分布推定部２５は、記述長算出部２４により決定された形状投影モデルに対応する各ブロックに人間が存在すると判定し、人間が存在するブロックの数を移動領域上の歩行者の数として推定し、人間が存在するブロックの位置を移動領域上の歩行者の分布として推定する。 The distribution estimation unit 25 determines that there is a human in each block corresponding to the shape projection model determined by the description length calculation unit 24, and estimates the number of blocks in which the human exists as the number of pedestrians on the moving region. The position of a block where a human is present is estimated as the distribution of pedestrians on the moving area.

本実施の形態では、カメラ１１〜１Ｋが撮影手段の一例に相当し、画像取得部２１、人物領域抽出部２２及び観測ベクトル作成部２３が作成手段の一例に相当し、記述長算出部２４及び分布推定部２５が推定手段の一例に相当し、形状投影モデル記憶部２６が記憶手段の一例に相当する。 In the present embodiment, the cameras 11 to 1K correspond to an example of a photographing unit, the image acquisition unit 21, the person region extraction unit 22, and the observation vector creation unit 23 correspond to an example of a creation unit, and a description length calculation unit 24 and The distribution estimation unit 25 corresponds to an example of an estimation unit, and the shape projection model storage unit 26 corresponds to an example of a storage unit.

次に、上記のように構成された移動体分布推定装置による移動体分布推定処理について説明する。図８は、図１に示す移動体分布推定装置による移動体分布推定処理を説明するためのフローチャートである。 Next, the mobile body distribution estimation process by the mobile body distribution estimation apparatus configured as described above will be described. FIG. 8 is a flowchart for explaining the mobile body distribution estimation process by the mobile body distribution estimation apparatus shown in FIG.

まず、ステップＳ１において、カメラ１１〜１Ｋは、移動領域を移動する歩行者を撮影し、撮影した画像データを画像取得部２１へ出力し、画像取得部２１は、画像データを撮影時刻情報とともに人物領域抽出部２２へ出力する。 First, in step S1, the cameras 11 to 1K photograph pedestrians who move in the moving area, and output the photographed image data to the image acquisition unit 21. The image acquisition unit 21 captures the image data together with the photographing time information. The data is output to the area extraction unit 22.

次に、ステップＳ２において、人物領域抽出部２２は、画像データから人間のシルエットを表す人物領域をシルエット領域として抽出し、シルエット領域を特定したシルエット画像を撮影時刻情報とともに観測ベクトル作成部２３へ出力する。 Next, in step S2, the person area extraction unit 22 extracts a person area representing a human silhouette from the image data as a silhouette area, and outputs the silhouette image specifying the silhouette area to the observation vector creation unit 23 together with the shooting time information. To do.

次に、ステップＳ３において、観測ベクトル作成部２３は、上記の式（１４）〜（１５）を用いて、シルエット画像を人物領域と背景領域とに２値化し、２値化シルエット画像を観測ベクトルとして記述長算出部２４へ出力する。 Next, in step S3, the observation vector creation unit 23 binarizes the silhouette image into a person area and a background area using the above equations (14) to (15), and converts the binarized silhouette image into an observation vector. To the description length calculation unit 24.

次に、ステップＳ４において、記述長算出部２４は、初期値として、歩行者数を表すｈを０に、最小記述長を表すＤ_ｍｉｎをＤ（０，Ｚ）に設定する。 Next, in step S4, the description length calculation unit 24 sets h representing the number of pedestrians to 0 and D _min representing the minimum description length to D (0, Z) as initial values.

次に、ステップＳ５において、記述長算出部２４は、ｆｌａｇを０に設定し、歩行者が存在するブロックを表すｌ_ｈ＋１を１からＭまで変化させながら、以下の処理を実行する。すなわち、記述長算出部２４は、形状投影モデル記憶部２６から該当する形状投影モデルを読み出し、Ｋ台のカメラ１１〜１Ｋの形状投影モデルの集合Ｂ＝Ｂ_ｌ１∪Ｂ_ｌ２∪…∪Ｂ_ｌｈ＋１を算出し、算出した形状投影モデルの集合ＢとＫ台のカメラ１１〜１Ｋの観測ベクトルＺとを用いて、不等式Ｄ（Ｂ，Ｚ）＋（（ｈ＋１）／２）（ｌｏｇＭ）＜Ｄ_ｍｉｎを満たすか否かを判定し、この不等式を満たす場合、すなわち記述長が低減している場合、Ｄ_ｍｉｎをＤ_ｍｉｎ＝Ｄ（Ｂ，Ｚ）＋（（ｈ＋１）／２）（ｌｏｇＢ）に更新するとともに、ｆｌａｇをｆｌａｇ＝１に更新し、分布推定部２５は、人間が存在する全ブロックを表すａｎｓをａｎｓ＝［ｌ_１ｌ_２ … ｌ_ｈ＋１］’に更新する。 Next, in step S5, the description length calculation unit 24 sets the flag to 0, and executes the following processing while changing l _{h + 1} representing the block where the pedestrian is present from 1 to M. That is, the description length calculation unit 24 reads out the corresponding shape projection model from the shape projection model storage unit 26, and sets a set B = B _l1 ∪B _l2 ∪... ∪B _{lh + 1} of the shape projection models of the K cameras _{11 to} 1K. Using the calculated shape projection model set B and the observed vectors Z of the K cameras 11 to 1K, the inequality D (B, Z) + ((h + 1) / 2) (logM) <D _min is set. If this inequality is satisfied, that is, if the description length is reduced, D _min is updated to D _min = D (B, Z) + ((h + 1) / 2) (log B) At the same time, the flag is updated to flag = 1, and the distribution estimation unit 25 updates ans representing all blocks in which humans exist to ans = [l ₁ l ₂ ... L _{h + 1} ] ′.

次に、ステップＳ６において、記述長算出部２４は、ｆｌａｇ＝１であるか否か判断し、ｆｌａｇ＝１の場合、すなわち記述長が低減している場合は、ステップＳ７において、歩行者数を表すｈをｈ＝ｈ＋ｆｌａｇに更新することにより歩行者の人数を増加させてステップＳ５以降の処理を繰り返し、ｆｌａｇ＝１でない場合、すなわち記述長が低減していない場合は、処理をステップＳ８へ移行する。 Next, in step S6, the description length calculation unit 24 determines whether or not flag = 1. If flag = 1, that is, if the description length is reduced, the number of pedestrians is determined in step S7. Update the represented h to h = h + flag to increase the number of pedestrians and repeat the processes in and after step S5. If flag = 1 is not satisfied, that is, if the description length is not reduced, the process proceeds to step S8. To do.

記述長が低減していない場合、ステップＳ８において、分布推定部２５は、現在のａｎｓ＝［ｌ_１ｌ_２ … ｌ_ｈ＋１］’を基に、ブロックｌ_１、ｌ_２、…、ｌ_ｈ＋１に歩行者が存在すると判断し、歩行者の数及び分布を推定する。その後、ステップＳ１に戻り、次フレームの画像データを取得し、フレーム単位で上記の処理を継続し、入力画像群に適合するシーン内の人物位置を探索する。 When the description length is not reduced, in step S8, the distribution estimation unit 25 walks to the blocks l ₁ , l ₂ ,..., L _{h + 1} based on the current ans = [l ₁ l ₂ ... L _{h + 1} ] ′. The number and distribution of pedestrians are estimated. Thereafter, returning to step S1, image data of the next frame is acquired, and the above processing is continued in units of frames to search for a person position in the scene that matches the input image group.

上記の処理により、本実施の形態では、撮影された複数の画像データから歩行者のシルエットを表す複数の２値化シルエット画像が観測ベクトルとして作成され、歩行者数を順次増加させながら、２値化された複数の観測ベクトルと、２値化された形状投影モデルの集合とを用いて記述長最小基準が算出され、記述長最小基準が低減しなくなるまで探索が行われ、このときの形状投影モデルに対応するブロックに歩行者が存在すると判断して移動領域上の歩行者の数及び分布が推定されるので、歩行者の位置とシルエット画像との関係をモデル化した形状投影モデルに基づき、観測画像群の尤度が極大となる歩行者分布を推定することができ、多数のオクルージョン領域を持つ複雑なシーンにおいても、歩行者の数及び分布を高精度に且つリアルタイムに推定することができる。 Through the above processing, in the present embodiment, a plurality of binarized silhouette images representing pedestrian silhouettes are created as observation vectors from a plurality of captured image data, and binary values are obtained while sequentially increasing the number of pedestrians. Description length minimum criterion is calculated using a plurality of digitized observation vectors and a binarized set of shape projection models, and a search is performed until the description length minimum criterion is not reduced. Since the number and distribution of pedestrians on the moving area is estimated by determining that there are pedestrians in the block corresponding to the model, based on the shape projection model that models the relationship between the position of pedestrians and silhouette images, It is possible to estimate the pedestrian distribution that maximizes the likelihood of the observed image group, and the number and distribution of pedestrians can be accurately and reconstructed even in complex scenes with many occlusion areas. It can be estimated in-time.

また、上記の処理では、０人の場合から一人ずつ徐々に増加させながら、各歩行者の位置を確定させている。この場合、カメラ１１〜１Ｋから遠方に位置する複数の歩行者のシルエットにカメラ１１〜１Ｋに近い一人の歩行者のモデルが当てはまる等、処理が局所解に陥る可能性があるが、複数のカメラ１１〜１Ｋを用いた多視点で得られる画像群を同時に評価しているので、上記のように局所解に陥ることを抑制することができる。 In the above processing, the position of each pedestrian is fixed while gradually increasing one by one from the case of zero. In this case, there is a possibility that the processing falls into a local solution, for example, a model of one pedestrian close to the cameras 11 to 1K is applied to the silhouettes of a plurality of pedestrians located far from the cameras 11 to 1K. Since the image groups obtained from multiple viewpoints using 11 to 1K are simultaneously evaluated, it is possible to suppress falling into a local solution as described above.

次に、上記の移動体分布推定装置の推定精度について説明する。まず、形状投影モデルに基づいて得られるシミュレーション画像を用いて歩行者位置分布を推定した。歩行者の存在範囲は、床面上の３３０ｃｍ×３３０ｃｍの矩形領域とし、該当領域を１１×１１のブロックに分割し、１ブロックのサイズは、３０ｃｍ×３０ｃｍとした。歩行者は、これらブロックのいずれかに存在するものとした。 Next, the estimation accuracy of the above mobile body distribution estimation apparatus will be described. First, the pedestrian position distribution was estimated using a simulation image obtained based on the shape projection model. The existence range of the pedestrian was a rectangular area of 330 cm × 330 cm on the floor, the area was divided into 11 × 11 blocks, and the size of one block was 30 cm × 30 cm. Pedestrians were assumed to be in one of these blocks.

図９は、カメラの配置例を示す図である。図９に示すように、歩行者ＨＭの存在範囲の周囲に３台の仮想カメラＣＡを設置し、観測像を生成した。このカメラ配置は、後述の実画像観測における環境の制約を考慮し、極力多様な方向から観測可能な配置を選定した。 FIG. 9 is a diagram illustrating an arrangement example of cameras. As shown in FIG. 9, three virtual cameras CA were installed around the existence range of the pedestrian HM to generate an observation image. In consideration of the environmental constraints in actual image observation, which will be described later, this camera arrangement was selected so that observation was possible from various directions as much as possible.

図１０は、３名の歩行者に対する生成画像の例を示す図である。図１０に示す例は、３名の歩行者が座標（ｘ，ｙ）＝（０，６０）、（ｘ，ｙ）＝（２１０，２１０）、（ｘ，ｙ）＝（２４０，１８０）の３箇所に存在する場合の生成画像である。この例では、全てのカメラの観測において、２人目と３人目の歩行者間でオクルージョンが生じている。この画像を入力として、図１に示す移動体分布推定装置により歩行者の位置分布を推定した。 FIG. 10 is a diagram illustrating an example of a generated image for three pedestrians. In the example shown in FIG. 10, three pedestrians have coordinates (x, y) = (0, 60), (x, y) = (210, 210), (x, y) = (240, 180). It is a generated image when it exists in three places. In this example, occlusion occurs between the second and third pedestrians in all camera observations. Using this image as an input, the position distribution of the pedestrian was estimated by the moving body distribution estimation apparatus shown in FIG.

図１１は、歩行者の位置分布を推定した結果を示す図である。図中、◇印は歩行者の配置を、＋印は位置推定結果をそれぞれ示している。図１１から、このときの推定歩行者数は３名であり、推定結果と一致し、また、歩行者の分布についても、全歩行者について正しい推定結果が得られていることがわかる。 FIG. 11 is a diagram illustrating a result of estimating the position distribution of pedestrians. In the figure, ◇ indicates the placement of pedestrians, and + indicates the position estimation result. FIG. 11 shows that the estimated number of pedestrians at this time is three, which is consistent with the estimation result, and that the correct estimation result is obtained for all pedestrians in the pedestrian distribution.

次に、ランダムな位置に仮想的に歩行者を配置し、繰り返し推定を行うことにより図１に示す移動体分布推定装置の安定性を調べた。この例では、歩行者数を１人、２人、３人、４人、５人と変化させながら、上記の例と同様に３３０ｃｍ×３３０ｃｍの範囲内に歩行者を配置し、３台の仮想カメラに対する投影像を各人数１０組ずつ生成した。歩行者の位置は、床面を３０ｃｍ×３０ｃｍに区切った格子点上とし、実画像における歩行者同士の物理的な干渉を考慮し、複数の歩行者が隣り合う格子点に位置しないという条件でランダムに選択し、歩行者の分布を推定した。 Next, the stability of the mobile body distribution estimation apparatus shown in FIG. 1 was examined by virtually placing pedestrians at random positions and performing repeated estimation. In this example, while changing the number of pedestrians to 1, 2, 3, 4, and 5, pedestrians are arranged within the range of 330 cm × 330 cm as in the above example, and three virtual Ten sets of projected images for each camera were generated. The position of the pedestrian is on a grid point where the floor surface is divided into 30 cm × 30 cm, and in consideration of physical interference between pedestrians in the actual image, a plurality of pedestrians are not positioned at adjacent grid points. Random selection was made to estimate the pedestrian distribution.

図１２は、歩行者の数の推定結果を示す図であり、図１３は、図１２に示す推定結果の平均及び標準偏差を示す図であり、図１４は、各推定位置に最も近い真値の位置までの距離を計算したときの歩行者の位置分布推定誤差を示す図であり、図１５は、図１４に示す位置分布推定誤差のＸ位置及びＹ位置の平均及び標準偏差を示す図である。図１２乃至図１５から、一人の場合を除いて、推定人数に誤差は生じているものの、人数及び位置ともに、真値に近い推定値が得られていることがわかる。 12 is a diagram showing estimation results of the number of pedestrians, FIG. 13 is a diagram showing averages and standard deviations of the estimation results shown in FIG. 12, and FIG. 14 is a true value closest to each estimated position. FIG. 15 is a diagram showing the pedestrian position distribution estimation error when calculating the distance to the position, and FIG. 15 is a diagram showing the average and standard deviation of the X and Y positions of the position distribution estimation error shown in FIG. is there. From FIG. 12 to FIG. 15, it can be seen that an estimated value close to the true value is obtained for both the number of people and the position, although there is an error in the estimated number of people except for the case of one person.

次に、歩行者数を２人とし、２者間の距離を１〜５ブロック（３０〜１５０ｃｍ）のいずれかに固定した上で、歩行者の位置をランダムに選択し、各１０回の推定を行った。図１６は、歩行者の位置分布推定誤差を示す図であり、図１７は、図１６に示す位置分布推定誤差のＸ位置及びＹ位置の平均及び標準偏差を示す図である。距離が２ブロック以下の場合、多くの観測についてオクルージョンが生じるが、図１６及び図１７から、図１に示す移動体分布推定装置では、その場合も問題なく、位置及び分布が推定できていることがわかる。 Next, the number of pedestrians is two, and the distance between the two is fixed to any one of 1 to 5 blocks (30 to 150 cm), and the position of the pedestrian is selected at random and estimated 10 times each. Went. 16 is a diagram showing a pedestrian position distribution estimation error, and FIG. 17 is a diagram showing the average and standard deviation of the X and Y positions of the position distribution estimation error shown in FIG. When the distance is 2 blocks or less, occlusion occurs for many observations. From FIG. 16 and FIG. 17, the mobile object distribution estimation apparatus shown in FIG. I understand.

次に、カメラ数と推定精度との関係を示すため、歩行者数５人の場合について、観測に用いるカメラを１〜３台に変化させ、上記と同様に各１０回の推定を行った。図１８は、カメラ数に対する歩行者の数の推定結果を示す図であり、図１９は、図１８に示す推定結果の平均及び標準偏差を示す図であり、図２０は、カメラ数に対する歩行者の位置分布推定誤差を示す図であり、図２１は、図２０に示す位置分布推定誤差のＸ位置及びＹ位置の平均及び標準偏差を示す図である。 Next, in order to show the relationship between the number of cameras and the estimation accuracy, the number of cameras used for observation was changed from 1 to 3 for the case of 5 pedestrians, and estimation was performed 10 times in the same manner as described above. 18 is a diagram showing an estimation result of the number of pedestrians with respect to the number of cameras, FIG. 19 is a diagram showing an average and standard deviation of the estimation results shown in FIG. 18, and FIG. 20 is a diagram showing pedestrians with respect to the number of cameras. FIG. 21 is a diagram showing the average and standard deviation of the X and Y positions of the position distribution estimation error shown in FIG.

図１８乃至図２１から、この例では、カメラ２台と３台とでは、推定結果に差がないものが、カメラ数を１とすると推定精度が悪化していることがわかる。この結果は、多人数時における多視点観測の効果を示しており、この傾向は、歩行者数の増加に伴い、より顕著になる。 From FIG. 18 to FIG. 21, it can be seen that in this example, there is no difference in the estimation results between the two cameras and the three cameras, but when the number of cameras is 1, the estimation accuracy is deteriorated. This result shows the effect of multi-viewpoint observation when there are many people, and this tendency becomes more remarkable as the number of pedestrians increases.

次に、実際に３台のカメラ１１〜１３を用いて歩行者を撮影した観測画像を用いて推定を行った。この例では、人数を１人、２人、３人、４人、５人と変化させながら、上記の各例と同様に３３０ｃｍ×３３０ｃｍの範囲内に歩行者を配置し、３台のカメラ１１〜１３によって各人数１０組の画像を撮影し、予め無人状態で撮影した画像との差分処理によりシルエット像を得た。 Next, it estimated using the observed image which image | photographed the pedestrian using the three cameras 11-13 actually. In this example, while changing the number of people to 1, 2, 3, 4, and 5, pedestrians are arranged within a range of 330 cm × 330 cm as in the above examples, and three cameras 11 Images of 10 pairs of each person were taken by ˜13, and silhouette images were obtained by differential processing with images taken in an unattended state in advance.

図２２は、観測画像の一例を示す図である。このときのカメラ配置及び歩行者の配置は、図１２及び図１３の例と同一とした。ここで、歩行者の内訳は、男性４名（身長１８３ｃｍ、１７８ｃｍ、１７６ｃｍ、１７０ｃｍ）、女性１名（身長１６５ｃｍ)である。 FIG. 22 is a diagram illustrating an example of an observation image. The camera arrangement and the pedestrian arrangement at this time were the same as in the examples of FIGS. Here, the breakdown of pedestrians is four men (height 183 cm, 178 cm, 176 cm, 170 cm) and one woman (height 165 cm).

図２３は、歩行者の数の推定結果を示す図であり、図２４は、図２３に示す推定結果の平均及び標準偏差を示す図であり、図２５は、各推定位置に最も近い真値の位置までの距離を計算したときの歩行者の位置分布推定誤差を示す図であり、図２６は、図２５に示す位置分布推定誤差のＸ位置及びＹ位置の平均及び標準偏差を示す図である。 FIG. 23 is a diagram showing an estimation result of the number of pedestrians, FIG. 24 is a diagram showing an average and standard deviation of the estimation results shown in FIG. 23, and FIG. 25 is a true value closest to each estimated position. FIG. 26 is a diagram illustrating a position distribution estimation error of a pedestrian when the distance to the position is calculated, and FIG. 26 is a diagram illustrating an average and a standard deviation of the X position and the Y position of the position distribution estimation error illustrated in FIG. is there.

図２３乃至図２６から、実写画像に対して、位置で±８０ｃｍ未満、人数で±１名以下という精度で推定が行えていることがわかる。一方、人数の増加に伴って誤差が増加していることに加え、図１２及び図１３の結果に比べても、推定精度が悪化している。この要因としては、シルエット観測自体のノイズに加え、歩行者間のオクルージョン、姿勢及び体格の違い、着衣形状の違い等が考えられる。したがって、本実施の形態では、歩行者の直立姿勢についてのみモデル化したが、複数の異なる姿勢ごとに形状投影モデルを作成し、歩行者の姿勢を検出した後に、検出した姿勢の形状投影モデルを用いて歩行者の数及び分布を推定するようにしてもよい。 From FIG. 23 to FIG. 26, it can be seen that estimation is performed with accuracy of less than ± 80 cm in position and ± 1 or less in the number of people for a real image. On the other hand, in addition to the increase in error with the increase in the number of people, the estimation accuracy is also deteriorated compared to the results of FIGS. This may be due to the noise of silhouette observation itself, occlusion between pedestrians, differences in posture and physique, and differences in clothing shapes. Therefore, in this embodiment, only the pedestrian's upright posture is modeled, but after creating a shape projection model for each of a plurality of different postures and detecting the pedestrian's posture, the shape projection model of the detected posture is It may be used to estimate the number and distribution of pedestrians.

最後に、図１に示す移動体分布推定装置を用いてシーン内を移動する歩行者の位置を連続的に推定した。図２７は、シーン内を移動する歩行者の位置を連続的に推定した結果を示す図であり、図２８は、図２７に示す例に使用した観測画像の一例を示す図である。 Finally, the position of the pedestrian moving in the scene was continuously estimated using the moving body distribution estimation apparatus shown in FIG. FIG. 27 is a diagram illustrating a result of continuously estimating the position of a pedestrian moving in the scene, and FIG. 28 is a diagram illustrating an example of an observation image used in the example illustrated in FIG.

本例では、歩行者に図２７に示す破線上の軌跡を辿るように教示し、観測画像に対して式（１４）に示した差分処理によりシルエット像を得た。図２７中の＋印は、各時刻における推定位置を示している。図２７に示すように、位置推定結果と教示したコースとの平均距離は、８．５９ｃｍ（距離分布の標準偏差１１．６ｃｍ）という精度が得られた。この結果、本実施の形態では、連続的な運動に対しても、その位置を推定できることがわかる。 In this example, the pedestrian was instructed to follow the locus on the broken line shown in FIG. 27, and a silhouette image was obtained by the difference process shown in Expression (14) for the observed image. The + mark in FIG. 27 indicates the estimated position at each time. As shown in FIG. 27, the average distance between the position estimation result and the taught course was 8.59 cm (standard deviation of the distance distribution was 11.6 cm). As a result, in the present embodiment, it is understood that the position can be estimated even for continuous motion.

なお、本例では、教示コースの上辺（Ｙ＝３００）上の８箇所（Ｘ＝３０、６０、…、２４０）以外では、教示コースはブロックの中心と一致していない。本実施の形態では、モデル生成時に仮定したブロック単位で入力画像に尤も適合する歩行者分布が選択されるため、ブロックの大きさよりも細かな位置の変動を推定することはできない。このため、ブロック境界付近に存在する歩行者は、その周辺のいずれかのブロックに存在するものとして推定される。図２７の追跡結果では、教示コースの周辺ブロックに推定位置が得られており、上記状況を示しているといえる。 In this example, the teaching course does not coincide with the center of the block except for eight places (X = 30, 60,..., 240) on the upper side (Y = 300) of the teaching course. In the present embodiment, since a pedestrian distribution that is most suitable for the input image is selected in units of blocks assumed at the time of model generation, it is impossible to estimate position fluctuations that are finer than the block size. For this reason, it is estimated that the pedestrian who exists in the block boundary vicinity exists in one of the surrounding blocks. In the tracking result of FIG. 27, it can be said that the estimated position is obtained in the peripheral blocks of the teaching course, indicating the above situation.

上記の各例により、歩行者の数及び分布の推定における本実施の形態の有効性が確認された。なお、上記の各例では、画像サイズを全て８０×６０画素とし、一般的なパーソナルコンピュータ４台で並列処理した場合の処理速度は、３〜１０フレーム／秒であった。 The above examples confirmed the effectiveness of the present embodiment in estimating the number and distribution of pedestrians. In each of the above examples, the image size is 80 × 60 pixels, and the processing speed when parallel processing is performed by four general personal computers is 3 to 10 frames / second.

本発明の一実施の形態による移動体分布推定装置の構成を示すブロック図である。It is a block diagram which shows the structure of the mobile body distribution estimation apparatus by one embodiment of this invention. カメラ投影モデルを説明するための模式図である。It is a schematic diagram for demonstrating a camera projection model. シルエット画像と床面上の歩行者の分布との対応関係を示す模式図である。It is a schematic diagram which shows the correspondence of a silhouette image and the distribution of the pedestrian on a floor surface. モデル生成用画像の取得方法を説明するための模式図である。It is a schematic diagram for demonstrating the acquisition method of the image for model generation. 各観測方向における平均画像の一例を示す図である。It is a figure which shows an example of the average image in each observation direction. 投影シルエット画像の生成方法を説明するための模式図である。It is a schematic diagram for demonstrating the production | generation method of a projection silhouette image. 投影シルエット画像の一例を示す図である。It is a figure which shows an example of a projection silhouette image. 図１に示す移動体分布推定装置による移動体分布推定処理を説明するためのフローチャートである。It is a flowchart for demonstrating the mobile body distribution estimation process by the mobile body distribution estimation apparatus shown in FIG. カメラの配置例を示す図である。It is a figure which shows the example of arrangement | positioning of a camera. ３名の歩行者に対する生成画像の例を示す図である。It is a figure which shows the example of the production | generation image with respect to three pedestrians. 歩行者の位置分布を推定した結果を示す図である。It is a figure which shows the result of having estimated the position distribution of the pedestrian. 歩行者の数の推定結果を示す図である。It is a figure which shows the estimation result of the number of pedestrians. 図１２に示す推定結果の平均及び標準偏差を示す図である。It is a figure which shows the average and standard deviation of the estimation result shown in FIG. 各推定位置に最も近い真値の位置までの距離を計算したときの歩行者の位置分布推定誤差を示す図である。It is a figure which shows the position distribution estimation error of a pedestrian when the distance to the position of the true value nearest to each estimated position is calculated. 図１４に示す位置分布推定誤差のＸ位置及びＹ位置の平均及び標準偏差を示す図である。It is a figure which shows the average and standard deviation of X position and Y position of the position distribution estimation error shown in FIG. 歩行者の位置分布推定誤差を示す図である。It is a figure which shows the position distribution estimation error of a pedestrian. 図１６に示す位置分布推定誤差のＸ位置及びＹ位置の平均及び標準偏差を示す図である。It is a figure which shows the average and standard deviation of X position and Y position of the position distribution estimation error shown in FIG. カメラ数に対する歩行者の数の推定結果を示す図である。It is a figure which shows the estimation result of the number of pedestrians with respect to the number of cameras. 図１８に示す推定結果の平均及び標準偏差を示す図である。It is a figure which shows the average and standard deviation of the estimation result shown in FIG. カメラ数に対する歩行者の位置分布推定誤差を示す図である。It is a figure which shows the position distribution estimation error of a pedestrian with respect to the number of cameras. 図２０に示す位置分布推定誤差のＸ位置及びＹ位置の平均及び標準偏差を示す図である。It is a figure which shows the average and standard deviation of X position and Y position of the position distribution estimation error shown in FIG. 観測画像の一例を示す図である。It is a figure which shows an example of an observation image. 歩行者の数の推定結果を示す図である。It is a figure which shows the estimation result of the number of pedestrians. 図２３に示す推定結果の平均及び標準偏差を示す図である。It is a figure which shows the average and standard deviation of the estimation result shown in FIG. 各推定位置に最も近い真値の位置までの距離を計算したときの歩行者の位置分布推定誤差を示す図である。It is a figure which shows the position distribution estimation error of a pedestrian when the distance to the position of the true value nearest to each estimated position is calculated. 図２５に示す位置分布推定誤差のＸ位置及びＹ位置の平均及び標準偏差を示す図である。It is a figure which shows the average and standard deviation of X position and Y position of the position distribution estimation error shown in FIG. シーン内を移動する歩行者の位置を連続的に推定した結果を示す図である。It is a figure which shows the result of having continuously estimated the position of the pedestrian who moves within the scene. 図２７に示す例に使用した観測画像の一例を示す図である。It is a figure which shows an example of the observation image used for the example shown in FIG.

Explanation of symbols

１１〜１Ｋカメラ
２０画像処理装置
２１画像取得部
２２人物領域抽出部
２３観測ベクトル作成部
２４記述長算出部
２５分布推定部
２６形状投影モデル記憶部 11 to 1K camera 20 image processing device 21 image acquisition unit 22 human region extraction unit 23 observation vector creation unit 24 description length calculation unit 25 distribution estimation unit 26 shape projection model storage unit

Claims

A plurality of photographing means for photographing a moving body that moves in a moving area divided into M (M is an integer of 2 or more) blocks having a size that allows only one moving body to be located ;
Creating means for creating a plurality of observation silhouette images representing silhouettes of moving objects from a plurality of photographed images photographed by the plurality of photographing means;
A model projected on the image coordinate system of each photographing means when it is assumed that the moving body located in each block is photographed from the camera viewpoint of each photographing means with respect to the observation silhouette image group created by the creating means An estimation unit that fits an image and estimates the number and distribution of moving objects on a moving region most suitable for an observation silhouette image group based on a model selection criterion ;
The estimation means increases the number of moving objects by one from zero, and changes the value representing the block in which the moving object is present from 1 to M, and displays a model image of each number of moving objects for the observation silhouette image group. The description length of the observation silhouette image group is calculated by fitting, and if the calculated description length is reduced, the block data representing all blocks in which the moving object exists is updated based on the model image, and the description length no longer decreases. mobile distribution estimation apparatus the number and the block data of the moving body, characterized that you estimated as the number and distribution of the most suitable moving object in a mobile region when.

The creating means binarizes the observation silhouette image,
The estimation means fits a binarized model image to the binarized observation silhouette image group, and estimates the most suitable number and distribution of moving objects based on a model selection criterion. The moving body distribution estimation apparatus according to claim 1 .

The estimation means calculates a description length of the observation silhouette image group using an observation error as a portion where the pixel value of the binarized observation silhouette image is inverted with respect to the pixel value of the binarized model image 3. The mobile body distribution estimation apparatus according to claim 2, wherein the number and distribution of mobile bodies having the shortest description length are estimated as the most suitable number and distribution of mobile bodies on the moving area.

A storage unit for storing the plurality of model images in advance;
The estimation means sequentially reads out the model images stored in the storage means, sequentially applies the read model images to the observation silhouette image group, and determines the most suitable number of moving objects based on the model selection criteria and The mobile body distribution estimation apparatus according to claim 1 , wherein the distribution is estimated.

5. The moving body distribution estimation apparatus according to claim 1 , wherein the plurality of model images are created by morphing processing from a plurality of images obtained by shooting the moving body from different camera viewpoints.

Multiple capturing means, a mobile distribution estimation method using a mobile distribution estimation apparatus comprising creating means and estimating means,
A step in which the plurality of photographing means shoots a moving body that moves in a moving area divided into M (M is an integer of 2 or more) blocks having a size that allows only one moving body to be located ; When,
The creating means creating a plurality of observation silhouette images representing silhouettes of a moving object from a plurality of photographed images photographed by the plurality of photographing means;
When it is assumed that the estimation means has photographed a moving body located in each block from the camera viewpoint of each photographing means for the observation silhouette image group created by the creating means, the image coordinate system of each photographing means And estimating the number and distribution of moving objects on the most suitable moving region based on the model selection criteria , wherein the estimating means reduces the number of moving objects from zero to one. By increasing the value representing the block in which the moving object exists from 1 to M, the model length of each moving object is fitted to the observed silhouette image group to calculate the description length of the observed silhouette image group. If the calculated description length is reduced, the block data representing all blocks in which the moving object is present is updated based on the model image, and the number of moving objects when the description length no longer decreases Fine block data, the most suitable mobile distribution estimation method characterized by comprising the steps of estimating the number and distribution of the moving object in a mobile region.

Multiple taken by multiple imaging means mobile M (M is an integer greater than or equal to 2) to move the moving area divided into blocks having the level allowing only one mobile is located Creating means for creating a plurality of observation silhouette images representing silhouettes of moving objects from captured images;
A model projected on the image coordinate system of each photographing means when it is assumed that the moving body located in each block is photographed from the camera viewpoint of each photographing means with respect to the observation silhouette image group created by the creating means An estimation unit that fits an image and estimates the number and distribution of moving objects on the most suitable moving area based on a model selection criterion, and increases the number of moving objects by one from zero, and there are moving objects. The description length of the observation silhouette image group is calculated by fitting the model images of each moving object to the observation silhouette image group while changing the value representing the block from 1 to M, and the calculated description length is reduced. Update the block data representing all blocks in which the moving object exists based on the model image, and the most suitable number of moving objects and block data when the description length no longer decreases. Mobile distribution estimation program for causing a computer to function as estimating means for estimating the number and distribution of the moving body on the moving region.