Download presentation
Presentation is loading. Please wait.
1
Generic Object Recognition
The Current State and Future Directions on Generic Object Recognition Akio Chimura : Wada Lab. Akira Miyahara : Nagayama Lab. Takehito Higa : Funaki Lab.
2
Generic Object Recognition
Image has been used in conventional image recognition Ex.) Very clear, Linearly etc. Common image Ex.) Garden trees, Sky, People etc. Except for the front image of a human face, there is almost none that can be recognized in practical accuracy Computer can not automatically determine the one that is reflected in the photograph 一般画像認識とは 従来の画像認識に使われてきた画像 例)とても鮮明、直線的なものが多い 等 一般的な画像 例)庭の木、空、人物 等 未だに人間の顔の正面画像を除いては実用的な精度で認識が可能な対象は殆ど無い 計算機が自動的に写っているものを判別することができない
3
Object recognition Identification(同定) Classification(分類)
The computer compares the input image and the object models in the database, output whether the presence of an object corresponding to the model Objects in images associated with "class" humans decided, output the common name of the object 物体認識 Identification(同定) 入力画像とデータベースのモデルと比較し、モデルに対応する物体が存在しているかどうかを出力する Classification(分類) 人間が決めた分類(class)と画像中の物体を対応付け、物体の一般名称を出力する 一般物体認識はClassificationのことを意味する Generic Object Recognition means Classification
4
Generic Object Recognition
Manpower is necessary for classification and searching of large amounts of image data To increase the target recognition is difficult Automatically handles large amounts of image data, we want to be able to tag with the keyword 一般物体認識 大量の画像データの分類や検索には人手が必要である 認識対象を増やすことが困難である 大量の画像データを自動で処理し、キーワードでタグ付けできるようにしたい
5
これまでの研究 PREVIOUS RESEARCH
6
Early 1990s Knowledge-based image understanding systems to provide a method for recognizing for each objects in images Manpower is necessary for rulemaking Against the 3D original information, had been treated as a 2D image As the target of recognition, prepare a 3D model There is a need for recognition is well-known shape 1990年代前半 画像中の物体ごとに認識手法を用意する知識ベース型の画像理解システム ただし、ルール作りは人の手で行う 元の情報は3次元であるのに対し、これまでは2次元的な画像としてしか取り扱っていなかった 認識の対象として、物体の3次元モデルを用意し認識に用いる 認識の対象の形状が完全に既知である必要がある
7
Late 1990s (1) extract features automatically from training images and it used in the recognition Color histogram As a method to search similar images from the distribution of color, are still used 1990年代後半 学習画像を用意してそこから自動的に特徴量を抽出して認識に用いる カラーヒストグラム 色の分布から類似画像の検索する手法として、現在でも使われている
8
Late 1990s (2) Eigenface(固有顔), Parametric eigenspace method
Using images taken by changing the direction of the object and the direction of the light source, estimated from the eigenvector There is a need to provide a learning image 1990年代後半 学習画像を用意してそこから自動的に特徴量を抽出して認識に用いる 固有顔(Eigenface)、パラメトリック固有空間法 物体の方向や高原の方向を変化させて撮影した画像を用い、固有ベクトルから推定する 学習画像を用意する必要がある
9
Late 1990s(3) Application from the image database Photobook, Blobworld
It has been studied in the research communities of multimedia, the video data including a large number of images and image database Similarity of the image represent a class of scene or objects Photobook, Blobworld Split the image, and associate between the noun and the feature quantity of each part of the area There is a need to provide a learning image 画像データベースからの応用 大量の画像を含む画像データベースや映像データを研究対象としてきたマルチメディアの研究コミュニティーで研究されてきた 画像中の物体、もしくは画像の表すシーンのクラス(分類名)の類似度 Photobook、Blobworld 画像を分割してそれぞれの部分領域の特徴量と名詞単語の関連付けを行った 人の手によって領域と単語の対応を指示する必要がある
10
The Current State and Future Directions on Generic Object Recognition
Chapter.3
11
A new technique for the 21st Century
With the speed up of the computer was able to use various techniques. As two typical methods, A method based on region A method based on local features
12
A method based on region 1/2
A method of automatically keywording images. Not to classify each image class. →Add multiple keywords to the image. Word-image translation model Using the Corel Image Database, performing automatic annotation to the region.
13
A method based on region 2/2
1.Clustering the partial area of the image. 2.Determined in advance the probability of occurrences of each word for each cluster. 3.Then, for each partial area of the test image to determine the word of the nearest cluster.
14
A method based on local features 1/3
C.Schmid et al proposed a method by a combination of local features, make the comparison of the image. Solve the problems of the previous object recognition. Do not need to cut out subject of learning from the image. Occlusion can be ignored.
15
A method based on local features 2/3
1.By interest point detector, pick out the characteristic points of about 100 points from the image. 2.And feature vectors such as the pixel value of each point, which is characterized by a set of those images. Result of the detection by the Kadir-Brady detector for the image.
16
A method based on local features 3/3
3.Looking for a feature vector to match close to the test image. To vote also consider the relative position between the feature points. 4.Regarded as a model that matches the image of most votes. (b) Model of relative position. (c) Local pattern. (d) Result.
17
Recent research topics:BoK(1/3)
Method to ignore the relative position of the local area. Called and BoF (Bag of Features) and BoK (Bag of Key points).
18
Recent research topics:BoK(2/3)
1.To be extracted from the image characteristic amount SURF. 2.Clustering for all feature points were obtained. 3.To construct a histogram of visual words as the centroid n classification of each class. Image is converted into a feature vector.
19
Recent research topics:BoK(3/3)
(1) (2) (3)
20
Recent research topics:context(1/3)
The photo of the real world contains several object, it has a relationship of some sort. For example, trees could be high if there is a rod-shaped object in the green fields. In addition, the poles could be higher if there is a building around.
21
Recent research topics:context(2/3)
Recent research by a probabilistic model to represent context, to build a model by learning has been carried out. To handle the Context using the probability model is to introduce a Bayesian network or graphical model is generally used. They are very comprex.
22
Recent research topics:context(3/3)
In addition, there is a required to recognize an each object appear in scene. Many current studies have limited the target scene, the study is just the beginning.
23
4.1 Evaluation data set The evaluation unified for comparing each technique will become important. →Standard evaluation data set is required. The standard of the present evaluation image data is “Caltech101” of University of California. →Caltech101 consists of 101 kinds of images as the name suggests. There is image of 9144 sheets mainly collected by human power using Google Image Search.
24
4.1 Evaluation data set The group of UC Berkeley uses Caltech101, has achieved the best results, and it is 66.23% of a recognition rate. The recognition rate by the Contellation model is 17.7%.
25
4.2 Benchmark workshop There is a workshop which competes for the recognition rate of general object recognition. →PASCAL Challenge, TRECVID, Image CLEF PASCAL Challenge is a contest which PASCAL in Europe sponsors. The subject which recognizes the object specified from the given test image. The present recognition rate is about 40 percent.
26
4.2 Benchmark workshop TRECVID is a contest which NIST (National Institute of Standards and Technology) in the United States does. The subject of the high order feature extraction which selects the scene containing the specified object out of an actual news image occurs. A recognition rate is an average and is about 40 percent.
27
4.2 Benchmark workshop Image CLEF is a contest of multilingual information processing search. The subject which classifies the image of 1000 sheets containing 21 kinds of objects occurs. A recognition rate is not so high at the about 20 percent.
28
4.3 Study data creation by human power
About the problem of creation of grand-truth data required for study or evaluation. →In order to build large-scale grand-truth data, it is indispensable that a researcher builds together. Since study data must be true data, it should create it by human power.
29
4.4 Study data creation depended automatically
There is research of the automatic knowledge acquisition for general image recognition. Image knowledge is automatically acquired from Web and the method of classifying a picture automatically is proposed. →It is the system currently called "Web image mining" .
30
4.4 Study data creation depended automatically
As how to acquire a high-precision Web image, →A image is searched with seven languages and let top five sheets of each search results be a study image. (total of 35 sheets) Minus point:The data on Web contains the noise. Plus point:A data set without deviation can be built.
31
5.1 Future task: How to decide a recognition class
Man can recognize tens of thousands of kinds of objects. The recognition class should be a basic recognition level which a small child memorizes first. →It is a fundamental view which determines a recognition class by general object recognition.
32
5.2 Future task: Correspondence to change within a class
The change in the class may be big. For example Class “chair” 1 leg, 4 legs, like a sofa, and like a bench. →It is various.
33
6. Conclusion This time, from the past of general object recognition to the newest trend was summarized, and the issue which should solve general object recognition was considered. Correspondence to future, a class, and a lot of data with the image feature. Furthermore, it is also important to use the information on Web.
Similar presentations
© 2024 slidesplayer.net Inc.
All rights reserved.