Tuesday, November 25, 2014

OpenCV Haar / cascade training tutorial

A detailed manual for the training sample data generation method and a method for learning the OpenCV's Haar classifier, Cascade classifier.

For information about Haar training or cascade training, but also the site of the OpenCV tutorial manual or Naotoshi Seo web part also describes the ambiguity and also learned how to use the more explicit gathered in Korean because it is in English.

Come to my own unclear use this part seems much needed quite a lot of time to grasp it. Referring to the contents of this article, I think that we could be much shorter this time.

0.3 Contents Table of Contents

1. Related Resources
2. Overall details on OpenCV Haar Training & Cascade Training
3. In connection with the training data generated will have a basic understanding of
4. opencv_createsamples.exe Where?
5. opencv_createsamples.exe Usage
6. To create training data from an extended multi-sample
7. OpenCV Haar Training Method
8. OpenCV Cascade Training Method
9. detecting physical object using a detector learned
10. Errors and solutions that may occur from the Training Course



1. Related Resources

Referring to this article, as well as a good resource of:
  • OpenCV web Manual:  Cascade Classifier Training
  • Representative Haartraining Tutorial Site: http://note.sonots.com/SciSoftware/haartraining.html
  • Haar & Cascade paper: [Viola2001] P. Viola and MJ Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features", CVPR 2001
  • HOG paper: [Dalal2005] N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," CVPR 2005.
  • LBP paper: [Liao2007] S. Liao, X. Zhu, Z. Lei, L. Zhang and SZ Li, "Learning Multi-scale Block Local Binary Patterns for Face Recognition," International Conference on Biometrics (ICB), 2007, pp . 828-837.


2. Overall details on OpenCV Haar Training & Cascade Training

Haar classifier or Cascade classifier is one of the major ways that you can use to find certain types of objects in the image. A simple description of the way  the video pedestrian detection technology (Pedestrian Detection) , please refer to [Viola2001] section of the paper describes the contents of the article.

Using OpenCV has opencv_haartraining.exe offers a way to learn the Haar Classifier in [Viola2001] paper.

But Separately, existing Haar Classifier cascade learning structure are preserved, but provides a structure, the cascade learning how Cascade Training available for a wide range of image feature extraction by independent partial image feature. The Cascade Training in OpenCV that are new as well as existing LBP Haar feature supports both the feature and the HOG feature.

Thus, using a Cascade Training method by setting the parameter, Haar, LBP, to select at will the HOG dare to use a conventional Haar Training method it is possible to study looks clear. However, training data must be generated using the opencv_createsamples.exe regardless of learning because they are used in common.

Also, OpenCV web page describes the look LBP time for learning to use a feature, while both the execution time is much faster than the Haar feature detection performance is better than that using the LBP feature Haar feature are substantially similar.

Comparison of the HOG feature and LBP feature is thought to vary depending on the application. Once HOG slower compared to Haar and LBP. And the performance aspects HOG utilizes the edge information, Haar, LBP, depending on the performance of the target to be detected because it uses the brightness difference between the area and the area characteristics are thought to be different.


3. In connection with the training data generated will have a basic understanding of

To learn Haar / Cascade classifier positive samples and negative samples are required for both. deulyimyeo positive sample is a sample image of the object is detected and that, negative samples is the general image that does not contain the object. classifier learning (training) process refers to the process of obtaining a robust detector to find a good combination of the feature video feature that can best discriminate positive samples and negative samples among the various number of video feature.


4. opencv_createsamples.exe Where?

Common yuteol mobility for generating training data needed for OpenCV's Haar training or Cascade training. After you have installed OpenCV, opencv / apps / look at haartraining / relevant source code are present.

But, basically, the actual executable only be only the source code of the utilities are not present using the cmake they must perform a clean build of the opencv (build). At this point, if you check haejugo BUILD_opencv_apps from the options in the cmake build utilities will be generated. Or their related sources will be compiled directly by creating a c ++ project.

opencv_haartraining.exe, opencv_traincascade.exe, opencv_performance.exe files are also all true.


5. opencv_createsamples.exe Usage

Naotashi Seo tutorial listed in the site, but the use of the utilities is a total of four.Perform a completely different one, depending on how the parameters juneunya.This content does not point out that there is also regrettable OpenCV documentation. FYI, opencv / apps / look haartraining / createsamples.cpp file you can determine how rough usage, albeit not depend on the parameters.

Parameter list and brief description of each parameter of the utilities are as follows:

Usage: opencv_createsamples
  [-info <Description_file_name>] -> positive sample image file that stores your list and object-zone information
  [-img <Image_file_name>] -> a single positive sample images (only the object area, crop image)
  [-vec <Vec_file_name>] -> create a file name to save the training data (extension: vec)
  [-bg <Background_file_name>] -> negative sample image list file
  [-num <Number_of_samples = 1000>] -> training data to generate the number
  [-bgcolor <Background_color = 0>] -> positive reference pixel values ​​to be treated as transparent color (color) in the sample
  [-bgthresh <Background_color_threshold = 80>] -> [bgcolor-bgthresh, bgcolor + bgthresh] handles the color of the range as the transparent color (color)
  [-inv] [-randinv] -> Positive samples in batches reversed or inverted randomly
  [-maxidev <Max_intensity_deviation = 40>] -> positive sample variation range of brightness values
  [-maxxangle <Max_x_rotation_angle = 1.100000>] -> roll rotation transformation range
  [-maxyangle <Max_y_rotation_angle = 1.100000>] -> yaw rotation transformation range
  [-maxzangle <Max_z_rotation_angle = 0.500000>] -> pitch rotation transformation range
  [-show [<Scale = 4.000000>]] ->shows whether the training data generated by the image
  [-w <Sample_width = 24>] -> width of an image to generate training data (pixels)
  [-h <Sample_height = 24>] -> Image Height of training data to generate pixels

-info: info parameters, save the list of the bounding rectangle of the object's location in the positive image of the sample path (path) and image into a text file and enter the file name as a parameter. The data file format is "Image path object can [x, y, width, height] 'Lim.
For example) plist.txt
d: \ positives \ img1.jpg 1 140 100 45 45
d: \ positives \ img2.jpg 2 100 20 50 50 0 30 25 25
d: \ positives \ img3.jpg 1 0 0 20 20
...

-bg: Enter the text file where you saved the negative image list
Example) nlist.txt
d: \ negatives \ img1.jpg
d: \ negatives \ img2.jpg
...

Run Mode Types

So how do you combine these parameters to operate in four different modes as follows (names of each mode was held arbitrarily for ease of explanation).
  • [Mode 1] generate a single expanded training : Saving as one of the positive samples deformed by various internal training data formats
  • [Mode 2] Multi-generation fixed training : to save only positive sample as it converts the input data format for internal training
  • [Mode 3] test image generation : The combination of positive samples and negative image to produce the test image and save it as a jpg file
  • [Mode 4] training data output : shows the training data stored in the internal data format to the image

Execution mode decision rule

Of the parameter info, img, vec, mode depends on whether the explicit bg. If you specify both the img and has operated in mode 1 and vec, vec mode with the info if you specify 2, info, img, bg mode if you specify 3, only if vec explicit acts in Mode 4.
createsamples -img p.jpg -vec tr.vec => mode 1
createsamples -info plist.txt -vec tr.vec => mode 2
createsamples -info plist.txt -img p.jpg -bg nlist.txt => mode 3
createsamples -vec tr.vec => mode 4

The opencv_createsamples utilities are even and the parameters to be used also varies according to each mode are different little by little mean that the same parameter d. Summarizing the parameters used for each mode tabulated as follows:


[Mode 1] creates a single extended training data

by random variations in received a positive one sample image -img to create a positive training data as much as -num number. Training data generated are -w, -h has the size is stored as a file name -vec.

-bg inde parameter is optional, if you enter -bg randomly extracted from the negative image of the background region covering the positive list, and then you put a strain on the sample that generates the training data. At this time, the positive samples [bgcolor-bgthreshd, bgcolor + bgthresh] pixel in the range are considered to be the transparent color is filled with the background color. If you do not enter a blank space by the transparent color -bg positive part and rotational deformity of the sample is filled with -bgcolor.

The input samples are to be variously modified it because one, brightness variations are within -maxidev, rotation variants -maxxangle, -maxyangle, randomly done within -maxzangle. The units of rotational deformity in radians (radian), with a default maxxangle = maxyangle = 1.1 (63 degrees), the maxzangle = 0.5 (28.6 degrees). Because the default size, vehicle detection, etc. In applications with low rotational deformity should be lower the value of this parameter.

-maxxangle: roll rotational transformation, the size of an object changes in the vertical direction

-maxyangle: yaw rotation transformation, the size of an object changes in the right and left direction

-maxzangle: pitch rotation transformation, the rotation of the change in the image plane

Samahseo test, using this mode would be great to see your face only implement a detector capable of detecting time well.


[Mode 2] Multi-fixed training data generation

literally the size of the sample area received only positive input to -info -w, switch to the -h (resize) and save it as -vec file. Because they do not give a variation for increasing the number of samples, the number of training data that is generated is max {-num, the actual number of samples positive}. In other words, a positive sample can give less than -num generate the training data only -num number, and give a larger training data is generated until the actual samples.


Mode 2 method should be used if there is a sufficient number of positive samples enough to cover all of the various changes possible to the object, so as to be used as training data of a given sample.


[Mode 3] created a test image

After adding the variations on the positive sample images received by -img, create background (negative) test image by attaching the image above received by the -bg and save it as a jpg file (background image is randomly selected from the list, and the number -num Test jpg image files as they created).


Note that this behavior is how a little different in this mode, depending on juneunya the -info parameters. Ten thousand days give the name of the file in the -info (eg. -info Testlist.txt), text files corresponding to the file name and the newly created (Note: If there was a file of the same name in the Run folder data is carried sense) Test image files and test a couple the object position in the image is recorded. Jpg files and folders are stored in the current run. However, if the path to give -info (eg. -info Testimages \), the folder is created under the jpg file is created folders are stored. A list of the files posted, if location information does not create an image. File name recognition criteria to distinguish whether the path is at the end of the name, '\' or '/' is whether or not stuck.

If this mode is used to generate the test image advantage can be seen in the video object location, but if possible, because it seems contrived image'd better use the real test images.


[Mode 4] training data to show

bring up the training data stored in -vec shows in one image file. -w, -h, enter the saved image size of training data in the vec file, -scale, enter show how how to enlarge the image.


6. To create training data from an extended multi-sample

OpenCV has a function to generate the training data that extends several positive samples unfortunately does not provide. In a manner consistent with one of the mode 1 or the vec file generated after generating the training data for each of the expanded sample, or they need to implement the function to perform these functions directly.

Note  Naotashi Seo tutorial  perl script (createtrainsamples.pl) is provided, which repeats the merging of a code (mergevec.cpp) and Mode 1 vec file for one look on the positive list of images on the site.




7. OpenCV Haar Training Method

Use opencv_haartraining.exe utilities This section provides instructions on how to train a haar classifier. Note that, as described in Section 2 above, you can learn even haar classifier using this method instead of cascade training.

The default value for this parameter list and utilities, brief description of each parameter follows:
Usage: opencv_haartraining
  -data <dir_name>
  -vec <vec_file_name>
  -bg <background_file_name>
  [-bg-Vecfile]
  [-npos <Number_of_positive_samples = 2000>]
  [-nneg <Number_of_negative_samples = 2000>]
  [-nstages <Number_of_stages = 14>]
  [-nsplits <Number_of_splits = 1>]
  [-mem <Memory_in_MB = 200>]
  [-sym (Default)] [-nonsym]
  [-minhitrate <Min_hit_rate = 0.995>]
  [-maxfalsealarm <Max_false_alarm_rate = 0.5>]
  [-weighttrimming <Weight_trimming = 0.95>]
  [-eqw]
  [-mode <BASIC (default) | CORE | ALL>]
  [-w <Sample_width = 24>]
  [-h <Sample_height = 24>]
  [-bt <DAB | RAB | LB | GAB (default)>]
  [-err <Misclass (default) | gini | entropy>]
  [-maxtreesplits <Max_number_of_splits_in_tree_cascade = 0>]
  [-minpos <Min_number_of_positive_samples_per_cluster = 500>]

Example of use.
opencv_haartraining -data result -vec tr.vec -bg nlist.txt -npos 400 -nneg 5000

-data: This parameter is performed two roles. One of each cascade stage (stage), the directory name to be saved by classifier and the other one is being saved file name will be the final detector. For example, '-data result' to enter when you create a folder called result, under the current folder and run are stored under the cascade classifier result. Also saved the final detector is learned the name of the currently running result.xml folder.

-vec: positive training data file. Prior positive data generated by opencv_createsamples.exe (extension: vec), type the file

-bg: negative image, enter the file name stored in this list (see Section 5 describes the contents of the front). Or may have a file as input to .vec -bg parameters (where vec file should be generated from the negative images). If you enter a .vec file to -bg The -bg-vecfile also need to specify the parameters.

-nstages: The number of cascade steps to learning. Cascade detector is composed of a series of basic detectors, Cascade detection method is to first filter out the false in one step classifier, screening out the false in step 2 classifier for the other, ... and in this manner survives to the final step to be deemed successful object detection. This step may -nstages i.e., which will adjust the number of basic detectors. Each step of the basic detectors are learned result \ under 0, 1, 2, ... stored under the form of a text file.

-npos: Set the number of positive samples used in each cascade learning stage (stage). Note that if you enter the actual number of samples, but not in .vec file.npos <= (vec number of samples in the file - 100) / (1+ (nstages-1) * (1-minhitrate))) so the value of the wind cycle. Refer to Section 10 'training errors and solutions' under the wind for more information.

-nneg: Setting a negative number of samples used in each cascade learning stage (stage). One negative of the input images into -bg in random positions and sizes in various positions and sizes because of the negative sample draw negative real number of images that give the desired value, regardless of the search.
=> In fact, it is not pulling random samples sequentially scan the negative image (scan) and unplug the sample. If the method has reached the end of the image while moving the sample spinning at intervals of a fixed size window within the image, and change the size of the window and scanning the image from the start position of the back, ..., all the windows available for the image After scanning for size, and then extracts the sample in a manner beyond a negative image. The important point here is that the only candidate to be used as negative samples patched areas are areas that are not the object is incorrectly detected as an object in a negative image ahnindedo utilized as negative study sample. In other words, the previous stage (stage) Oh, and if you scan all along the negative image area detected by the trained classifier is generated by adding the appropriate area in the negative study sample, and in this way continue scanning until the total nneg two negative samples are obtained The. Thus, if the detection rate o If the classifier learned from the previous step is very low so it takes a lot of time to prepare for the negative phase of the current study sample.

-minhitrate: the minimum detection rate required by the basic classifier for each cascade stage. Detection rate of the final detector minhitrate ^ nstages the search.For example, if you accept the default minimum 0.995 ^ 14 = being able to get a detector with a detection rate of 0.932 degree. But this is a general guide only for the actual detection rate because the detection rate for the training input data entered into .vec files can significantly decrease.

-maxfalsealarm: upper limit of the false detection rate is required for each basic classifier cascade stage. False detection rate is estimated for the final detector maxfalsealarm ^ nstages Im. If you leave the default, the value of 0.5 ^ 14 = .0000610 about watering. At first glance seem very small, for any possible window region of the input image is detected as the degree of O ratio can not be come about Considering that small value. In any case, meaning that the error detection rate of 0.5 in each stage, chatgetda a classifier that can filter out 50% or more of the input candidates is malim. Once these stages are connected in cascade classifier morphology in ultimately being able to get the detector to filter out most of the background.

☞ minhitrate, please see the answers to the following comments about the detailed meaning than for maxfalsealarm parameters are thought to be helpful (ttungkaen 2013.8.20, neverabandon 2013.12.11).

-mode: BASIC = 0, CORE = 1, ALL = 3 can have three values ​​are the default is BASIC Lim. BASIC the original [Viola2001] haar feature in the figure below as used in the paper, 1a, 1b, 2a, corresponding to 2c. Selecting CORE 1a of the figure below, 1b, 2a, 2b, 2c, 2d, 3a and use, by selecting the ALL using all the feature under all performing training. Depending on the application, according to the characteristics of the object to find as appropriate provide a main surface being -mode.

-w, -h: size of the sample data in the .vec file. Sample data are referring to the image patch (patch) is used which are the image width (w) and height (h).

-nsplits: parameter to select whether to use a binary decision tree in which each step of weak classifier. If -nsplits 1 with only one quarter of the most basic of depth 1 binary  stump tree is used. Give at least two values ​​with the corresponding bifurcation CART (Classification And Regression Tree) Classifier is used.

-mem: Temporary memory size for the training process. Appropriately controlled as it is free based on their computer memory

-sym, -nonsym: parameter to set whether the object is symmetrical left and right like a human face. The default value is input to -nonsym only because it is set to true, ten thousand and one non-symmetric


8. OpenCV Cascade Training Method

Use opencv_traincascade.exe yutit mobility is a description of how to train a cascade classifier. As an upgrade to existing Haar Training methods, Haar feature, as well as LBP (Local Binary Patterns), HOG (histogram of oriented gradient) can be used to feature.

Also opencv_traincascade can do the training much faster than the old way because it supports parallel processing library tbb. Of course, in order to tbb support you need to set the opencv according to the tbb and rebuild.

The default value for this parameter list and utilities, brief description of each parameter follows:
Usage: opencv_traincascade
  -data <cascade_dir_name>
  -vec <vec_file_name>
  -bg <background_file_name>
  [-numPos <Number_of_positive_samples = 2000>]
  [-numNeg <Number_of_negative_samples = 1000>]
  [-numStages <Number_of_stages = 20>]
  [-precalcValBufSize <Precalculated_vals_buffer_size_in_Mb = 256>]
  [-precalcIdxBufSize <Precalculated_idxs_buffer_size_in_Mb = 256>]
  [-baseFormatSave]
  [-stageType <BOOST (default)>]
  [-featureType <HAAR (default), LBP, HOG>]
  [-w <SampleWidth = 24>]
  [-h <SampleHeight = 24>]
  [-bt <DAB | RAB | LB | GAB (default)>]
  [-minHitRate <Min_hit_rate = 0.995>]
  [-maxFalseAlarmRate <Max_false_alarm_rate = 0.5>]
  [-weightTrimRate <Weight_trim_rate = 0.95>]
  [-maxDepth <Max_depth_of_weak_tree = 1>]
  [-maxWeakCount <Max_weak_tree_count = 100>]
  [-mode <BASIC (default) | CORE | ALL>]

Example of use.
opencv_traincascade -data result -vec tr.vec -bg nlist.txt -numPos 400 -numNeg 5000 -featureType HAAR -mode CORE

Most of the parameters will be ever only explains some of the key parameters are new or different parts of a case already described in Section 7.

-data: If you do not have a folder that corresponds to the input path error must also enter the folder path After you create the folder in advance, since it occurs.Each cascade classifier and the final step learning classifier is stored in the folder you entered. The final classifier is the name that is stored cascade.xml.

-featureType: HAAR, LBP, HOG can select the feature of the three types. Refer to the section 2 for the difference between each feature wind.

-baseFormatSave: -featureType This parameter has meaning only when Im HAAR.Give this parameter to specify the haejum save the training results with older data format of the Haar training scheme.


9. detecting physical object using a detector learned

Please refer to the opencv see the code provided by OpenCV \ samples \ c \ facedetect.cpp example.


10. Errors and solutions that may occur from the Training Course

* OpenCV Error: Assertion failed (elements_read == 1) in unknown function

=> Error being generated because of the number of positive samples incorrectly entered. Parameters of -npos, -numPos is being positive number of samples used in each cascade stage, not the total number of positive samples. Below is the information for developers who answer to the problem.

The problem is that your vec-file has exactly the same samples count that you passed in command line -numPos 979. Training application used all samples from the vec-file to train 0-stage and it can not get new positive samples for the next stage training because vec-file is over. The bug of traincascade is that it had assert () in such cases, but it has to throw an exception with error message for a user. It was fixed in r8913. -numPose is a samples count that is used to train each stage. Some already used samples can be filtered by each previous stage (ie recognized as background), but no more than (1 - minHitRate) * numPose on each stage. So vec-file has to contain> = (numPose + (numStages-1) * (1 - minHitRate) * numPose) + S, where S is a count of samples from vec-file that can be recognized as background right away. I hope it can help you to create vec-file of correct size and chose right numPos value

In summary, npos <= (the number of samples in the vec file - S) / (1+ (nstages-1) * (1-minhitrate))) malim weeks, such that the value -npos or -numPos. S is the ordination of such things as positive from the detector is really hard to find in the vec file (so you can throw things classifier is considered that negative). S is some ambiguity ;; S = after me informed about once ever 100 error also seemed to be a little higher S.

* Training during a segmentation fault (segmentation fault) if the day

=> Turning the opencv_haartraining.exe in the state that should enable OpenMP parallel processing feature that this phenomenon may occur. The reason for this is so a very old opencv_haartraining utilities (?) Referred to these days because it does not fit with opencv computing environment. In this case, even if a little slow, using OpenMP, and in a state of being disabled or compile opencv_haartraining.exe, or you must use the latest version of opencv_traincascade.exe utilities.

* Cascade If you fall into an infinite loop during learning

=> If you are using opencv_traincascade.exe, int negCount = fillPassedSamples (posCount, proNumNeg, false, negConsumed); In fillPassedSamples function of the internal f (;;) loop during execution can fall into an infinite loop. negative background from the image if (predict (i) == 1.0F) the satisfaction that seem to occur if you do not secure enough negNum the negative samples. One solution ishere , refer to. Another solution is simply to escape the for (;;) statement, with a break in the case, even even though both investigate and insert a routine that checks all the negative image that the investigation did not find a sample yet.

No comments:

Post a Comment