Friday, July 14, 2017

Vehicle detection and traffic information extracting in CCTV streams.


This paper presents a suitable framework for vehicle detection and localization that introduced as part of a project at Transport for London (TfL). In this research project, two main vehicle detection methods are highlighted. 
  • Motion silhouettes
  • 3DHOG (3D extended Histograms of oriented gradients)
The researches understood the fact that the traffic flow monitoring by detecting vehicles in CCTV streams tend to be unique and accurate by using high camera numbers to track the long transmission paths. In fact the research shown that manual monitoring over time significantly reduces the accuracy of detection. Therefore they used new technology that provides automatic and relevant real – time alerts.

The conventional concept of these types of projects is background modelling. In the motion mask method, it is only used as baseline. Motion mask cannot fulfill the vehicle identification process properly due to some environmental reasons like lightning changes, rain and fog. Camera shake and low camera angles can also effect to that process partially. After realizing that problem a traffic based classifier 3DHOG was introduced to overcome the problems.

The framework defined for vehicle detection in this research contains three basic components.
          i.  Detector
         ii.  Classifier
         iii. Tracker


Detector is designed to model the static part of the scene. It uses background estimation with Gaussian Mixture Model (GMM). This model enables rejection of moving objects in the background.
The difference between the background and the new frame, helps to estimate the foreground mask. It is refined by a shadow removal algorithm. The contours of the resulting binary foreground mask are extracted. The maximum match of vehicle gives the final estimation.

The motion silhouette method cannot overcome the problems of shadow removing, noise and saturation areas of the cameras. 3DHOG (Histograms of Oriented Gradients) uses to substitute overlapping match measures of the frames. HOG descriptor to image patches defined in 3D model space.

Affine transformation used to generate those scale normalized patches. Every patch generated a descriptor. In each descriptor is generated from every patch. Using the data driven appearance model, the single Gaussian distribution is learned for every interest point descriptor.  During system operation, new descriptors are generated for every 2D projection blocks. The remainder of the framework remains identical to the earlier description.




As the last step, the frame to frame detection is used as inputs to the Kalman filters. The detection of a single frame can be tracked over time using a Kalman filter.

Vehicle detection result, using 3DHOG:


No comments:

Post a Comment