Event recognition requires long term tracking of objects in the surveillance videos. However, longterm tracking typically suffers from a lack of robustness in most realistic scenarios, due to illumination changes, cluttered background, occlusions, appearance changes, etc. Therefore, most of the event recognition methods omit long-term tracking procedure, so that they can describe and recognize only short term events such as walking, running, sitting, falling, kicking, etc. To circumvent this drawback, a system is proposed in this paper, which fuses the information acquired from the foreground mask and pixel color of the frames whenever needed to handle occlusion and to achieve long term object detection, tracking and labeling. By this system, the event recognizer becomes able to discriminate long lasting events such as purse snatching, fighting, meeting, unwanted person around a car, etc. Many videos of various events and scenarios are investigated based on the spatio-temporal organization of the objects along the time and generic solutions, which are applicable for most of the problematic cases in all types of the videos and scenarios, are proposed. Finally, results are presented for well-known data sets and our data set, all of which include long term events. We observed that the performance of long term event recognition is improved with the proposed system.
Surveillance long term event tracking