Thought this was cool: Max-Margin Early Event Detectors – CVPR 2012 Best Student Paper

看到tombone博客上的消息,今年CVPR的优秀学生论文是牛津vgg实验室的 Professor Andrew Zisserman教授的学生Hoai, Minh的一篇文章。

Hoai, Minh是CMU的PhD,到vgg做博后,作品和运气都不错。vgg也真是牛啊,在cvpr上获奖,也驰骋eccv。



这样,人脸,人体基本都是有结构的模板性的目标,要描述细节上的变化,当然要捕捉一些细节的结构化的信息,自然地,作者用到了Structural SVMs。Structural Learning是今年cvpr很热的一个话题,当然也是很有用的一个话题,经典的工作要数Cornell的Thorsten Joachims教授的作品了。




Max-Margin Early Event Detectors.
Hoai, Minh & De la Torre, Fernando
CVPR 2012

The need for early detection of temporal events from sequential data arises in a wide spectrum of applications ranging from human-robot interaction to video security. While temporal event detection has been extensively studied, early detection is a relatively unexplored problem. This paper proposes a maximum-margin framework for training temporal event detectors to recognize partial events, enabling early detection. Our method is based on Structured Output SVM, but extends it to accommodate sequential data. Experiments on datasets of varying complexity, for detecting facial expressions, hand gestures, and human activities, demonstrate the benefits of our approach. To the best of our knowledge, this is the first paper in the literature of computer vision that proposes a learning formulation for early event detection.

Early Event Detector Project Page (code available on website)


Simulating the sequential arrival of training data

We simulate the sequential arrival of training data and use partial events as positive training examples. We train a single event detector to recognize all partial events, but our method does more than augmenting the set of training examples.

Monotonicity requirement

Monotonicity requirement—the detection score of a partial event cannot exceed the score of an encompassing partial event. MMED provides a principled mechanism to achieve this monotonicity, which cannot be assured by a naive solution that simply augments the set of training examples.


Detecting disgust Detecting Disgust. See video
Detecting fear Detecting Fear. See video

From left to right: the onset frame, the frame at which MMED fires, the frame at which SOSVM fires, and the peak frame.


Download mmed-release-0.1 and follow instruction in ./src/README.html.


