Bibliography:
Radu-Daniel Vatavu, Lisa Anthony, and Jacob O. Wobbrock. Gestures as Point Clouds: A $P Recognizer for User Interface Prototypes. Proceedings of the ACM International Conference on Multimodal Interaction. 2012.
Summary:
This paper discusses $P, a new type of gesture recognition algorithm that extends the $ family of algorithms. The algorithm aims to support multistroke gestures (unlike $1 and Protractor), while avoiding the exponential complexity seen in $N and $N-protractor algorithms.
The $P algorithm, removes the temporal information from the gestures and stores the gesture as a point-cloud. This is matched against the point cloud of each possible template gesture in the training data. Doing this efficiently is what makes $P efficient. This problem can be converted into a form of the well studied 'assignment problem' in graph theory. The 'Hungarian algorithm' is among the gold standard for this, but is very slow, even though its very accurate. The authors instead, studied a set of Greedy-X heuristics for the template matching. The chosen greedy heuristic matches each point in the candidate with an unmatched point in the template point cloud and combines the differences with a weight. This was found to be the most efficient and its accuracy was comparable to the hungarian algorithm.
$P was found to perform comparable to $1 on single strokes and better than the $N recognizer on multiple strokes. The authors also provided the Pseudocode for the algorithm.
Discussion:
For me, this is surprisingly a simple, elegant and efficient algorithm for identifying gestures. The pseudocode provided by the authors makes it very accessible and easy for developers to implement the same in their applications. Since the core of the algorithm is based on a greedy heuristic, I am curious what its failing cases are. As mentioned in the table, since the algorithm is not rotationally invariant, I wonder if it can be combined with the polar-coordinate rotation invariance discussed in the previous paper.
Radu-Daniel Vatavu, Lisa Anthony, and Jacob O. Wobbrock. Gestures as Point Clouds: A $P Recognizer for User Interface Prototypes. Proceedings of the ACM International Conference on Multimodal Interaction. 2012.
Summary:
This paper discusses $P, a new type of gesture recognition algorithm that extends the $ family of algorithms. The algorithm aims to support multistroke gestures (unlike $1 and Protractor), while avoiding the exponential complexity seen in $N and $N-protractor algorithms.
The $P algorithm, removes the temporal information from the gestures and stores the gesture as a point-cloud. This is matched against the point cloud of each possible template gesture in the training data. Doing this efficiently is what makes $P efficient. This problem can be converted into a form of the well studied 'assignment problem' in graph theory. The 'Hungarian algorithm' is among the gold standard for this, but is very slow, even though its very accurate. The authors instead, studied a set of Greedy-X heuristics for the template matching. The chosen greedy heuristic matches each point in the candidate with an unmatched point in the template point cloud and combines the differences with a weight. This was found to be the most efficient and its accuracy was comparable to the hungarian algorithm.
$P was found to perform comparable to $1 on single strokes and better than the $N recognizer on multiple strokes. The authors also provided the Pseudocode for the algorithm.
Discussion:
For me, this is surprisingly a simple, elegant and efficient algorithm for identifying gestures. The pseudocode provided by the authors makes it very accessible and easy for developers to implement the same in their applications. Since the core of the algorithm is based on a greedy heuristic, I am curious what its failing cases are. As mentioned in the table, since the algorithm is not rotationally invariant, I wonder if it can be combined with the polar-coordinate rotation invariance discussed in the previous paper.
No comments:
Post a Comment