Sketch Recognition Course (TAMU CSCE 624): Assignment 12: Reading 8

Bibliography:
Dean Rubine. Specifying Gestures by Example, Proceedings of the 18th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '91), 1991

Summary:
The paper talks about a GRANDMA, a toolkit for adding gestures to direct manipulation interfaces. It also describes a trainable single stroke recognizer used by GRANDMA.

At the time of writing this paper, existing gesture based applications relied on hand-coding the gesture recognizer. This made the system complex and hard to generalize. GRANDMA aims to provide a generalized toolkit for automatically creating the gesture recognizer with few examples and training data.

The paper describes GDP, a gesture-based drawing program that uses a single-stoke recognizer built using GRANDMA. The recognizer identifies simple gestures for operations like rectangle, line, rotate, delete, copy etc. Since its a single stroke recognizer, it can avoid problems like segmentation and makes it intuitive from a users tense and relax perspective while making the gesture.

The system seems to use a graphical interface where various gesture-classes can be created. Each class is given a set of training examples of different gestures, that should reflect the variation of that gesture. Empirically, it is found that 15 is a reasonable number of examples.

The single stroke gesture recognizer works by extracting a set of features from the input gesture, and uses a linear machine to classify the example as one of the gesture classes. (defined by the designer in the system). The author describes a set of 13 features, according to the following. Each feature should be incrementally computable in constant time per input point, a small change in input should correspond to a small change in the feature, there should be enough features to differentiate between all the expected gestures. These features include sine, cosine of initial angle, length of stroke, bounding box, sum of angles at each mouse point, speed and duration.

The gesture classification is done by taking a linear combination of these features, with a possibility of rejecting the gesture. The weight vector is obtained by training the classifier on the given sample input gestures. Gesture rejection is done by setting some threshold on a probability function for each class.

The single stroke recognizer was found to perform well in practice, despite its simplicity. The author describes extensions to the system by eager recognition and support for multi-touch interfaces. The multitouch support can be achieved by performing a single stroke identification for each of the strokes and then combining them using a decision tree.

Discussion:
This paper shows the importance of a field like gesture recognition (more generally sketch recognition) to Human-Computer Interaction. A toolkit for gesture recognition makes it much easier to develop this field, as hand-coding recognizers is one of the main blockers to developing effective sketch based interfaces.

The features discussed in the paper gives a nice overview of what all to look for in a single stroke. As explained by Dr. Hammond, the importance of making sure "similar shapes have similar features" is further emphasized in this paper. Intuitively, that makes a lot of sense, as that is how we humans also think about differences in shapes.

I am really curious to see an extension to this algorithm to handle multi-touch gestures, and thereby interfaces. To me, multi seems like an entirely different problem mainly because its hard to breakdown multitouch gestures into composable single-strokes. But maybe using an effective decision tree handles that.

Sketch Recognition Course (TAMU CSCE 624)

Wednesday, 20 September 2017

Assignment 12: Reading 8 - Rubine Gestures

No comments:

Post a Comment