Wednesday, 27 September 2017

Assignment 14: Reading 10 - Hammond Chapter 1

Bibliography:
Tracy Hammond. Sketch Recognition, Chapter 1: Stroke Basics, 2017.

Summary:
This reading discusses the basics of Stroke - a mathematical representation of a gesture as a list of points. A stroke is represented by a series of (x: x-coordinate, y: y-coordinate, t: epoch time) samples, obtained by sampling a gesture from pen-down to pen-up on a device. The sampling rate depends on the device. Spatially, a stroke can be represented by concatenating a series of vectors from point to point. Length of a stroke is defined as the sum of all the individual vectors in a stroke.

The chapter explains various useful trignometric identities.  The concepts of sine, cosine and tangent are explained as triangle identities using the mnemonic SOH CAH TOA and how they relate to the interior acute angle. Furthermore, inorder to compute the angle between lines in a stroke, the chapter explains the formula using arctan and arccos. Though the computation using arcTan is simple, it can become undefined in some cases. The arcCos and law of cosines can also be used to compute the angle, but it is computationally less efficient as it involves taking square roots.


During preprocessing, the points are sometimes resampled to make them spacially equidistant or even out point densities. The most prefereable way to do this is by using the diagonal length of the strokes bounding box. Other methods using fixed point distance or fixed number of points to not scale well for varying stroke sizes or point densities.

Discussion:
This paper provides some required mathematical background for representing gestures and generating features from them. It is very useful for someone starting out with gesture recognition. The representation of gestures as samples (and a vector) allows for applying existing computational methods to this field.

No comments:

Post a Comment