Sketch Recognition Course (TAMU CSCE 624): Reading 1 - Learning through the Lens of Sketch

Bibliography:

Hammond, Tracy. "Learning Through the Lens of Sketch", Frontiers in Pen and Touch, Chapter 21, Springer, 2017.

Summary:

This talk gives an overview about Dr. Hammond's research and the evolution of the field of sketch recognition. Her motivation for the field comes from trying to understand, why certain tasks that are inherently simple for humans, turn out to be difficult for computers to perform. By writing computer algorithms that mimic human activities, we may be able to better understand how the brain works. This in turn will facilitate better communication with/using technology.

Automated recognition of sketches is crucial to achieving this, as sketching is a very intuitive way for humans to perceive/understand things. Dr. Hammond categorises sketch recognition algorithms into 3 types: 1) Appearance based 2) Gesture based 3) Geometry based. As each method has its own drawbacks and advantages, the most effective system will use a combination of techniques.

Dr. Hammond's research on geometric algorithms, led to defining a set of constraints on algorithms that are similar to what humans perceive. For example, humans can easily perceive horizontal and verticle lines, but are not very good at perceiving specific angles. Furthermore, this perception of constraints changes greatly based on the size and form of the shape.

While building constraint based systems (such as Mechanix) to describe shapes, an interesting application was using these to teach drawing and perception. Human subjects were able to improve their drawing/perceiving skills by getting immediate personalised feedback. These systems are further improved by taking into account factors such as drawing corners and sound of pen. Visualizing these strokes as a function of speed led to interesting insights that differentiate novices from experts. Measuring oversteering at corners (using techniques such as NDDE and DCR) along with the above mentioned factors, helped recognize perception primitives.

Another goal of her research was to differentiate between shape and text. By looking at the shannon entropy of both, it was found that text had significantly higher entropy than shape, and this helped in distinguishing them with high accuracy.

The talk concludes by looking at temporal data for recognising sketches, namely activity recognition. The research found that, it was possible to gain information about a person, just by looking at their eye tracking data such as saccades, entropy and shapelets.

Discussion:

The talk gives an excellent overview of the field of sketch recognition, the various algorithms that are being developed and the difficulty in trying to create computer algorithms that mimic humans. It also makes me think how evolution and natural selection have made us so remarkably efficient!

The field provides significant insight into how humans perceive things. To me, pedagogy seems to be the primary application of the field. I wonder how this relates back to making computers compliment humans in day to day activities.

While most of the talk focussed on geometric algorithms (with some gesture based methods at the end), I wonder how these techniques compare to ones that use appearance based algorithms. Seems like appearance based techniques will be less insightful, as there is no temporal data.

Extended Discussion:
The video helped me see the emphasis placed on relating sketch recognition to how the brain works. Indeed, how we perceive things does significantly reflect on our behavior and sketches. As explained by Dr. Hammond in the QA, these insights will help make sketch recognition much more valuable to people, rather than just replacing the pen and paper.

It was cool to see the demonstration on how the ladder system grew to recognizing 923 shapes, just based on simple constraints. The speed of recognition was very impressive.

Regarding eye tracking, the graphs with saccades and Dr. Hammond's explanation show how they are unique to each individual, and how it relates to the usage of eye-muscles.

Sketch Recognition Course (TAMU CSCE 624)

Monday, 4 September 2017

Reading 1 - Learning through the Lens of Sketch

No comments:

Post a Comment