Bibliography:
Akshay Bhat and Tracy Anne Hammond. Using Entropy to Identify Shape and Text in Hand Drawn Diagrams. Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence. 2009.
Summary:
This paper focuses on differentiating between text and shape sketches. The motivation for this is that, most existing sketch recognition systems perform well in either text or sketch, but not both. The paper suggests using entropy rate (specifically, zero-order entropy rate), as a distinguishing factor.
Previously, the shape vs text classification was done using some specific features, found by trial and error. This was very time consuming and this motivated the authors to find a single logically coherent feature. It was found that entropy of text is more than shape, when representing them using a general set of coordinate equations. This led the authors to define entropy based on digital ink. First, a model alphabet is defined as a set consisting of 7 symbols, corresponding to a range of angles between 0 and pi. Each point in a stroke is assigned an alphabet based on the angle it makes with neighbouring points. 'zero-order' refers to independence between consecutive symbols when determining the probability.
User storkes were grouped based on thresholds for time and spatial coordinates. After resampling and preprocessing, the alphabet model was applied and entropy for each group was calculated. The system classified the groups as text or shape, and left the ones with an entropy value at the boundary as unclassified. The system had an accuracy of over 90 percent.
Discussion:
Entropy based classification for text vs shape seems very effective. Also, the confidence value emitted by the system makes it usable in other higher-level systems (similar to paulson features paper). The alphabet model, however, seems fairly simple with only 7 symbols. I wonder how this would scale with a more complex alphabet and higher-level non-primitive shapes.
Akshay Bhat and Tracy Anne Hammond. Using Entropy to Identify Shape and Text in Hand Drawn Diagrams. Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence. 2009.
Summary:
This paper focuses on differentiating between text and shape sketches. The motivation for this is that, most existing sketch recognition systems perform well in either text or sketch, but not both. The paper suggests using entropy rate (specifically, zero-order entropy rate), as a distinguishing factor.
Previously, the shape vs text classification was done using some specific features, found by trial and error. This was very time consuming and this motivated the authors to find a single logically coherent feature. It was found that entropy of text is more than shape, when representing them using a general set of coordinate equations. This led the authors to define entropy based on digital ink. First, a model alphabet is defined as a set consisting of 7 symbols, corresponding to a range of angles between 0 and pi. Each point in a stroke is assigned an alphabet based on the angle it makes with neighbouring points. 'zero-order' refers to independence between consecutive symbols when determining the probability.
User storkes were grouped based on thresholds for time and spatial coordinates. After resampling and preprocessing, the alphabet model was applied and entropy for each group was calculated. The system classified the groups as text or shape, and left the ones with an entropy value at the boundary as unclassified. The system had an accuracy of over 90 percent.
Discussion:
Entropy based classification for text vs shape seems very effective. Also, the confidence value emitted by the system makes it usable in other higher-level systems (similar to paulson features paper). The alphabet model, however, seems fairly simple with only 7 symbols. I wonder how this would scale with a more complex alphabet and higher-level non-primitive shapes.
No comments:
Post a Comment