Hacker Newsnew | past | comments | ask | show | jobs | submit | captures's commentslogin

The classification is surprisingly simple - k-nearest neighbors on a 27-dimensional feature vector extracted from each drawing.

The features: - Stroke count - Point density across 6 horizontal and 6 vertical bands (where is the ink?) - Direction histogram across 8 compass directions (which way are strokes going?) - Aspect ratio and total stroke length - First stroke start position, last stroke end position

The training set is ~64k hand-drawn samples from the original Detexify project. Each sample gets preprocessed and converted to this 27D vector. Classification is then just finding the k nearest training samples by Euclidean distance and returning the most common symbols among them.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: