Filter comments which contain incorrect punctuation or misspellings. In this case it’s YouTube comments (and apparently I’m not the only one who has a problem describing them without using the word “cesspool”), but frankly Reddit could use it just as well. The results are compelling, there’s a very strong correlation between comment stupidity and poor spelling/punctuation.
This is brilliant in its simplicity, and it would be interesting to see how this compares to an approach using StupidFilter, which I guess is some kind of binary SVM classifier.
Advantages:
- Far more debuggable and understandable than the output of SVM or a Bayesian classifier.
- Much quicker to execute!
- Wide-spread usage would have a positive effect on society as a whole.