Not that hard

Sure, machine intelligence is hard, but not this hard.

Apparently, McCain's website has comment filters to prevent people posting links to external websites (what if the external site contains pornography or Obama-hagiography?). The filters don't let in any words that contain "net" or "com". So, you can't say "planet or commercial" on the blog's comments, you have to say "plan et or co mmercial".

Several ridiculous things about the scenario:
  1. Why are they writing their own comment moderation software? What is wrong with using one of the half-dozen blog running software products out there? A party of business should know what specialized knowledge is, right?
  2. There is no need to "discover" embedded URLs in text. If it doesn't have http://, don't make it a link. How hard is that? Few people are going to copy and paste some random words into the URL text field of their browser.
  3. Finding URLs in text is done all over the place. Every mail program in the world does it. Surely, there's a standard algorithm or routine out there? Oh, yes, there is (CPAN).
Finding URLs in text is a machine intelligence problem. And these can be counter intuitively hard because the human eye is better at recognizing patterns than a computer is. But this particular problem could have been avoided so easily that it makes me wonder if McCain has high-school kids (or worse: political hacks) running his website.

  1. Maybe McCain hired the former city manager of Tuttle, OK:

    It's L-i-n-u-x, that is an Operating System

    (Credit Terry S. for pointing this story out.)