Stopping blog spam bots with CAPTCHA

Home
E-mail Us
Oracle Articles
New Oracle Articles

Oracle Training
Oracle Tips
Oracle Forum
Class Catalog

Remote DBA
Oracle Tuning
Emergency 911
RAC Support
Apps Support
Analysis
Design
Implementation
Oracle Support

SQL Tuning
Security
Oracle UNIX
Oracle Linux
Monitoring
Remote support
Remote plans
Remote services
Application Server
Applications
Oracle Forms
Oracle Portal
App Upgrades
SQL Server
Oracle Concepts
Software Support
Remote Support
Development
Implementation

Consulting Staff
Consulting Prices
Help Wanted!

Oracle Posters
Oracle Books
Oracle Scripts
Ion
Excel-DB

Don Burleson Blog

Stopping blog spam bots with CAPTCHA

IT Humor Burleson Consulting

Now that blogs have afforded millions of people the opportunity to become published authors, the issue of protecting against blog spam has become a serious issue. The CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) has been somewhat successful.

Spammers are getting more sophisticated and all bloggers must be able to protect themselves against spammers who might use their blog to publish illegal or pornographic information. In traditional publishing, editors carefully vetted all magazine, newspaper or periodical content since the publication is the sole responsibility of the publisher.

The next generation of anti-spam verification

As warped letter recognition becomes useless, we will need to come-up with a CAPTCHA image that requires non-procedural intelligence. For example, we could require simple math problems as verification that the blog commenter is not a spam bot:

As the anti-spam technology becomes more flexible, bloggers can put-in whatever verifications that they desire, and they can use their anti-spam mechanism to ask subtle questions, using CAPTCHA for testing general knowledge of current events:

You can also create anti-spam verifications in CAPTCHA that ensure that your blog comments are in-line with your own personal opinions and values:

For technical blogs, you might use the anti-spam CAPTCHA mechanism to ensure that everyone who posts comments has a general knowledge of the computer industry:

For teenagers, you could test their knowledge of the popular cartoons and ask trivia questions:

For the kiddies, animal questions are always popular:

The more adventurous bloggers could incorporate sick humor, double entendres and off-color jokes into their blog comment verifications:

Note: Above is just an example of a sick joke, just a joke, no harm intended. Please don't send me e-mails!

The current trend towards more sophisticated anti-spam verification also presents some important commercial opportunities.

Paid advertisers anti-spam verification

In just a few years we may see corporate sponsorship of CAPTCHA anti-spammer verification questions and this could become a huge industry as advertisers leap upon the chance to make consumers type-in their company name:

Advertisers currently spend billions of dollars every year on logo recognition and these corporations may spend big dollars to get their products into blog user verification questions:

USA blogger anti-spam responsibilities

In the USA, The Communications Decency Act (CDA) section 230 and the Digital Millennium Copyright Act (DMCA) exempts online publishers from responsibility for content published by other people.

However, this is only in the USA. In some countries, hosting defamation or pornography is highly illegal, even if you did not publish it yourself and you have a duty to police your blog for illegal and infringing content.

These protection are great for the USA, but if your blog is read overseas, beware. Remember, bloggers are publishers, and you have the same responsibilities as any other publisher to ensure that your blog does not contain anything illegal, stolen (copyright violation), libelous, or defamatory.

Let's take a closer look at how you can prevent automated spam robots (bots) from littering your forum or blog with illegal content.

Anti-spam bot CAPTCHA verification

Spam bots have become very sophisticated and they are now capable to registering user ID's and posting spam on many message boards and forums. In order to thwart the spam bot programs, many blogs and forums have been forced to incorporate "image-based" tools to thwart the spam bots:

But we must remember that the spammers have a huge incentive to pollute your blog with links to the latest 401 scam, penile enlargers and photoshopped porn pictures of famous people.

Even though spam is illegal in many states, you are still not fully absolved of responsibility if your blog or forum is read in jurisdiction with strict laws against invasion of privacy, copyright infringement, defamation, and pornography. Even in the USA, hosting a blog that has kiddie porn links and photos will guarantee a Federal search warrant, even if you did not publish the offensive content yourself.

The flaws in letter recognition technology

It's only a matter of time before the spammers use the technology that recognizes cursive and distorted writing. For example, the genealogy web site www.ancestry.com has developed software to read and digitize scribbled handwriting.

Today, word verification procedures extract random jpg images from a database table, display them, and match the response to the table's string value:

The internal structure of anti-spam mechanisms

As we have noted, it's just a matter of time before these letter recognition schemes are racked by the spam bots and we will need to incorporate more sophisticated images that are beyond the reach of traditional procedural programming.

This is only the tip of the iceberg. At the end of the day, web bots can become as sophisticated and a human being and CAPTCHA will have to evolve. As time passes, web publishers will always be challenged to verify that their forum and blog comments are not generated by automated spammers, and it will always be problematic to positively prove that the person publishing on your blog is a real, live human being.

��