Berkely spam harvest study excerpt
The following is an excerpt from a study done at Berkeley, which explores how spammers "harvest" email addresses. This excerpt was originally put in the wiki to inform a discussion about how to prevent the wiki from being an easy target for spam harvesters. That discussion is over; however, this information is generally useful, so I'm leaving it in the wiki as a resource for web designers. The full study is well worth reading. --Pete 00:04, 15 Dec 2005 (PST)
3. Obscure email addresses
"Obscuring" addresses — by rewriting them in various ways — doesn't offer nearly the degree of protection against harvesting tools as the more effective methods discussed above. The alternatives of using JavaScript...and offering...contact forms...are both far more effective... On the other hand, obscuring addresses is often much simpler, as this approach typically does not require any programming.
The following are four techniques that you can use to "obscure" email addresses on your web pages:
3.2. "Munge" email addresses
Address "munging" typically consists of substituting words for symbols in the domain name parts of email addresses — "at" (or variations thereof) for the "at sign" and "dot" or "period" for the periods, and the like — as well as adding whitespace between each part of the address. Sometimes extraneous text is also added to the address, which a human reader would ostensibly know to remove.
Using this technique, webmaster@yourhost.berkeley.edu could be munged as:
- webmaster -at- yourhost dot berkeley dot edu
or perhaps:
- webmaster -at- NOSPAM yourhost dot berkeley dot edu
Benefits. This technique is also likely to foil simple-minded harvesting tools.
Drawbacks. This technique places burdens on your site's visitors to understand how to "unmunge" your email addresses [...]
Finally, sophisticated harvesting programs with pattern matching capabilities may still be able to retrieve a fairly high percentage of the addresses trivially obscured in this manner.