<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>An Experiment in Bloggery &#187; spam</title>
	<atom:link href="http://kevin.sb.org/tag/spam/feed/" rel="self" type="application/rss+xml" />
	<link>http://kevin.sb.org</link>
	<description>The occasional view into my life</description>
	<lastBuildDate>Wed, 30 Jun 2010 18:48:33 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>The Resurrection of Typosphere</title>
		<link>http://kevin.sb.org/2007/01/22/the-resurrection-of-typosphere/</link>
		<comments>http://kevin.sb.org/2007/01/22/the-resurrection-of-typosphere/#comments</comments>
		<pubDate>Mon, 22 Jan 2007 19:34:00 +0000</pubDate>
		<dc:creator>Kevin Ballard</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[DreamHost]]></category>
		<category><![CDATA[irc]]></category>
		<category><![CDATA[Planet Argon]]></category>
		<category><![CDATA[spam]]></category>
		<category><![CDATA[subversion]]></category>
		<category><![CDATA[trac]]></category>
		<category><![CDATA[typo]]></category>
		<category><![CDATA[Typosphere]]></category>

		<guid isPermaLink="false">http://54f021aa-61c3-4e07-82f3-51e46c740bca</guid>
		<description><![CDATA[After months of absence, Typosphere has returned from the dead! We migrated off of Planet Argon and onto DreamHost, where we should have more control. We also upgraded to Trac 0.10.3 and turned off anonymous editing (users now have to register to file a ticket). This should (hopefully) prevent the issue that lead to Typosphere [...]]]></description>
			<content:encoded><![CDATA[<p>After months of absence, <a href="http://www.typosphere.org">Typosphere</a> has returned from the dead!</p>

<p>We migrated off of <a href="http://www.planetargon.com">Planet Argon</a> and onto <a href="http://www.dreamhost.com">DreamHost</a>, where we should have more control.
We also upgraded to <a href="http://trac.edgewall.org">Trac</a> 0.10.3 and turned off anonymous editing (users now have to register to file a ticket).
This should (hopefully) prevent the issue that lead to Typosphere dying in the first place.</p>

<div class="highlight">
<p>One important thing to note is that as part of this process, we also moved the subversion repository.
Unfortunately, the old repository was hosted as an svn:// URI using the typosphere.org domain, which meant
there was no way to preserve this URI (since we can&#8217;t run long-lived background daemons on DreamHost). The
new URI uses http and a new subdomain, so if necessary we can move the repository without moving the website.</p>

<p style="text-align: left">The new repository URL is <a href="http://svn.typosphere.org/typo/trunk">http://svn.typosphere.org/typo/trunk</a>.</p>
</div>

<p><span id="more-136"></span></p>

<p>The issue, as near as I can tell, is Typosphere started getting spammed massively. At this time none of the
developers (and that includes me) was really paying attention to Typo, as we were busy with other things. So
for about a month Typosphere Trac got so full of spam that, well, it was more spam in one location than
I&#8217;ve ever seen in the rest of my life. This managed to trip a bug in Trac that caused it to start sucking
CPU and RAM, and so Planet Argon turned off Trac for our account.</p>

<p>Some time later, the other developers and I started trying to resurrect Typosphere. Unfortunately, at about
this time the systems administrator for Planet Argon was preparing to leave the company, so any attempts at
contacting him to resolve the issue went unanswered. I eventually called Planet Argon (which is how I learned
that the systems administrator had, in fact, left that very day) and spoke to the new systems administrator. He
agreed to try and fix Trac for us, but after hearing nothing for a few days, I decided it would be better to seek
hosting elsewhere.</p>

<p>Luckily, I had access to a <a href="http://www.dreamhost.com">DreamHost</a> account with plenty of spare bandwidth and disk space,
so we decided to move there. For the most part the migration went smoothly, until I started up Trac and discovered
exactly how much spam was in there.</p>

<p>This problem stumped me for about 2 weeks. I spent several hours trying to clean it by hand one day, and after those
several hours I couldn&#8217;t tell the difference. So, yesterday, I finally sat down to try and solve the problem.</p>

<p>With the help of the fine folks on the <a href="irc://chat.freenode.net/#trac">#trac</a> IRC channel, especially coderanger, I wrote a script which deleted
every single ticket change after a certain timestamp (corresponding to the first spam comment). Unfortunately, there
were probably a handful of legitimate changes lost, but there really was no other alternative. In any case, this script
worked flawlessly, and Trac was de-spammed. To prevent this from happening in the future, I turned off anonymous
editing and installed a plugin which allows users to register for an account. Hopefully the requirement of registration
will block most spam.</p>

<p>There was one interesting aspect to this that puzzled me until yesterday. The vast majority of the spam I saw contained
words that I had placed into the blacklist ages ago. I couldn&#8217;t figure out why the spam protection wasn&#8217;t working.
And then yesterday I discovered the reason. The blacklist is kept on a page called BadContent. The first code block
on that page consists of regular expressions, one per line, that each match a blacklisted expression. Unfortunately,
I forgot to mark this page read-only. So what happened was one of the random spam attempts happened to target this
page. The spammer replaced the content with his own code block containing a vast number of <code>&lt;a href&gt;</code> tags linking
to spammy websites. This had the effect of replacing the entire blacklist with a bunch of regular expressions matching
<code>&lt;a href&gt;</code> tags. This meant that all of the stuff that was previously blacklisted was no longer being blocked, opening
the floodgates for all sorts of spam.</p>
]]></content:encoded>
			<wfw:commentRss>http://kevin.sb.org/2007/01/22/the-resurrection-of-typosphere/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->