During my long vacation, one of the tasks I set for myself was to migrate my site off of DreamHost. I’ve been with DreamHost for a while and they’ve mostly provided me with decent service. However, all too frequently I’ve found little things that have bothered me:
- Occasional downtimes while I want to do something. I’m a night person and that’s generally when sysadmins think it’s a good time to take the system down since the users are sleeping. While that’s true in general, it’s not true when your customers are night owls.
- Shell access was often incredibly slow. System loads on the shell servers were often high and so were the latencies, making typing anything on the command line a brutal experience.
- You’re somewhat limited in what you can run. DreamHost picks out the Apache version and whatever modules are installed. They provide a base install of PHP, but (if you’re inclined) you can build your own version and run it using FastCGI. While it’s possible to do this, it’s kind of a pain in the butt.
So I’ve recently switched over to Slicehost. Slicehost does virtual private hosting. You get a virtual machine and are free to run whatever software you want on one of their OS images (they have several Linux flavors to choose from). Anyway, I’ve recently moved all of my unlcehulka.com material over to Slicehost, so…uh…welcome! If you’re interested in trying out Slicehost, feel free to use my email address when signing up to give me the referral (rckenned AT yahoo.com).
I’m hoping to use my newfound hosting freedoms to do some more interesting things with the site. I’m currently working on a simple side project that I hope to launch to the site in the coming weeks. In the meantime, if you see anything broken on the site, feel free to mention it in the comments and I’ll take care of it.
I was hit with another spam flood tonight (Akismet did catch it) so I decided to give Bad Behavior another shot. I’m wondering how much traffic coming from MySpace is actually just spammers following links, so it’ll be interesting to see what this does to the referrers.
Anyway, if you have issues accessing my pages or commenting on posts, let me know (hopefully you know my email address by now).
The DreamHost admin panel has been taunting me for many weeks now with a shiny link that would allow me to instantly update my WordPress installation. I’ve been meaning to upgrade to 2.0.2 for some time, so tonight I made a backup of my database and then asked DreamHost to do their thing.
Disappointment. I got email a few minutes later telling me the upgrade failed because DreamHost couldn’t download the necessary files. First, what happened? I was able to download WordPress just fine. Second, why is the upgrade for 2.0.2 pulling from wordpress.org directly? They don’t even make 2.0.2 available directly, they just have a version titled “latest”. Third, why isn’t DreamHost caching the download? I suppose it makes some sense to give WordPress a better idea of how many times it’s been downloaded, but there are better ways of figuring that out.
Anyway, I ended up downloading WordPress on my own and installed it by hand. Let me know if you see any issues.
You probably didn’t notice, but the machine at DreamHost that holds all of my data (including this blog) went down around 8:00am Pacific time this morning.
The server you’re on (yoda) has suffered a hard drive failure on its
system drive at approximately 8am PST. Services and data are being moved
over to new hardware, and it should be back online in under an hour.
It took them quite a while to bring everything back up, but things seem to be back to normal now. The database actually took quite a while to come back. Between this, a recent mail outage and a recent DoS attack…DreamHost has been having quite a lot of problems lately.
I upgraded my WordPress installation to the 2.0 Release Candidate tonight. The admin UI certainly is more…busy now. I haven’t decided if I like that or not yet. What I do like is that I unzip’d the file from wordpress.org, copied in my configs, my plugins and my tweaked theme and swapped out the old WordPress for the new one and it just worked. There’s something to be said for how simple it is to upgrade WordPress.
As always, let me know if you notice any issues.
So now that I’ve been running Akismet for a little while, I’m pretty impressed. It’s just survived it’s second 30+ spam storm and in all I think it’s only let a single piece of spam through (even that might not have been classified spam, just someone typing gibberish with no hyperlinks).
The UI is working much more to my liking after my tweaks to it.
It’s a glorious day to blog, once again.
I spent about 15 minutes tweaking Akismet tonight so that it wouldn’t do those things that piss me off. In particular, I changed the management page to show all spam messages, not just one message per spammer IP address. I also changed the sort order of the spam messages. It was displaying the newest at the top. I wanted to scroll the list chronologically. Not for any particular reason, just because that’s the way I read dated things.
While looking through the PHP, I noticed a bunch of dead code. In particular, inside the loop where it displays the currently held spam messages, it goes to through the trouble of formatting a date from the database for output:
$comment_date = mysql2date(get_settings(“date_format”) . ” @ ” . get_settings(“time_format”), $comment->comment_date);
Then it never uses the variable. Instead it calls the comment_date() function, which only gives you the date (if it’s not clear from the code above, $comment_date has the date and time). I switched the template to use $comment_date instead so I could see what time the message was posted. Again, not because I care that much but mostly because I hated seeing the information go to waste.
Before I make it sound like Akismet is the worst piece of software ever, there are two things I want to point out:
- The button in the management interface that deletes all current spam is smart. By smart, I mean it will only delete the spam that was captured up to the time when the management page was displayed. This means that if, while you’re viewing the management page, more spam comes in, it won’t be removed from the system when you push the delete all button. That’s huge! Most systems I’ve seen with an “empty” button aren’t that smart and will delete everything in the system even if you haven’t seen it. Next time you’re using a web application and you empty the trash or spam folders, realize that while you were looking at the folder more email may have come in. You will (in most webapps) have deleted those messages without ever having a chance to look at them. Akismet goes out of its way to make sure it doesn’t do that to you.
- The spam filter has been REALLY good so far. It’s caught every piece of spam I’ve seen and I haven’t seen one false positive yet. This is the main reason I keep giving Akismet second (and third) chances. It gets the hard part of this comment spam business right. Even if it can’t handle the management UI to save it’s life, that’s easy to fix. Fixing a broken spam detection algorithm is much harder.
So rejoice, Akismet is back on the job. Hopefully for good this time.
Well, it’s always too good to be true. Earlier I gave Akismet another chance. Shortly after, I deactivated it once again. This time for trying to deceive me. I can’t tell if it’s actively trying to make me mad or if it’s just bad at math.
In the screenshot above you can see where I’ve circled something in red. It says “There are currently 2 comments identified as spam.” If you look immediately below that, you’ll only see one piece of spam. That was actually an older screenshot. Some time after I saw that Akismet told me there were currently 8 or 9 pieces of spam and yet it only displayed 3.
I don’t know what gives, but when I had already only placed tentative trust in the software, this doesn’t instill any additional confidence. So, once again, comments are being moderated by yours truly. If I get some spare time I might pop over to the Akismet web site or do some searches and see if I can find out whether other people have seen this issue.
Update: I’ve found why the counts are different. The count displayed comes from this query:
SELECT COUNT(comment_ID) FROM $wpdb->comments WHERE comment_approved = ‘spam’
While the list of spam comments comes from this query:
SELECT *, COUNT(*) AS ccount FROM $wpdb->comments WHERE comment_approved = ‘spam’ GROUP BY comment_author_IP ORDER BY comment_date DESC LIMIT 150
That “GROUP BY” is the big differentiator. The spam count query does an absolute count of all of the individual pieces of spam in the system. The comment list query (if I’m reading it correctly) groups all of the comments by the IP of the comment author. So if a spammer sends me two pieces of comment spam from the same IP, it shows up as +2 in the count but only one of the comments shows up in the list.
It’s probably still doing what I want it to do, but it’s confusing as hell. I’m not sure why on earth you would ever design it this way. Looks like it’s time to use the “Contact Us” link on the Akismet home page.
A while ago, I had to give up on Akismet because of some issues I was having. Today I reenabled it and things have gone pretty well so far. It’s let through two legitimate comments and it’s stopped two pieces of spam.
I won’t really declare it a victory until it stops the next quick spam flood (my spam seems to come in waves for the most part). But this is definitely looking much better. It’s looking so good that I went ahead and deleted all of the old spam I still had in the comments table (I was saving it in case I could use it for some Bayesian filtering).
I already don’t like Akismet so I’ve deactivated it. I didn’t even get to the point where it misbehaved in handling a comment. The admin panel is what pisses me off. It tells me there’s 440 pieces of spam it’s flagged when it hasn’t been active long enough. As it turns out, anything marked “spam” in the comments database is fair game. And Akismet tells me that after 15 days it’s going to trash anything marked “spam”. I actually prefer to keep that in the database for historical reasons.
*sigh* Another one bites the dust. I wonder how difficult it would be to hack out the bits that just do an “is it spam” check from the plugin. That’s all I’m really interested in.