Spam egg sausage and spam

Radcliffe Infirmary

As time goes by and Google indexes more and more of my content, I get more spam of every variety. I get spam emails, spam comments on the blog, and spam added to the wiki. Of the three, the email spam is the most common, but also the most easily dealt with. It has existed for so long that good systems exist for dealing with it: whether based on Bayesian reasoning or on group filtering processes. The former are largely centered around word usage. If an email contains the word ‘Viagra’ the chances of it being spam are high. If it includes the string of characters ‘V1agr4!!!’ it is virtually certain to be spam. The latter are based on user reporting. Most spam isn’t very original. As such, if GMail has 1000 people report that a particular message is spam, it can pretty reliably block it for everybody else.

I cannot get too far into how this blog’s anti-spam system works. This is because automated systems seem to have become capable of determining which system or combination of systems a site is using and then launching an appropriate attack. Suffice it to say that the blog uses a variant of both approaches above, plus one more special thing. Since the system was implemented, it has dealt with spam from 9188 different IP addresses. Security through obscurity may not be intelligent or rubust in many circumstances, but it works well enough when you are somewhat better defended than most sites, not of much value to attack, and surrounded by sites with much worse systems.

The wiki is the most vulnerable, precisely because the intended purposes of a wiki requires easy editing. Given that so few users contribute to mine, the best solution might be to lock it down so that only those with approved accounts can access it.

One possible lesson to be drawn from this is that technology eventually evolves the ability to deal with abuse. The older the system being attacked is, the more likely a sensible and effective set of countermeasures will be developed. Alternatively, it is possible that the more open approaches used by blogs and wikis are fundamentally more vulnerable to abuse.

Only time will tell.

Obviousness and patents

This week, the US Supreme Court issued a ruling related to the ‘obviousness’ test in patent filing. The case – KSR Int’l Co. v. Teleflex Inc. (PDF) – hinged on whether an automatic adjustment device for an accelerator pedal created by KSR infringed upon the patents of Teleflex. KSR argued that the combination of technologies was obvious, and that Teleflex could not claim royalties.

In order to maintain a fair and beneficial system, the condition that patents cover non-obvious innovations is highly important. The whole reason for granting patents is to foster innovation by granting temporary monopolies to innovators. Patents are meant to include enough information to allow a skilled practitioner to actually make the thing being patented. Under this system, inventors are meant to be willing to disclose the nature of what they have accomplished so that it might serve to aid the investigations of others. In exchange, they get legal rights over their invention for a defined period of time. This trade-off hardly makes sense when companies are permitted to patent trivial innovations, such as the much ridiculed patent awarded to Amazon.com for ‘one click shopping.’

Recently, there have been a good number of cases where the patent system is accomplishing something quite unlike this ideal. ‘Patent trolls‘ acquire patents of a broad and obvious kind, then wait for another company to release a successful product that arguably infringes on them. More often than not, the objective is simply to receive some kind of payment in return for ending the legal hassle. Of course, this interferes with the processes of innovation, as well as undermining the general credibility of the patent system. RIM and Vonage have both recently been targeted by such suits.

It seems sensible that patent offices should be more aggressive in their interpretations of what it means for an invention to be ‘novel’ and ‘non-obvious.’ As such, they would reduce the occurrences in which someone is unfairly granted rights over an idea that many other people have likely come up with, but not bothered to go through the process of trying to patent. It would also reduce the danger of patent trolling, particularly if the courts recognize that such behaviour can be predatory, and that the patent system ultimately exists to serve the public good.

PS. Slashdot has commented on the Supreme Court ruling. Most of these entries are also relevant.

Browser considerations

This post, which was linked to on Tony’s blog, got me thinking about web browser choice. All I want is something that displays pages properly without eating too much RAM. Good RSS handling is an advantage. I am likely to stick with Firefox for now, but it is good to assess the state of competition every once in a while.

Continue reading “Browser considerations”

Important OS X update

Mac users, make sure you get the latest security patch from Apple. It covers some distinct vulnerabilities in terms of wireless networking, as well as patching several dozen general system and application vulnerabilities. You can read more about it here.

To get it, just click the Apple icon in the upper-left corner of the screen and then choose ‘Software Update’ from the menu that comes down. While being on a Mac does make you safer, it certainly does not make you invulnerable.

When only the high-carbon option works

After an agonizing two hours of trying1 to book Eurostar tickets, I gave up and got a flight to Paris from EasyJet. I am not sure if the bookings problems were Eurostar’s or NatWest’s fault. If my bank is to blame, they have sunk even deeper in my estimation. If it was the train company, they lost two customers because their web interface is unreliable. It failed at every possible stage: listing train times, entering payment information, and processing my credit card.

I am leaving on the afternoon of April 26th (three days after my thesis submission) and returning on April 30th (a few weeks before exams). It would be nice to go for longer, but the middle of an Oxford term is not the time for an extended foreign jaunt.

[1] Over and over and over again, without success.

Serial numbers and used goods

Quad in St. Cross College, Oxford

One of the great things about the internet is the ability to deal with information that is far too diffuse and voluminous to be processed in other ways. Indeed, that is the principal way in which modern computing qualitatively changes that we are able to do, as opposed to altering the rate at which we can complete a particular task.

Given those characteristics, it surprises me that nobody has come up with a site that catalogs serial numbers for all the kinds of products that include them: from bicycles to cameras to mobile phones. Such a site would allow users to enter that information when they purchased a product. It would then be on hand for warranty claims and in the event of loss or theft. People purchasing such items online, or in used good shops, could check the database to ensure that the products they are buying are not listed as stolen. Like eBay, it is much more efficient to have all these numbers sorted in a single place than to have numerous separate databases. The chances of a person trawling through many sites are low, but one well organized one could get masses of traffic. (See: network effect)

You could even imagine a system where online retailers like eBay are integrated with such a site. The listing for a camera would thus include a serial number linked to an entry in the database. If you bought the item, then received one with a different serial number from the one listed, you would be entitled to lodge a complaint and the seller would get flagged as a potential fraudster. I have personally avoided buying photographic equipment from eBay because I fear that a lot of it may be stolen. Having some simple protections like these in place would make me feel a lot better about it.

PS. For an example of an existing but limited serial number listing, see the stolen equipment registry over at Photo.net. It is unlikely that someone buying a cheap digital camera online will look at that (I knew it existed and it took me some searching around to find the URL), but perhaps someone buying an expensive tilt-shift lens for a medium format camera system will.

Waiting for SkypeIn in Canada

Canadian telecom regulators should hurry up and allow the allocation of SkypeIn numbers. The deal is that you pay about $50 a year to Skype for a phone number in an area code of your choice. People can then call it from within that area, as though it were a free local call. They would actually be calling a computer that forwards the call to your Skype account, on whatever computer or Skype-enabled phone you are using, anywhere in the world. You can also have it automatically redirect calls to another normal phone, though there is a per-minute charge for that.

The system seems really good because people in your designated area can call you without worrying about long distance charges. Also, people who don’t find the whole Skype system comprehensible can call you without any knowledge of how it all works. Supposedly, it is unavailable in Canada because it is incompatible with 911, but this doesn’t make a great deal of sense, since SkypeIn numbers receive calls, rather than initiate them.

With a combination of SkypeIn and Skype Unlimited (which costs $30 a year and includes unlimited calling to landlines), I could speak an unlimited amount to friends in North America for less than $75 a year, with benefits such as being able to use any internet cafe that has Skype installed as though it were my home phone. I just need to wait for Canadian regulators to permit the final link in the chain.

PS. I realize that I could buy a SkypeIn number for New York or Seattle, which would be very cheap for friends in Canada to call. Losing the convenience of it being a local call, for them, is the reason I have not done so thus far, though you can attach SkypeIn numbers in up to ten area codes to a single Skype account.

Information saturation

Mansfield College, Oxford

There is no time when it is easier to get distracted from a task than when it is something long, complex, and challenging. My room is never cleaner than at the periods before exams, nor my emails so well managed as at times when I have some massive research project to complete. The number of possibilities on the web: from blogging to instant messaging, compound the danger. So too, the special stresses involved in thesis writing.

This month’s issue of The Walrus includes an article called “Driven to Distraction” that addresses the issue of how many such temptations exist in a digital age. I subscribe to 127 different information feeds: most of which get updated more than once a day, and some of which are regularly updated more than twenty times a day. Beyond that, I have email, the manual screening of spam from blog comments and wiki pages, Facebook, constantly updating access logs for various sites, text messages on my cell phone, and news websites that I track.

Just as I have frequently used music and immersion in a laptop-free coffee shop environment to try to get some reading done, I am going to try to reduce the frequency with which I am checking my various feeds: staying logged out of Bloglines and email and checking each only a few times a day (or at least every couple of hours, instead of virtually constantly). Maybe then I will be able to finish hammering out a new version of chapter two, as well as drafts of chapters three and four, before Dr. Hurrell departs for Brazil, leaving me to finish my thesis entirely on my own.