The terms people arriving at our site through search.live.com are just... weird, though. Most are outright vulgar, searching for obscure pornography or celebrity names, drugs, sex aids... Here's an few examples:
/log/trunk Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR
188.8.131.52 http://search.live.com/result.aspx?q=breast+enhancement&mrt=en-us&FORM=LVSP /
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; Win64; x64; SV1)
/browser/media/tutorials Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2;
.NET CLR 1.1.4322)
These hits were to small obscure pages on our site, such as svn changelogs, and then I noticed the source IP addresses were in just a few IP ranges, so I ran a whois to see if one ISP tied connected them all;
arc@sobek ~/work/pysoy $ whois 184.108.40.206
OrgName: Microsoft Corp
Address: One Microsoft Way
NetRange: 220.127.116.11 - 18.104.22.168
You read that correctly. Microsoft, in a desperate attempt to make themselves seem more important, or perhaps just to flood free software project's websites with unwanted traffic, is running bots which act like normal web crawlers. Indeed, over 97% of the hits we got from search.live.com were from Microsoft's own IP subnets. Searching Google, I found this story was previously covered by others more observant of their logs.
In response, I'm adding a special rule to block all future traffic from the offending netblocks, including MICROSOFT 22.214.171.124 - 126.96.36.199 and MICROSOFT-1BLK 188.8.131.52 - 184.108.40.206.