This guy sucks!

I first noticed this on reddit today, but eventually saw it in some pingbacks to my blog (at tekkie.wordpress.com). Somebody has a “blog” hosted in Germany that is ripping off other people’s blog posts and posting them as his own at www.indquery.com. He’s copied a bunch of my posts. In one case I saw he copied a Wikipedia entry. So far, of my posts, this guy’s copied in full (by title):

Straying into dangerous territory
Squeak is like an operating system
Redefining Computing, Part 2
Redefining Computing, Part 1
The real computer revolution
Having fun playing with fire,…er C
Seaside hosting redux
Coding like writing
Squeak/anti-virus problem solved

Secondly the blogger has ads on his site, so I don’t know. Maybe he’s making some money off of what I wrote. He never asked me for permission to reproduce my stuff. In any case, he sucks. My guess is this guy’s looking for what might be popular articles for wherever he is, and trying to attract traffic, taking all the credit for it.

To tell you the truth, I don’t mind if other people copy what I post, but at least have the decency to give me credit, and a link to this blog. That’s just basic etiquette. I do tend to put a lot of work into what I write, so I want to at least get credit for it. I’ve been doing it in the interest of sharing knowledge, not trying to profit from it in an immediately tangible way (though that thought has crossed my mind from time to time).

Anybody have any ideas about administrative action I can take to deal with this leech?

I guess I should start doing this now: Copyright 2007, Mark Miller, https://tekkie.wordpress.com. Sheesh!

Here’s the Netcraft info. I found on the “blog”:

site: http://www.indquery.com
domain: indquery.com
IP Address: 72.18.203.18
Country: DE (Germany)
Domain Registry: Unknown
Organization: Unknown
Netblock owner: RadixDirect.com
nameserver: ns1.cpxadvertising.com
DNS Admin: cpxclick@gmail.com
Reverse DNS: serverpoint.com
Nameserver Organization: Unknown

Advertisements

16 thoughts on “This guy sucks!

  1. Mark –

    He is not trying to get credit, he is acting s a honeypot for search engines, to get those ads in front of faces and get click-through revenue. This is why, like Steve Ballmer, I hate Google. Google has single handedly destroyed much of the value on the Web. Everytime I search for something, 90% of the results I get are sites just like this. Most of them are old archives of mailing lists or Usenet, which are loaded with good keywords for non-adult content and written in a natural voice, making them great targets for search engines. But then I click the search result, and 2/3rds of the results are the exact same item… all with “Ads by Google” on them. Google has enabled these scum to exist (note that they never use Yahoo! or MSN branded ads, only Google’s!). And Google does NOTHING to prevent it. I work with a couple of companies to help with the search engine marketing. We had to cancel our work with Google AdSense (the program that lets people put Google ads on their site), because we were seeing our budget chewed up in a matter of hours, typically before the US business day started, with all of these referal clicks from these ad sites. Checking out the sites showed that none of them were legit, and none of those ads resulted in what the customer defined as a “conversion”. In other words, nearly $20/day was being dumped down the drain towards fraud. As soon as we stopped using AdSense and only used AdWords (sponsored results on the Google search pages themselves), we were able to drop the budget by 50%, saw the conversions go up to a normal number, and watched the daily bidget last throughout the day.

    I have not yet found an effective defense against this so far, but I would recommend that you at least report the site to the ad provider (Google). For a great article on the disgusting relationship between these jerks and the ad vendors, check out this article from Microsoft Research:
    http://research.microsoft.com/research/pubs/view.aspx?type=Technical%20Report&id=1269&0sr=a

    J.Ja

  2. Okay well this helps fill in the picture some more. I knew whoever did this was out for money, but I didn’t know about how they were trying to game the search optimization techniques. I figured they were just trying to steal some traffic from my blog, and the blogs of others, and make some money while they were at it.

    Practically since I started this blog I’ve seen comments show up that get through the spam filter, and they seem to have a message that is kind of pertinent, but then if I check the web address they put in for themselves it’ll be to some commercial site. I’ve gotten some trackbacks like this, too. It used to be obvious. It would be a page with some bogus text on it, and a bunch of ads. As time passed they got more sneaky. Sometimes I couldn’t tell at first blush that it wasn’t another blog. It really took some inspection to realize they were just a honeypot.

    I would delete the comment if it wasn’t pertinent, but otherwise I’d just delete the link. Hence the comments policy I posted earlier.

    I knew this had something to do with with trying to get the search engines to link to them, and I had heard the term “click fraud”, but I wasn’t clear on how it worked. What I was dealing with before felt manageable. I just had to be vigilant.

    I thought I’d heard Google was going to try to deal with click fraud. Or maybe that had to do with sites that put malware on people’s computers.

    What was offensive about this case is it felt like somebody had stolen something from me. Whoever copied these articles deliberately cut out the authors’ names, and when they were posted. They just captured the titles and the article text, though the links within the copied articles still point back to articles I’ve posted here. My blog stats show I’ve actually gotten some traffic from the copied articles.

    As I indicated earlier, I first noticed this on reddit. One of my articles reappeared in the rankings. It was rather high up, too, so it was probably getting about 1,000 hits. I thought, “Huh. That’s odd.” I looked at the comments for it on reddit, and someone complained it was a plagiarized copy of it, and was kind enough to refer to the original article here (whoever that was, thank you!).

    I know you’re trying to say there wasn’t even an intent to steal, just an attempt to game the system, but they deliberately stripped away any notions of originality as well. That’s why it felt like theft.

    I will complain to Google about this, for what it’s worth.

  3. > My blog stats show I’ve actually gotten some traffic from the copied articles

    he he, somehow I found your blog from those other posts – I was wondering why there wasn’t the opportunity to comment 🙂

    every cloud has a silver lining, but yeh, he sucks, I agree

  4. Hi, Bill. Welcome. 🙂

    Yeah, like I said, I would mind it less if my stuff was copied with the original authorship information kept intact. One of the goals I had with this blog is spreading ideas around. Still, linking to the original articles is preferable.

    I can kind of sympathize with those who supported the earlier versions of the GPL license in that they felt they could be assured that the work they donated for free to a project wouldn’t end up contributing directly to someone else’s bottom line, though people can indirectly profit from it. It’s more complicated than that, but I’ve heard that’s the basic attraction to it for some.

  5. @Justin:

    Just had a bit of an epiphany about this situation. You’re more right than I thought. I happened to be clicking around a couple articles on my blog and I noticed that my name and date were displayed on posts when I looked at the listing of articles. But when I clicked in to an article (say, to look at the comments) the name and date information disappeared from the post. So this could explain why these things weren’t captured for display on the imposter site, assuming whoever did it was just copying text indiscriminantly.

    I really never noticed this before. Makes me wonder if there’s a setting for this, or maybe it has to do with the stylesheet. Hmm. Something to investigate.

  6. Mark –

    I would be willing to bet ten to one that he is feeding that through an RSS feed, and that since the feed keeps the name and date in separate fields from the main content, his scraper is deliberately ignoring those fields. Why manually copy/paste, when WordPress’s top posts list is a gold mine of quality content? And since those top post listings get linked to more often than other posts (statistically, since they have more exposure), he is piggybacking on whatever link love your page gets because his stuff seems similar.

    Funny how it works, isn’t it?

    J.Ja

  7. @Justin:

    What you say makes sense. Hadn’t thought about the RSS feed, though I agree. With the exception of a couple of the posts he copied, all of them have gotten a significant amount of hits after I posted them, usually because someone posts references to them on Reddit or Technorati. I agree RSS is one way, but he could also be crawling some tagging lists. I don’t think it’s just through Top Posts on WordPress. That would be easy, but it wouldn’t explain all of the articles he copied. I know that some of them didn’t make Top Posts.

    For example, one of the ones he duplicated was a rather obscure post I wrote about Squeak having a conflict with a particular piece of anti-virus software. That got hardly any hits, though I did get some foreign trackbacks.

    Like I said in my post, I imagine that all of the articles he duplicated had some sort of local interest. The server is in Germany, after all. Squeak is more popular in Europe than it is here.

  8. I also noticed the bastard after he copied a post that pinged my blog. According to whois, the bastard’s registered his domain through Godaddy.com.

    I will contact GoDaddy and Google, so please do so as well, if possible. What annoys the hell out of me is that searching for my post titles on google yields his website before mine. Shameless douchebaggery.

  9. I reported the website to google adsense as a copyright violator. You can do the same by visiting the website and clicking on the “Ads by google” link inside the ad box. This will take you to a feedback page, where you’ll find a link called ‘Also report a violation’, which will open up a list of checkboxes including ‘This website is using my copyrighted content’.

  10. @Muhammad:

    Thanks so much for the information! I have lodged my complaint with Google using the steps you laid out. I was wondering how I’d do this. I tried to complain by just going to google.com, but I couldn’t find anything direct like what you found. I was able to find their corporate postal address though, and was going to write them a letter. Now I don’t have to.

    I’ll also lodge a complaint with GoDaddy.com.

    I’m curious, how did you find out where the domain was registered? I know about Netcraft, but it doesn’t give me this information.

    Thanks again.

  11. You’re welcome. I found how to report this to google by searching–google’s website doesn’t provide any info as you’ve seen.

    As for Godaddy, I found it out through whois.net. Alas, GoDaddy replied back saying they only registered the domain, and aren’t hosting it, and that we should contact the domain host instead.

  12. @Muhammad:

    Hmm. I got a response from GoDaddy, asking me to confirm some information. They gave me a web page to go to for this. I haven’t followed up yet.

    Google responded asking me to send information in written form to their corporate address. They said they would only take action if the guy violated the DMCA (Digital Millennium Copyright Act), which as far as I’m concerned is only an American law, not international. So I’m losing hope that they’ll do anything. I’m still going to try to make a go of it. Things have gotten busier for me lately.

  13. What I do is click on the “Gooogle” link and then comment on the ad. There is a place where you can mark the site as violating Adsense policies. Under reasons, tell why, with the URI of both your page (that he copied) and his page (the copy of yours).

    I do not know whether anyone ever looks into those. It feels better to have contacted the specific part of Google that is involved.

  14. Pingback: Update on plagiarism situation « Tekkie

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s