404 Tech Support

My Brush with Blog Scraping

While many people have resigned that spam is just a fact of life–they will get it, it will fill their inboxes, they will waste valuable time each and every day deleting it–having someone steal your written words to make their spam look legitimate is much worse, much more personal, and nothing of which to resign yourself.

The DMCA (Digital Millenium Copyright Act) is in place protect all people regarding the use and misuse of their original works by others. The particular violation of copyright that I plan to discuss and provide information on is blog scraping. While blog scraping is primarily of concern to bloggers, it also has diverse effects on everyday users and readers. Blog scraping, or feed scraping, is the act of a site subscribing to your RSS feed and reposting your content. Most often, and the point of blog scraping, is that your content will now be surrounded with ads. By stealing your content, the scammer behind this has attempted to make their blog look legitimate and increased the range of individuals that they can reach through search engine results.

This hurts the end-user, a blog reader, because it fails to establish community. If someone performs a web search and gets a result to a site pilfering your content, they may be able to read what they need and move on. That’s not such a bad thing, in fact, that’s what I hope I am able to provide to many people that end up here searching for support to a technical problem. I hope they are able to find their solution and get on with their life. The problem arises when my post only tangentially solves your problem and you want to ask a question in the comments to see if I might be able to help. If you’re on my blog (http://www.404techsupport.com) then feel free and I will reply to any comments and try to help. However, if you’re on a scammer’s site mirroring my content with ads all over the place, then:
A) If you post a comment, you won’t get a reply from me and your problem won’t get assistance. Even sending a ‘Thanks’ encourages a blogger to continue writing, but instead you would be thanking a scammer for stealing content.
B) If you click on a link or an ad on the scammer’s site, you are allowing them to profit through advertisement and stolen content.
C) My (blog’s) reputation will decrease because there are ads within the content, which makes pages harder to read and may link to inappropriate or unrelated sites.
D) You do not get the community from related articles and other readers with similar interests who may post in comments to assist further.

Unfortunately, my blog (http://www.404techsupport.com) has been scraped. This means my content, many articles that I feel proud of and would refer people to, was showing up on other sites and largely attributing credit to someone else. It’s been pilfered before by other illegitimate sites, but in those instances it only contained excerpts of a few articles and linked directly to the full link. These sites posted the excerpts, which were still a violation of copyright and intellectual property, but headed them with “404 tech support wrote:” which linked back to the full article on my page. These were of less concern to me.

Today, however, I became painfully aware of a site that was fully copying my entries and other entries from around the blog-o-sphere. As I looked around this site, searching for contact info, I saw some of these posts were getting comments. People were actually posting “Thanks” to the scammer for the posts that he had merely stolen instead of what they had created and taken the time to put down in words and share* with the world. (*Share does not imply that rights and permissions to material’s usage have been given up.)

Due to this blog scraping, you may notice a few things different with the site (http://www.404techsupport.com). I have added a disclaimer under the RSS feed in the right-column. This specifies that you may freely subscribe to my blog to be notified that there is a new post. You may certainly read the new posts through the RSS feed as it is intended. In fact, I encourage as many people that want to subscribe to the RSS feed. (You may also want to update your subscription as feed addresses have changed recently. Trying things out through Feedburner, which was then bought out by Google…) You may not use the RSS feed to repost my articles on another website. You may certainly post links to my articles from your site. I would be grateful for that and honored that you think I offer something worth your recommendation. You may also add-on to my posts if you have further steps, recommendations, or comments to add. You may also disagree with my posts and post a rebuttal on your own blog. I welcome all dialog regarding anything I write. If you wish to repost an article or parts of one, you should ask first. Unfortunately, this is one of those many cases where a few ruin it for the rest. I will most likely respond to you very quickly and grant permission if you are a legitimate blogger (anyone but a scammer/spammer/bot). I’ll also link to your post with the original post.

Another addition to my site is the logo for MyFreeCopyright in the left column. This service allows you to timestamp your original content as a public record so it can be undeniably claimed as your own should legal ramifications be necessary for disputing Intellectual Property ownership. MyFreeCopyright subscribes to your feed and hashes the content of your posts (or other works) and creates a timestamp of creation verifying that the work is yours.

Neither the Terms of Service or the MyFreeCopyright are required to be posted thanks to the DMCA and the Berne Convention. Even if you don’t have written anywhere that you hold the copyright to this original material, as the creator you hold this copyright regardless. I have posted this information, even though it is not necessary, because it strengthens the copyright claims, reaches to other countries that have strict requirements like the phrase “All Rights Reserved” or “Copyright [date]” must be visibly posted. This also helps act as a Terms of Service so that you can point any person running the ISP or the domain registration inbetween and say, yes this guy is violating my intellectual property rights, my copyright, and even here the visibly posted Terms of Service and copyright notice. MyFreeCopyright can then be used (by showing proof of the e-mail containing the title, excerpt, date, and hash of the article) to establish ownership.

I learned a lot researching the DMCA issues and what I might do and hope to provide some directions to point you in if your blog has been scraped or you are facing similar issues with stolen Intellectual Property.

Learn about the DMCA, Intellectual Property, and your rights:


Find out what to do and about others confronting similar problems:


Take action:


A big thanks
to all the sites above as they helped me take control of my Intellectual Property.

Update: The site pilfering my content has now voluntarily removed it from their site after I sent them a DMCA notification e-mail. There are many blogs still “contributing” to the site however. Take action to help other bloggers!