How to Find Link-Worthy Data

by SEOMoz

You might be a little tired of hearing ‘content is king’.  And it’s increasingly difficult to make content stand out online.  But a few sites are leading the way with their innovative use of data.  There’s the Guardian Datablog, Information is Beautiful and the ubiquitous OK Trends to name but a few.

But sites like these are still in the minority.  So there’s ample opportunity to turn data into links.  But first you need to know…

How To Get Your Hands On Some Tasty Data

data cake

There’s data practically everywhere.  There are tonnes of different sources you can use.

APIs and Scraping

If you’ve got some developer resource available, you can pull data from a shed-load of APIs all over the web.  Mining Twitter and Facebook are obviously popular, but there are lots of other opportunities.

Programmable Web has a massive list of APIs you can tap into.  Speaking at a recent Distilled conference (Boston ProSEO), Dharmesh Shah suggested signing up to the RSS feed from Programmable Web – not because you need to know everything that’s coming out, but for the ideas it will trigger as you go along.  It can save you bucket-loads of time if you’re able to pluck out an idea from a while back that ‘ll work perfectly for a new project.

If there’s no API, scraping is always an option.  And even if the API is available, scraping can be preferable for doing things on the fly, and for the less technically-able (like me).  There are a couple of great resources that have already been written on this – check out the following:

And if you’re really getting into scraping, you should also check out ScraperWiki.  You can find out more about ScraperWiki here and here – especially for those who don’t code.


This is a pretty simple one really.  You can create surveys using Mechanical Turk in the same way as Will’s Panda questionnaire.

If you’re using Mechanical Turk, there are some challenges you should be aware of with regards to statistical significance, i.e. are the people doing work via Mechanical Turk really representative of the intended population?  But these kinds of objections can often be worked around by being upfront about where your data has come from.  Don’t try to bury your sources – if people can’t find them, they won’t trust you.  And if they have to seriously dig to get them, somebody will oust you.  Put them up front.  Be very transparent.

The beauty of using survey data is you can ask exactly what you want to ask.  There can be nothing more frustrating than having a great idea, and searching for hours to find a dataset to support it, only to abandon the project.

Open Data

This is a huge one.  Open data is a very hot topic, with more and more governments succumbing to pressure to open up their data.  As an example of how you can use open data, the following graphic by 97th Floor was created using a publicly available data source.  And Open Site Explorer shows 203 root domains linking to the page on which it appears (!).

where does the money go

Rather than searching through various government datasets, the Guardian Datablog have a search engine that allows you to search all of the open data sources from around the world.  And they’re continually adding to it as more and more countries open up their data.

For other publicly available datasets, the following sites have some fairly extensive lists:

Academic Papers

In a similar vein to open data, academic papers and journals can be a great source of valuable information.  The problem with academic papers is they aren’t written for the public.  They’re buried in the depths of the web and barely anyone outside academia reads them.  They tend to be very dry and completely inaccessible.  But they often contain really valuable content.  You just need to turn them into something appealing and easy to understand.

You’re not necessarily being rewarded for being the source of the information, but for digging it out and turning it into a much more consumable and enjoyable format.  If might take a bit of effort, but that’s where you’re adding the value.

Another great thing about these papers and journals is they’ve been properly researched in an academic fashion.  And you’re quoting very respected sources, which will give your content added weight.  Nothing like quoting a few .edus to add some gravitas.

To find academic journals, try Google Scholar or SpringerLink.


One massively overlooked data source – especially by SEOs – is our old friend Google.  As well as providing lots of tools to process data, they’re a useful source as well.  For starters, they have this list of data sources you can explore.  But there’s also the headsmackingly obvious – Google Insights and Google Keyword Tool.

Yes, I’m serious.  Although we’re in a niche where everybody knows about them, the majority of the public still have no idea you can see behind Google and find out what everyone’s searching for and what the trends are.  When I first showed it to some of my friends, they were genuinely amazed.

There could be some really easy wins you could make without much effort at all.  For examples of simple things you could do, check out these 2 posts by David McCandless.  You could easily do a quick and dirty press release on online trends that could get some decent coverage.

google insights by david mccandless

Client Data

Client data is ideal but there can be a few difficulties.  The advantage of using client data is you can announce something genuinely new – that wasn’t previously in the public domain.  However, there are a number of things to be aware of when using internal data:

  • Some companies will be reluctant to give you access, mostly due to concerns about competitive intelligence
  • There may be delays in getting the data to you, which can impede your ability to deliver on time
  • The data will often have missing entries and errors, and may even be completely unusable
  • The dataset may be too small to be reliable (especially when you start segmenting)

It’s worth raising the above when you first discuss the possibility of using internal data, so you manage expectations.  If you do end up using the data, you have to be careful you don’t over-state your findings.  As mentioned previously, you should clearly state how you sourced your data, so as not to be misleading.  As long as you do this, you can still create something worthwhile.  It’s still a story – or at least it should be if you’re planning on putting it out there.

Anything I’ve Missed?

So there you have it – you need never be short of data again.  But if there are any major sources of data you think I’ve missed, be sure to add them to the comments below.

Marketing Ethics: Persuasion vs. Manipulation

by SEOMoz

The debate over “white hat” vs. “black hat” tactics in SEO seems to resurface every few months, followed soon after by a debate over whether those labels or the debate itself are even worth having. I thought it would be useful to step back a bit and look at the broader issues of ethics in marketing.

As marketers, our job is to persuade people, whether it’s to choose a certain product or buy it from a certain vendor. It’s not always clear, though, when persuasion becomes manipulation. I’m going to explore 5 scenarios in a white-board style format (read that: “crudely illustrated for your amusement”).

Scenario 1: Simple Alignment

The first scenario is what I’ll call “simple alignment” – the Customer wants X, your Client (employer, etc.) sells X, and you work to facilitate the process:

Illustration of Simple Alignment

The Customer is on one side of the wall, dreaming of a new car, and your client is on the other side, trying to sell that car. You (the green arrow) come in to bring the wall down. Alignment could just be the act of bringing Customer and Client together (like driving relevant traffic to a site). The end result is win-win.

Scenario 2: Simple Choice

In the “simple choice” scenario, the Customer wants either X or Y, but hasn’t made up their mind. So, you nudge them to make a choice that fits your objectives:

Illustration of Simple Choice

Is it unethical? On the one hand, the Customer wanted X or Y, so nudging them toward X is hardly a heinous crime. If you persuade them with features and benefits, this could be completely win-win. If you outright lie to drive them toward your Client, it’s a very different story.

Scenario 3: Competitive Choice

Scenarios (1) and (2) are based in an imaginary world where only one person actually sells anything. What if the Customer wants X, but your Client has a Competitor, and you steer the Customer toward buying from your Client?

Illustration of Competitive Choice

Obviously, the ethics of this situation can get complicated fast. Let’s say your Client makes 98% of their revenue selling pirated Justin Bieber CDs to al Qaeda, while their Competitor makes its money selling double-rainbows to puppies. I’d probably rather buy from your Competitor. On the other hand, as long as we’re not lying about our Client, the Competitor, or the products, this is still essentially an act of persuasion. The Customer wanted X and they ultimately bought X.

Scenario 4: Unknown Desire

Sometimes, Customers have no idea what they want – not in the sense of choosing between 2 or more options, but in the sense of not even knowing that an option exists:

Illustration of Unknown Desire

In some ways, this is the essence of much of modern marketing – it’s less about pushing us to choose from alternatives, and more about persuading us to want things we didn’t know existed. The iPhone is a great example – I didn’t know I wanted one until I tried it out. Until then, I had been suffering the delusion that my LG clamshell phone with no internet was all that I needed.

In all seriousness, this is a tough one. It’s the heart of modern consumerism, which many people would certainly say has gotten out of control. Is fulfilling an unknown desire inherently bad? No, of course not. Is manipulating people into wanting something by playing on their envy, fear, doubt, and uncertainty unethical? That’s a very different question.

Scenario 5: Altered Decision

Finally, what if we sell someone something they didn’t originally want at all? The Customer is looking for X and you convince them to buy Y:

Illustration of Altered Decision

In some cases, this may be like Scenario (4). The Customer thought they wanted the blue car until they saw it in red and loved it. In other cases, you may be aggressively pushing them to make a decision they later regret. Somewhere in between is the boundary between persuasion and manipulation.

It’s All About Intent

My point is simple – the ethics of marketing get complicated fast, and a lot of it boils down to intent. This is what makes Google’s job so hard – they can’t reach into our brains to see what we’re scheming, so they have to infer intent from action.

Take paid links, for example. Buying an ad to drive traffic to your site is perfectly acceptable to search engines. Buying an ad to build a juicy link back to your site and manipulate your ranking violates Google’s guidelines. If no one ever bought an ad just for SEO purposes, there would be no need to nofollow links. Since Google can’t judge our intent and paid links were abused, they have to assume that all paid links are suspect.

I’m not defending Google’s stance or claiming that Google’s guidelines are the same as ethical behavior. I’m simply saying that these situations are a lot grayer than we sometimes like to believe, especially when you consider the entirety of the internet.

Why Does It Matter?

So, why should this matter to you? Even if you’re not that concerned with ethical marketing, I think there’s something else at play here, and it directly affects your bottom line. Look at the 5 scenarios again:

  1. Simple Alignment
  2. Simple Choice
  3. Competitive Choice
  4. Unknown Desire
  5. Altered Decision

What’s the easiest type of sale to make? Usually, it’s going to be Scenario (1). You just need to help the Customer find your Client, or maybe you need to improve your CRO to bring down a few walls within your site. The Customer already wants what you’re selling.

On the other end, getting someone to completely change their mind may not just be unethical – it’s also extremely difficult. If you find yourself constantly having to change people’s minds, even to the point of manipulation, you may be targeting the wrong market.

It’s funny that Scenario (4) seems to be the current Holy Grail of marketing. Apple is the poster child for selling us things we didn’t even know we wanted. There are, admittedly, tremendous advantages – being the first to market means you get a great head-start and can put up barriers to entry. For most of us, though, it’s just not necessary or cost-effective. There may be plenty of Scenario (1) and (2) clients out there, and your money could be better spent finding them.

This post was inspired by a conversation with a UX colleague, Harry Brignull, and his work on what UX folks have come to call “Dark Patterns”. This post isn’t really about dark patterns, but it’s a pretty cool concept (and very cool name), so I’d encourage you to check it out.

Competitive Analysis in Under 60 Seconds Using Google Docs

by SEOMoz

Faced with a new client, and having established a list of keywords they need to target, you want to evaluate the competition to find out what sites are dominating the SERPs for these keywords. However… being an SEO you’re a busy guy (or gal), and you need it done right now. I’ve built a Google Docs tool to automagically do exactly that and this post will walk you through it.

The basis for this tool comes from a report in this linkbuilding post on YOUmoz which contained a neat little ‘SERP Saturation’ report. I don’t know how Stephen made his snazzy looking report (he’s now shared a few details in this comment), but in response to a few people asking about his I thought I’d put together a tool. Here is Stephen’s report:

SERP Saturation Report

Cool, eh? We are going to produce something very similar, albeit not as pretty. We will automatically pull ranking data and tie into the Linkscape API to pull in some helpful metrics.

1. What does the report show?

So, what’s the report all about? It is a pretty standard report, and most SEOs will have put together similar reports in their time. It shows which domains are dominating the results pages for the specified list of keywords. It is an excellent way to quickly see who the main players are, and see a few metrics for them.

Ours will be sorted by the cumulative number of times a subdomain has appeared in the top 10 of the search results over all the keywords we specify, and will display the mozRank, Domain Authority and Linking Root Domains for each. We’ll show just the top 10 competitors in our report.

You can just duplicate the Google Docs spreadsheet I provide below, and change almost any of this to add, modify or take away as per your needs.

2. How do you configure it?

You must configure it the first time you use it:

1) If you’ve not yet done so, get a SEOmoz API key. Its free!

2) Open the Google Docs spreadsheet. In File menu select ‘Make a copy’ so you have a version you can edit (call it “Report Template” or such).

3) Go to the ‘Config’ sheet at the bottom, and enter your SEOmoz API details.

4) If you’d like to change the template for which Google URL to do (it defaults to UK for me), you can do that here too.

3. How do you use it?

Open your report template spreadsheet you just made.

1) On the config tab, paste up to 50 keywords, one per row, starting at cell B7 (its indicated).

2) Open the ‘Report’ sheet.

3) Now select ‘Make a copy’ and give it a name (“Client X Report” or whatever). This  step is *essential* or the fields will not update properly (I’m working on making this not necessary – any clues?).

4. What should you see?

You should see a snazzy little report:

SERP Competitive Report

It shows everything I promised, and more even:

SERP Competitive Report Graph

A colourful and interactive, albeit it slightly wonky, graph! What more could you want?!

5. Under the hood

You don’t need to read this section if you are neither interested in how it works or need to edit it at all. Besides which, I’m mostly just going to refer you elsewhere! A big shout out to Tom Critchlow, whose prior work contributed heavily to this little tool. Firstly, you need to read:

How To Build Agile SEO Tools Using Google Spreadsheets

Which introduces how to scrape the SERPs for ranking data. I modified what Tom did slightly as I wanted a list of subdomains, rather than pages, so there is a bit of string cropping (and fudging!).

Next you need to read Ian Lurie’s post (which Tom also helped with):

Linkscape + Google Spreadsheets. Together, at last.

Again, this I also edited. I changed the code around quite a bit, which you can see in the script editor. You end up with a function you can enter into a cell:

=getLinkscapeData(A1, 1)

The A1 is a cell reference to a URL, and the 1 is a dummy parameter to prevent annoying caching issues.

For a look at the full code for the Linkscape API interface, and some pointers on how to modify it to suit your needs I’ve put up a separate post on Using the Linkscape API with Google Docs, which includes a simpler example spreadsheet to try the code out with.

The rest of the spreadsheet is a few simple bits to filter and cumulate the necessary bits and pieces, along with a few tricks to try to sidestep some bugs in Google Apps. Nothing in the sheet is protected (there are a fed hidden columns) so you can take a look at the workings. If you have specific questions, post them in the comments and I’ll try my best to answer.

This was my first real foray into Google Docs, so it might not be particularly elegant. Also the document seems to have trouble updating sometimes – if anyone has a solution that would be great. In the meantime, if you just ‘Make a copy’ it seems to force an update.

6. Wrap up

Ok, it isn’t in depth analysis, but if you have a keyword list, and want a very quick peek at what domains are players, and their general stats, this tool gives you a quick and dirty look. Most importantly – it is free and open, so you can tweak it to your hearts content.

Questions, comments or suggestions are very welcome – post below and I’ll get back to you.

How Will You View Your TV Content in 2011?

Web and other options are shaking up how we watch TV

By David Lieberman, USA TODAY

If you gave or got a TV set, game console, Blu-ray player or DVR for the holidays, you might become the kind of person who scares executives who run movie and television production studios, broadcast and cable channels, and cable and satellite systems.

Many of these devices now make it easy for people with home broadband networks to feed content from the Internet, including Hollywood movies and TV shows, onto their TVs.

Read more…