Social Network Spam and Author/Agent Rank

by SEOMoz

On Wednesday I presented at SMX on the panel called “Facebook, Twitter, and SEO”. I was excited to speak alongside Horst Joepen (SearchMetrics), Jim Yu (BrightEdge), and Michael Gray (Atlas Web Services). In my talk, I showed some information from patents that talked about how a search engine might detect a person’s topical relevance and authority and use a scale on which to pass link juice from their social shares or not. Let’s explore some of these a bit more.

What Factors Might Search Engines Look At?

There are three concepts I would like to introduce you to.

Topical Trustrank

The first concept you should be familiar with is “topical Trustrank”. The original Trustrank was first mentioned in this Yahoo patent from 2004. At the time, it seemed underdeveloped, since it relied on sites to label themselves. And worse than underdeveloped, it was open to spam since it relied on websites to tag themselves (not unlike the meta keywords tag). The patent was granted in 2009 as a way to rank sites based on labels given them by people, according to this article called Google Trustrank Patent Granted.

Another take on Trustrank is Topical Trustrank, which was introduced in 2006. Because Trustrank seemed to be biased heavily towards larger communities that could attract more spam pages (without tripping a spam threshold, maybe?), Topical Trustrank aimed to build trust based on the relevance of the connecting sites (and I would argue now, the topical relevance of those sharing links via social networks).

Author Rank

According to one Yahoo patent application, “…author rank is a ‘measure of the expertise of the author in a given area.'” Since this is delightfully vague, here are some specific areas (taken from Bill Slawski’s How Search Engines May Rank User Generated Content) that the search engines might look at to determine if you are authoritative:

  • A number of relevant/irrelevant messages posted;
  • Document goodness of all documents initiated by the author;
  • Total number of documents initiated posted by the author within a defined time period;
  • Total number of replies or comments made by the author; and,
  • A number of [online] groups to which the author is a member.

We can take these and apply them to social as well. If they are calculating author rank based off of content taken from around the web, why would they not also use this author rank for your social shares? Here are some more questions a search engine might ask about a user (according to an email I received from Bill Slawski):

  • Do they contribute something new, useful, interesting?
  • Are they tweeting new articles, or recycling old articles? Are they sharing articles from just one site, or are they sharing articles from a number of different sites? What’s their engagement/CTR?
  • Do they participate in meaningful conversations with others?
  • Are they replying to others through @replies or others (DMs. maybe?)? What topics?
  • Do those others contribute something new, useful, interesting?
  • Are they themselves keeping the cycle going and replying to various others, or always responding to the same users?

Agent Rank

According to this article from Search Engine Land, Google applied for a patent around a way to determine an agent, or author’s, authority in a specific niche. According to the article:

Content creators could be given reputation scores, which could influence the rankings of pages where their content appears, or which they own, edit, or endorse.

Also according to the article, here are some of the goals of Agent Rank:

  • Identifying individual agents responsible for content can be used to influence search ratings.
  • The identity of agents can be reliably associated with content.
  • The granularity of association can be smaller than an entire web page, so agents can disassociate themselves from information appearing near the information for which the agent is responsible.
  • An agent can disclaim association with portions of content, such as advertising, that appear on the agent’s web site.
  • The same agent identity can be attached to content at multiple locations.
  • Multiple agents can make contributions to a single web page where each agent is only associated to the content that they provided.”

Does the following sound like the new rel=author markup that we’re seeing in the search results? I think it does:

“Tying a page to an author can influence the ranking of that page. If the author has a high reputation, content created by him or her many be considered to be more authoritative that similar content on other pages. If the agent reviewed or edited content instead of authoring it, the score for the content might be ranked differently.” “An agent may have a high reputation score for certain kinds of content, and not for others – so someone working on site involving celebrity news might have a strong reputation score for that kind of content, but not such a high score for content involving professional medical advice.”

The article goes on to explain that authority scores will be hard to build up, but easy to harm. This would be one way to keep authors producing high quality content. Some more factors that may influence authority:

  • Quality of the response
  • Relevance of the response
  • The authority of those who respond to what you post

The Google Person Theory

How might search engines view my sharing and that of my followers?

Two weeks ago Duane Forrester from Bing posted an interesting article showing how a they might visualize if someone is attempting to game their ranking signals by sharing a lot, or if the increased rise in sharing is natural. According to the Information Retrieval based on historical data (PDF) patent:

A large spike in the quantity of back links may signal a topical phenomenon (e.g., the CDC web site may develop many links quickly after an outbreak, such as SARS), or signal attempts to spam a search engine (to obtain a higher ranking and, thus, better placement in search results) by exchanging links, purchasing links, or gaining links from documents without editorial discretion on making links.

If we take “back links” and replace it with social shares, we get this:

A large spike in the quantity of [social shares] may signal a topical phenomenon (e.g., the CDC web site may develop many links quickly after an outbreak, such as SARS), or signal attempts to spam a search engine (to obtain a higher ranking and, thus, better placement in search results) by exchanging [shares], purchasing [shares], or gaining [shares] from [others] without editorial discretion…

If you are automatically tweeting every interesting article that comes your way, and you have a large network of people who do the same in an attempt to game the signals, here is the image of how Bing might view those manipulated ranking signals (the below is an example of a “Like Farm”). Check out all of the hubs on the image below:

And here is an image of non-manipulated, truly viral signals. Check out the wide scatter of sources:

Some quick pieces of data to dissuade you from spamming or completely automating

We hear a lot of talk around automating your social stream. This seems like an oxymoron to me, since it undercuts the whole purpose of “social” media. Here is an interesting statistical graph for you:  Manual tweets get twice the clicks on average!

Next, if you’re interested in whether automating your Twitter stream will increase your followers, take this next graph into account:

Key learning: Less automation = more followers

(All data gathered from Triberr – The Reach Multiplier)

How can I build my author trustrank with the search engines?

Here are some ways to benchmark and build your author presence in the eyes of the search engines:

Author microformats – if you own a website, you most definitely should implement the new rel=author microformat, validating through Google Plus. This is a fantastic way to directly claim your content to the search engines. Here is how to do implement it on WordPress (via Joost de Valk) and here is the official Google page on authorship.

Klout Topics- Since we were talking about topical trustrank earlier as well, you might want an idea of which topics the search engines might consider you authoritative about. I think that Klout Topics is a good place to start.

Gravatar – Ross Hudgens wrote a great post a few months ago called Generating Static Force Multipliers for Great Content wherein he talked about the importance of a consistent personal brand and image across the Internet. If you have the same photo across many different sites, how could the search engines not use this in determining trustworthiness?

KnowEm is a website where you can find if your username has been taken across many different social networks. This is a great place to go to learn where you need to sign up to protect your username, and therefore your personal brand and author trust.


Author authority has long been a topic of discussion in SEO circles and we’ve wondered “Does Google have an author rank?” From these patents, I think it is obvious that they have the capability, and especially now with Google Plus for Google, and Facebook for Bing, both are going to be making this even more of a priority.

I’d love to hear your thoughts.