-
All started with a question about AuthorRank Wil Reynolds honestly admitted he was not able to answer, and for which asked the help of Bill Slawski, surely one of the best expert in Patents related to the Search Industry (and not only).
-
Share“@wilreynolds @TroyEaves When does #authorrank become a legitimate Factor? //Depends on what you mean by “legitimate.”
-
His first answer could be defined as “academic”, as Bill remember how the presence of the profile photo in the SERPs can lead to better CTR, as some studies seem to confirm
-
Share“@wilreynolds @TroyEaves The act of showing authorship profile images in search results is a UI change that can result in more clicks.
-
Then Bill added
-
Share“@wilreynolds @TroyEaves Authorship markup could be used now as a signal in a determination of filtering of near duplicate content.
-
Share“@wilreynolds @TroyEaves Under an authorship markup approach, plagiarism may become, or may be now, a negative ranking signal.
-
Share“@wilreynolds @TroyEaves Authorship might also be used as a positive ranking signal in social (Search Plus Your World) results at this point.
-
The plagiarism detail was what attracted my attention, especially. Oh, yes, I was following with attention the tweets Bill was publishing.
Therefore I entered with a somehow “extreme case” question: -
Share“@bill_slawski Let say what you say is true (I agree with that idea): how Google discriminate plagiarism? (1/2) @wilreynolds @TroyEaves
-
Share“@bill_slawski Especially plagiarism by a trusted author (we know so many cases in real life), apart timestamps? cc: @wilreynolds @TroyEaves
-
Share“@gfiorelli1 @wilreynolds @TroyEaves In the Google Knol patent on author reputation, plagiarism might lessen reputation scores for authors…
-
Share“@gfiorelli1 @wilreynolds @TroyEaves While Knol is no more, the ideas from that patent might live on via Google Plus and authorrank
-
But instantly he made a clarification:
-
Share“@gfiorelli1 @wilreynolds @TroyEaves The Google Author Badge patent appft.uspto.gov/netacgi/np… insists it isn’t a method of detecting plagiaism
-
(note, unfortunately is not possible to embed the US Patent and Trademark Office database pages here)
-
Share“@gfiorelli1 @wilreynolds @TroyEaves But the mention of plagiarism in the patent points to it as a problem that might be addressed elsewhere
-
Seeing that the topic was getting even more interesting, I started doing questions – maybe simple ones – but in my line of Socratically learning new things (“I know that I don’t know”).
-
Share“@bill_slawski Let say, using other methods but the author badge, right? Maybe a combo of crossed analysis cc: @wilreynolds @TroyEaves
-
The answer by Bill was almost immediate:
-
Share“@gfiorelli1 @wilreynolds @TroyEaves Authorship as one aspect of a quality score under a near dup content approach: appft.uspto.gov/netacgi/np…
-
Share“@gfiorelli1 @wilreynolds @TroyEaves Note that near duplicate content patent is a new (continuation) version of an older patent
-
Share“@gfiorelli1 @wilreynolds @TroyEaves The “quality score” language within its claims can refer to PageRank, as well as signals like authorship
-
After reading this last tweet, I asked:
-
Share“@bill_slawski Can’t also refer to “engagement” values (i.e.: comments, trackbacks & social signals)? cc: @wilreynolds @TroyEaves
-
And to that question, Bill replied citing another Google patent (the “grouptivity patent”)
-
Share“@gfiorelli1 @wilreynolds @TroyEaves Content sharing under Google’s grouptivity patent filing seobythesea.com/2012/02/go… could influence ranking
-
And, about the duplicate content issue related to plagiarism, he said:
-
Share“@gfiorelli1 Under that approach, it’s still more a matter of filtering near duplicate content rather than penalizing.
-
To which I replied remembering well how duplicated content is not a real penalization problem, but a filtering by Google one:
-
Share“@bill_slawski That approach would be also consistent to the more general approach to duplicated content Google have (Panda apart).
-
Share“@gfiorelli1 And that kind of internal consistency is probably a good thing, building upon existing systems if effective and appropriate.
-
Said that, I moved the conversation to a topic which I – as Italian and operating in a not-English Google world – feel very close: translation plagiarism:
-
Share“@bill_slawski But what is still making me wondering is the “sub-issue” of the translation plagiarism too.
-
Share“@gfiorelli1 Plagiarism might impact reputation scores for “authors” under an authorship markup approach.
-
Share“@gfiorelli1 This system needs syndication markup/meta data, including translated content.
-
Share“@bill_slawski And that, again, would be logical. Maybe “common sense” should be the guide for all 🙂
-
Share“@gfiorelli1 Authorrank/agent rank will likely also be calculated for pages without authorship markup at some point.
-
Share“@bill_slawski As if Google was using a sort of Copyscape on steroid to detect the source of the original content & assign it to his author?
-
Share“@gfiorelli1 Right. Source attribution meta tags might help seobythesea.com/2010/11/go… but an approach like copyscape? See: seobythesea.com/2008/11/go…
-
This is the post from his own blog, which Bill cited:
-
Share
Google to Help Content Creators Find Unauthorized Duplicated Text, Images, Audio, and Video?
By , on November 21, 2008, at 2:08 am I’ve written in the past about many of the reasons why you might find the same content at different… -
My conversation with Bill had to stop here: different time zones, things to do… you know, the daily routines 🙂
But Gabriella Sannino (aka @SEOcopy), entered in the conversation with Bill with a technical note about Google filtering duplicated content: -
Share“@bill_slawski I like that reducing the number of Web pages or sites crawled, these techniques can be used to reduce storage requirements
-
Share“@SEOcopy Yes, like not crawling mirrored sites, it’s a smart move making crawling more efficient, indexing faster, and retrieval simpler.
-
Share“@SEOcopy Storage is growing to be less of a concern, but the computational expense is still something to consider.
-
Share“@SEOcopy And getting attribution right, so that the right site is showing in search results for the right content is extremely important.
Thank you, Gianluca
Responding to questions with fairly complex answers can be tough when limited to 140 characters, but I love the opportunity to have meaningful discussions with people from around the globe on Twitter.
I think there are elements of Agent Rank and Authorship Markup that still aren’t in place, and need to be addressed, and I was asked another really good question from @ShawnBishop about how Google might handle multiple rel=”author” links on a page.
The original Agent Rank patent does anticipate having multiple agents or authors on a page, with the possibility that an article or blog post might have been written by more than one author (somewhat like this post itself), to commentors on a post, as well as advertisers, and even editors.
It would also be great for Google to introduce some kind of meta data that could be used to point to syndicated content on other pages like they have for Google News, and I think Google chose to experiment with that on Google News to get a sense of how they might use it elsewhere.
The question about translated content on other sites is another area that I believe Google will address somehow, and I’m looking forward to see how they decide to best to display it in both social search and web search results. A really good translator does more than just present content from one language into another, and I hope Google decides to present translated content in a way that both credits the original author and a translator.
These are still very early days for author rank/agent rank, and the idea of creating a reputation graph that might include different reputation scores for authors for different topics. Add to that they idea that this kind of reputation graph also will intersect with an “interest” graph that could take in both user-behavior in how people browse and search along with how they share content, and the numbers of signals that Google is using to rank objects and content and entities on the web is growing considerable.
Definitely interesting times we live in.
It’s great to see Authorship and AuthorRank discussed in such detail. What it also signals, for me, is a shift away from the brute force method of SEO (aka – build tons of links from anyone and anywhere).
Multi-authorship is definitely something Google has given thought to as Bill mentions. In fact, I’ve seen Google recognize a comment on my blog with a link back to a Google+ profile using the ?rel=author parameter. It didn’t attribute authorship of the post to that person but the Rich Snippets Testing Tool did see the authorship parameter. It would be interesting to take that a step further and add the site you were commenting on to the Contributor to section and see what happened.
I’m also interested in how Google will handle true co-authorship of content. For instance, the Agent Rank patent is actually authored by two Googlers. Attributing that work to both is likely fairly straight-forward but the display issues are actually a bit more difficult.
Authorship clearly assists in identifying the ‘source’ of content which can then be used to better identify those who are scraping and using that content. Google’s actually gotten far better at identifying scrapers lately. Whether it’s related to this or not, I don’t know.
Translation is a trickier issue. In discussing hreflang with a Googler, it was made clear than translated content is NOT the same. Meaning, you wouldn’t use a rel=canonical on a translated version of the same content. Whether it’s translation or syndication, the ability to make the authorship portable yet still valid seems important … and difficult.
The difference between scraping and syndication is really all about granting someone permission. I, personally, would enjoy an interface where Google prompted me to grant or deny the copies of my content it had found during their crawl.
What I’m most excited about though is the idea that AuthorRank will be granted by topic and that the link graph can be curated by experts in those fields.
Search, it’s never boring.