Friday May 02nd, 2008 | News, SEM, SEO

Google Crawls Javascript, News at 11!


In case you haven’t noticed – Google is crawling JavaScript these days. Yah, I know. Just when you think you’ve pretty much got it all figured out, Google has to go and mix it up a bit. With their continued shift toward more intrusive crawling, Google has been striving to pierce the JavaScript veil and peak inside previously hidden content  -  and it’s causing issues. While this is leading to improved crawability of many sites with dynamic navigation, it’s also opening the door to some potentially damaging side effects including indexing pages purposely hidden from search engines, leading to duplicate content issues and more.

JavaScript & SEO

For years content or links embedded in JavaScript was virtually invisible to search engines. This meant that the content inside the JavaScript did not add or take away from the density of your copy, and links with-in JavaScript could not be followed or indexed by search engines. It was a major obstacle for many website whose main navigation was JavaScript based, or whose content was dynamically populated by a JavaScript function.

But it was one of the few widely accepted SEO truths. Despite it being a real pain to work around at times, it was comforting to know that something about SEO was immutably true. Many elements of search engine optimization are rather nebulous. The lines of what works/doesn’t work are often rather gray and at times the engines seem to operate more on exceptions than on rules. So it was kind of nice to have something you could rely on to be forever true. But alas – the rules to the game have been changed.

Google changes the rules of the game

For years we’ve read about how Google was striving to find their way into and through JavaScript in their untiring pursuit to index all of the world information. The last year has seen some definate progress in their efforts.

Over the past several months we began noticing pages which were previously not indexed, suddenly being crawled and indexed by Google despite the fact that they were only accessible through JavaScript embedded links and even form elements.

Intrigued in the ongoing trend by the Googlebot towards more aggressive crawling, we began to pay pretty close attention. Over the months we’ve observed Google begin to crawl simple JavaScript links like pop-up window calls, and JavaScript driven navigation menus and even form drop down elements with embedded links.

We’ve observed Google crawling

  • JavaScript pop-up window calls
  • JavaScript iFrame triggers
  • dynamic JavaScript driven navigation elements l
    • drop down menus
    • fly out menus
    • on-page window elements
    • simple JavaScript function calls
  • certain JavaScript driven widgets
  • form element embedded links
  • form submission results

JavaScript was a comfortable obstacle

Anyone with a bit of SEO experience under their belt has developed workarounds for those early, JavaScript driven drop down/fly out menus. Over the years search engine savvy web developers have crafted creative workarounds using CSS & minimally intrusive JavaScript to produce website navigation that is at the same time bot dynamic and search engine friendly.

We had nearly learned to live with it. In fact in some cases we’ve learned to use it to our advantage – placing some content we didn’t want crawled/indexed with-in JavaScript pop-up windows, on the page in doc write segments or linking to pages using JavaScript embedded links.

Duplicate content issues

While it’s great to see Google break though one of the staple obstacles to search engine accessibility, it’s also opening the doors to completely new problems – primarily the indexing of content not intended to be indexed and specifically the indexing of duplicated content.

Pop-ups whose content were essentially invisible to search engines and generally neglected by optimizers are suddenly flung into the content mix. Form drop down elements can now potentially lead search engines to nearly duplicated content.

The buzz has already worked it’s way through the community and is the subject of discussion on many of the industries forums & blog comment sections.

So heads up SEO community, no time to slack off – go check your clients sites and make sure all your ducks are in a row.

How has this effected your optimization?

Share your experience – discuss your observation & opinions in the comments below.

Resources related to Google’s aggressive crawling

Google
Google Webmaster Blog – Google crawls HTML forms - April, 11 2008 Googlebot can fill and process HTML forms - select options from drop down menus, radio buttons and select boxes – to reach deeper content

Matt Cutts
on Google crawling HTML forms 2006 - as far back as mid 2006 Matt was giving us hints on how to fend off the ever zealous Googlebot,

on Google crawling HTML forms today – today we have several other tools SiteMaps.org Protocol and robots.txt sitemap: declarations to help corral the aggressive bot, and with that added control Google loosens the leash on it’s Googlebot to dig a little deeper heare and there

Search Engine Journal
Google crawling JavaScript with clean URLs - Lorne backer chimes in with a comparison to Yahoo ignoring the “nofollow” tags, and a some good quotes including some nice points in the comments section

Search Engine Round Table
Google crawling & indexing through JavaScript with clean URLs - highlighting the key points at issue in a world where Googlebot can crawl JavaScript – particularly duplicate content issues

Google indexing JavaScript, Crawling like a human? - some interesting thoughts on Googlebots ‘evolution’ toward more ‘human’ behavior

Web Master World
Googlebot executing Javascript? - speculation around Googlebots ability to process JavaScript including executing JavaScript functions


3 great comments so far, keep em coming! | rss

  • Friday May 02nd, 2008
    Straderade

    Thanks for the informative posting! This could be a great opportunity but also cause huge problems with various clients. I’ll be on the look out for current and future websites.

    Do you feel Google will throw sites with duplicate content within Java out of search results?

    Again, thanks for the heads up! ;)


  • Monday September 28th, 2009
    dudemjk

    Hello, very nice explanation, I wonder is google can crawl javascript which the output is links ?.

    thanks.


  • Thursday November 12th, 2009
    GoodCamel Blog » GoogleBot Follows URLs in JavaScript

    [...] This guy talking about the implications of that new feature in GoogleBot: Google Crawls Javascript, News at 11! [...]



Thanks for reading.

Quick, add your comment!
Trackbacks are enabled.





«
»