Javascript- Bot Experience and User Experience

Incorrect Application of Javascript in Your Site Is Ruining Your SEO

Knowing and understanding Javascript, and its underlying potential to affect your SEO campaign is an essential skill that a modern SEO expert should have. Javascript – among other components – lets search engines crawl and analyze a website. Hence, if a webmaster fails to properly incorporate Javascript into their website, it will lead to a failure of indexing and ranking.

The most important part about Javascript’s impact on SEO is whether the search engines can locate the content and understand its meaning. Also, webmasters should be mindful of their website’s ability to be indexed in mobile platforms. If search engines could not, then what should you do? But, before we get into optimizing your Javascript, let’s start with the basics.

 

Javascript- Bot Experience and User Experience-1

Back to Basics

When webmasters are creating a website, there are 3 major components that they need to take note of, namely:

  • HTML

Also known as Hypertext Markup Language, it serves as the structure, backbone, and the “skeletal system” of the website. It serves to organize the content, define the static content, make the headings, list elements, paragraphs, etc.

  • CSS

The Cascading Style Sheet, or otherwise known as CSS, is in charge of the website’s design, aesthetics, and style. Basically, it is the page’s layer of presentation.

  • Javascript

This is the component that is in charge of interactivity and the main element related to the dynamic web. It is either placed inside the HTML document in the <script> tags, or it is linked or referenced to. With today’s progress, a variety of Javascript frameworks and libraries are now available such as jQuery, ReactJS, EmberJS, etc.

  • What You Need To Know About AJAX

Asynchronous Javascript and XML, when combined form AJAX. It is a set of techniques for web developers that enables web applications to communicate with a server while not disturbing the current page. The word “Asynchronous” means that lines of code and other functions can still run while the “async” script is currently active. In the past, XML used to be the primary language used by web developers to pass data, however, AJAX is now used to pass numerous types of data.

The most commonly known use of AJAX is to update the content or layout of a web page without causing a refresh on the current page. Historically, whenever a user loads a page, all the elements of the page must be transported from the server to the page, and then the process of rendering begins. However, with the introduction of AJAX, only the elements or assets that differ between the pages need to be loaded, which means that user does not have to refresh the entire page. This entails that the overall experience of users will improve drastically.

The best representation of AJAX is to think about it as calling mini servers. Also, the best example of AJAX in use is in Google Maps, wherein you do not have to refresh the entire page just to get the right place – instead Google maps call the mini servers to get the different element/assets needed, and the page updates without an entire page reload.

  • The DOM (Document Object Model)

Being an SEO professional, you might have known what DOM already is, but if you do not, DOM is what Google uses to analyze, inspect, and understand web pages. You can see the DOM whenever you go to the “Inspect Element” in a browser. A simple way of looking at DOM is that this is the process the browser does whenever it receives the HTML document so that it could start rendering the page.

The entire process – not just the DOM – starts with the browser receiving the HTML document. Afterward, it will start parsing all the content of the received document and all the additional resources such as CSS and Javascript files are also transported into the page. The DOM is created from the parsing of all the content and resources. It could be understood as the structured, organized, and systematically arranged version of the page’s code.

Today, the DOM is usually very different from the initial HTML document due to the existence of dynamic HTML. This kind of HTML is the page’s ability to change the content it shows depending on the user’s input, environmental conditions, and other variables. Simply put, Dynamic HTML leverages HTML, CSS, and Javascript.

  • Headless Browsing

This process is basically transporting web pages without the user interface. Its importance has recently been noticed because Google and Baidu use headless browsing to better understand the content of the page and the user’s experience.

Examples of scripted headless browsers are PhantomJS and Zombie.js. It is commonly used to automate web interaction for research purposes and to render static HTML snapshots for the purpose of initial requests.

 

Javascript- Bot Experience and User Experience-2

Javascript, SEO, and Fixing the Issues

Usually, there are 3 main reasons why you should be concerned about the Javascript on your site:

  • Crawlability – The capability of bots to crawl your website.
  • Obtainability – The capability of bots to access your site’s information, and to parse through its content.
  • Perceived Site Latency – Also known as the Critical Rendering Path

 

  • Crawlability

One of the main jobs of web developers and webmasters is to make sure that bots could find their URLs and understand their site’s structure. Two key elements are in play:

  1. Blocking the search engines from your site’s Javascript
  2. Proper internal linking – which means that you did not leverage your Javascript as replacement for HTML tags

Why You Should Unblock Javascript

As mentioned, when Javascript is blocked, search engines are unable to receive your site’s full experience. This means that the search engine could not see what the end users are seeing on your site. This will lead to a reduction of your site’s allure to search engines, and can even lead to search engines interpreting it as cloaking.

One of the best ways to see if there are things that Googlebot is blocked from is to use Fetch as Google, Fetch and Render, and TechnicalSEO.com’s robot.txt testing tools to identify those that are blocked. After researching the things that are blocked, the best thing that any web developer or webmaster can do is to unblock those resources and give Google the access that they need.

Internal Linking

Internal Linking should be one of the webmasters top priorities whenever they are making or cleaning their site’s architecture. So, when internal linking is made, it should be used with regular anchor tags within the HTML or the DOM.

More importantly, do not use Javascript’s onclick events as a replacement for internal linking. Although end URLs might be crawled through Javascript and XML sitemaps, they will not be included in your site’s global navigation.

Internal Linking is one of the main signals for search engines to understand your site’s architecture, and for them to know the importance of your pages. You should not underestimate the power of internal links because sometimes, they can override “SEO hints” like canonical tags.

URL Structure

Normally, AJAX sites or the Javascript-based websites are using fragment identifiers (the hashtag symbol – #) within their URLs

  • Not Recommended
    • Only the Hash (#) – This is not crawlable. This is mainly used for identifying anchor links – these are the links that allow a user to jump to a piece of content inside a given page. The important thing to remember is that anything after the Hash (#) of the URL is never sent to the server, and it will cause the page to immediately scroll to the first element with a matching ID. Also, Google recommends avoiding the use of “#” in the URLs.
    • The Hashbang (#!) – Hashbang URLs were originally a hack to support crawlers. A few years back, Google and Bing developed a complex AJAX solution wherein the Hashbang (#!) in the URL with the UX co-existed with an escaped_fragment HTML based experience for bots. Google has now rescinded this recommendation, and have preferred to receive the exact experience users have. Inside escaped fragments, there are two experiences:
      • Original Experience – This URL must either possess a Hashbang (#!) within the URL to indicate there is an escaped fragment or a meta element which indicates there exists an escaped fragment.
      • Escaped Fragment – This replaces the hashbang (#!) in the URL with “_escaped_fragment_” and serves the HTML snapshot. It is also called the Ugly URL because it is long, and somehow looks like a hack.
  • Recommended
    • pushState History API – Pushstate is navigation-based and is usually a part of the History API (your browsing history). Normally, pushState updates the URLs in the address bar and only the necessary changes are updated. It also allows Javascript sites to leverage URLs. It is currently allowed by Google, only when supporting navigation in the browser for the client-side or hybrid rendering.
      • A great use of pushState is the process of “infinite scroll” or the process wherein when the user reaches a new part of the page, the URL automatically updates. Essentially, if the user clicks refresh on the page, they will land on the same part once the refresh is finished. However, with pushState, they do not need to refresh the page because the URL is automatically updated when they reach new content.
  • Obtainability

Search engines have been known to make use of headless browsing to render the DOM to obtain a better understanding of the content of the page, and the user’s experience. This means that Google can still process Javascript and use the DOM – instead of the HTML document.

On the other hand, there are still situations wherein search engines have difficulty comprehending a site’s Javascript. It is definitely important for a webmaster or web developer to understand how bots crawl and interact with their site’s content. If you are not sure, then conduct tests.

For search engine bots that execute Javascript, there are a few elements that they need for them to be able to obtain a site’s content:

  • If your site requires some actions from the users, search engines probably don’t see it.
    • Google’s bots do not have the capability to click, write, or do any other activity that requires a user’s actions. So, if your website has elements like this, Google probably does not have the same experience as the end user. That is why it is important for you to be mindful that both the bots and actual users should have the same experience.
  • If your Javascript loading time takes more than five seconds, search engines may not be seeing your page.
    • It has been known that there is no timeout value for websites, however, they should aim to load in 5 seconds or less.
  • If errors exist inside your Javascript, both browsers and search engines have a possibility to miss out on sections of your page if the code is not executed properly.

Making Sure Search Engines Get Your Content

  1. Test

The most commonly used solution to resolve Javascript is to just let it be. Just let Google’s algorithm do its work. Giving Google the experience that the users have is naturally its preferred choice. 2014 was the year that Google first announced that they were able to better understand the web regarding Javascript and other elements. However, experts in the industry have speculated that Google could crawl Javascript websites way before their 2014 announcement. Hence, if you could see your website’s content in the DOM, there is a high probability that it is being parsed by Google.

A recent study conducted by Bartosz Goralewicz tested a combination of different libraries and frameworks of Javascript to determine how Google interacts with them. The test eventually concluded that Google has the capability to interact with varying Javascript forms, and has showcased that some forms of Javascript are more challenging to interact with than others.

The studies conducted by different experts help SEO practitioners understand how they can effectively make their websites, and take on a more proactive role in the overall scheme of their website. However, it is still better for webmasters and web developers alike to make a habit out of testing and experimenting with small sections of their website – this is to find the appropriate solution for them. You could test and examine through:

  1. Affirming that your site’s content can be viewed in the DOM
  2. Testing a group of pages to check if Google can index its content
  • Check some quotes from your content
  • Fetch it with Google and see if content appears.
    • An important note to remember when using Fetch with Google is: when you use Fetch with Google, it usually occurs during the load event or before timeout. This is definitely a great way to check if Google could see your content, and whether your Javascript is blocking the robots.txt. This way is still subject to error, however, it is a really good first step.

After all the testing, what should you do if there is an error, and search engine bots still could not index and get your content? Well, if any errors are found in your testing, you can opt to try HTML snapshot.

  1. HTML Snapshot

These are basically completely rendered pages that could be returned to search engine bots. HTML Snapshot is a controversial topic when it comes to Google, however, it is important for you to understand it because there are cases when using it is your only option.

HTML snapshots could be used when search engines and other sites such as Facebook cannot get ahold of your Javascript, then it is better to return an HTML snapshot than to not get your content indexed – or not even understood at all.

Additionally, take note that Google wants the same experience that the users have. So, only provide them with snapshots if it is absolutely necessary, and other experts or forums could not be of help to you.

Whenever you consider using HTML snapshots, always remember that Google has already “devalued” the use of this AJAX recommendation. Although they still support it in some ways, they still recommend avoiding the use of AJAX. The direction that Google wants to walk on is understandable because they want to have the same experience as the users.
The second thing to take note of is the possibility of cloaking. If ever you have returned an HTML snapshot, and the search engine found that it does not – in any way – represent your website, it is already considered a cloaking risk.

Despite all the drawbacks, HTML snapshots have their own advantages:

  1. It will help search engines and crawlers understand the experience on the website. This is because some types of Javascript are harder to understand.
  2. Other search engines and crawlers should be able to understand user’s experience. Bing, like some search engines, has not announced that it has the capability to crawl and index Javascript. The only solution that Javascript-heavy websites could do is to make use of HTML snapshots. However, you have to be wary if this is really the case before using HTML snapshots.
  • Site Latency

Whenever browsers receive the HTML document and proceed to create the DOM, most of the resources included in the page is loaded as they appear in the document. Simply put, if your HTML document has a large file at the top, the browser will prioritize that file first.

Google’s critical rendering path is catered to the improvement of user experience because the concept behind it is that the browser loads whatever the user needs as soon as possible. However, if your HTML document has an unnecessary amount of resources or Javascript files blocking the page’s ability to load you are getting “render-blocking Javascript”. This means that your Javascript is impeding the page’s capability to appear as if it is loading faster (the appearance is also known as perceived latency).

Using page speed measuring tools, test if you have render-blocking Javascript issues, there are three potential solutions:

  1. Inline: You should add the Javascript into the HTML document
  2. Async: add the “async” attribute to the HTML tag to make your Javascript asynchronous
  3. Defer: Lower the placement of the Javascript in the document. However, it is important to remember that the scripts should be arranged in terms of priority. So, if your Javascript is among the top priorities in your page, avoid deferring it to a lower position.

Key Takeaway

The most important aspect of any practitioner’s SEO campaign is to have their website crawled, indexed and ranked. Additionally, one of the main components of a website is its Javascript. However, elements like Javascript can also break your website’s goal of ranking on the first page of the SERPs.

Always make sure that the content of your website is crawlable, obtainable, and is producing the optimal site latency. Although this is more inclined on technical SEO, the content of this article could still be immensely helpful for your SEO campaign. Keep all the things mentioned here in mind, and you will have a great time optimizing your website.

Are you now able to properly use your Javascript? Do you still need help in any other issues of your website? Tell me in the comments below, and let’s help each other out.

P.S.

SEO Hacker will be hosting a no-nonsense SEO conference and it’s none other than the SEO Summit 2017. I want to invite you to attend the Summit if you want to improve your business’ online presence. Renowned experts will be sharing their in-depth knowledge of SEO, digital marketing, conversion rate optimization, and much more. Interested? Click the link for more information

SEO, , , , , , , , , , ,

About Sean

is a Filipino motivational speaker and a Leadership Speaker in the Philippines. He is the head honcho and editor-in-chief of SEO Hacker. He does SEO Services for companies in the Philippines and Abroad. Connect with him at Facebook, LinkedIn or Twitter. Check out his new project, Aquascape Philippines.