Last updated on
Back to Basics
When webmasters are creating a website, there are 3 major components that they need to take note of, namely:
Also known as Hypertext Markup Language, it serves as the structure, backbone, and the “skeletal system” of the website. It serves to organize the content, define the static content, make the headings, list elements, paragraphs, etc.
The Cascading Style Sheet, or otherwise known as CSS, is in charge of the website’s design, aesthetics, and style. Basically, it is the page’s layer of presentation.
- What You Need To Know About AJAX
The most commonly known use of AJAX is to update the content or layout of a web page without causing a refresh on the current page. Historically, whenever a user loads a page, all the elements of the page must be transported from the server to the page, and then the process of rendering begins. However, with the introduction of AJAX, only the elements or assets that differ between the pages need to be loaded, which means that user does not have to refresh the entire page. This entails that the overall experience of users will improve drastically.
The best representation of AJAX is to think about it as calling mini servers. Also, the best example of AJAX in use is in Google Maps, wherein you do not have to refresh the entire page just to get the right place – instead Google maps call the mini servers to get the different element/assets needed, and the page updates without an entire page reload.
- The DOM (Document Object Model)
Being an SEO professional, you might have known what DOM already is, but if you do not, DOM is what Google uses to analyze, inspect, and understand web pages. You can see the DOM whenever you go to the “Inspect Element” in a browser. A simple way of looking at DOM is that this is the process the browser does whenever it receives the HTML document so that it could start rendering the page.
- Headless Browsing
This process is basically transporting web pages without the user interface. Its importance has recently been noticed because Google and Baidu use headless browsing to better understand the content of the page and the user’s experience.
Examples of scripted headless browsers are PhantomJS and Zombie.js. It is commonly used to automate web interaction for research purposes and to render static HTML snapshots for the purpose of initial requests.
- Crawlability – The capability of bots to crawl your website.
- Obtainability – The capability of bots to access your site’s information, and to parse through its content.
- Perceived Site Latency – Also known as the Critical Rendering Path
One of the main jobs of web developers and webmasters is to make sure that bots could find their URLs and understand their site’s structure. Two key elements are in play:
One of the best ways to see if there are things that Googlebot is blocked from is to use Fetch as Google, Fetch and Render, and TechnicalSEO.com’s robot.txt testing tools to identify those that are blocked. After researching the things that are blocked, the best thing that any web developer or webmaster can do is to unblock those resources and give Google the access that they need.
Internal Linking should be one of the webmasters top priorities whenever they are making or cleaning their site’s architecture. So, when internal linking is made, it should be used with regular anchor tags within the HTML or the DOM.
Internal Linking is one of the main signals for search engines to understand your site’s architecture, and for them to know the importance of your pages. You should not underestimate the power of internal links because sometimes, they can override “SEO hints” like canonical tags.
- Not Recommended
- Only the Hash (#) – This is not crawlable. This is mainly used for identifying anchor links – these are the links that allow a user to jump to a piece of content inside a given page. The important thing to remember is that anything after the Hash (#) of the URL is never sent to the server, and it will cause the page to immediately scroll to the first element with a matching ID. Also, Google recommends avoiding the use of “#” in the URLs.
- The Hashbang (#!) – Hashbang URLs were originally a hack to support crawlers. A few years back, Google and Bing developed a complex AJAX solution wherein the Hashbang (#!) in the URL with the UX co-existed with an escaped_fragment HTML based experience for bots. Google has now rescinded this recommendation, and have preferred to receive the exact experience users have. Inside escaped fragments, there are two experiences:
- Original Experience – This URL must either possess a Hashbang (#!) within the URL to indicate there is an escaped fragment or a meta element which indicates there exists an escaped fragment.
- Escaped Fragment – This replaces the hashbang (#!) in the URL with “_escaped_fragment_” and serves the HTML snapshot. It is also called the Ugly URL because it is long, and somehow looks like a hack.
- A great use of pushState is the process of “infinite scroll” or the process wherein when the user reaches a new part of the page, the URL automatically updates. Essentially, if the user clicks refresh on the page, they will land on the same part once the refresh is finished. However, with pushState, they do not need to refresh the page because the URL is automatically updated when they reach new content.
- If your site requires some actions from the users, search engines probably don’t see it.
- Google’s bots do not have the capability to click, write, or do any other activity that requires a user’s actions. So, if your website has elements like this, Google probably does not have the same experience as the end user. That is why it is important for you to be mindful that both the bots and actual users should have the same experience.
- It has been known that there is no timeout value for websites, however, they should aim to load in 5 seconds or less.
Making Sure Search Engines Get Your Content
The studies conducted by different experts help SEO practitioners understand how they can effectively make their websites, and take on a more proactive role in the overall scheme of their website. However, it is still better for webmasters and web developers alike to make a habit out of testing and experimenting with small sections of their website – this is to find the appropriate solution for them. You could test and examine through:
- Affirming that your site’s content can be viewed in the DOM
- Testing a group of pages to check if Google can index its content
- Check some quotes from your content
- Fetch it with Google and see if content appears.
After all the testing, what should you do if there is an error, and search engine bots still could not index and get your content? Well, if any errors are found in your testing, you can opt to try HTML snapshot.
- HTML Snapshot
These are basically completely rendered pages that could be returned to search engine bots. HTML Snapshot is a controversial topic when it comes to Google, however, it is important for you to understand it because there are cases when using it is your only option.
Additionally, take note that Google wants the same experience that the users have. So, only provide them with snapshots if it is absolutely necessary, and other experts or forums could not be of help to you.
Whenever you consider using HTML snapshots, always remember that Google has already “devalued” the use of this AJAX recommendation. Although they still support it in some ways, they still recommend avoiding the use of AJAX. The direction that Google wants to walk on is understandable because they want to have the same experience as the users.
The second thing to take note of is the possibility of cloaking. If ever you have returned an HTML snapshot, and the search engine found that it does not – in any way – represent your website, it is already considered a cloaking risk.
Despite all the drawbacks, HTML snapshots have their own advantages:
- Site Latency
Whenever browsers receive the HTML document and proceed to create the DOM, most of the resources included in the page is loaded as they appear in the document. Simply put, if your HTML document has a large file at the top, the browser will prioritize that file first.
Always make sure that the content of your website is crawlable, obtainable, and is producing the optimal site latency. Although this is more inclined on technical SEO, the content of this article could still be immensely helpful for your SEO campaign. Keep all the things mentioned here in mind, and you will have a great time optimizing your website.
SEO Hacker will be hosting a no-nonsense SEO conference and it’s none other than the SEO Summit 2017. I want to invite you to attend the Summit if you want to improve your business’ online presence. Renowned experts will be sharing their in-depth knowledge of SEO, digital marketing, conversion rate optimization, and much more. Interested? Click the link for more information