Understanding The Google AMP Cache


March 7th, 2017

AMP is a way to build web pages for static content that render fast. Accelerated Mobile Pages (AMP) is an open content ecosystem which aims at giving the user a great reading and surfing experience.

Speed has always been the main quality criteria for Google as page load speed is the first impression that the user gets about the website.
The AMP ecosystem works entirely on the concept of caching. Hence, understanding the importance of the AMP cache is imperative for every SEO and online marketer who, advocate the implementation of AMP to their clients.

amp-project

AMP in action consists of three different parts:
• AMP HTML
• AMP JS
• Google AMP cache

AMP HTML

AMP HTML is a subset of HTML for authoring content pages such as news articles which helps improve UX by reducing the page load time.

AMP JS

To avoid the delay in page rendering AMP manages resource handling and asynchronous loading but third-party JavaScript is kept out of the critical path.

Google AMP Cache

The Google AMP Cache is a proxy-based content delivery network for delivering all valid AMP documents. It fetches AMP HTML pages, caches them, and improves page performance automatically.

What is a Cache server?

A cache server is a dedicated network server or service acting as a server that saves Web pages or other Internet content locally. By placing previously requested information in temporary storage, or cache, a cache server both speeds up access to data and reduces demand on an enterprise’s bandwidth.

What is Google AMP Cache?

The Google AMP Cache is a cache of validated AMP documents published to the web. It is available for anyone to use. Google products, including Google Search, serve valid AMP documents and their resources from the cache to provide a fast user experience across the mobile web.

 
The concerns regarding AMP Cache are:

• The major concern for many is that their pages are served from a cached server which is out of their control.
• Many are also worried about the fact that the cached version served to the user may vary from the original version.
• Analytics attribution is another concern regarding AMP.

But, in fact there is no reason to worry about the AMP Cache delivery network.

• Every time the AMP content is served from the AMP Cache, the content is auto updated, and the updated version is served to the next user.
• The AMP cache uses the “stale-while-revalidate” model. It uses the original server caching headers, such as Max-Age, to judge the freshness of the content.
• The AMP cache model attributes the traffic to the publisher through the <amp-analytics> tag. The AMP supports a large number of analytics providers.
• When the AMP page is requested from the AMP cache, it is validated first. The page will not be served to the user if any problems are detected. Hence, it ensures quality pages are displayed on the user’s device.
• The main reason for a fast and user friendly experience is the pre-rendering feature. Due to pre-rendering the pages are served instantly.
• This is pre-rendering undoubtedly can be done from origin servers instead of the AMP cache but that uses up the bandwidth, CPU resources and battery of that server. This may affect the speed and contradict the main purpose of hosting AMP pages.
• The AMP cache also ensures security against XSS as in does not allow Cross-site scripting in the main document.
• Moreover, AMP is a win-win for users, publishers and advertisers. When you search on a search platform, the AMP pages as they come from the cache they do not associate the search term to it.
Optimizations, Modifications And HTML sanitization

The Google AMP Cache performs optimizations and modifications, such as the following:
• Validates if the content meets all the AMP norms.
• The AMP Cache along with the document caches images and fonts too.
• Optimizes image dimensions to prevent memory issues and poor responsiveness.
• The  amp-img tag removes the EXIF data and some metadata which can create issues or may be invisible.
• Converts GIF, PNG, and JPEG format images to WebP in browsers that support WebP.
• Serves over a secure channel (HTTPS) and uses the latest web protocols (SPDY, HTTP/2).
• In making the above transformations, the Google AMP Cache disregards the “Cache-Control: no-transform” header.
• All HTML comments are removed.
• Tag and attribute names are rewritten in lowercase.
• Attribute values are consistently quoted and escaped.
• All tags are closed, except for HTML5 void elements.
• Whitespace inside tags is removed.
• Text is escaped.
• Encoded text characters are simplified, using UTF-8 equivalent characters.
• Elements that can only be in the body get moved into the body.
• Outbound links are made absolute so that they continue to work when the document is served from the Google AMP Cache origin instead of the publisher origin.

The Google AMP cache not only ensures optimization of web pages but also validates them. Regarding Ads. Last year the Google AMP team laid out four core principles i.e speed, beauty, security and how to use all these 3 together.

If the AMP pages are being served from the cache then how will the ads be displayed?

The AMP Project announced the open source AMP for Ads initiative in July to work on the UX regarding ads on AMP pages. The initiative’s goal is to fix the foundation of digital advertising on the web, applying the principles of AMP to advertising and making ads faster, more beautiful and secure.

Since the announcement, AMP Ads have come a long way. Publishers across the world like The Washington Post, The Guardian, and USAToday have been testing AMP Ads experience for Publishers and users.

If the AMP pages are being served from the cache, then, what about the ads on the AMP pages?

There is still a long way to go and Google is yet exploring the use of AMP for ads.
According to Google, In the future:
• ads will be statically analyzable to be safe and behave well,
• they will be able to use a common set of functionality to significantly reduce bandwidth consumption,
• CPU usage will be limited to on-screen ads, maximizing battery life,
• and ads will be coordinated with the page making sure that primary content and functionality can always be buttery smooth at 60 frames per second.

Google says, we are trying to build a user-experience-first ecosystem for advertising on the web and, looking at the success of AMP in publishing, it might just work.

About The Author

Founder of WebPro Technologies a Web solutions company based in India which focuses on building quality web presence for businesses. Bharati Ahuja is a SEO Trainer and speaker, Web Entrepreneur, Blog Writer, Internet Marketing Consultant.

Website: http://www.webpro.in/about-bharati-ahuja/

Digital Marketing Agency India