As you may already know, Google grabs a copy of each web page it crawls and files it away in its cache. There are exceptions to this behavior as some website owners request not to cache their content either by using robots.txt or password protecting their sites. But, the vast majority of the data Google crawls is cached and accessible via the cached link on the search result page or by using the cache operator.
The Google Cache Banner
For example, to see the Google’s cache of this blog enter the following in Google’s search box: cache:awads.net/wp. You will notice a banner at the top of the page that has some really important information:
Most probably, you rarely pay attention to the banner and you just blow right past it. If you look closely, you will notice the following: This cached page may reference images which are no longer available. This means that you are actually communicating with the original web server that hosts the page and fetching all the images directly from there. This also means that the website where the original web page is hosted on knows about your visit and can log and track your IP address even when you view a web page through the Google cache.
So, if you were striving for anonymity by viewing the Google cached page, you just blew your cover! and if you were striving for maximum page load speed by viewing the Google cached version of the page, you just fetched the images (and other stuff) directly from the external website, making the page load slower in your browser (unless you want to view the images of course).
Cached Text Only
But don’t give up just yet. Notice the Click here for the cached text only in the Google cache page header. This gives you an option to view only the data that Google has captured, without any external references, without any style sheets, without any JavaScript, without any Flash, without any Java Applets… just plain old HTML and text. The fact that viewing a web page with JavaScript disabled provides a safer browsing (there is even a Firefox add-on that does that called NoScript). Moreover, using the the cached text only, you communicate only with the Google server bypassing any connection with the external server where the original page is stored.
When you click on the cached text only link, all what Google does is appending &strip=1 to the cache URL. Notice that the banner at the top of the page looks different now:
The &strip=1 parameter strips out all the “bells and whistles”, leaving you with a page that could look quite different from the original, but still has the “meat” that makes it useful to read.
Cut and Paste
So, you can browse most of the web safely and anonymously using a quick cut and paste and a URL modification. For example, the following Google query site:awads.net inurl:contact returns one result. Instead of clicking the Cached link, right-click the Cached link and copy the URL to the clipboard (Firefox: Copy Link Location, IE:Copy Shortcut), then paste it into the address bar of your browser. Append &strip=1 to the end of the URL and hit Enter. You will be directly taken to the stripped version of the cached page.
Greasemonkey
By now, you’re probably saying: but that’s a lot of work, this copy and paste business. Well, again, don’t give up just yet. There are a couple of Greasemonkey scripts (Firefox only for now) that will make the whole experience with Google cache seamless, easy and fast.
Greasemonkey is a Firefox add-on. It allows you to customize the way a web page displays and functions using small bits of JavaScript. You can download it from the Firefox add-ons site.
After installing Greasemonkey you need to feed it user scripts. You can pick from hundreds of user scripts available at userscripts.org. Here are two that work with Google cache:
I have found the above user scripts very handy and especially useful when used together.
Sources and Resources
Related articles:
Hi Eddie
Reached your blog from blogs.oracle.com Nice article…bringing out the details…i read this message many times but never so much carefully !
Google Rocks !
Thanks Sidhu
Yes, Google rocks indeed. I’m always interested in the different ways I can use Google to access all kinds of information.