PowWeb Forums - The Perfect Community for the Perfect Host  

Register now to interact with over 11,000 members! Registered users have Posting Privileges, free access to Private Messaging, Email Notifications and more.

Go Back   PowWeb Community Forums > The PowWeb Platform > .htaccess / Scheduled Jobs
User Name
Password
Register FAQ Members List Search Today's Posts Mark Forums Read

Closed Thread
 
Thread Tools Search this Thread
Old 2-25-06, 05:12 PM   #1
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
Lightbulb Use htaccess to dramatically speed up your site!

I was looking into speeding up my site.. specifically by using headers to specify custom cacheing..

Previously I was using php to send the necessary headers, but now I just have this in my /htdocs/.htaccess file.

Code:
### turn on the Expires engine ExpiresActive On ### expires after a month in the client's cache ExpiresByType image/gif A2592000 ExpiresByType image/png A2592000 ExpiresByType image/jpg A2592000 ExpiresByType image/x-icon A2592000 ExpiresByType application/pdf A2592000 ExpiresByType application/x-javascript A2592000 ExpiresByType text/plain A2592000 ### expires after 4.8 hours ExpiresByType text/css A17200


Please note that the "A" before the numbers above stands for Access. This means that the stopwatch starts when a client accesses the file. You can also use "M" for modified..

Then from a Linux/BSD/Macintosh shell, wget to the file you want to see the headers for.

When I use wget to do this, I specify the following flags:

-S // prints out the headers returned from the server
-p // also downloads anything required to view the page (like css, and images)
--spider //specifies to wget to not download anything, just to crawl


But don't take my word for it.. the proof is in the pudding.. below.

This is the results when not using Expires
Code:
[produke1@corruptu:~]$ wget -S -p --spider http://www.example.com --21:00:19-- http://www.example.com/ => `www.example.com/index.html' Resolving www.example.com... done. Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Sat, 25 Feb 2006 21:00:39 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 X-Powered-By: PHP/4.4.2 5 Keep-Alive: timeout=5, max=5 6 Connection: Keep-Alive 7 Content-Type: text/html 200 OK Loading robots.txt; please ignore errors. --21:00:20-- http://www.example.com/robots.txt => `www.example.com/robots.txt' Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Sat, 25 Feb 2006 21:00:40 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 Last-Modified: Wed, 22 Feb 2006 23:23:13 GMT 5 ETag: "129ba06" 6 Accept-Ranges: bytes 7 Content-Length: 31 8 Keep-Alive: timeout=5, max=5 9 Connection: Keep-Alive 10 Content-Type: text/plain; charset=iso-8859-1 200 OK --21:00:20-- http://www.example.com/favicon.ico => `www.example.com/favicon.ico' Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Sat, 25 Feb 2006 21:00:43 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 Last-Modified: Tue, 13 Dec 2005 12:20:26 GMT 5 ETag: "17c22bd" 6 Accept-Ranges: bytes 7 Content-Length: 894 8 Keep-Alive: timeout=5, max=5 9 Connection: Keep-Alive 10 Content-Type: image/x-icon 200 OK --21:00:23-- http://www.example.com/inc/css/example.css => `www.example.com/inc/css/example.css' Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Sat, 25 Feb 2006 21:00:53 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 Last-Modified: Thu, 23 Feb 2006 02:55:10 GMT 5 ETag: "b57d48" 6 Accept-Ranges: bytes 7 Content-Length: 17547 8 Keep-Alive: timeout=5, max=5 9 Connection: Keep-Alive 10 Content-Type: text/css 200 OK --21:00:34-- http://www.example.com/inc/js/script.js => `www.example.com/inc/js/script.js' Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Sat, 25 Feb 2006 21:00:54 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 Last-Modified: Wed, 22 Feb 2006 11:50:47 GMT 5 ETag: "1cb6dc7" 6 Accept-Ranges: bytes 7 Content-Length: 3898 8 Keep-Alive: timeout=5, max=5 9 Connection: Keep-Alive 10 Content-Type: application/x-javascript 200 OK --21:00:34-- http://www.example.com/inc/i/btn-send.png => `www.example.com/inc/i/btn-send.png' Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Sat, 25 Feb 2006 21:00:57 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 Last-Modified: Thu, 16 Feb 2006 12:07:03 GMT 5 ETag: "b57d55" 6 Accept-Ranges: bytes 7 Content-Length: 608 8 Keep-Alive: timeout=5, max=5 9 Connection: Keep-Alive 10 Content-Type: image/png 200 OK


This is the results when using Expires


Code:
[produke1@corruptu:~]$ wget -S -p --spider http://www.example.com --20:58:33-- http://www.example.com/ => `www.example.com/index.html' Resolving www.example.com... done. Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Sat, 25 Feb 2006 20:58:53 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 X-Powered-By: PHP/4.4.2 5 Keep-Alive: timeout=5, max=5 6 Connection: Keep-Alive 7 Content-Type: text/html 200 OK Loading robots.txt; please ignore errors. --20:58:34-- http://www.example.com/robots.txt => `www.example.com/robots.txt' Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... Read error (Connection reset by peer) in headers. Retrying. --20:58:41-- http://www.example.com/robots.txt (try: 2) => `www.example.com/robots.txt' Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 416 Requested Range Not Satisfiable 2 Date: Sat, 25 Feb 2006 20:59:06 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 Cache-Control: max-age=2592000 5 Expires: Mon, 27 Mar 2006 20:59:06 GMT 6 Last-Modified: Wed, 22 Feb 2006 23:23:13 GMT 7 ETag: "129ba06" 8 Accept-Ranges: bytes 9 Content-Length: 0 10 Content-Range: bytes */31 11 Keep-Alive: timeout=5, max=5 12 Connection: Keep-Alive 13 Content-Type: text/plain; charset=iso-8859-1 20:58:46 ERROR 416: Requested Range Not Satisfiable. --20:58:46-- http://www.example.com/favicon.ico => `www.example.com/favicon.ico' Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Sat, 25 Feb 2006 20:59:12 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 Cache-Control: max-age=2592000 5 Expires: Mon, 27 Mar 2006 20:59:12 GMT 6 Last-Modified: Tue, 13 Dec 2005 12:20:26 GMT 7 ETag: "17c22bd" 8 Accept-Ranges: bytes 9 Content-Length: 894 10 Keep-Alive: timeout=5, max=5 11 Connection: Keep-Alive 12 Content-Type: image/x-icon 200 OK --20:58:53-- http://www.example.com/inc/css/example.css => `www.example.com/inc/css/example.css' Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Sat, 25 Feb 2006 20:59:19 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 Cache-Control: max-age=17200 5 Expires: Sun, 26 Feb 2006 01:45:59 GMT 6 Last-Modified: Thu, 23 Feb 2006 02:55:10 GMT 7 ETag: "b57d48" 8 Accept-Ranges: bytes 9 Content-Length: 17547 10 Keep-Alive: timeout=5, max=5 11 Connection: Keep-Alive 12 Content-Type: text/css 200 OK --20:59:08-- http://www.example.com/inc/js/script.js => `www.example.com/inc/js/script.js' Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Sat, 25 Feb 2006 20:59:28 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 Cache-Control: max-age=2592000 5 Expires: Mon, 27 Mar 2006 20:59:28 GMT 6 Last-Modified: Wed, 22 Feb 2006 11:50:47 GMT 7 ETag: "1cb6dc7" 8 Accept-Ranges: bytes 9 Content-Length: 3898 10 Keep-Alive: timeout=5, max=5 11 Connection: Keep-Alive 12 Content-Type: application/x-javascript 200 OK --20:59:08-- http://www.example.com/inc/i/btn-send.png => `www.example.com/inc/i/btn-send.png' Connecting to www.example.com[34.34.34.34]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Sat, 25 Feb 2006 20:59:28 GMT 3 Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 4 Cache-Control: max-age=2592000 5 Expires: Mon, 27 Mar 2006 20:59:28 GMT 6 Last-Modified: Thu, 16 Feb 2006 12:07:03 GMT 7 ETag: "b57d55" 8 Accept-Ranges: bytes 9 Content-Length: 608 10 Keep-Alive: timeout=5, max=5 11 Connection: Keep-Alive 12 Content-Type: image/png 200 OK


From these examples you should be able to see what is going on that is so dramatic.

Basically, what this Expires does for you is it allows you to command the clients (site visitors) web browser to save a local (on their own computer) copy of everything you say to cache.

The way I have my htaccess setup (above) all the images on the site will be saved for 30 days on the users computer.

To REALLY speed up your site, add a line to the Expires in htaccess for text/html that expires every hour or two. The only reason that I do not do this, is that I have php displaying the current date and time, and I am using some random images in a creative way, so I cannot do this.
produke is offline  
Old 2-25-06, 08:44 PM   #2
mixerson
 
Join Date: Jan 2005
Location: Northeast
Posts: 185
Reputation: 52
If there are any HTTP cache-control header experts out there, please confirm or deny this, but...

As I understand it, the "expires" header tell the browser cache (or proxy server) to keep a page for at most a certain time, not at least a certain time.

If you want the browser to cache a page or image as long as possible, you don't have to send out any special headers. You should only send an "expires" header if you want to force a browser to get a fresh copy of a frequently updated page, instead of the cached copy.
mixerson is offline  
Old 2-26-06, 04:15 AM   #4
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
Did you look at the wget capture?

It sure doesn't look like there are default expire headers sent by apache.. in fact their are not.

Do you mean default as in ExpiresDefault? This is a command to specify all filetypes instead of specifying the correct ones.

Adding the Expires header is better than no expires header. It forces some browsers to cache a local copy, whereas no expires header isn't bad just slower... and dependent on client browser setttings.
produke is offline  
Old 2-26-06, 10:23 AM   #5
extras
on hiatus
 
Join Date: Mar 2004
Location: Canada
Posts: 5,815
Reputation: 314
Quote:
Originally Posted by produke
Adding the Expires header is better than no expires header. It forces some browsers to cache a local copy, whereas no expires header isn't bad just slower... and dependent on client browser setttings.
Which browser?
IE, for example, caches the data (depending on the setting) without Expire header.

How it supposed to work.
http://www.w3.org/Protocols/rfc2616/...3.html#sec13.4
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

Also, you may need to test many times to evaluate the effect more accurately.
Take a look at the access_log, too.
extras is offline  
Old 2-26-06, 10:36 AM   #6
mixerson
 
Join Date: Jan 2005
Location: Northeast
Posts: 185
Reputation: 52
As keyplyr mentioned, sending your own headers just overrides the default behavior, which is usually to cache whatever is likely to be needed again. The rational for the default cache behavior is mentioned in section 13.2.2 in the HTTP documenation at www.w3.org
Quote:
Since origin servers do not always provide explicit expiration times, HTTP caches typically assign heuristic expiration times, employing algorithms that use other header values (such as the Last-Modified time) to estimate a plausible expiration time.
In other words, if you don't send any Expires header, the browser or proxy cache will assign an expiration time for you. Setting a long expiration time may cause an item to be cached longer than without the Expires header, but it doesn't turn on the cache.

There are certain types of requests, such as authenticated pages, that are not cached by default. You can use the cache-control headers to override the default behavior and request that they be cached.

In the headers you captured (without Expires), you can see the entries:
Quote:
4 Last-Modified: Wed, 22 Feb 2006 23:23:13 GMT
5 ETag: "129ba06"
The Last-Modified and ETag headers are used by the default cache mechanism. Looking at the Last-Modified header, if a page hasn't been modified for a year is less likely to change than one modified yesterday, so it will be cached longer. If you just changed a page, and you know that you won't change it again for a long time, you could set a long expriration time to request that it be cached longer. The ETag header is used as a kind of checksum, so the server can check if a file hasn't changed since the last request and just return a "use the cached copy" response, instead of sending back the entire file.

I only use Expires headers to force short expiration times for frequently updated pages, but my browser cache is full of images and documents from my sites that don't send any Expires headers.

I think that if you remove the Expires stuff from your .htaccess file, you'll find that your browser will still cache files from your site.
mixerson is offline  
Old 2-26-06, 12:35 PM   #7
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
Quote:
I think that if you remove the Expires stuff from your .htaccess file, you'll find that your browser will still cache files from your site.

Yes but with Expires you can have such a higher level of control than without. Without, you can't be sure if your image is cached or not... there is a lot going on. When you specify Expires.. everyone (including search engines) like this more.

Code:
AddHandler application/x-httpd-php .htm Options +FollowSymLinks -Indexes -ExecCGI DirectoryIndex index.htm AddDefaultCharset ISO-8859-1 AddCharset ISO-8859-1 .css DefaultLanguage en ServerSignature Off ExpiresActive On ### expire GIF images after a month in the client's cache ExpiresByType image/gif A2592000 ExpiresByType image/png A2592000 ExpiresByType image/jpeg A2592000 ExpiresByType image/x-icon A2592000 ExpiresByType application/pdf A2592000 ExpiresByType application/x-javascript A2592000 ExpiresByType text/plain A2592000 ExpiresByType text/css A10800

There is a reason Powweb has it on the apache servers..

I think you'll find that if you actually do some testing with this, you will see the benefits.. if not, please don't quote the Apache documentation...

Let me know if you guys have any other tips specific to using htaccess.
A good tip is how this method lets the client cache even the favicon for a set period.


PDF WITHOUT HTACCESS
Code:
Date: Sun, 26 Feb 2006 16:11:02 GMT Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 Last-Modified: Mon, 20 Feb 2006 05:52:15 GMT ETag: "1aa1fe1-36fa1" Accept-Ranges: bytes Content-Length: 225185 Connection: close Content-Type: application/pdf WITH HTACCESS Date: Sun, 26 Feb 2006 16:12:43 GMT Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 Cache-Control: max-age=2592000 Expires: Tue, 28 Mar 2006 16:12:43 GMT Last-Modified: Mon, 20 Feb 2006 05:52:15 GMT ETag: "1aa1fe1" Accept-Ranges: bytes Content-Length: 225185 Connection: close Content-Type: application/pdf Content-Language: en

CSS WITHOUT HTACCESS
Code:
Date: Sun, 26 Feb 2006 16:08:55 GMT Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 Last-Modified: Sun, 26 Feb 2006 15:41:13 GMT ETag: "b57d48-45b4" Accept-Ranges: bytes Content-Length: 17844 Connection: close Content-Type: text/css WITH HTACCESS Date: Sun, 26 Feb 2006 16:21:45 GMT Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 Cache-Control: max-age=10800 Expires: Sun, 26 Feb 2006 19:21:45 GMT Last-Modified: Sun, 26 Feb 2006 16:21:38 GMT ETag: "b57d48" Accept-Ranges: bytes Content-Length: 17828 Connection: close Content-Type: text/css; charset=iso-8859-1 Content-Language: en



JPEG WITHOUT HTACCESS
Code:
Date: Sun, 26 Feb 2006 16:28:48 GMT Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 Last-Modified: Wed, 22 Feb 2006 12:16:56 GMT ETag: "b57d54-45e7" Accept-Ranges: bytes Content-Length: 17895 Connection: close Content-Type: image/jpeg WITH HTACCESS Date: Sun, 26 Feb 2006 16:23:52 GMT Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 Cache-Control: max-age=2592000 Expires: Tue, 28 Mar 2006 16:23:52 GMT Last-Modified: Wed, 22 Feb 2006 12:16:56 GMT ETag: "b57d54" Accept-Ranges: bytes Content-Length: 17895 Connection: close Content-Type: image/jpeg Content-Language: en

JAVASCRIPT WITHOUT HTACCESS
Code:
Date: Sun, 26 Feb 2006 16:28:19 GMT Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 Last-Modified: Sun, 26 Feb 2006 07:39:29 GMT ETag: "1cb6dc7-f2e" Accept-Ranges: bytes Content-Length: 3886 Connection: close Content-Type: application/x-javascript WITH HTACCESS Date: Sun, 26 Feb 2006 16:25:54 GMT Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 Cache-Control: max-age=2592000 Expires: Tue, 28 Mar 2006 16:25:54 GMT Last-Modified: Sun, 26 Feb 2006 07:39:29 GMT ETag: "1cb6dc7" Accept-Ranges: bytes Content-Length: 3886 Connection: close Content-Type: application/x-javascript Content-Language: en


PNG WITHOUT HTACCESS
Code:
Date: Sun, 26 Feb 2006 16:27:52 GMT Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 Last-Modified: Thu, 16 Feb 2006 14:37:34 GMT ETag: "6db4e0-141" Accept-Ranges: bytes Content-Length: 321 Connection: close Content-Type: image/png WITH HTACCESS Date: Sun, 26 Feb 2006 16:26:42 GMT Server: Apache/1.3.34 (Unix) FrontPage/5.0.2.2635 mod_ssl/2.8.25 OpenSSL/0.9.7d PowWeb/1.1 Cache-Control: max-age=2592000 Expires: Tue, 28 Mar 2006 16:26:42 GMT Last-Modified: Thu, 16 Feb 2006 14:37:34 GMT ETag: "6db4e0" Accept-Ranges: bytes Content-Length: 321 Connection: close Content-Type: image/png Content-Language: en
produke is offline  
Old 2-26-06, 02:06 PM   #8
extras
on hiatus
 
Join Date: Mar 2004
Location: Canada
Posts: 5,815
Reputation: 314
Produke, I appreciate your effort.
But you should provide more data to backup your claim.
Also, testing with the wget isn't representative, in this case.

To see if the effect of caching, you shold check the access_log (or Header capturing utility)
to confirm all requests coming from the browser and how the server responded.
With cached data, the browser doesn't make any request, normally.
Even with reload, the browser send a request with if-modified-since, and the server tells back it can use the cached data with 304 response.

I took time to test your code, and the result shows there is no difference what so ever in the way the browser and server reacted that influences the performance.

How I tested:
I used FireFox with LiveHTTPHeader.
I created "expire" directory where i put a .htaccess file with the code you suggested.
Then I put simple html file, aaa.html in the directory, as well as in the /htdocs.
I cleared browser cache, and requested aaa.html, and reloaded the page.
I cleared the browser cache and repeated the test.

In both cases, FF request with "if-modified-since" indicating that it cached the data,
and the server responded with usual 304 response code.
http://pow.check-these.info/expire.html

So, with or without the Expire header, FF cached the data and used it, even on the relaod.
Now, I want you to do similar test showing the case where Expire header really make the differences.

You can use following URLs for testing/comparison.

Without Expire code:
http://pow.check-these.info/aaa.html
http://pow.check-these.info/aaa2.html
With Expire code:
http://pow.check-these.info/expire/aaa.html
http://pow.check-these.info/expire/aaa2.html
extras is offline  
Old 3-1-06, 06:39 AM   #9
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
Quote:
Minimizing round trips over the Web to revalidate cached items can make a huge difference in browser page load times. Perhaps the most dramatic illustration of this occurs when a user returns to a site for the second time, after an initial browser session. In this case, all page objects will have to be revalidated, each costing valuable fractions of a second (not to mention consuming bandwidth and server cycles). On the other hand, utilizing proper cache control allows each of these previously viewed objects to be served directly out of the browser's cache without going back to the server. The effect of adding cache control rules to page objects is often visible at page load time, even with a high bandwidth connection, and users may note that your sites appear to paint faster and that "flashing" is reduced between subsequent page loads. Besides improved user perception, the Web server will be offloaded from responding to cache revalidation requests, and thus will be able to better serve new traffic.

However, in order to enjoy the benefits of caching, a developer needs to take time to write out a set of carefully crafted cache control policies that categorize a site's objects according to their intended lifetimes.


Quote:
The most cacheable representation is one with a long freshness time set. Validation does help reduce the time that it takes to see a representation, but the cache still has to contact the origin server to see if it’s fresh. If the cache already knows it’s fresh, it will be served directly.

In order to make the best use of any cache, including effectively using a browser cache, we need to provide some indication of when a resource is no longer valid and should therefore be reacquired. More specifically, we need the ability to indicate caching rules for Web page objects, ranging from setting appropriate expiration times to indicating when a particular object should not be cached at all. Fortunately, we have all of these tools at our disposal in the form of HTTP cache controls rules.

The key to cache awareness lies in understanding the two concepts that govern how caches behave: freshness and validation. Freshness refers to whether or not a cached object is up-to-date, or in more technical terms, whether or not a cached resource is in the same state as that same resource on the origin server. If the browser or other Web cache lacks sufficient information to confirm that a cached object is fresh, it will always err on the side of caution and treat it as possibly out-of-date or stale. Validation is the process by which a cache checks with the origin server to see whether one of those potentially stale cached object is fresh or not. If the server confirms that the cached object is still fresh, the browser will use the local resource; if not, a fresh copy must be served.

Once the data is downloaded to the cache, it is "stamped," indicating where it came from and at what time it was accessed. It may also be stamped with a third piece of information: when it needs to be reacquired. But, since most sites do not stamp their data with this explicit cache control information, we'll assume that our example lacks this information.

The user follows the link to page2.html, which has never been visited before and which references image1.gif, image3.gif, and image4.gif. In this case, the browser downloads the markup for the new page but the question is: should it re-download image1.gif and image3.gif even though it already has them cached? The obvious answer would be no, but, how can we be sure that the images have not changed since we downloaded page1.html? Without cache control information, the truth is that we can't. Therefore, the browser would need to revalidate the image by sending a request to the server in order to check if each image has been modified. If it has not been changed, the server will send a quick 304 Not Modified response that instructs the browser to go ahead and use the cached image. But, if it has been modified, a fresh copy of the image will have to be downloaded.

From this basic example, it is apparent that, even when CSS, images, and JavaScript are fresh, we may not get the caching benefit we expect, since the browser still has to make a round trip to the server before it can reuse the cached copy.

The default "Automatic" setting in Internet Explorer partially reduces this continual chatter between browser and server by skipping revalidation of cached objects during a single browser session. You will notice that page load time is generally much quicker when revisiting the same page during the same browser session. To see the performance penalty that would otherwise be incurred by all those 304 Not Modified responses, instead select "Every visit to the page."

I hope these quotes help explain how using mod_expires dramatically helps speed up your site.

Code:
ExpiresActive On ExpiresDefault "now plus 5 second" ExpiresByType image/gif A2592000 ExpiresByType image/png A2592000 ExpiresByType image/jpeg A2592000 ExpiresByType image/x-icon A2592000 ExpiresByType application/pdf A2592000 ExpiresByType application/x-javascript A2592000 ExpiresByType text/plain A2592000 ExpiresByType application/x-shockwave-flash A2592000

Extras.. wget is more reliable and accurate than livehttp headers on any day of the week.. I guess a cgi or tcpdump would be more verbose.. I'm curious why you advise against using wget?
produke is offline  
Old 3-1-06, 10:28 AM   #10
extras
on hiatus
 
Join Date: Mar 2004
Location: Canada
Posts: 5,815
Reputation: 314
Produke, use tcpdump if you want, but see the fact that FF (and IE) use the cache without Expire header.
And they use cahce even with reload (in this case, the browser checks the freshness, though).

In other words, to prove the merit of added Expire header, you should run at least 4 tests.
#1 Without Expire header, 1st run after clearing the browser cache.
#2 Without Expire header, 2nd run.
#3 With Expire header, 1st run after clearing the browser cache.
#4 With Expire header, 2nd run.

And if #2 and #4 show different results, then you can say it improves.
If not, it has no effect.

I did exactly that, and I didn't see any differences.
Both #2 and #4 used cached data, didn't make any request for the image.
So, adding Expire header had no impact, in that test.

It's time for you to do the same and show the result.

As for wget vs browser, you need to understand we are concerned about how browsers act.
Wget is a good tools for testing certain things, but not for this one, as it may act differently.
And to see how browsers act, you need to use a tool like LiveHTTPHeader or to look at access_log.
(Tcpdump, EtherReal can be used, too. But access_log is far easier to check. )
extras is offline  
Old 3-2-06, 07:27 PM   #11
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
Arrow

extras- I appreciate you motivating me to prove what I know, as that helps others, which is the point. So thanks. And although I would love to use ethereal or tcpdump to give you uncompromising data, I just don't have the time at the moment (in the middle of web site launches). Instead, I wrote a simple php script that consists of a form with a single input box to enter any web address. The script then calls a bash shell script with the value from the input box to issue the following commands which retrieve the headers returned for the requested web address.

Code:
HEAD -fe URL GET -fe -m head URL wget -S --spider -v URL w3m -dump_head URL lynx -head -dump URL

I learned how to write these scripts thanks to an incredible resource of tools/scripts/source code at http://check-these.info/

With that said, I am now 100% clear and convinced that the method I outlined in my first post will dramatically speed up your site to visitors!

Unfortunately, because the powweb forum people DISABLED img posting, I cannot include the image that details exactly how this method works so well.

I had to post it on my development site, http://www.produke.com


You will notice from these images, how expires works.

With expires OFF, your browser does cache the images, but what you fail to realize is that because your browser has no expires info on your images, your browser must re-check that an image in its cache has not been modified.. it does this for every file, every time you access the page! SLOWWWW!

With expires ON, your browser caches the images, but in your cache, it knows that it doesn't have to check for a new version until the expires information tells it that it has expired. FASSSSTTT!

Its pretty simple, and I personally have seen a dramatic increase in my sites performance, and so have my clients.

Heres the latest relevant htaccess I am using

Code:
ExpiresActive On ExpiresDefault "A1" ### 1 month ExpiresByType image/x-icon A2592000 ExpiresByType application/x-shockwave-flash A2592000 ### 1 week ExpiresByType text/plain A604800 ExpiresByType application/pdf A604800 ExpiresByType image/gif A604800 ExpiresByType image/png A604800 ExpiresByType image/jpeg A604800 ### 1 hour ExpiresByType application/x-javascript A3600 ExpiresByType text/css A3600 ### 1 second ExpiresByType text/html A1
produke is offline  
Old 3-2-06, 08:42 PM   #12
extras
on hiatus
 
Join Date: Mar 2004
Location: Canada
Posts: 5,815
Reputation: 314
Quote:
Originally Posted by produke
With expires OFF, your browser does cache the images, but what you fail to realize is that because your browser has no expires info on your images, your browser must re-check that an image in its cache has not been modified.. it does this for every file, every time you access the page! SLOWWWW!
I don't think so.
I've tested with FireFox and IE (5), and both of them used cache without making any request for the server, regardless of the Expires header.

See it by yourself. I made test case for you.
Go to this directory and click and aaa.html, and then aaa2.html, and finally showlog.cgi
http://pow.check-these.info/expire/

Repeat the same for another directory without the Expires header.
http://pow.check-these.info/noexpire/

You can also see the .htaccess in each directory.
The showlog.cgi will show you that aaa.gif isn't accessed with the request for aaa2.html.
In other words, browsers use the cacheed aaa.gif without checking the freshness.

But I'm not saying that it will never make any differences.
If you find a case I can confirm, let me know.
So far, the test method presented by you weren't enough, IMO.
extras is offline  
Old 3-2-06, 10:45 PM   #13
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
extras-

It is not a short-term gain in bandwidth.. This method is a long-term way to guarantee speedier rendering for your visitors. I would suggest that you read up on Caches in general and about the expires header to better understand the result of this method. IOW, I don't have the expertise to really explain the inner-workings of this apache module.. GRTFM but not apaches online docs.. I would recommend you look at the source code for the mod_expires and google it.

Another side benefit that I failed to mention is that any proxies that are operating between you and your visitor will see and USE the expires header.

You could also assume that search engines use this information.

The special part of this is that the headers are sent by the server.. not by a meta tag in the html file. This is key to getting search engine robots and other hard/soft-ware to really use this header.

It is ONLY meant to speed up sites that have repeat-visitors. I am using this method on a site that has a lot of members, who access the page at varying frequencies.

It enables your visitors to shorten their packets by not asking when the last modification time is, it also instructs their cache for how long to keep the images before the browser sends a GET request for the object with an If-Modified-Since header. Then the server (your webpage) will have to respond with a 304.. all useless if you use expires.

I might get back to you on this with some raw packets, but I just can't right now. The testing involved in proving what this was designed and implemented by apache to do would be a lengthy excercise in packet capturing and study.
produke is offline  
Old 3-3-06, 01:30 PM   #14
extras
on hiatus
 
Join Date: Mar 2004
Location: Canada
Posts: 5,815
Reputation: 314
Produke, you are not giving any hard evidence.
Read what I wrote, again, and do the test to see by yourself, please.

So far, you are not showing anything that can be retested/confirmed by us, unlike I've done for you.
And my tests show that your clain isn't true, at all.

If you have time to post some text without any supportind data, please do the test I showed.
It only takes a few minutes.
extras is offline  
Old 3-3-06, 04:25 PM   #15
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
the test you recommended would not be worthwhile to do, that test would not provide enough evidence.

Only a packet level program and subsequent ananlysis would be good enough evidence.

But I say, why do what apache programmers have already done?
produke is offline  
Old 3-3-06, 04:53 PM   #16
extras
on hiatus
 
Join Date: Mar 2004
Location: Canada
Posts: 5,815
Reputation: 314
Quote:
Originally Posted by produke
the test you recommended would not be worthwhile to do, that test would not provide enough evidence.
Well, it's enough to show that your claim isn't true.
The access_log shows all requests browsers make (unless logging server is having trouble).
So, we can say that regardless of the Expires header, cache is utilized and browsers (at least IE5 and FF) are not making any request for the cached image with the test.
Very simple and easy to understand.

Quote:
Only a packet level program and subsequent ananlysis would be good enough evidence.
But I say, why do what apache programmers have already done?
I guess you don't want to do any test possibly showing your claim isn't tue.
I was hoping that you can come up with solid data and test condition that prove your theory and we can retest/confirm.
Too bad.
extras is offline  
Old 3-8-06, 02:23 AM   #17
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
No, you are right, but I just don't have the time right now to test, I stated earlier that I might do a thorough test when I have the time (freetime).

I did some initial testing to see if the test was doable in a short time or not, and for this I just used ethereal, IE, and firefox, and their respective temporary file folders. I also used an OpenBSD box with Squid running to view the effects expires has on an intermediary proxy. This is no small test, don't pretend it is.

I think the confusion is occuring because I didn't make clear that this is a client-side improvement in speed. Ultimately, if you use expires, your visitors will see your webpage rendered faster because their browsers will be able to determine when to fetch from the cache and when to send an If-Modified-Since header, and when to fetch from the server, and when to clear an item from its cache.


Quote:
Originally Posted by apacheref.com
mod_expires controls the setting of the HTTP Expires header field in server responses. The expiration date can be set relative to either the time of the source document's last modification, or the time of the last client access. This information informs the client and intermediate HTTP proxies about the document's validity and persistence. If cached, the document may be fetched from the cache rather than from the source until the expiration date has passed. After that time, the cache copy is considered ``expired'' and invalid, and a new copy must be obtained from the source.
produke is offline  
Old 3-8-06, 04:51 AM   #18
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
Lightbulb

Ok extras, I did a brief test to stop the debate. I think this really clears up what exactly expires does, and shows how using it can dramatically speed up the rendering of your webpages to visitors. There shouldn't be anymore doubts, but hey, I'm here for ya!

KEY:

Clean Start == I browsed to google and deleted all files/cookies/and history and setup my test page as the default homepage. Then I closed the browser, turned on ethereal, and opened the browser.
Refreshed == After the clean start, I restarted ethereal, and then refreshed the current test page.
Closed and Reopened == Without clearing the cache, I closed the browser, started ethereal, then reopened the browser.


Expires Off
Clean Start
Code:
me->S GET / HTTP/1.1 S->me HTTP/1.0 200 OK (text/html) me->S GET /ij/y.js HTTP/1.1 S->me HTTP/1.0 200 OK (application/x-javascript) me->S GET /ic/y.css HTTP/1.1 S->me HTTP/1.0 200 OK (text/css) me->S GET /ii/bg-hnav.png HTTP/1.1 me->S GET /ii/logo1.png HTTP/1.1 me->S GET /ii/bg.jpg HTTP/1.1 me->S GET /ii/encrypted.png HTTP/1.1 S->me HTTP/1.0 200 OK (JPEG JFIF image) me->S GET /ii/logo2.png HTTP/1.1 me->S GET /ii/btn-qsend.png HTTP/1.1 me->S GET /ii/player.jpg HTTP/1.1 S->me HTTP/1.0 200 OK (image/png) S->me HTTP/1.0 200 OK (image/png) S->me HTTP/1.0 200 OK (JPEG JFIF image) me->S GET /ii/bg-input.png HTTP/1.1 S->me HTTP/1.0 200 OK (image/png)
Expires On
Clean Start
Code:
No. Time Source Destination Protocol Info me->S GET / HTTP/1.1 S->me HTTP/1.0 200 OK (text/html) me->S GET /ij/y.js HTTP/1.1 S->me HTTP/1.0 200 OK (application/x-javascript) me->S GET /ic/y.css HTTP/1.1 S->me HTTP/1.0 200 OK (text/css) me->S GET /ii/logo2.png HTTP/1.1 me->S GET /ii/logo1.png HTTP/1.1 me->S GET /ii/bg.jpg HTTP/1.1 me->S GET /ii/bg-hnav.png HTTP/1.1 S->me HTTP/1.0 200 OK (image/png) S->me HTTP/1.0 200 OK (image/png) S->me HTTP/1.0 200 OK (JPEG JFIF image) S->me HTTP/1.0 200 OK (image/png) me->S GET /ii/encrypted.png HTTP/1.1 me->S GET /ii/btn-qsend.png HTTP/1.1 me->S GET /ii/player.jpg HTTP/1.1 S->me HTTP/1.0 200 OK (image/png) S->me HTTP/1.0 200 OK (image/png) S->me HTTP/1.0 200 OK (JPEG JFIF image) me->S GET /ii/bg-input.png HTTP/1.1 S->me HTTP/1.0 200 OK (image/png)
Expires Off
Refreshed
Code:
me->S GET / HTTP/1.1 S->me HTTP/1.0 200 OK (text/html) me->S GET /ij/y.js HTTP/1.1 S->me HTTP/1.0 304 Not Modified me->S GET /ic/y.css HTTP/1.1 S->me HTTP/1.0 304 Not Modified me->S GET /ii/bg-hnav.png HTTP/1.1 me->S GET /ii/logo1.png HTTP/1.1 me->S GET /ii/logo2.png HTTP/1.1 me->S GET /ii/bg.jpg HTTP/1.1 S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified me->S GET /ii/encrypted.png HTTP/1.1 me->S GET /ii/player.jpg HTTP/1.1 me->S GET /ii/btn-qsend.png HTTP/1.1 me->S GET /ii/bg-input.png HTTP/1.1 S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified me->S GET /ii/logo1.png HTTP/1.1 me->S GET /ii/bg-hnav.png HTTP/1.1 me->S GET /ii/logo2.png HTTP/1.1 me->S GET /ii/bg.jpg HTTP/1.1 S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified me->S GET /ii/encrypted.png HTTP/1.1 S->me HTTP/1.0 304 Not Modified
Expires On
Refreshed
Code:
me->S GET / HTTP/1.1 S->me HTTP/1.0 200 OK me->S GET /ij/y.js HTTP/1.1 S->me HTTP/1.0 304 Not Modified me->S GET /ic/y.css HTTP/1.1 S->me HTTP/1.0 304 Not Modified me->S GET /ii/bg-hnav.png HTTP/1.1 me->S GET /ii/logo1.png HTTP/1.1 me->S GET /ii/logo2.png HTTP/1.1 me->S GET /ii/bg.jpg HTTP/1.1 S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified me->S GET /ii/encrypted.png HTTP/1.1 me->S GET /ii/btn-qsend.png HTTP/1.1 me->S GET /ii/player.jpg HTTP/1.1 me->S GET /ii/bg-input.png HTTP/1.1 S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified me->S GET /ii/logo1.png HTTP/1.1 me->S GET /ii/logo2.png HTTP/1.1 me->S GET /ii/encrypted.png HTTP/1.1 me->S GET /ii/bg.jpg HTTP/1.1 S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified me->S GET /ii/bg-hnav.png HTTP/1.1 S->me HTTP/1.0 304 Not Modified
Expires Off
Closed and Reopened
Code:
me->S GET / HTTP/1.1 S->me HTTP/1.0 200 OK me->S GET /ij/y.js HTTP/1.1 S->me HTTP/1.0 304 Not Modified me->S GET /ic/y.css HTTP/1.1 S->me HTTP/1.0 304 Not Modified me->S GET /ii/bg-hnav.png HTTP/1.1 me->S GET /ii/logo2.png HTTP/1.1 me->S GET /ii/bg.jpg HTTP/1.1 me->S GET /ii/logo1.png HTTP/1.1 S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified me->S GET /ii/btn-qsend.png HTTP/1.1 me->S GET /ii/encrypted.png HTTP/1.1 me->S GET /ii/player.jpg HTTP/1.1 S->me HTTP/1.0 304 Not Modified me->S GET /ii/bg-input.png HTTP/1.1 S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified S->me HTTP/1.0 304 Not Modified
Expires On
Closed and Reopened
Code:
me->S GET / HTTP/1.1 S->me HTTP/1.0 200 OK (text/html)
produke is offline  
Old 3-8-06, 04:53 AM   #19
HalfaBee
 
HalfaBee's Avatar
 
Join Date: Feb 2002
Location: Sydney, Australia
Posts: 7,130
Reputation: 333
Quote:
Originally Posted by produke
I was looking into speeding up my site.. specifically by using headers to specify custom cacheing..

Previously I was using php to send the necessary headers, but now I just have this in my /htdocs/.htaccess file.


using php was probably the thing slowing your site down.
Any files that don't need php should never use php on a shared server.
__________________
I don't suffer from laziness, I enjoy every minute!
Edit your php.ini here
http://members.powweb.com/member/cgi...nt/PHPplus.bml
HalfaBee is online now  
Old 3-8-06, 05:28 AM   #20
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
Yeah I hate that...I do love that php is server-side, I don't want to require javascript ever from site visitors.. So any speedier cgi or shell script solutions are really what I'm looking for.

Expires doesn't work well with dynamic pages like htm pages that are parsed for php.. Thats why I am using headers in my php.

I need php to display the current date and random pictures, and for other stuff. I don't want to get off-topic, but please let me know if I can do anything better/faster.



Heres the php I am currently using on htm files - [ index.htm ]
Code:
<?php ob_start(); header("Expires: Mon, 20 Dec 2004 01:00:00 GMT"); header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT"); header('Content-Type: text/html; charset=iso-8859-1'); header('Content-Language: en-us'); ob_end_flush(); $wt[0] = 3;$wt[1] = 3;$wt[2] = 5;$wt[3] = 1;$wt[4] = 3;$wt[5] = 1;$wt[6] = 3;$wt[7] = 5; for ( $a=0; $a<count($wt); $item[$a] = $a, $a++ ) for ( $b=1; $b<=$wt[$a]; $b++ ) $pick[] = $a; $rpic = '<span id="p'.$item[$pick[rand(0,count($pick)-1)]].'" class="randImg"></span>'; putenv("TZ=America/Indianapolis"); $mdate = '<span id="cT">'.date("D M j Y g:i:s A T").'</span>'; $lmodd = "Updated: ".date("F d, Y", getlastmod()); ?>

Heres the htaccess I am currently using - [ .htaccess ]
Code:
AddHandler application/x-httpd-php .htm Options +FollowSymLinks -Indexes -ExecCGI DirectoryIndex index.htm ### sets it to iso-8859-1 AddDefaultCharset On DefaultLanguage en-us ErrorDocument 404 /includes/error/error.htm ExpiresActive On ExpiresDefault "A60" ### 1 month ExpiresByType image/x-icon A2592000 ExpiresByType application/x-shockwave-flash A2592000 ### 1week ExpiresByType text/plain A604800 ExpiresByType application/pdf A604800 ExpiresByType image/gif A604800 ExpiresByType image/png A604800 ExpiresByType image/jpeg A604800 ### 1 hour ExpiresByType application/x-javascript A3600 ExpiresByType text/css A3600 ### fix no www RewriteEngine On RewriteCond %{HTTP_HOST} !^www\. RewriteCond %{HTTP_HOST} !powweb\.com$ RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]


I also use a combination of javascript and CSS to preload my main images

HEAD - [ index.htm ]
Code:
<!-- downloads javascript before css to start the preloading of images faster --> <script src="/ij/y.js" type="text/javascript"></script> <link href="/ic/y.css" media="all" rel="stylesheet" type="text/css" />


JAVASCRIPT - [ y.js ]
Code:
function initjs() { var d=document; if(d.images){ if(!d.PL) d.PL=new Array(); var i,j=d.PL.length,a=initjs.arguments; for(i=0; i<a.length; i++) if (a[i].indexOf("#")!=0){ d.PL[j]=new Image; d.PL[j++].src=a[i];}}}


BODY - [ index.htm ]
Code:
<body onload="initjs('/ii/bg-hnav.png','/ii/logo.png','/ii/logo1.png','/ii/encrypted.png','/ii/bg.jpg');">


I dont use any IMG tags, instead I use CSS to apply a background image to an A tag - [ index.htm ]
Code:
<a href="/" id="logo" title="Home"></a>

CSS - [ y.css ]
Code:
a#logo {position:absolute; display:block; top:9px; left:1px; width:207px; height:43px; background:#FFF url(/ii/logo.png) no-repeat; z-index:110; cursor:pointer;}

But for my javascript to work I need - [ index.htm ]
Code:
<div id="ns"> <img src="/ii/bg-hnav.png" alt="ns" /> <img src="/ii/logo.png" alt="ns" /> <img src="/ii/logo1.png" alt="ns" /> <img src="/ii/encrypted.png" alt="ns" /> <img src="/ii/bg.jpg" alt="ns" /> </div>

And then I use this CSS - [ y.css ]
Code:
#ns {position:absolute; bottom:0; left:0; display:none; visibility:hidden; height:0; width:0; overflow:hidden;}

If you see anywhere I can improve please tell me! With all of this in place, my site is sooo much faster than before.. which is important because powweb is so darn slow!
produke is offline  
Old 3-8-06, 10:50 AM   #21
extras
on hiatus
 
Join Date: Mar 2004
Location: Canada
Posts: 5,815
Reputation: 314
Good. You finally did the test.

Looking at your test result, it's clear for me that you didn't needed TCPdump or Etherrreal.
Checking raw log or using utilities like LiveHTTPHeaders (or the equivalent for IE) is enough.
Both can show all requests made by the browser and the response code for it.

Also, as I said, there is no difference in the reaction of browsers for the initial request and the refresh.
Although you didn't check, there is no difference for the request for the page that uses same resources, either.

But you are showing new data (at last ....) that it affects the reaction of browser (IE?) after closing and reopenning.
So, that might be the "condition" Expires header makes difference. I'll test that, later.

I've been wondering the default behavior of IE & FF about the cache.
I know that IE had (still have?) the option to check the freshness on a) each request, b) each start up
or c) Never, or something like that.
I tried to find that option in IE5 the other day, but I couldn't.
Maybe it's not there anymore, and the default is set to c) Never.(The default behavior in the past was b) on each startup).
If so, Expires will have impact on someone who visits the site repeatedly, and if s/he closes the browser in between.

I searched MSDN and some other places about the default behavior of IE about the cache,
but I didn't find anything, so far.
In old FF I use, there is no option (but maybe I can set using config file...)


In short, for visitors like me (my machine is on 24/7, with FF always opened ), it wouldn't make any difference, unfortunately.

For someone who turns off machine each night, or someone using unstable OS that needs to be rebooted often, it may have impact on repeated visits.

However, you should be aware of the side effect of using it.
When you want to modify some items, you have no mean to change them unless you use different names and change all links and references for them (unless you are willing to wait for the expiration time).
It can be a burden for some site, in some cases.

Last edited by extras : 3-8-06 at 12:23 PM.
extras is offline  
Old 3-8-06, 12:24 PM   #22
mixerson
 
Join Date: Jan 2005
Location: Northeast
Posts: 185
Reputation: 52
Quote:
Extras: I know that IE had (still have?) the option to check the freshness on a) each request, b) each start up
or c) Never, or something like that.
I tried to find that option in IE5 the other day, but I couldn't.
In IE6/Windows, go to the Tools/Internet Options dialog, then click the Settings button in the Temporary Internet Files section.

The options for "Check for newer version of stored pages" are:
  • Every visit to the page
  • Every time you start Internet Explorer
  • Automatically
  • Never
I don't remember what the default is, but I assume it was "Automatically"
mixerson is offline  
Old 3-8-06, 12:28 PM   #23
extras
on hiatus
 
Join Date: Mar 2004
Location: Canada
Posts: 5,815
Reputation: 314
Thank you.
As I don't use IE very often anymore, I was lost in it.

I guess the default is "Automatically", too.
So, I'll find out what it does, after closing and reopening.
extras is offline  
Old 3-9-06, 05:03 PM   #24
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
Hey extras- I just wanted to warn you about relying on server logs or especially relying on livehttp

LiveHTTP is a cool extension, no doubt, but you seem to be unaware that it regularly misses requests. It definately should not be relied upon for serious testing.

The ONLY way to be sure about the packets is to use a program like ethereal.

When I use ethereal for this type of work (or tethereal), I use the builtin filter to specify that it should only capture (or display) packets that use the HTTP protocol, the filter is quite simply "http"

And as for relying on the server logs, ops is sooo slow! And even apache is known to miss some requests.

A huge benefit of using a packet-level capture program is that you can also see info such as content-length, content-type, acknowledgement and sequence info, etc..

I used to be a computer security consultant, thats how I know about this random stuff.
produke is offline  
Old 3-9-06, 05:32 PM   #25
extras
on hiatus
 
Join Date: Mar 2004
Location: Canada
Posts: 5,815
Reputation: 314
I know the situation of the raw log, here.
(You will see some posts of mine treating the subject if you search.)
It wasn't very reliable in the past, especially after the clustering setup was introduced.
But it's more reliable now, and if it misses, it doesn't miss always the same type of request.
So, you will definitely know if something is wrong with logging by repeating the tests.
So far, during the test I've done, i didn't see any missing entry.

As for LiveHTTPHeader, I never noticed the case.
As I often cross check with log or other means, I don't agree with your statemewnt
"it regularly misses requests.", again, presented without any supporting evidence.
But I take a note, and I'll recheck things if necessary.

I don't deny the virtue of TCPdump or Ethereal or any other utilities.
I use them when I need.
In this case, there was no real need, as the raw log supplies all info needed.

And I just finished the test on Win/Linux using IE/FF/Konqueror.
Wait for the next post.

Last edited by extras : 3-9-06 at 07:06 PM.
extras is offline  
Old 3-9-06, 05:36 PM   #26
extras
on hiatus
 
Join Date: Mar 2004
Location: Canada
Posts: 5,815
Reputation: 314
Testing method:

I accessed following URLs, and then closed the browser and revisited them, again.
I tested with FireFox and konquror.
With Expiration header on gif
http://pow.check-these.info/expire/aaa.html
http://pow.check-these.info/expire/aaa2.html
http://pow.check-these.info/expire/showlog.cgi
Without Expiration header
http://pow.check-these.info/noexpire/aaa.html
http://pow.check-these.info/noexpire/aaa2.html
http://pow.check-these.info/noexpire/showlog.cgi



FireFox/Linux old version:

With Expiration heafers

Fresh access:
[09/Mar/2006:10:51:00 -0800] "GET /expire/aaa.html HTTP/1.1" 200 586
[09/Mar/2006:10:51:00 -0800] "GET /expire/aaa.gif HTTP/1.1" 200 10563
[09/Mar/2006:10:51:02 -0800] "GET /expire/aaa2.html HTTP/1.1" 200 586
[09/Mar/2006:10:51:04 -0800] "GET /expire/showlog.cgi HTTP/1.1" 200 3427
Close and reopen:
[09/Mar/2006:10:52:32 -0800] "GET /expire/showlog.cgi HTTP/1.1" 200 3622


Without Expiration headers

Fresh access:
[09/Mar/2006:10:51:14 -0800] "GET /noexpire/aaa.html HTTP/1.1" 200 586
[09/Mar/2006:10:51:14 -0800] "GET /noexpire/aaa.gif HTTP/1.1" 200 10563
[09/Mar/2006:10:51:16 -0800] "GET /noexpire/aaa2.html HTTP/1.1" 200 586
[09/Mar/2006:10:51:19 -0800] "GET /noexpire/showlog.cgi HTTP/1.1" 200 2432
Close and reopen:
[09/Mar/2006:10:52:55 -0800] "GET /noexpire/showlog.cgi HTTP/1.1" 200 2629


Konqueror/Linux

With Expiration heafers

Fresh access:
[09/Mar/2006:10:29:49 -0800] "GET /expire/aaa.html HTTP/1.1" 200 586
[09/Mar/2006:10:29:51 -0800] "GET /expire/aaa.gif HTTP/1.1" 200 10563
[09/Mar/2006:10:29:54 -0800] "GET /expire/aaa2.html HTTP/1.1" 200 586
[09/Mar/2006:10:30:00 -0800] "GET /expire/showlog.cgi HTTP/1.1" 200 594
Close and reopen:
[09/Mar/2006:10:31:14 -0800] "GET /expire/aaa.html HTTP/1.1" 200 586
[09/Mar/2006:10:31:19 -0800] "GET /expire/aaa2.html HTTP/1.1" 200 586
[09/Mar/2006:10:31:21 -0800] "GET /expire/showlog.cgi HTTP/1.1" 200 1342


Without Expiration headers

Fresh access:
[09/Mar/2006:10:30:24 -0800] "GET /noexpire/aaa.html HTTP/1.1" 200 586
[09/Mar/2006:10:30:24 -0800] "GET /noexpire/aaa.gif HTTP/1.1" 200 10563
[09/Mar/2006:10:30:27 -0800] "GET /noexpire/aaa2.html HTTP/1.1" 200 586
[09/Mar/2006:10:30:30 -0800] "GET /noexpire/showlog.cgi HTTP/1.1" 200 655
Close and reopen:
[09/Mar/2006:10:32:12 -0800] "GET /noexpire/aaa.html HTTP/1.1" 200 586
[09/Mar/2006:10:32:14 -0800] "GET /noexpire/aaa2.html HTTP/1.1" 200 586
[09/Mar/2006:10:32:16 -0800] "GET /noexpire/showlog.cgi HTTP/1.1" 200 1466

IE5.5 Windows

With Expiration heafers

Fresh access:
[09/Mar/2006:11:32:23 -0800] "GET /expire/aaa.html HTTP/1.1" 200 586
[09/Mar/2006:11:32:23 -0800] "GET /expire/aaa.gif HTTP/1.1" 200 10563
[09/Mar/2006:11:32:25 -0800] "GET /expire/aaa2.html HTTP/1.1" 200 586
[09/Mar/2006:11:32:27 -0800] "GET /expire/showlog.cgi HTTP/1.1" 200 6173
Close and reopen:
[09/Mar/2006:11:33:06 -0800] "GET /expire/aaa.html HTTP/1.1" 304 -
[09/Mar/2006:11:33:10 -0800] "GET /expire/aaa2.html HTTP/1.1" 304 -
[09/Mar/2006:11:33:13 -0800] "GET /expire/showlog.cgi HTTP/1.1" 200 6697


Without Expiration headers

Fresh access:
[09/Mar/2006:11:32:40 -0800] "GET /noexpire/aaa.html HTTP/1.1" 200 586
[09/Mar/2006:11:32:40 -0800] "GET /noexpire/aaa.gif HTTP/1.1" 200 10563
[09/Mar/2006:11:32:43 -0800] "GET /noexpire/aaa2.html HTTP/1.1" 200 586
[09/Mar/2006:11:32:46 -0800] "GET /noexpire/showlog.cgi HTTP/1.1" 200 5315

Close and reopen:
[09/Mar/2006:11:33:23 -0800] "GET /noexpire/aaa.html HTTP/1.1" 304 -
[09/Mar/2006:11:33:23 -0800] "GET /noexpire/aaa.gif HTTP/1.1" 304 -
[09/Mar/2006:11:33:25 -0800] "GET /noexpire/aaa2.html HTTP/1.1" 304 -
[09/Mar/2006:11:33:27 -0800] "GET /noexpire/showlog.cgi HTTP/1.1" 200 6081


FireFox Windows

With Expiration heafers

Fresh access:
[09/Mar/2006:11:34:20 -0800] "GET /expire/aaa.html HTTP/1.1" 200 586
[09/Mar/2006:11:34:20 -0800] "GET /expire/aaa.gif HTTP/1.1" 200 10563
[09/Mar/2006:11:34:22 -0800] "GET /expire/aaa2.html HTTP/1.1" 200 586
[09/Mar/2006:11:34:25 -0800] "GET /expire/showlog.cgi HTTP/1.1" 200 7445

Close and reopen:
[09/Mar/2006:11:35:01 -0800] "GET /expire/showlog.cgi HTTP/1.1" 200 7629


Without Expiration headers

Fresh access:
[09/Mar/2006:11:34:35 -0800] "GET /noexpire/aaa.html HTTP/1.1" 200 586
[09/Mar/2006:11:34:35 -0800] "GET /noexpire/aaa.gif HTTP/1.1" 200 10563
[09/Mar/2006:11:34:37 -0800] "GET /noexpire/aaa2.html HTTP/1.1" 200 586

Close and reopen:
[09/Mar/2006:11:34:38 -0800] "GET /noexpire/showlog.cgi HTTP/1.1" 200 6839



Conclusion:
For FireFox (both Linux and Win2K) and Konqueror/Linux, there was no effect of Expiration headers.

However, it was interesting to see, FF used cachce without checking freshness for html and gif,
while Konqueror didn't use cache for html but used cached data without checking freshness for gif, after closing anf reopening the browser.


On IE5.5, when the cache checking is set to "Automaic", the Expiration headers DOES
make difference by skipping the request.
And it will reduce the load time IF there are many items in the page.
(When there are only small number of items, with the 304 response,
I don't think it changes a lot)


My recommendation:
As majority of people are still using MSIE with default configuration,
the use of Expiration headers may reduce the bandwidth and hit, and improve the page loading time for the visitors, if they turn off the machine (or browser).

For users of FireFox, for people who keep their machine (and browser) open 24/7,
there is no benefit, as far as I could see.

Personally, I wouldn't bother, as both the bandwidth and the requests are not my concern,
and frequent visitors of my main site use FireFox, which uses cached data without Expiration header.


Note:
These tests are limited to .html and .gif.
Some other file types may benefit from Expires much more.

If you find something interesting, please report.
extras is offline  
Old 3-9-06, 07:24 PM   #27
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
extras- your information is completely useless without the full (at least relevant) htaccess file used.

I went ahead and did the same procedure I did before using FF on windows, and found you are correct. The FF cache may or may not use the Expires header, but even without the expires header it does not send 304's upon reopening of the browser.

And why in the world would you keep FF on linux open 24/7? I think you might be the only one that does that. For one thing, FF isn't made with AJAX, you actually have to close and reopen to update the program. And for another thing, windows crashes or has to reboot ALL THE TIME due to software upgrades and bad programming, and FF is more resource intensive on windows than a linux/BSD environment.

Also, I'm curious why you would prefer to use Konquerer.. I find it to be slow due to its multi-purpose code of filemanager/browser

If you are using linux I would strongly suggest using Opera instead.. It is very nice! And my fav linux is archlinux, though I am also running slackware and gentoo.

-------------------------------

I repeated this same test using a Squid Proxy in between my browser and the target site. I tried the tests using the proxy in full mode (meaning I had to configure the proxy in my browser) and also in transparent mode (where all browsers use the proxy without knowing it.

I found that the Squid proxy uses the Expires Header as its #1 indicator of a cached files freshness! This is big news for anyone now using mod_expires...

Quote:
Originally Posted by Squid Documentation
12.20 How does Squid decide when to refresh a cached object?

When checking the object freshness, we calculate these values:

* OBJ_DATE is the time when the object was given out by the origin server. This is taken from the HTTP Date reply header.
* OBJ_LASTMOD is the time when the object was last modified, given by the HTTP Last-Modified reply header.
* OBJ_AGE is how much the object has aged since it was retrieved:

OBJ_AGE = NOW - OBJ_DATE

* LM_AGE is how old the object was when it was retrieved:

LM_AGE = OBJ_DATE - OBJ_LASTMOD

* LM_FACTOR is the ratio of OBJ_AGE to LM_AGE:

LM_FACTOR = OBJ_AGE / LM_AGE

* CLIENT_MAX_AGE is the (optional) maximum object age the client will accept as taken from the HTTP/1.1 Cache-Control request header.
* EXPIRES is the (optional) expiry time from the server reply headers.

These values are compared with the parameters of the refresh_pattern rules. The refresh parameters are:

* URL regular expression
* CONF_MIN: The time (in minutes) an object without an explicit expiry time should be considered fresh. The recommended value is 0, any higher values may cause dynamic applications to be erronously cached unless the application designer has taken the appropriate actions.
* CONF_PERCENT: A percentage of the objects age (time since last modification age) an object without explicit exipry time will be considered fresh.
* CONF_MAX: An upper limit on how long objects without an explicit expiry time will be considered fresh.

The URL regular expressions are checked in the order listed until a match is found. Then the algorithms below are applied for determining if an object is fresh or stale.
Squid-1.1 and Squid-1.NOVM algorithm

Code:
if (CLIENT_MAX_AGE) if (OBJ_AGE > CLIENT_MAX_AGE) return STALE if (OBJ_AGE <= CONF_MIN) return FRESH if (EXPIRES) { if (EXPIRES <= NOW) return STALE else return FRESH } if (OBJ_AGE > CONF_MAX) return STALE if (LM_FACTOR < CONF_PERCENT) return FRESH return STALE

Kolics Bertold has made an excellent flow chart diagram showing this process.
FLOWCHART

For Squid-2 the refresh algorithm has been slightly modified to give the EXPIRES value a higher precedence, and the CONF_MIN value lower precedence:

Code:
if (EXPIRES) { if (EXPIRES <= NOW) return STALE else return FRESH } if (CLIENT_MAX_AGE) if (OBJ_AGE > CLIENT_MAX_AGE) return STALE if (OBJ_AGE > CONF_MAX) return STALE if (OBJ_DATE > OBJ_LASTMOD) { if (LM_FACTOR < CONF_PERCENT) return FRESH else return STALE } if (OBJ_AGE <= CONF_MIN) return FRESH return STALE

Expires is definately used in Squid and this should explain why using Expires header will drastically improve the speed of your site to visitors who are behind a proxy (think libraries, schools, corporate buildings, peoplePC dialup users, etc. etc..)
produke is offline  
Old 3-9-06, 08:29 PM   #28
extras
on hiatus
 
Join Date: Mar 2004
Location: Canada
Posts: 5,815
Reputation: 314
Quote:
Originally Posted by produke
extras- your information is completely useless without the full (at least relevant) htaccess file used.
Didn't you check the DirectoryIndex?
http://pow.check-these.info/expire/
http://pow.check-these.info/noexpire/

I made the .htaccess visible for you.
http://pow.check-these.info/expire/.htaccess
http://pow.check-these.info/noexpire/.htaccess

Do you understand that I made the setup totally transparant for you from the beginning?
And I've already told you that you can see the .htaccess.
So, my info is totally relevant, well prepared, extremenly user friendly and thorough.
Quote:
Originally Posted by extras
See it by yourself. I made test case for you.
Go to this directory and click and aaa.html, and then aaa2.html, and finally showlog.cgi
http://pow.check-these.info/expire/

Repeat the same for another directory without the Expires header.
http://pow.check-these.info/noexpire/

You can also see the .htaccess in each directory.

Quote:
And why in the world would you keep FF on linux open 24/7? I think you might be the only one that does that.
I'm lazy.

Quote:
For one thing, FF isn't made with AJAX, you actually have to close and reopen to update the program.
I don't even like Javascript. So, you can understand that I don't care at all about AJAX.

Quote:
And for another thing, windows crashes or has to reboot ALL THE TIME due to software upgrades and bad programming, and FF is more resource intensive on windows than a linux/BSD environment.
I rarely boot my Win2K machine, but it's very stable when I use.
Maybe because I don't allow spywares like Winfows auto update to run.

Quote:
Also, I'm curious why you would prefer to use Konquerer.. I find it to be slow due to its multi-purpose code of filemanager/browser
Because I'm lazy. Konqueror is there without installing.
And I only open it when i need it, for exmaple, to cross check something.

Quote:
If you are using linux I would strongly suggest using Opera instead.. It is very nice! And my fav linux is archlinux, though I am also running slackware and gentoo.
I use Opera in FreeBSD, in the VMware player, as it's lighter.

-------------------------------
I'll read about the Squid test. Thank you for the info.

I think, there is a very simple way to achieve the same (or pretty similar) effect as the Expires header in IE.
Any user who would like to speed up the load time should simply choose the "Never" option, I guess.
I think that will make IE acts just like FF for the cache, and pretty easy to do.


As for the effectiveness of the Expires header, I'm a bit curious about dynamic pages
extras is offline  
Old 3-9-06, 08:48 PM   #29
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
The expires header does not work on dynamic pages.

Of course you can set it up to work on dynamic pages but you must first convert the dynamic into the static. I would guess by adding a handler that is a cgi program that converts it? I honestly don't know.

For me its just not viable, as all my html files contain dynamic content.

As I showed earlier, I use php to output headers on all my .htm (handler php) files because I don't want them cached by a browser. Since I use strict XHTML+CSS and I don't include any inline JS, or CSS, the .htm file is extremely small, and yet the CSS, JS, and all other page items get cached!

My htm pages for instance, fit on just 4 packets! Without compression!
produke is offline  
Old 5-8-06, 08:26 AM   #30
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
Reset client cache with ETag

So I ran into a problem after I had been using the Expires tips outlined here..

Basically the problem was that there was no way to 'reset' a clients cache for something like a swf, pdf, etc.. the clients stick to the Expires rules you outlined for them. So, one way that I am testing now to reset clients caches is to use the FileETag directive

Powwebs default setup uses FileETag INode. So when I change that to something like FileETag Size, the headers for the various files get a new Etag which causes the cache to be reset.

You can try it out by adding
Code:
FileETag Size
in your htaccess file.
produke is offline  
Old 5-9-06, 10:05 PM   #31
produke
 
Join Date: Jul 2005
Location: USA
Posts: 119
Reputation: 41
Ok so the etag thing might not work to force the client to recache.

I've tried
Code:
<FilesMatch file.htm> ExpiresActive Off </Files>
And that succeeded in turning off Expires for just that file. But it didn't force a recache!

So I ended up doing the following-

from
HTML Code:
<script src="/ij/y.js" type="text/javascript"></script> <link href="/ic/y.css" rel="stylesheet" type="text/css" media="all" />

to
HTML Code:
<script src="/ij/y.js?v=1" type="text/javascript"></script> <link href="/ic/y.css?v=1" rel="stylesheet" type="text/css" media="all" />

And then when I make a change to the files, I just ++ the v= to v=1a or v=2





When I am actively developing a live site, I change the expires code to
Code:
### 1 second ExpiresByType application/x-javascript A1 ExpiresByType text/css A1

This enables my browser to To check the last-modified (or etag) of the javascript and css files 1 second after I last viewed it. This means I don't have to hit the refresh button! But more importantly, it means my clients don't.
produke is offline  
Closed Thread


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT -4. The time now is 05:14 AM.


Contents ©PowWeb, Inc. ~ vBulletin, Copyright © 2000-2007 Jelsoft Enterprises Limited.