Search the Web:

Google

Sunday, January 27, 2008

How Http Communication Works

The driving force behind Ajax is the interaction between the client (web browser) and the server. Previously, the understanding of this communication was limited to those who developed purely on the server-side using languages such as Perl and C. Newer technologies such as ASP.NET, PHP, and JSP encouraged more of a mix of client- and server-side techniques for software engineers interested in creating web applications, but they often lacked a full understanding of all client-side technologies (such as JavaScript). Now the pendulum has swung in the other direction, and client-side developers need to understand more about server-side technology in order to create Ajax solutions.

HTTP PRIMER

Central to a good grasp of Ajax techniques is hypertext transmission protocol (HTTP), the protocol to transmit web pages, images, and other types of files over the Internet to your web browser and back. Whenever you type a URL into the browser, an "http://" is prepended to the address, indicating that you will be using HTTP to access the information at the given location.

HTTP consists of two parts: a request and a response. When you type a URL in a web browser, the browser creates and sends a request on your behalf. This request contains the URL that you typed in as well as some information about the browser itself. The server receives this request and sends back a response. The response contains information about the request as well as the data located at the URL (if any). It's up to the browser to interpret the response and display the web page (or other resource).

HTTP Requests

The format of an HTTP request is as follows:
<request-line>
<headers>
<blank>
[<request-body>]

In an HTTP request, the first line must be a request line indicating the type of request, the resource to access, and the version of HTTP being used. Next, a section of headers indicate additional information that may be of use to the server. After the headers is a blank line, which can optionally be followed by additional data (called the body).

There are a large number of request types defined in HTTP, but the two of interest to Ajax developers are GET and POST. Anytime you type a URL in a web browser, the browser sends a GET request to the server for that URL, which basically tells the server to get the resource and send it back. Here's what a GET request for http://www.example.com/ might look like:

GET / HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6)
Gecko/20050225 Firefox/1.0.1
Connection: Keep-Alive

The first part of the request line specifies this as a GET request. The second part of that line is a forward slash (/), indicating that the request is for the root of the domain. The last part of the request line specifies to use HTTP version 1.1 (the alternative is 1.0). And where is the request sent? That's where the second line comes in.

The second line is the first header in the request, Host. The Host header indicates the target of the request. Combining Host with the forward slash from the first line tells the server that the request is for www.example.com/. (The Host header is a requirement of HTTP 1.1; the older version 1.0 didn't require it.) The third line contains the User-Agent header, which is accessible to both server- and client-side scripts and is the cornerstone of most browser-detection logic. This information is defined by the browser that you are using (in this example, Firefox 1.0.1) and is automatically sent on every request. The last line is the Connection header, which is typically set to Keep-Alive for browser operations (it can also be set to other values, but that's beyond the scope of this book). Note that there is a single blank line after this last header. Even though there is no request body, the blank line is required.


If you were to request a page under the http://www.example.com/ domain, such as http://www.example.com/books, the request would look like this:
GET /books/ HTTP/1.1
Host: http://www.example.com/
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6)
Gecko/20050225 Firefox/1.0.1
Connection: Keep-Alive

Note that only the first line changed, and it contains only the part that comes after http://www.example.com/ in the URL.
Sending parameters for a GET request requires that the extra information be appended to the URL itself. The format looks like this:
URL ? name1=value1&name2=value2&..&nameN=valueN

This information, called a query string, is duplicated in the request line of the HTTP request, as follows:
GET /books/?name=ajax%20book HTTP/1.1
Host: http://www.example.com/
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6)
Gecko/20050225 Firefox/1.0.1
Connection: Keep-Alive

Note that the text "ajax book" had to be encoded, replacing the space with %20, in order to send it as a parameter to the URL. This is called URL encoding and is used in many parts of HTTP. (JavaScript has built-in functions to handle URL encoding and decoding; these are discussed later in the chapter). The name-value pairs are separated with an ampersand. Most server-side technologies will decode the request body automatically and provide access to these values in some sort of logical manner. Of course, it is up to the server to decide what to do with this data.

The POST request, on the other hand, provides additional information to the server in the request body. Typically, when you fill out an online form and submit it, that data is being sent through a POST request.
Here's what a typical POST request looks like:
POST / HTTP/1.1
Host: http://www.example.com/
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6)
Gecko/20050225 Firefox/1.0.1
Content-Type: application/x-www-form-urlencoded
Content-Length: 40
Connection: Keep-Alive
name=ajax%20book&publisher=smrithi

You should note a few differences between a POST request and a GET request. First, the request line begins with POST instead of GET, indicating the type of request. You'll notice that the Host and User-Agent headers are still there, along with two new ones. The Content-Type header indicates how the request body is encoded. Browsers always encode post data as application/x-www-form-urlencoded, which is the MIME type for simple URL encoding. The Content-Length header indicates the byte length of the request body. After the Connection header and the blank line is the request body. As with most browser POST requests, this is made up of simple name-value pairs, where name is Professional Ajax and publisher is Wiley. You may recognize that this format is the same as that of query string parameters on URLs.
As mentioned previously, there are other HTTP request types, but they follow the same basic format as GET and POST. The next step is to take a look at what the server sends back in response to an HTTP request.


HTTP Responses

The format of an HTTP response, which is very similar to that of a request, is as follows:
<status-line>
<headers>
<blank line>
[<response-body>]

As you can see, the only real difference in a response is that the first line contains status information instead of request information. The status line tells you about the requested resource by providing a status code. Here's a sample HTTP response:

HTTP/1.1 200 OK
Date: Sat, 31 Dec 2005 23:59:59 GMT
Content-Type: text/html;charset=ISO-8859-1
Content-Length: 122

<html>
<head>
<title>Example Homepage</title>
</head>
<body>
<!-- body goes here -->
</body>
</html>

In this example, the status line gives an HTTP status code of 200 and a message of OK. The status line always contains the status code and the corresponding short message so that there isn't any confusion. The most common status codes are:

200 (OK): The resource was found and all is well.

304 (NOT MODIFIED): The resource has not been modified since the last request. This is used most often for browser cache mechanisms.

401 (UNAUTHORIZED): The client is not authorized to access the resource. Often, this will cause the browser to ask for a user name and password to log in to the server.

403 (FORBIDDEN): The client failed to gain authorization. This typically happens if you fail to log in with a correct user name and password after a 401.

404 (NOT FOUND): The resource does not exist at the given location.

Following the status line are some headers. Typically, the server will return a Date header indicating the date and time that the response was generated. (Servers typically also return some information about themselves, although this is not required.) The next two headers should look familiar as well, as they are the same Content-Type and Content-Length headers used in POST requests. In this case, the Content-Type header specifies the MIME type for HTML (text/html) with an encoding of ISO-8859-1 (which is standard for the United States English resources). The body of the response simply contains the HTML source of the requested resource (although it could also contain plain text or binary data for other types of resources). It is this data that the browser displays to the user.
Note that there is no indication as to the type of request that asked for this response; however, this is of no consequence to the server. It is up to the client to know what type of data should be sent back for each type of request and to decide how that data should be used.

1 comment:

addy said...

Hey!!
Nice way of putting it up....
good keep it up....
:-)

LinkDirectory.com is a brand NEW 100% SEO friendly and an Entirely human edited link directory.
DaniWeb IT Discussion Community
Blog Search, Blog Directory
Directory of Computers/Tech Blogs