HTTP : Must known Protocol (Part 1)

Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems.Basically, HTTP is a TCP/IP based communication protocol, that is used to deliver data (HTML files, image files, query results, etc.) on the World Wide Web.The default port is TCP 80, but other ports can be used as well. It provides a standardized way for computers to communicate with each other.

HTTP specification specifies how clients' request data will be constructed and sent to the server, and how the servers respond to these requests.


TCP/IP is responsible for breaking up the packets  into small pieces and taking them to the correct place or ip address where they are supposed to reach.

Basic Features of HTTP :


HTTP is connectionless: The HTTP client, i.e., a browser initiates an HTTP request and after a request is made, the client disconnects from the server and waits for a response. The server processes the request and re-establishes the connection with the client to send a response back.

HTTP is media independent: It means, any type of data can be sent by HTTP as long as both the client and the server know how to handle the data content. It is required for the client as well as the server to specify the content type using appropriate MIME-type.

HTTP is stateless: As mentioned above, HTTP is connectionless and it is a direct result of HTTP being a stateless protocol. The server and client are aware of each other only during a current request. Afterwards, both of them forget about each other. Due to this nature of the protocol, neither the client nor the browser can retain information between different requests across the web pages.

HTTP/1.0 uses a new connection for each request/response exchange, where as HTTP/1.1 connection may be used for one or more request/response exchanges.

HTTP Architecture 


HTTP is based on the client-server architecture model and a stateless request/response protocol that operates by exchanging messages across a reliable TCP/IP connection.

An HTTP "client" is a program (Web browser or any other client) that establishes a connection to a server for the purpose of sending one or more HTTP request messages. An HTTP "server" is a program ( generally a web server like Apache Web Server or Internet Information Services IIS, etc. ) that accepts connections in order to serve HTTP requests by sending HTTP response messages.



URL

At the heart of web communications is the request message, which are sent via Uniform Resource Locators (URLs).URL define the application protocol or scheme which will be used for packet transmission, host where request has to be sent , port number , resource path and query data if needed.





Different method or Verbs used in HTTP 


URLs reveal the identity of the particular host with which we want to communicate, but the action that should be performed on the host is specified via HTTP verbs. Of course, there are several actions that a client would like the host to perform. HTTP has formalized on a few that capture the essentials that are universally applicable for all kinds of applications.

These request verbs are:

GET: fetch an existing resource. The URL contains all the necessary information the server needs to locate and return the resource.
POST: create a new resource. POST requests usually carry a payload that specifies the data for the new resource.
PUT: update an existing resource. The payload may contain the updated data for the resource.
DELETE: delete an existing resource.

The above four verbs are the most popular, and most tools and frameworks explicitly expose these request verbs. PUT and DELETE are sometimes considered specialized versions of the POST verb, and they may be packaged as POST requests with the payload containing the exact action: create, update or delete.

There are some lesser used verbs that HTTP also supports:

HEAD: this is similar to GET, but without the message body. It's used to retrieve the server headers for a particular resource, generally to check if the resource has changed, via timestamps.
TRACE: used to retrieve the hops that a request takes to round trip from the server. Each intermediate proxy or gateway would inject its IP or DNS name into the Via header field. This can be used for diagnostic purposes.
OPTIONS: used to retrieve the server capabilities. On the client-side, it can be used to modify the request based on what the server can support.

Status Codes


The Status-Code element in a server response, is a 3-digit integer where the first digit of the Status-Code defines the class of response and the last two digits do not have any categorization role. There are 5 values for the first digit:

1xx: Informational : It means the request has been received and the process is continuing.

2xx: Success : It means the action was successfully received, understood, and accepted.

3xx: Redirection : It means further action must be taken in order to complete the request.

4xx: Client Error : It means the request contains incorrect syntax or cannot be fulfilled.

5xx: Server Error : It means the server failed to fulfill an apparently valid request.

Few of the status code which are mainly used in are as :

200 OK : Standard response for successful HTTP requests. The actual response will depend on the request method used. In a GET request, the response will contain an entity corresponding to the requested resource. In a POST request, the response will contain an entity describing or containing the result of the action.

400 Bad Request : The server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).

401 Unauthorized : The request requires user authentication.

404 Not Found : The requested resource could not be found but may be available again in the future. Subsequent requests by the client are permissible.

500 Internal Server Error : The server is unable to process the request due to an internal error.

502 Bad Gateway : The server is unable to process the request due to the error occured in remote service



HTTP Message Structure 


HTTP requests and HTTP responses use a generic message format of RFC 822 for transferring the required data. This generic message format consists of the following four items.


Message Start-Line : A start-line will have the following generic syntax:

start-line = Request-Line | Status-Line

Example of Request Line and Status Line are as below :

ET /hello.htm HTTP/1.1     (This is Request-Line sent by the client)

HTTP/1.1 200 OK             (This is Status-Line sent by the server)


Header Fields

HTTP header fields provide required information about the request or response, or about the object sent in the message body. There are four types of HTTP message headers:

General-header: These header fields have general applicability for both request and response messages.

Request-header: These header fields have applicability only for request messages.

Response-header: These header fields have applicability only for response messages.


Entity-header: These header fields define meta information about the entity-body or, if no body is present, about the resource identified by the request.

Following are the examples of various header fields:

User-Agent: curl/7.16.3 libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3
Host: www.example.com
Accept-Language: en, mi
Date: Mon, 27 Jul 2009 12:28:53 GMT
Server: Apache
Last-Modified: Wed, 22 Jul 2009 19:15:56 GMT
ETag: "34aa387-d-1568eb00"
Accept-Ranges: bytes
Content-Length: 51
Vary: Accept-Encoding
Content-Type: text/plain

Message Body

The message body part is optional for an HTTP message but if it is available, then it is used to carry the entity-body associated with the request or response. If entity body is associated, then usually Content-Type and Content-Length headers lines specify the nature of the body associated.

A message body is the one which carries the actual HTTP request data (including form data and uploaded, etc.) and HTTP response data from the server ( including files, images, etc.). Shown below is the simple content of a message body:

Example of message body :

<html>
<body>
<h1>Hello, World!</h1>
</body>
</html>



HTTP Request Format :


An HTTP client sends an HTTP request to a server in the form of a request message whose format is similar to the HTTP message structure as explained above.

Request-Line : Request-Line begins with a method token, followed by the Request-URI and the protocol version, and ending with CRLF. The elements are separated by space SP characters.

Request-Line = Method SP Request-URI SP HTTP-Version CRLF

Request method indicates the method to be performed on the resource identified by the given Request-URI. The method is case-sensitive and should always be mentioned in uppercase. Few method are as GET,POST,PUT,DELETE etc.

Request-URI is a Uniform Resource Identifier and identifies the resource upon which to apply the request.

Request Header Fields : The request-header fields allow the client to pass additional information about the request, and about the client itself, to the server.
These fields act as request modifiers. Few of the important request header attributes are :
1) Content-Type
2) Authorization
3) Accept - Language
4) User Agent
5) Accept- Encoding
6) Connection
example of Request message in HTTP :

GET /hello.htm HTTP/1.1
User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: www.tutorialspoint.com
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive


Response in HTTP 


Response Structure is same as the HTTP  message structure . Lets see the each element in bit more detail 

Message Status-Line : 
A Status-Line consists of the protocol version followed by a numeric status code and its associated textual phrase. The elements are separated by space SP characters.

Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF

HTTP Version :A server supporting HTTP version 1.1 will return the following version information:

HTTP-Version = HTTP/1.1

Status Code : The Status-Code element is a 3-digit integer where first digit of the Status-Code defines the class of response and the last two digits do not have any categorization role

Response Header Fields : The response-header fields allow the server to pass additional information about the response which cannot be placed in the Status- Line. These header fields give information about the server and about further access to the resource identified by the Request-URI. Few of the attribute are as below :

  • Content-Type
  • Content-Length
  • Server
  • Connection

Example of response message :

HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 12:28:53 GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Wed, 22 Jul 2009 19:15:56 GMT
Content-Length: 88
Content-Type: text/html
Connection: Closed
<html>
<body>
<h1>
Hello, World!</h1>
</body>
</html>

HTTP - Header Fields


HTTP header fields provide required information about the request or response, or about the object sent in the message body. There are four types of HTTP message headers:

General-header: These header fields have general applicability for both request and response messages.
Client Request-header: These header fields have applicability only for request messages.
Server Response-header: These header fields have applicability only for response messages.
Entity-header: These header fields define meta information about the entity-body or, if no body is present, about the resource identified by the request.

General Header: Few of the general fields are as below :

  • Cache-Control
  • Connection
  • Date
  • Pragma
  • Trailer

This covers the basic concepts of HTTP protocol. If you have any question or you want to add anything, then please comment below.


Comments

Popular posts from this blog

Deploy standalone Spring MVC Application in Docker Container

Refactor Code : Separate Query from Modifier

ConcurrentHashMap Internal Working