Introduction To HTTP

14 min readJan 17, 2023

HTTP, or Hypertext Transfer Protocol, is the foundation of communication on the World Wide Web. It is a set of rules for transferring files, such as text, images, and videos, between web servers and clients, which are typically web browsers.

HTTP enables the transfer of data between different computers and networks, allowing users to access and share information from anywhere in the world. The protocol is based on a client-server model, where a client, such as a web browser, sends a request to a server, which then sends back a response.

One of the key features of HTTP is its ability to transfer data in a format that is easily understood by both the client and the server. This is achieved through the use of a standardized message format, which includes a request line, headers, and a message body. The request line, which is the first line of the message, contains the method, URI, and version of HTTP being used. The headers provide additional information about the request or response, such as the type of content being sent and any cookies associated with the request. The message body, which is optional, contains the actual data being transferred.

Another important aspect of HTTP is its support for various methods, or ways in which a client can request information from a server. The most commonly used methods are GET, which retrieves information from the server, and POST, which sends information to the server. Other methods include PUT, which updates information on the server, and DELETE, which deletes information from the server.

HTTP also supports different types of communication, such as one-way and two-way communication. One-way communication, also known as a fire-and-forget method, is when the client sends a request to the server and does not wait for a response. Two-way communication, on the other hand, is when the client sends a request and waits for a response from the server. This type of communication is commonly used for interactive applications, such as online shopping and social media.

In conclusion, HTTP is a fundamental protocol that allows for the transfer of data between web servers and clients. It provides a standardized message format, supports various methods, and enables different types of communication. Without HTTP, the World Wide Web as we know it would not be possible.

Versions

There are several versions of the HTTP protocol that have been developed and standardized over the years. The most widely used versions are:

HTTP/0.9: The first version of HTTP, which was released in 1991. This version is considered to be very basic and only supports a single method, GET, which is used to retrieve information from the server.
HTTP/1.0: The first official version of HTTP, which was released in 1996. This version added support for additional methods, such as POST and HEAD, and introduced the concept of headers and status codes.
HTTP/1.1: The most widely used version of HTTP, which was released in 1999. This version made several improvements over HTTP/1.0, such as the ability to persist connections, better support for caching, and the introduction of the Host header.
HTTP/2: The latest version of HTTP, which was released in 2015. This version introduced a number of new features, including multiplexing, server push, and header compression, which significantly improves the performance and efficiency of communication over the web.
HTTP/3: A new version of HTTP, which is currently in development and is expected to be released in 2022. This version of HTTP will be based on the QUIC protocol and aims to improve the security and performance of web communication.

It’s worth noting that HTTP/2 and HTTP/3 are not fully supported by all web browsers and servers yet, and many websites still use HTTP/1.1 and lower versions.

Methods

HTTP defines several methods, also known as verbs, that indicate the desired action to be performed on the identified resource. The most common HTTP methods are:

GET: Retrieves information from the server. This is the most widely used method and is used to retrieve information from a specified URI (Uniform Resource Identifier).
POST: Submits information to the server. This method is used to send data to the server, typically to create a new resource or update an existing one.
PUT: Replaces a current resource with new data. This method is used to completely replace the current representation of a resource with new data.
DELETE: Deletes a specified resource. This method is used to delete a resource from the server.
HEAD: Retrieves the headers of a specified resource. This method is similar to GET, but it only retrieves the headers of a resource, rather than the resource itself.
OPTIONS: Describes the communication options for a specified resource. This method is used to retrieve information about the communication options available for a specified resource.
PATCH: Applies partial modifications to a resource. This method is used to partially update a resource, rather than replacing it entirely.
CONNECT: Establishes a network connection to a specified resource. This method is used to establish a network connection to a specified resource, typically used for proxy servers.
TRACE: Retrieves a diagnostic trace of the request message. This method is used to retrieve a diagnostic trace of the request message, typically used for debugging purposes.

These are the most widely used methods, but there are other methods that are less common such as PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK, etc.

POST vs PUT and GET vs HEAD
The main difference between the HTTP methods POST and PUT is the way in which they handle the resource being acted upon.

POST is used to submit an entity to a specified resource, often causing a change in state or side effects on the server. It’s used to create a new resource, or to update an existing one by sending data to the server.
PUT, on the other hand, is used to completely replace the current representation of a resource with new data. It’s typically used to update an existing resource, or to create a new resource if one does not already exist.

Also calling PUT once or several times successively has the same effect (that is no side effect), it is idempotent.

The main difference between the HTTP methods GET and HEAD is the amount of information returned in the response.

GET is used to retrieve information from the server and it returns the entire representation of the resource, including the headers and the body.
HEAD, on the other hand, is used to retrieve the headers of a specified resource. It only returns the headers and no body, it can be used to check if the resource exist or retrieve the metadata of it without requesting the entire resource.

In summary, PUT is used to update or create a resource by replacing the current representation, while POST is used to create or update a resource by submitting data to the server. GET returns the entire representation of a resource, while HEAD returns only the headers.

Headers
HTTP headers are used to provide additional information about the request or response message. Both the request and response headers are composed of a set of fields, each with its own name and value.

Here is an example of a request header:

GET /example HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) Gecko/20100101 Firefox/120.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Cookie: session=abcdefghijklmnopqrstuvwxyz

The first line is the request line, which contains the method (GET), the URI (/example), and the version of HTTP (HTTP/1.1) being used.
The Host field indicates the hostname and port number of the server being requested. This field is required by HTTP/1.1
The User-Agent field is used to identify the client software and version.
The Accept field is used to indicate the types of content that the client is able to handle.
The Accept-Language field is used to indicate the preferred language for the response.
The Accept-Encoding field is used to indicate the types of content encoding that the client is able to handle.
The Connection field is used to indicate whether the client wants to maintain a persistent connection to the server.
The Cookie field is used to send one or more cookies to the server.

Now let’s take a look at an example of a response header:

HTTP/1.1 200 OK
Server: Apache/2.4.38 (Debian)
Date: Mon, 11 Jan 2021 12:00:00 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 1234
Connection: keep-alive

The first line is the status line, which contains the version of HTTP (HTTP/1.1), the status code (200) and the reason phrase (OK)
The Server field is used to indicate the software and version of the server that generated the response.
The Date field is used to indicate the date and time of the response.
The Content-Type field is used to indicate the media type of the response body.
The Content-Length field is used to indicate the length of the response body in bytes.
The Connection field is used to indicate whether the server wants to maintain a persistent connection to the client.

There are many other headers that can be used depending on the context, such as authentication headers, caching headers, location headers and so on.

There are several resources on the internet that provide comprehensive guides on all possible HTTP headers. Here are a few examples:

The HTTP/1.1 specification published by the Internet Engineering Task Force (IETF) is the official standard for HTTP and provides detailed information on all possible headers. The specification can be found at: https://tools.ietf.org/html/rfc2616
The Mozilla Developer Network (MDN) provides a comprehensive guide on all possible HTTP headers, including examples and explanations of how they are used. The guide can be found at: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers
The W3C website provides a guide on all possible headers for HTTP/1.1 and HTTP/2, including explanations of how they are used, and examples. The guide can be found at: https://www.w3.org/Protocols/HTTP/
The OWASP website provides a guide on all possible headers and how to use them securely. The guide can be found at: https://owasp.org/www-community/HttpHeaders
The IANA website provides a list of all official headers and their meanings. The list can be found at: https://www.iana.org/assignments/message-headers/message-headers.xhtml

All of these resources are considered reputable and provide detailed information on all possible headers. They are a great reference for web developers and security professionals.

Response codes

HTTP defines a set of standard response codes that indicate the status of a request. These codes are grouped into classes based on their first digit. Here is a list of the most common response codes:

1xx (Informational): The request was received, but the server is still processing it.

100 Continue: The server has received the request headers and the client should proceed to send the request body.
101 Switching Protocols: The client has asked to switch to a different protocol, such as upgrading from HTTP/1.1 to HTTP/2.

2xx (Successful): The request was successfully received, understood, and accepted.

200 OK: The request was successful and the server has returned the requested resource.
201 Created: The request was successful and a new resource was created as a result.
204 No Content: The request was successful, but there is no resource to return.

3xx (Redirection): The request needs further action to be fulfilled, such as following a redirect.

301 Moved Permanently: The resource has been permanently moved to a new URI.
302 Found: The resource has been temporarily moved to a new URI.
304 Not Modified: The resource has not been modified since the last request and the client can use the cached version.

4xx (Client Error): The request contains bad syntax or cannot be fulfilled by the server.

400 Bad Request: The request is malformed or invalid.
401 Unauthorized: The request requires user authentication.
403 Forbidden: The server refuses to fulfill the request.
404 Not Found: The requested resource could not be found.
405 Method Not Allowed: The request method is not allowed for the requested resource.
429 Too Many Requests: The user has sent too many requests in a given amount of time.

5xx (Server Error): The server failed to fulfill a valid request.

500 Internal Server Error: The server encountered an unexpected condition that prevented it from fulfilling the request.
503 Service Unavailable: The server is currently unable to handle the request due to a temporary overload or maintenance.

These are the most common response codes, but there are many other codes that are less common such as 102 Processing, 207 Multi-Status, 308 Permanent Redirect, etc. Each of these codes are designed to convey different information about the request, and it’s important for the client to interpret the correct meaning and take the appropriate action. The resources mentioned in the previous section also provide information and comprehensive lists of all http response codes.

https://

HTTPS (Hypertext Transfer Protocol Secure) is an extension of the standard HTTP protocol that provides secure communication over the internet. It uses SSL (Secure Sockets Layer) or TLS (Transport Layer Security) to encrypt the data being transferred between the client and the server. This helps to protect sensitive information, such as login credentials or credit card numbers, from being intercepted or tampered with by a third party.

When a client makes a request to an HTTPS website, the browser first establishes an SSL/TLS connection with the server. Once the connection is established, the browser sends the request to the server over an encrypted channel. The server then processes the request and sends the response back to the browser over the same encrypted channel.

Here’s an example of how HTTPS works:

A user visits a website that uses HTTPS, such as https://www.example.com.
The browser establishes an SSL/TLS connection with the server by performing the SSL/TLS Handshake.
The browser sends a request to the server, such as a GET request to retrieve the website’s home page.
The server receives the request and processes it.
The server sends the response back to the browser, which includes the website’s home page and any additional resources (images, scripts, etc.)
The browser receives the response and decrypts the data using the session keys generated during the SSL/TLS Handshake.
The browser then renders the website’s home page and resources for the user to view.

In contrast, when a client makes a request to a HTTP website, the request and response are sent in plain text, which makes it easy for a third party to intercept and read the data.

It’s worth noting that in order for a website to use HTTPS, it needs to have a valid SSL/TLS certificate installed on the server. This certificate is used to authenticate the server’s identity and to establish the encrypted connection.

It’s also worth noting that HTTPS is becoming more important now, as it’s a requirement for many features such as geolocation, camera access, and storage access on the web and also it’s a ranking signal for search engines.

Cookies

Cookies are small text files that are stored on a user’s device by a website’s server. They are used to remember information about the user’s browsing activity and preferences, such as login credentials, shopping cart contents, or language preference. Cookies are sent by the website’s server to the user’s browser and are included in subsequent requests made to the same website. This allows the website to remember information about the user’s previous visits and to provide a more personalized experience.

Here’s an example of how cookies work:

A user visits a website that requires the user to log in. The user enters their login credentials and submits the form.
The website’s server receives the login request and verifies the credentials. If they are valid, the server sends a response to the browser, along with a “Set-Cookie” header. This header contains the name and value of the cookie, and any additional attributes such as the expiration date and domain.
The browser stores the cookie on the user’s device.
On the next request, the browser sends the cookie back to the server as a “Cookie” header.
The server receives the request and checks the cookie. If the cookie is valid, the server uses the information stored in the cookie to personalize the user’s experience or to keep the user logged in.

Cookies are used for a variety of purposes, such as to remember login credentials, to keep track of shopping cart contents, to show targeted ads, or to remember user preferences. Cookies can be deleted by the user at any time, and many web browsers allow users to control the use of cookies by setting their own preferences. For example, users can block all cookies, block only third-party cookies, or block cookies from specific websites. Some browsers also allow users to view and delete individual cookies.

It’s worth noting that cookies have some privacy concerns as they allow websites to track a user’s browsing activity and personalize the experience without the user’s explicit consent. Many websites now require users to accept the use of cookies before they can access the website’s content, and some browsers now block third-party cookies by default.

Additionally, there are some other types of cookies, such as session cookies and persistent cookies. Session cookies are stored in the browser’s memory only for the duration of the user’s visit and are deleted when the browser is closed, while persistent cookies are stored on the user’s device and are not deleted when the browser is closed, they have an expiration date.

Cookies are widely used for web development and it’s an important feature for many web application, However it’s important to be aware of their privacy implications and to provide users with control over their use.

Caching
HTTP caching is a technique used to temporarily store a copy of a resource on a client’s device, so that it can be quickly retrieved the next time the same resource is requested. Caching can reduce the amount of data that needs to be transferred over the network, which can improve the performance and responsiveness of a website.

There are two types of caching that can occur with HTTP: browser caching and proxy caching.

Browser caching: When a browser requests a resource from a server, the server can include “Cache-Control” and “Expires” headers in the response. These headers provide information to the browser about how long the resource can be cached and when it should be considered stale. The browser will then cache the resource and use the cached copy for subsequent requests, rather than requesting the resource again from the server.
Proxy caching: In addition to browser caching, resources can also be cached by intermediaries known as proxies. These are servers that sit between the client and the origin server and are used to improve performance and reduce network traffic. When a proxy receives a request for a resource, it can check its cache to see if a copy of the resource is already stored. If it is, the proxy can return the cached copy to the client, rather than forwarding the request to the origin server.

Caching can improve the performance of a website by reducing the number of requests that need to be sent to the server, but it can also cause problems if the cached resources become stale. To control caching, the server can set several headers such as Cache-Control, Expires, ETag, If-Modified-Since, If-None-Match, etc. The Cache-Control header, for example, can be used to specify whether a resource is cacheable, how long it can be cached, and whether it should be revalidated before being used.

It’s important to note that caching can cause stale data to be served to the user in case of dynamic content or if the server didn’t set the correct headers. Therefore, it’s important to test and tune caching headers to ensure that the website performance is optimal while providing the correct and updated content to the users.

Compression
HTTP compression is a technique used to reduce the amount of data that needs to be transferred over the network by compressing the response body before it is sent to the client. This can improve the performance of a website by reducing the amount of time it takes to load resources, especially for large files or slow connections.

There are several different algorithms that can be used for HTTP compression, such as GZIP and DEFLATE. The most commonly used algorithm is GZIP.

The process of compression works as follows:

The client sends a request to the server and includes an “Accept-Encoding” header, which specifies the compression algorithms that the client supports.
The server receives the request and checks the “Accept-Encoding” header to see if the client supports the compression algorithm that the server wants to use.
If the client supports the algorithm, the server compresses the response body using the specified algorithm before sending it back to the client.
The server also includes a “Content-Encoding” header in the response, which specifies the algorithm that was used to compress the response body.
The client receives the response, decompresses the response body using the specified algorithm, and then processes the response as usual.

It’s worth noting that compression can add some overhead to the server, as it needs to compress and decompress the data, but it can save a lot of bandwidth, especially for large files and slow connections. It’s also important to note that not all types of contents can be compressed, such as already compressed files like images, videos, etc.

Compression is a feature that can be configured on the server and it’s transparent for the client, it can also be configured to be used only when the client supports it. Implementing HTTP compression can significantly improve the performance of a website and it’s a recommended practice for web development.

I hope you enjoyed this introduction to HTTP, please leave some claps if you enjoyed the content.

Introduction To HTTP

Written by JAVING