Since its development in the early 1990s, the World Wide Web has become a central part of most people’s lives, enabling opportunities for work, research, commerce, and social connection, as well as an endless stream of cat videos. By opening up a web browser on an internet-enabled device, you can access a nearly endless universe of information.
One of the principles behind the design of the Web is openness—the disclosure and standardization of the rules by which web applications should operate and interact with each other. This openness enables interoperability, allowing web applications from different vendors that adhere to agreed-upon rules in order to work correctly and reliably with one another.
The rich media experience supported on the modern internet has evolved from its original text-based concept. Yet the underlying mechanism for retrieving information from a website is still much the same as originally conceived. The key to this communication is called HTTP, a simple protocol that supports the Web.
What is HTTP and why do we need it?
HTTP stands for Hypertext Transfer Protocol. It’s a formally defined set of rules for communication between a client (the network resource requesting data or services) and a server (the resource that receives and responds to the request).
In other words: If the Internet is the infrastructure connecting web clients and servers, HTTP is the “language” they speak to each other over that connection. This is how we make web pages load and YouTube videos play.
Standardized computer network protocols ensure that hardware and software produced by different vendors can work together reliably. HTTP does this for web communications. The HTTP protocol specifies the rules for resource requests and responses between web clients and the servers.
HTTP is an application layer protocol in the seven-layer OSI networking model, which standardizes the communication functions of telecommunications or computing systems regardless of the underlying internal structure and technology. The definition and ongoing development of this protocol is now the responsibility of an international organization called the Internet Engineering Task Force (IETF).
HTTP is most commonly used with a web browser client (such as Chrome, Safari, or Edge) and a web server running on a computer system located somewhere on the Internet. HTTP supports many other web applications and services as well.
Important terminology
Some of the vocabulary used in this discussion may benefit from a bit more definition and discussion. These terms include:
- Protocol: An agreed-upon set of rules, conventions, and data structures that determine how resources exchange data.
- Server: A networked provider of resources or services to clients. For example, a web server may deliver web page content along with other services to web browsers that request those services.
- Client: A networked requester of services from a server. For example, a web browser (the client) may request to download a web page from a web server or to upload and send an email message via a mail server.
- Proxy: Outside of the internet, a proxy is a person who has the authority to act on behalf of another. For example, someone with power of attorney can act as a proxy for, or represent the legal interests of, another person. In the digital world, a proxy (or proxy server) is an application that acts as an intermediary for clients seeking resources from other servers. The purpose of this is most often to mask the identity of the requesting client — perhaps the most straightforward examples of proxies are paid proxy services like iVPN, which users connect to for a fee in order to keep their data and activity protected online.
- Request: Generally, a request is the act of formally asking for something. In the context of HTTP, a request is a message sent from a client to a web server that specifies an action to be performed on a resource.
- URL: A Uniform Resource Locator, or URL, is a unique identifier for a web resource, specifying its location on the web and the mechanism that is to be used to retrieve it. URLs are sometimes called web addresses and are commonly used to link web pages to each other.
- TLS/SSL: SSL refers to Secure Sockets Layer, a protocol for encrypting and securing communications over a computer network. SSL was used until 2014 when, due to vulnerabilities, it was replaced by the more secure TLS cryptographic protocol. SSL and TLS are used to secure HTTP communications in the HTTPS protocol.
- HTML: HyperText Markup Language is used to define and format documents intended to be displayed in web browsers. It is one of the fundamental standards of the World Wide Web.
How does HTTP work?
The World Wide Web uses HTTP to communicate between web clients and servers—requests for web pages, updates to web resources, status of requests, and so on. The HTTP protocol defines how the web client and server interact with each other. It specifies in detail the requests a client is allowed to ask of a server, as well as how a server should respond. This encompasses the syntax of the allowable communications—essentially, the permitted grammar rules and semantics of those communications, as well as the intended meaning of requests and responses. This also defines the capabilities available to web clients as they interact with web servers.
In the client-server model, the client always initiates requests of a server. In web applications, a web browser is usually the client and a web server is the server. For example, a browser might ask “Send me the web page http://neeva.com/learn.” The server then sends back the appropriate response, perhaps “Here’s the HTML for that web page. Good luck!” or, alternatively, “That page doesn’t exist, sorry.” Requests and responses are addressed to the appropriate resources and are transmitted as HTML documents over TCP/IP.
The HTTP protocol defines eight “verbs” by which a web client can request to retrieve, update, or manage content on a web server resource. These eight types of HTTP requests are known as HTTP Request Methods, and these include:
For retrieving a resource (typically web content)
For updating a resource
For removing a resource
For debugging messages
For creating a network connection
A user can request a web page by typing an HTTP URL into a browser’s address bar. This causes the browser to formulate a request to the web server for fetching the HTML document at the specified URL. This typically uses the “GET” HTTP Request Method. If the requested document exists, and if the client user has permission to access it, a copy of it is returned to the browser, which is responsible for parsing and properly displaying the document to the user.
There may be several complicating factors here:
- Application security may prevent access to the requested resource.
- Cascading Style Sheets (CSS) may affect the layout.
- Additional linked sub-pages may need to be retrieved.
- Embedded code such as JavaScript may be retrieved and executed.
- All of this likely passes through web proxies which relay the requests and responses in both directions.
HTTP vs HTTPS
The internet is not a private communications channel. HTTP requests and responses are transferred in plain text. They can be easily intercepted and read by people or software other than their intended recipients. This might not be a big deal for your cat videos, but when dealing with passwords, credit card numbers, or other sensitive information, bad things can happen. The evolution of ecommerce in particular created an early need for more secure communication online.
Another problem with HTTP is “spoofing,” in which a malicious website impersonates the site being addressed and intercepts the communications. HTTP alone cannot guarantee the identity of the remote server, and a client can be fooled into communicating with an imposter without assured authentication of the remote server identity.
HTTPS (an acronym for HTTP Secure) is an enhancement to HTTP. It allows for secure authentication of the remote server and the encrypted transfer of HTTP data. HTTPS relies on a digital certificate from a trusted third party (known as the “certificate authority”) to secure the connection and ensure that the site is legitimately who it says it is. It also uses TLS to encrypt the HTTP requests and responses, preventing sneaky scammers from stealing sensitive information.
Both HTTP and HTTPS are commonly supported today. A link that you enter or click on may begin with http:// or https://, designating which protocol is to be used. Modern web browsers display a padlock in the address bar next when using a secure HTTPS connection. They omit that graphic when using the less secure HTTP.
Is HTTP safe?
HTTP is less than 100% safe, and the privacy of our online activity is increasingly a priority for many individuals and businesses. Much of what we do online involves personal, work, financial, and other sensitive data. HTTPS ensures that you connect with the intended web service, and that the communication between your browser and the web service cannot be read by eavesdroppers.
Some modern browsers warn a user when they try to access a website using the HTTP protocol, calling it “not secure.” Because of this, more websites are specifically supporting HTTPS, and redirecting HTTP requests to a more secure HTTPS communications channel.
Ready to protect your privacy online and use products that benefit you, not scammers and advertisers? Try Neeva, the world’s first private, ad-free search engine. We will never sell or share your data with anyone, especially advertisers, and we are committed to showing you the best results for every search. Try Neeva for yourself, at neeva.com.