If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.
You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

how the web works intro

Page history last edited by Andrew Hill 12 years, 7 months ago

Intro | How links work | Client requests | Server response | Caching | Server-side programmes | Negotiation | Cookies | Logs | Authentication | TCP and IP | FAQ

Introduction

If you are looking for a really simple, and I mean simple, guide to how the web works, start with this BBC animation . There is also a summary on the Web architecture page of this site.

About this tutorial

This is an attempt to give a basic understanding of how the web works and was written because I saw so many articles on various news groups that plainly showed that people needed to learn this. Not knowing of any one place to find this info, I decided to collect it into an article. Hope you find it useful!

It covers the HTTP protocol, which is used to transmit and receive web pages, as well as some server workings and scripting technologies. It is assumed that you already know how to make web pages and preferably some HTML as well.. It is also assumed that you have some basic knowledge of URLs. (A URL is the address of a document, what you need to be able to get hold of the document.)

I'm not entirely happy with this, so feedback would be very welcome. If you're not really a technical person and this tutorial leaves you puzzled or does not answer all your questions I'd very much like to hear about it. Corrections and opinions are also welcome.

Some background

When you browse the web the situation is basically this: you sit at your computer and want to see a document somewhere on the web, to which you have the URL.

Since the document you want to read is somewhere else in the world and probably very far away from you some more details are needed to make it available to you. The first detail is your browser. You start it up and type the URL into it (at least you tell the browser somehow where you want to go, perhaps by clicking on a link).

However, the picture is still not complete, as the browser can't read the document directly from the disk where it's stored if that disk is on another continent. So for you to be able to read the document the computer that contains the document must run a web server. A web server is a just a computer program that listens for requests from browsers and then execute them.

So what happens next is that the browser contacts the server and requests that the server deliver the document to it. The server then gives a response which contains the document and the browser happily displays this to the user. The server also tells the browser what kind of document this is (HTML file, PDF file, ZIP file etc) and the browser then shows the document with the program it was configured to use for this kind of document.

The browser will display HTML documents directly, and if there are references to images, Java applets, sound clips etc in it and the browser has been set up to display these it will request these also from the servers on which they reside. (Usually the same server as the document, but not always.) It's worth noting that these will be separate requests, and add additional load to the server and network. When the user follows another link the whole sequence starts anew.

These requests and responses are issued in a special language called HTTP, which is short for HyperText Transfer Protocol. What this article basically does is describe how this works. Other common protocols that work in similar ways are FTP and Gopher, but there are also protocols that work in completely different ways. None of these are covered here, sorry. (There is a link to some more details about FTP in the references.)

It's worth noting that HTTP only defines what the browser and web server say to each other, not how they communicate. The actual work of moving bits and bytes back and forth across the network is done by TCP and IP, which are also used by FTP and Gopher (as well as most other internet protocols).

When you continue, note that any software program that does the same as a web browser (ie: retrieve documents from servers) is called a client in network terminology and a user agent in web terminology. Also note that the server is properly the server program, and not the computer on which the server is an application program. (Sometimes called the server machine.)