How does website renders to the browser
June 10, 2021
This series of posts is crafted for all the newcomers that had put all their efforts into learning new programming language and have very little time left to get grasp of Internets and programming basic concepts. This is pretty common, I had and I still have many gaps in my programming education. My main goal is to share the knowledge in a simple way so everyone can get at least basic concepts. That's why I will try to explain it on high level of abstraction possible. At the same time, I'll fill my gaps too by writing on this blog. Remember, that everyone makes mistakes so if I misspelled some word or misunderstood some concept, feel free to jump in and correct me. Let's get started!
First I have to make clear one thing. Substantial part of newcomers code mindlessly. I did that and there's a big chance that you were doing this too. It's completely fine. We have a lot of time to fill the gaps in our knowledge. First of many gaps that I had was that I was coding front-end, and I had close to zero familiarity of how Internet works. This happens because we focus on getting to know languages and max out the time left in order to get first coding job or to know some new, hot technology. You can still code without knowing what routing is and earn a decent living, but for me after some period of time it would feel dull that I do not get as much as I can from the job that I do 40 or more hours a week.
0. Creation of the Internet
History of the Internet started in 60s as one of the US army's projects. It went through many stages of development, and many research groups to finally evolve to the network that we use every day. It was
released to the public in the 80s, mostly to the universities and private companies, when ARPANET computers switched to the TCP/IP protocol that we use to this day. On April 1993 CERN put the web into the public domain. The World Wide Web was the creation of CERN employee Tim Berners-Lee. Thanks to him and CERN Internet became part of public infrastructure. The release of the World Wide Web included:
- basic client which at that time was limited to 2 options - MOSAIC or CERN browser.
- basic server
- library of common code.
1. DNS look up
For a basic user Internet starts with typing the domain name of the website that you are looking for into browsers address bar. Our machines first job is to do a DNS lookup. By that we mean action of asking a DNS server what is IP address of website. Every machine connected to Internet has its own IP address and in order to retrieve files that make a website we have to know whats IP address of website's server. In the most simple explanation DNS(Domain Name System) is a database that contains names of the websites and their equivalent in IP address. It is human friendly way of storing IP addresses that machines use to communicate with each other. It is much easier for us to type google.com than its IP address. How can you check manually whats IP address of, for example, google.com? Assuming you are user of Linux, you can find out whats IP address of website that you are looking for by opening terminal (it also works on macOS, i have tried it). Just type:
nslookup google.com // <- insert domain name
The result that i got:
Server: 127.0.0.53 Address: 127.0.0.53#53 Non-authoritative answer: Name: google.com Address: 220.127.116.11 <- this is google.com's IP address Name: google.com Address: 2a00:1450:401b:805::200e
Now when you know Google's IP address, you can just type 18.104.22.168 into browsers address bar to enter google.com!
Many machines connected to the Internet host some part of DNS database. These machines are called DNS servers. None of them host the whole database, each DNS server maintains only piece it. It is important to note that query to DNS servers may take a little while, because if one DNS server doesnt know IP address of the domain that you are looking for, it will re-direct request to the other server.
The concept of cache is important to understand. The DNS query may happen immediately if your computer cached previously DNS results. By cache we mean temporarily saved data that is kept in computers memory. That's why when you have opened some website and closed the tab, you can still load the same website in almost no time.
Of course DNS is much complicated than that but for now it should be enough. If you want to read more, check out this website.
2. Browser makes a connection to a web server
Let's now go a little back in time, to times of grandparent of all networks ARPANET. ARPANET was a research network, sponsored by US Deparment of Defense. Creators of ARPANET developed TCP/IP Reference Model, after its two primary protocols. After some time, it eventually connected hundreds of universities and public institutions. Now, we use everyday ARPANET's successor, the World Wide Web.
So here comes the another term that we need to explain to get better at knowing what actually Internet is. After receiving an IP address from DNS server, our machine builds a connection with a server that matches already retrieved IP address. This is done using the Transmission Control Protocol, or TCP. TCP is Internet protocol that provides reliable ordered delivery of stream of data. Data is very generic term, what TCP does is sending packets which is the smallest unit of information that can be transmitted over the Internet. TCP has many great features build in, like asking for missing pieces(packets) or re-ordering them correctly.
In it's most basic form, TCP will follow 4 steps while sending data:
- Protocol assigns unique ID to the packet. Thanks to that protocol can identify and distinguish each packet.
- If everything goes right (right packet order, reciepient receives data), reciepient sends confirmation to the sender.
- After receiving confirmation, sender sends another set of data until completion of the task.
- If something goes wrong, sender will keep on trying to send data to the recipient.
3. Browser sends an HTTP request to the server
Once the connection is established, browser can perform sending an HTTP request to the server that it connected to. By sending request we can finally send or receive data(or as i said before - packets). In our task - of trying to get a website load up on our screen - our browsers sends GET requests in order to receive static files that makes the website that we are looking for. Headers allow to send additional information about the resource, for example its type. We can find them both in requests and responses exchanged between client and the server. Requests can also contain a body. For example, these can be found in requests that we sent from various forms on the Internet.
4. Server sends back HTTP response
Server then handles the request that we've sent and sends to us an HTTP response through its response handler. A "response handler" is a code that is executed after a request is made. If everything goes right, server agrees to send files of the website with the response
200 OK. After that servers starts sending small chunks of data. In the case of responses, body will contain the resource that we asked for.
5. Browser displays the content it received
First HTML is received and rendered. Browser looks up for additional tags like
<link> to perform additional GET requests in order to receive CSS,JS and/or other resource like images. Browser parses received files in following order:
- HTML - DOM structure is generated
- CSS - parsed stylesheet is applied to the DOM structure. Website is ready to render to the browser.