253
votes

I have done some web based projects, but I don't think too much about the load and execution sequence of an ordinary web page. But now I need to know detail. It's hard to find answers from Google or SO, so I created this question.

A sample page is like this:

<html>
 <head>
  <script src="jquery.js" type="text/javascript"></script>
  <script src="abc.js" type="text/javascript">
  </script>
  <link rel="stylesheets" type="text/css" href="abc.css"></link>
  <style>h2{font-wight:bold;}</style>
  <script>
  $(document).ready(function(){
     $("#img").attr("src", "kkk.png");
  });
 </script>
 </head>
 <body>
    <img id="img" src="abc.jpg" style="width:400px;height:300px;"/>
    <script src="kkk.js" type="text/javascript"></script>
 </body>
</html>

So here are my questions:

  1. How does this page load?
  2. What is the sequence of the loading?
  3. When is the JS code executed? (inline and external)
  4. When is the CSS executed (applied)?
  5. When does $(document).ready get executed?
  6. Will abc.jpg be downloaded? Or does it just download kkk.png?

I have the following understanding:

  1. The browser loads the html (DOM) at first.
  2. The browser starts to load the external resources from top to bottom, line by line.
  3. If a <script> is met, the loading will be blocked and wait until the JS file is loaded and executed and then continue.
  4. Other resources (CSS/images) are loaded in parallel and executed if needed (like CSS).

Or is it like this:

The browser parses the html (DOM) and gets the external resources in an array or stack-like structure. After the html is loaded, the browser starts to load the external resources in the structure in parallel and execute, until all resources are loaded. Then the DOM will be changed corresponding to the user's behaviors depending on the JS.

Can anyone give a detailed explanation about what happens when you've got the response of a html page? Does this vary in different browsers? Any reference about this question?

Thanks.

EDIT:

I did an experiment in Firefox with Firebug. And it shows as the following image: alt text

7
Steve Souders have done a great deal of work in this field. Google for steve+souders+high+performance and have a look.anddoutoi
I don't mean performance tuning. I want to know the detail.Zhu Tao
By reading his work my understanding of how "it" works in detail increased tenfold so it´s still a valid comment. I´m not allowed by copyright to quote his whole book here so I still suggest you look his work up.anddoutoi
A great description of the order things happen is hereGerrat

7 Answers

286
votes

According to your sample,

<html>
 <head>
  <script src="jquery.js" type="text/javascript"></script>
  <script src="abc.js" type="text/javascript">
  </script>
  <link rel="stylesheets" type="text/css" href="abc.css"></link>
  <style>h2{font-wight:bold;}</style>
  <script>
  $(document).ready(function(){
     $("#img").attr("src", "kkk.png");
  });
 </script>
 </head>
 <body>
    <img id="img" src="abc.jpg" style="width:400px;height:300px;"/>
    <script src="kkk.js" type="text/javascript"></script>
 </body>
</html>

roughly the execution flow is about as follows:

  1. The HTML document gets downloaded
  2. The parsing of the HTML document starts
  3. HTML Parsing reaches <script src="jquery.js" ...
  4. jquery.js is downloaded and parsed
  5. HTML parsing reaches <script src="abc.js" ...
  6. abc.js is downloaded, parsed and run
  7. HTML parsing reaches <link href="abc.css" ...
  8. abc.css is downloaded and parsed
  9. HTML parsing reaches <style>...</style>
  10. Internal CSS rules are parsed and defined
  11. HTML parsing reaches <script>...</script>
  12. Internal Javascript is parsed and run
  13. HTML Parsing reaches <img src="abc.jpg" ...
  14. abc.jpg is downloaded and displayed
  15. HTML Parsing reaches <script src="kkk.js" ...
  16. kkk.js is downloaded, parsed and run
  17. Parsing of HTML document ends

Note that the download may be asynchronous and non-blocking due to behaviours of the browser. For example, in Firefox there is this setting which limits the number of simultaneous requests per domain.

Also depending on whether the component has already been cached or not, the component may not be requested again in a near-future request. If the component has been cached, the component will be loaded from the cache instead of the actual URL.

When the parsing is ended and document is ready and loaded, the events onload is fired. Thus when onload is fired, the $("#img").attr("src","kkk.png"); is run. So:

  1. Document is ready, onload is fired.
  2. Javascript execution hits $("#img").attr("src", "kkk.png");
  3. kkk.png is downloaded and loads into #img

The $(document).ready() event is actually the event fired when all page components are loaded and ready. Read more about it: http://docs.jquery.com/Tutorials:Introducing_$(document).ready()

Edit - This portion elaborates more on the parallel or not part:

By default, and from my current understanding, browser usually runs each page on 3 ways: HTML parser, Javascript/DOM, and CSS.

The HTML parser is responsible for parsing and interpreting the markup language and thus must be able to make calls to the other 2 components.

For example when the parser comes across this line:

<a href="#" onclick="alert('test');return false;" style="font-weight:bold">a hypertext link</a>

The parser will make 3 calls, two to Javascript and one to CSS. Firstly, the parser will create this element and register it in the DOM namespace, together with all the attributes related to this element. Secondly, the parser will call to bind the onclick event to this particular element. Lastly, it will make another call to the CSS thread to apply the CSS style to this particular element.

The execution is top down and single threaded. Javascript may look multi-threaded, but the fact is that Javascript is single threaded. This is why when loading external javascript file, the parsing of the main HTML page is suspended.

However, the CSS files can be download simultaneously because CSS rules are always being applied - meaning to say elements are always repainted with the freshest CSS rules defined - thus making it unblocking.

An element will only be available in the DOM after it has been parsed. Thus when working with a specific element, the script is always placed after, or within the window onload event.

Script like this will cause error (on jQuery):

<script type="text/javascript">/* <![CDATA[ */
  alert($("#mydiv").html());
/* ]]> */</script>
<div id="mydiv">Hello World</div>

Because when the script is parsed, #mydiv element is still not defined. Instead this would work:

<div id="mydiv">Hello World</div>
<script type="text/javascript">/* <![CDATA[ */
  alert($("#mydiv").html());
/* ]]> */</script>

OR

<script type="text/javascript">/* <![CDATA[ */
  $(window).ready(function(){
                    alert($("#mydiv").html());
                  });
/* ]]> */</script>
<div id="mydiv">Hello World</div>
36
votes

1) HTML is downloaded.

2) HTML is parsed progressively. When a request for an asset is reached the browser will attempt to download the asset. A default configuration for most HTTP servers and most browsers is to process only two requests in parallel. IE can be reconfigured to downloaded an unlimited number of assets in parallel. Steve Souders has been able to download over 100 requests in parallel on IE. The exception is that script requests block parallel asset requests in IE. This is why it is highly suggested to put all JavaScript in external JavaScript files and put the request just prior to the closing body tag in the HTML.

3) Once the HTML is parsed the DOM is rendered. CSS is rendered in parallel to the rendering of the DOM in nearly all user agents. As a result it is strongly recommended to put all CSS code into external CSS files that are requested as high as possible in the <head></head> section of the document. Otherwise the page is rendered up to the occurance of the CSS request position in the DOM and then rendering starts over from the top.

4) Only after the DOM is completely rendered and requests for all assets in the page are either resolved or time out does JavaScript execute from the onload event. IE7, and I am not sure about IE8, does not time out assets quickly if an HTTP response is not received from the asset request. This means an asset requested by JavaScript inline to the page, that is JavaScript written into HTML tags that is not contained in a function, can prevent the execution of the onload event for hours. This problem can be triggered if such inline code exists in the page and fails to execute due to a namespace collision that causes a code crash.

Of the above steps the one that is most CPU intensive is the parsing of the DOM/CSS. If you want your page to be processed faster then write efficient CSS by eliminating redundent instructions and consolidating CSS instructions into the fewest possible element referrences. Reducing the number of nodes in your DOM tree will also produce faster rendering.

Keep in mind that each asset you request from your HTML or even from your CSS/JavaScript assets is requested with a separate HTTP header. This consumes bandwidth and requires processing per request. If you want to make your page load as fast as possible then reduce the number of HTTP requests and reduce the size of your HTML. You are not doing your user experience any favors by averaging page weight at 180k from HTML alone. Many developers subscribe to some fallacy that a user makes up their mind about the quality of content on the page in 6 nanoseconds and then purges the DNS query from his server and burns his computer if displeased, so instead they provide the most beautiful possible page at 250k of HTML. Keep your HTML short and sweet so that a user can load your pages faster. Nothing improves the user experience like a fast and responsive web page.

12
votes

Open your page in Firefox and get the HTTPFox addon. It will tell you all that you need.

Found this on archivist.incuito:

http://archivist.incutio.com/viewlist/css-discuss/76444

When you first request a page, your browser sends a GET request to the server, which returns the HTML to the browser. The browser then starts parsing the page (possibly before all of it has been returned).

When it finds a reference to an external entity such as a CSS file, an image file, a script file, a Flash file, or anything else external to the page (either on the same server/domain or not), it prepares to make a further GET request for that resource.

However the HTTP standard specifies that the browser should not make more than two concurrent requests to the same domain. So it puts each request to a particular domain in a queue, and as each entity is returned it starts the next one in the queue for that domain.

The time it takes for an entity to be returned depends on its size, the load the server is currently experiencing, and the activity of every single machine between the machine running the browser and the server. The list of these machines can in principle be different for every request, to the extent that one image might travel from the USA to me in the UK over the Atlantic, while another from the same server comes out via the Pacific, Asia and Europe, which takes longer. So you might get a sequence like the following, where a page has (in this order) references to three script files, and five image files, all of differing sizes:

  1. GET script1 and script2; queue request for script3 and images1-5.
  2. script2 arrives (it's smaller than script1): GET script3, queue images1-5.
  3. script1 arrives; GET image1, queue images2-5.
  4. image1 arrives, GET image2, queue images3-5.
  5. script3 fails to arrive due to a network problem - GET script3 again (automatic retry).
  6. image2 arrives, script3 still not here; GET image3, queue images4-5.
  7. image 3 arrives; GET image4, queue image5, script3 still on the way.
  8. image4 arrives, GET image5;
  9. image5 arrives.
  10. script3 arrives.

In short: any old order, depending on what the server is doing, what the rest of the Internet is doing, and whether or not anything has errors and has to be re-fetched. This may seem like a weird way of doing things, but it would quite literally be impossible for the Internet (not just the WWW) to work with any degree of reliability if it wasn't done this way.

Also, the browser's internal queue might not fetch entities in the order they appear in the page - it's not required to by any standard.

(Oh, and don't forget caching, both in the browser and in caching proxies used by ISPs to ease the load on the network.)

6
votes

If you're asking this because you want to speed up your web site, check out Yahoo's page on Best Practices for Speeding Up Your Web Site. It has a lot of best practices for speeding up your web site.

2
votes

AFAIK, the browser (at least Firefox) requests every resource as soon as it parses it. If it encounters an img tag it will request that image as soon as the img tag has been parsed. And that can be even before it has received the totality of the HTML document... that is it could still be downloading the HTML document when that happens.

For Firefox, there are browser queues that apply, depending on how they are set in about:config. For example it will not attempt to download more then 8 files at once from the same server... the additional requests will be queued. I think there are per-domain limits, per proxy limits, and other stuff, which are documented on the Mozilla website and can be set in about:config. I read somewhere that IE has no such limits.

The jQuery ready event is fired as soon as the main HTML document has been downloaded and it's DOM parsed. Then the load event is fired once all linked resources (CSS, images, etc.) have been downloaded and parsed as well. It is made clear in the jQuery documentation.

If you want to control the order in which all that is loaded, I believe the most reliable way to do it is through JavaScript.

1
votes

Dynatrace AJAX Edition shows you the exact sequence of page loading, parsing and execution.

1
votes

The chosen answer looks like does not apply to modern browsers, at least on Firefox 52. What I observed is that the requests of loading resources like css, javascript are issued before HTML parser reaches the element, for example

<html>
  <head>
    <!-- prints the date before parsing and blocks HTMP parsering -->
    <script>
      console.log("start: " + (new Date()).toISOString());
      for(var i=0; i<1000000000; i++) {};
    </script>

    <script src="jquery.js" type="text/javascript"></script>
    <script src="abc.js" type="text/javascript"></script>
    <link rel="stylesheets" type="text/css" href="abc.css"></link>
    <style>h2{font-wight:bold;}</style>
    <script>
      $(document).ready(function(){
      $("#img").attr("src", "kkk.png");
     });
   </script>
 </head>
 <body>
   <img id="img" src="abc.jpg" style="width:400px;height:300px;"/>
   <script src="kkk.js" type="text/javascript"></script>
   </body>
</html>

What I found that the start time of requests to load css and javascript resources were not being blocked. Looks like Firefox has a HTML scan, and identify key resources(img resource is not included) before starting to parse the HTML.