Why am I sharing my travel stories?
Founder & CEO of TruStory. I have a passion for understanding things at a fundamental level and sharing it as clearly as possible.
In Part 1 of this post, I talked about what modules are, why developers use them, and the various ways to incorporate them into your programs.
In this second part, I’ll tackle what exactly it means to “bundle” modules: why we bundle modules, the different ways to do so, and the future of modules in web development.
On a high level, module bundling is simply the process of stitching together a group of modules (and their dependencies) into a single file (or group of files) in the correct order.
As with all aspects of web development, the devil is in the details. :)
When you divide your program into modules, you typically organize those modules into different files and folders. Chances are, you’ll also have a group of modules for the libraries you’re using, like Underscore or React.
As a result, each of those files has to be included in your main HTML file in a <script> tag, which is then loaded by the browser when a user visits your home page. Having separate <script> tags for each file means that the browser has to load each file individually: one… by… one.
…Which is bad news for page load time.
To get around this problem, we bundle, or “concatenate” all our files into one big file (or a couple files as the case may be) in order to reduce the number of requests. When you hear developers talking about the “build step” or “build process,” this is what they’re talking about.
Another common approach to speed up the bundling operation is to “minify” the bundled code. Minification is the process of removing unnecessary characters from source code (e.g. whitespace, comments, new line characters, etc.), in order to reduce the overall size of the content without changing the functionality of the code.
Less data means less browser processing time, which in turn reduces the time it takes to download files. If you’ve ever seen a file that had a “min” extension like “underscore-min.js”, you probably noticed that the minified version is pretty tiny (and unreadable) compared to the full version.
Task runners like Gulp and Grunt make concatenation and minification straightforward for developers, ensuring that human-readable code stays exposed for developers while machine-optimized code gets bundled for browsers.
However, if you’re adhering to non-native module systems that browsers can’t interpret like CommonJS or AMD (or even native ES6 module formats), you’ll need to use a specialized tool to convert your modules into properly-ordered browser-friendly code. That’s where Browserify, RequireJS, Webpack, and other “module bundlers” or “module loaders” come into play.
In addition to bundling and/or loading your modules, module bundlers offer a ton of additional features like auto-recompiling code when you make a change or producing source maps for debugging.
Let’s walk through some common module bundling methods:
As you know from Part 1, CommonJS loads modules synchronously, which would be fine except that it’s not practical for browsers. I mentioned that there was a workaround to this — one of them is a module bundler called Browserify. Browserify is a tool that compiles CommonJS modules for the browser.
For example, let’s say you have this main.js file that imports a module to calculate the average of an array of numbers:
So in this case, we have one dependency (myDependency). Using the command below, Browserify recursively bundles up all the required module(s) starting at main.js into a single file called bundle.js:
Browserify does this by jumping in to parse the AST for each require call in order to traverse the entire dependency graph of your project. Once it’s figured out how your dependencies are structured, it bundles them all in the right order into a single file. At that point, all you have to do is insert a single <script> tag with your “bundle.js” file into your html to ensure that all of your source code is downloaded in one HTTP request. Bam! Bundled to go.
Similarly, if you have multiple files with multiple dependencies, you simply tell Browserify what your entry file is and sit back while it does its magic.
The final product: bundled files prepped and ready for tools like Minify-JS to minify the bundled code.
If you’re using AMD, you’ll want to use an AMD loader like RequireJS or Curl. A module loader (vs. a bundler) dynamically loads modules that your program needs to run.
As a reminder, one of the main differences of AMD over CommonJS is that it loads modules asynchronously. In this sense, with AMD, you technically don’t actually need a build step where you bundle your modules into one file since you’re loading your modules asynchronously — meaning you’re progressively downloading only those files which are strictly necessary to execute the program instead of downloading all the files at once when the user first visits the page.
In reality, however, the overhead of high-volume requests over time for every user action doesn’t make much sense in production. Most web developers still use build tools to bundle and minify their AMD modules for optimal performance, using tools like RequireJS optimizer, r.js, for example.
Overall, the difference between AMD and CommonJS when it comes to bundling is this: during development, AMD apps can get away without a build step. At least, until you push the code live, at which point optimizers like r.js can step in to handle it.
For an interesting discussion on CommonJS vs. AMD, check out this post at Tom Dale’s blog :)
So far as bundlers go, Webpack is the new kid on the block. It was designed to be agnostic to the module system you use, allowing developers to use CommonJS, AMD, or ES6 as appropriate.
You might be wondering why we need Webpack when we already have other bundlers like Browserify and RequireJS that get the job done and do a pretty darn good job at it. Well, for one, Webpack provides some useful features like “code splitting” — a way to split your codebase into “chunks” which are loaded on demand.
For example, if you have a web app with blocks of code that are only required under certain circumstances, it might not be efficient to put the whole codebase into a single massive bundled file. In this case, you could use code splitting to extract code into bundled chunks that can be loaded on demand, avoiding trouble with big up-front payloads when most users only need the core of your application.
Code splitting is just one of many compelling features Webpack offers, and the Internet is full of strong opinion pieces on whether Webpack or Browserify is better. Here are just a few of the more level-headed discussions that I found useful for wrapping my head around the issue:
Back already? Good! Because next up I want to talk about ES6 modules, which in some ways could reduce the need for bundlers in the future. (you’ll see what I mean momentarily.) First, let’s understand how ES6 modules are loaded.
The most important difference between the current JS Module formats (CommonJS, AMD) and ES6 modules is that ES6 modules are designed with static analysis in mind. What this means is that when you import modules, the import is resolved at compile time — that is, before the script starts executing. This allows us to remove exports that are not used by other modules before we run the program. Removing unused exports can lead to significant space savings, reducing stress on the browser.
One common question that comes up is: how is this any different from the dead code elimination that happens when you use something like UglifyJS to minify your code? The answer is, as always, “it depends.”
(NOTE: Dead code elimination is an optimization step which removes unused code and variables — think of it as removing the excess baggage that your bundled program doesn’t need to run, *after* it’s been bundled).
Sometimes, dead code elimination could work exactly the same between UglifyJS and ES6 modules, and other times not. There’s a cool example at Rollup’s wiki) if you want to check it out.
What makes ES6 modules different is the different approach to dead code elimination, called “tree shaking”. Tree shaking is essentially dead code elimination reversed. It only includes code that your bundle needs to run, rather than excluding code your bundle doesn’t need. Let’s look at an example of tree shaking:
Let’s say we have a utils.js file with the functions below, each of which we export using ES6 syntax:
Next, let’s say we don’t know what utils functions we want to use in our program, so we go ahead and import all of the modules in main.js like so:
And then we later end up only using the each function:
The “tree shaken” version of our main.js file would look like this once the modules have been loaded:
Notice how the only exports included are the ones we use: each.
Meanwhile, if we decide to use the filter function instead of the each function, we wind up looking at something like this:
The tree shaken version looks like:
Notice how this time both each and filter are included. This is because filter is defined to use each, so we need both exports for the module to work.
Pretty slick, huh?
I challenge you to play around and explore tree shaking in Rollup.js’s live demo and editor.
Ok, so we know that ES6 modules are loaded differently than other module formats, but we still haven’t talked about the build step for when you’re using ES6 modules.
Unfortunately, ES6 modules still require some extra work, since there isn’t a native implementation for how browsers load ES6 modules just yet.
Here are a couple of the options for building/converting ES6 modules to work in the browser, with #1 being the most common approach today:
As web developers, we have to jump through a lot of hoops. It’s not always easy to convert our beautiful ES6 modules into something browsers can interpret.
The question is, when will ES6 modules run in the browser without all this overhead?
The answer, thankfully, “sooner than later.”
ECMAScript currently has a specification for a solution called the ECMAScript 6 module loader API. In short, this is a programmatic, Promise-based API that is supposed to dynamically load your modules and cache them so that subsequent imports do not reload a new version of the module.
It’ll look something like this:
Alternately, you could also define modules by specifying “type=module” directly in the script tag, like so:
If you haven’t checked out the repo for the module loader API polyfill yet, I strongly encourage you to at least take a peek.
Moreover, if you want to test-drive this approach, check out SystemJS, which is built on top of the ES6 Module Loader polyfill . SystemJS dynamically loads any module format (ES6 modules, AMD, CommonJS and/or global scripts) in the browser and in Node. It keeps track of all loaded modules in a “module registry” to avoid re-loading modules that were previously loaded. Not to mention that it also automatically transpiles ES6 modules (if you simply set an option) and has the ability to load any module type from any other type! Pretty neat.
The rising popularity of ES6 modules has some interesting consequences:
With HTTP/1, we’re only allowed one request per TCP connection. That’s why in loading multiple resources requires multiple requests. With HTTP/2, everything changes. HTTP/2 is fully multiplexed, meaning multiple requests and responses can happen in parallel. As a result, we can serve multiple requests simultaneously with a single connection.
Since the cost per HTTP request is significantly lower than HTTP/1, loading a bunch of modules isn’t going to be a huge performance issue in the long run. Some argue that this means module bundling isn’t going to be necessary anymore. It’s certainly possible, but it really depends on the situation.
For one, module bundling offers benefits that HTTP/2 doesn’t account for, like removing unused exports to save space. If you’re building a website where every tiny bit of performance matters, bundling may give you incremental advantages in the long run. That said, if your performance needs aren’t so extreme, you could potentially save time at minimal cost by skipping the build step altogether.
Overall, we’re still pretty far away from having a majority of websites serving their code over HTTP/2. I’m inclined to predict that the build process is here to stay at least for the near term.
PS: There are other differences with HTTP/2 as well, and if you’re curious, here’s a great resource.
Once ES6 becomes the module standard, do we really need other non-native module formats?
I doubt it.
Chances are, quite a while ;)
Plus, there are many people who like having “flavors” to choose from, so the “one truthful approach” may not ever become a reality.
I hope this two-part post helped clear up some of the jargon developers use when talking about modules and module bundling. Go ahead and check out part 1 if you found any of the terms above confusing.
As always, talk to me in the comments and feel free to ask questions!
Happy bundling :)