As application code shifts more and more to the client, the old application patterns are starting to re-emerge. One of these old patterns is the need for modularity. I did some research on module loaders for Javascript. The current favorite is the well thought-out and designed AMD standard. It will be interesting to watch how it develops as a consensus/committee based design. It does appear that some of the people involved had learned from prior experiences of trying to solve the problem, and that helps a lot.
AMD, is just a standard. So there are a few implementations. The current favorite AMD implementation is require.js. It is a clean project led by James Burke. The documentation is good and the project has a lot of visible support on the internet. Also, I’ve read some of his online interactions. It is tough to help out on an open source project and respond to criticism. However, James really handles it well. This is really important. Many online discussions drift towards negative behavior during technical debates.
Anyways, as I’m learning more about this and how these JS apps are being developed, I’m just struck at how their ‘new’ ideas are just rediscoveries. These are the same problems that early microcomputer devs had to deal with in the 1980s.
How does this tie to AMD loaders? Well, they missed a few of the lessons. They haven’t quite made the conceptual jump that they really need to manage memory. I’m not talking about malloc, I’m talking ‘module’ or code memory management. This may seem odd, because that is surely one of the intents they are addressing in the first place. I think it is because the browsers do not offer the capability to control this yet.
For example, in the early days of microcomputing, there was a popular system of the day called UCSD Pascal, or the UCSD p-system. It was a cool virtual machine implementation that emulated a 16 bit stack machine on just about any computer of the day. One thing that blew me away back then was the fact that even a few games were written in the system (Wizardry, Sundog, and a few others). The history and popularity of that system is the subject of another post. (Apple Pascal and How I learned Pascal)
One of the key attributes of the system was the concept of a ‘segment’ manager. This is the part that is relevant to this discussion. It managed what code was in memory at the given moment. Some other languages and systems had different names (overlay manager), but the concept was the same.
Here is how it worked:
Since machines in the late 70’s were just starting to have 64k total memory at that time, memory was a premium. At some point, some programs just needed more memory to solve the task for a portion of the program than a 64k system could provide. In order to run those programs, a code module manager was implemented in the p-system (segment manager). You could load or unload segments as needed for an application. For example, if you needed to switch to s apreadsheet from word processor mode, you would associate the code for the spreadsheet with a segment. If the user switched to spreadsheet mode and caused that code to run, the magic happened. The system would unload an unused segment from memory and then load the spreadsheet segment from disk and then execute it. Here is the interesting part. When you finished using that code, the system would automatically unload the code segment from memory so it could be used by another portion of the program. This system allowed people to finally write programs that were larger than the machines they used.
This was done in the early 1980s, yet, even today, this type of system is not common place. The segment manager was so important back then that it was replicated in the Mac Toolbox (which was originally in Pascal) until OS X. Almost all PC pascal implementations had it as well (Notably Turbo Pascal and Delphi). These systems did not appear in any of the ‘C’ based environments.
Eventually, the need for these was lessened as some of the functionality was replaced by dynamic loaders on unix or the DLL system on windows. Computers had lots and lots of memory for code. Still, the choice is not optimal today. For example, modern Unix or WinNT based kernels may un-associate part of a programs address space from physical RAM, but it still must maintain the book-keeping and page-tables for that code. If you look at the address space of a typical desktop program in memory, there are so many modules that this linking process can actually be responsible for the slow startup on todays ‘modern’ systems. The segment system was much cleaner and smaller.
So back to javascript. Let say you bring up a dialog that isn’t used often (lets say a file or photo importer). All of those resources get loaded into the little memory that your phone or ipad have. Unfortunately, there is really no way to unload javascript from a web page other than reloading a completely new web page. I did some google searches on the topic and their is no discussion on this topic. The closest discussion I could find was on reference re-binding variables in order to hide code. Nobody knows if that code gets unloaded if the GC runs. I think it is a safe bet that it doesn’t and this is yet another reason why browser memory footprints are so large.
It turns out that this is a hard problem if you don’t think about it from the beginning. For example, in Flash, they ran into the memory bloat issue issue (ActionScript is essentially Javascript after all). It was a huge problem for years. In flash they finally tackled this huge issue in 2008 in Flash 10 (which wasn’t in wide deployment til early 2009). If you are a javascript person, I suggest you look at the link I just embedded. Look at the long list of steps. I guaranty that HTML5 or HTML6 will be implementing this feature within a few years. I think a possible catalyst might be general usage of tools that show browser memory consumption (like the Chrome memory profiler).
Anyways, the segment concept was one of those little things that got lost in the march of progress. This happens a lot when technologies are created to solve a particular problem (javascript) and evolve into and older category (desktop style UI). Still, if I was to give javascript developers a message it would be this. “You are not the first to solve this problem. Take a moment and look to the past. A solution might have already existed.”
Safari GCs functions once they aren’t referenced:
http://f.cl.ly/items/2F1d0L1a293p153A3A0E/index.html
Woohoo, UCSD pascal 🙂 Nice write-up.
Segmentation had some other benefits, too. For example in security: segments could have certain permissions, and you could have small segments with only the specific capabilities that the code in question needs, adhering to the principle of least privilege. The move to flat memory spaces (or at least one memory space per process) mostly lost all of that. All libraries in a process can do everything, so if, say, your jpeg library gets pwned, the whole thing goes down. There has been some more recent research in capability-based architectures to bring this back, as well as projects to fit this into current CPU architectures such as capsicum-linux.org.