New Version of Simplix, My Hobby Operating System

Some of you may remember this post about Simplix, my hobby operating system. The first version, published in Sept., was not able to do much and was really buggy. Over the next 9 months, as I was studying the inner workings of modern operating systems, I also wanted to experiment with some of the concepts and ideas I was learning. I decided to apply my newly acquired knowledge to a new version of Simplix I was secretly working on. This new version contains a lot of improvements:

  • Greatly improved the performance of the page allocator.
  • New high performance memory allocator (kmalloc/kfree)
  • Better handling of software exceptions: Kill the current process and display debug info.
  • New system calls. Simplix now supports exit, fork, waitpid, getpid, getppid, time, stime, sleep and brk.
  • New scheduling algorithm. This algorithm is ridiculously simple and not particularly efficient or elegant. It should however be fair to interactive jobs, while doing its best to accomodate CPU intensive tasks.
  • Implemented a small set of user space libraries, including string manipulations (string.h) and a trivial implementation of malloc and free copied directly from the book “The C Programming Language” by Brian W. Kernighan and Dennis M. Ritchie.
  • Much cleaner source tree, improved source code documentation, etc.

As you have probably noticed by now, this new version of Simplix still does not provide any I/O facility for user space tasks. I/O is probably the most complex part of an operating system, so I decided to put it off for a little while longer. In this version however, I decided to write a few sample programs:

  • A Unix time counter implemented as a kernel thread.
  • Another kernel thread that finds and prints prime numbers.
  • A user task that computes the first 10,000 decimals of the number PI.
  • A kernel thread that prints live information about the system.
  • A program that creates a lot of user space tasks, each of them sleeps for one second before exiting.

You can already take a look at the complete and up-to-date source code, and even download it. Compiling Simplix requires a not-too-ancient version of GCC, make, and a few basic command line tools available on almost all Unix systems (objcopy, dd, etc.) If you don’t feel like trying it out yourself, I put together a very short Flash video showing the system booting and running inside Bochs. You can also put the kernel binary on a floppy and try it on a real PC with a floppy drive. Cheers!

Note: If you can’t see the video below, it’s probably because you are reading this article using a news reader. If that’s the case, open this article in a web browser to view the video.

Get the Flash Player to see this content.
Posted in System Programming | Comments Off

gedit, An Awesome Text Editor

TextMate has gained a lot of traction in the past few years, especially among web developers. Ruby developers swear by it for some reason. I don’t have MacOS X at home so I don’t use TextMate. Instead, my favorite text editor is gedit, the official text editor of the GNOME desktop environment. Unlike TextMate, gedit does not cost a thing, and is released under the GNU General Public License (GPL). It is extremely light weight and easily extensible via plugins, which can be written either in C or Python. On Ubuntu, the most popular plugins can be installed via the package gedit-plugins. Additional third party plugins can be downloaded here. Some of my favorite plugins include the following:

  • File browser pane
  • Find in documents
  • Draw spaces
  • Reopen tabs
  • Save without trailing space
  • Symbol browser
  • Tab converter

I also use the Darkmate theme for syntax highlighting. Finally, TextMate aficionados will enjoy this article explaining how to configure gedit to look and behave just like their editor of choice. Now, if you really insist on spending $64 (that’s how much a single user license for TextMate costs as the time of this writing), I would recommend you donate this amount to a charity of your choice. They really need it. Cheers!

Posted in Uncategorized | Comments Off

Ubuntu 8.04 (Hardy Heron) First Impressions

Ubuntu logo

Last night, I downloaded and installed Ubuntu 8.04 (Hardy Heron) Release Candidate on my laptop (a HP dv2000 with a dual core AMD64 2.8GHz CPU and 2GB RAM) I had used Ubuntu in the past (versions 7.04 and 7.10) on that same machine, but finally turned to Windows Vista because my hardware was not very well supported on Linux (mainly my Broadcom wireless card) However, last night, I fell in love with Ubuntu all over again. The install went very smoothly, and everything worked right out of the box. I activated the proprietary drivers for both my video card and my wireless card, and voila! I was off and running in about 30 minutes. Awesome!

Posted in Uncategorized | 1 Comment

JavaScript: The Good Parts

JavaScript: The Good Parts (Book Cover)

Douglas Crockford just published his first book titled JavaScript: The Good Parts. After reading this book, some of you may be left with the impression that Douglas is always complaining about some aspect of this very popular programming language. However, having been a user of the JavaScript language for about 7 years, and having used it extensively in small web sites and large web applications, all I can tell you is that I could not agree more with the author.

It is really unfortunate that we live in an imperfect world. As such, there is no perfect programming language, and there will probably never be. However, by gaining a deep understanding of the philosophy and the inner workings of a programming language, and by sticking to a subset of that language (what the author refers to as the “good parts”), we can all become better programmers by constructing more reliable and more maintainable programs.

In JavaScript: The Good Parts, Douglas extensively describes that good subset of the JavaScript language, occasionally warning to avoid the bad. I consider Douglas’ book a must-buy for anybody who’s serious about developing professional applications for the web. It’s definitely well worth the read!

Posted in Web Development | 1 Comment

YUI Compressor Version 2.3 Now Available

This new version of the compressor fixes a few bugs and implements a few additional micro optimizations. Please refer to the CHANGELOG file for a complete list of changes, and don’t hesitate to report any issue you may experience with this version of the YUI Compressor.

Download version 2.3 of the YUI Compressor

Posted in Web Development | 19 Comments

Happy New Year From Beautiful Rio de Janeiro

I wish all of you a happy new year from beautiful Rio de Janeiro! I’m still enjoying a hot (95F) and humid weather before heading back to the US in a few days. May bring you happiness and success in all of your endeavors. Cheers!


Posted in Uncategorized | 1 Comment

High Performance Ajax Applications - Video Presentation

Video snapshot

A few days ago, I gave a talk at Yahoo! about High Performance Ajax Applications. Eric Miraglia, from the YUI team, and Ricky Montalvo, from the Yahoo! Developer Network, were kind enough to shoot the video, edit it, and put it on the YUI Blog. In this talk, I cover the following topics:

  • Developing for high performance
  • High performance page load
  • High performance JavaScript
  • High performance DHTML
  • High performance layout and CSS
  • High performance Ajax
  • Performance measurement tools

Follow along by downloading the PowerPoint slides, or by looking at the slides on Slideshare. I’m looking forward to reading your comments and answering your questions in the comments section of this blog!

Posted in Web Development | 7 Comments

The Problem With innerHTML

The innerHTML property is extremely popular because it provides a simple way to completely replace the contents of an HTML element. Another way to do that is to use the DOM Level 2 API (removeChild, createElement, appendChild) but using innerHTML is by far the easiest and most efficient way to modify the DOM tree. However, innerHTML has few problems of its own that you need to be aware of:

  • Improper handling of the innerHTML property can enable script-injection attacks on Internet Explorer when the HTML string contains a script tag marked as deffered: <script defer>...<script>
  • Setting innerHTML will destroy existing HTML elements that have event handlers attached to them, potentially creating a memory leak on some browsers.

There are a few other minor drawbacks worth mentioning:

  • You don’t get back a reference to the element(s) you just created, forcing you to add code to retrieve those references manually (using the DOM APIs…)
  • You can’t set the innerHTML property on all HTML elements on all browsers (for instance, Internet Explorer won’t let you set the innerHTML property of a table row element)

I am more concerned with the security and memory issues associated with using the innerHTML property. Obviously, this problem is nothing new, and very bright people have already figured out ways to work around some of these problems.

Douglas Crockford wrote a purge function that takes care of breaking some circular references caused by attaching event handlers to HTML elements, allowing the garbage collector to release all the memory associated with these HTML elements.

Removing the script tags from the HTML string is not as easy as it seems. A regular expression should do the trick, although it’s hard to know whether it covers all possible cases. Here is the one I came up with:

/<script[^>]*>[\S\s]*?<\/script[^>]*>/ig

Now, let’s put these two techniques together in a single setInnerHTML function (Update: Thanks to those who commented on this article. I fixed the errors/holes you mentioned, and also decided to bind the setInnerHTML function to YAHOO.util.Dom)

YAHOO.util.Dom.setInnerHTML = function (el, html) {
    el = YAHOO.util.Dom.get(el);
    if (!el || typeof html !== 'string') {
        return null;
    }
    // Break circular references.
    (function (o) {
        var a = o.attributes, i, l, n, c;
        if (a) {
            l = a.length;
            for (i = 0; i < l; i += 1) {
                n = a[i].name;
                if (typeof o[n] === 'function') {
                    o[n] = null;
                }
            }
        }
        a = o.childNodes;
        if (a) {
            l = a.length;
            for (i = 0; i < l; i += 1) {
                c = o.childNodes[i];
                // Purge child nodes.
                arguments.callee(c);
                // Removes all listeners attached to the element via YUI's addListener.
                YAHOO.util.Event.purgeElement(c);
            }
        }
    })(el);
    // Remove scripts from HTML string, and set innerHTML property
    el.innerHTML = html.replace(/<script[^>]*>[\S\s]*?<\/script[^>]*>/ig, "");
    // Return a reference to the first child
    return el.firstChild;
};

Voila! Let me know if there is anything else that should be part of this function, or if I missed anything obvious in the regular expression.

Update: There are obviously many more ways to inject malicious code in a web page. The setInnerHTML function barely normalizes the <script> tag execution behavior across all A-grade browsers. If you are going to inject HTML code that cannot be trusted, make sure you sanitize it first on the server side. There are many libraries available for this.

Update: IE8 has a new toStaticHTML function attached to the window object that removes any potentially executable content from an HTML string!

Posted in Web Development | 30 Comments

Adding Back Button and Bookmarking Support to Your DHTML Slide Show

YUI 2.4.0, which was just released today, comes with a minor update to its history library. To celebrate this new release, I thought I would write a short article demonstrating how to use the YUI Browser History Manager to add back button and bookmarking support to a DHTML slide show.

Let’s start with a slightly modified version of Christian Heilmann’s maintainable, unobstrusive DHTML slide show. First, import the YUI Browser History Manager code and its dependencies:

<script type="text/javascript" src=".../2.4.0/build/yahoo-dom-event/yahoo-dom-event.js"></script>
<script type="text/javascript" src=".../2.4.0/build/history/history-min.js"></script>

Then, add the necessary static markup to the page:

<iframe id="yui-history-iframe" src="img/aston-martin.jpg"></iframe>
<input id="yui-history-field" type="hidden">

Note that the asset loaded in the IFrame does not have to be an HTML document (here, we load the first visible image in the slide show) This trick is useful to avoid an additional server round-trip, which would degrade the performance of your site.

Don’t forget to hide the IFrame by adding the following style declaration:

#yui-history-iframe {
  position:absolute;
  top:0; left:0;
  width:1px; height:1px;
  visibility:hidden;
}

Our application is composed of only one module, the slide show, which we will refer to using the identifier “slideshow”. The state of the “slideshow” module will encode the 0-based index of the currently visible slide. The next step is to figure out the initial state of our slide show module:

initialState = YAHOO.util.History.getBookmarkedState("slideshow") || "0";

The “slideshow” module can now be registered with the Browser History Manager, passing in the onStateChange callback, which will be executed when the state of the “slideshow” module changes:

YAHOO.util.History.register("slideshow", initialState, function (state) {
    showSlide(parseInt(state));
});

The initialization routine (initSlideShow) needs to be slightly modified to add a history entry instead of just showing the next slide when the user hits the “previous” or “next” links:

function initSlideShow () {
    currentSlideIndex = parseInt(YAHOO.util.History.getCurrentState("slideshow"));
    slides = YAHOO.util.Dom.get("slides").getElementsByTagName("li");
    YAHOO.util.Dom.addClass(slides[currentSlideIndex], "current");
    YAHOO.util.Event.addListener(["prev", "next"], "click", function (evt) {
        YAHOO.util.Event.stopEvent(evt);
        var newSlideIndex = this.id === "next" ?
            currentSlideIndex + 1 :
            currentSlideIndex - 1;
        if (newSlideIndex >= slides.length) {
            newSlideIndex = 0;
        } else if (newSlideIndex < 0) {
            newSlideIndex = slides.length - 1;
        }
        YAHOO.util.History.navigate("slideshow", newSlideIndex.toString());
    });
}

Call initSlideShow when the Browser History Manager is ready:

YAHOO.util.History.onReady(function () {
    initSlideShow();
});

Finally, initialize the Browser History Manager:

YAHOO.util.History.initialize("yui-history-field", "yui-history-iframe");

The final version is available here. Cheers!

Posted in Web Development | 6 Comments

Introducing CrossFrame, a Safe Communication Mechanism Across Documents and Across Domains

The mashup problem

According to my coworker Douglas Crockford, Mashups are the most interesting advancement in software development in decades. They are also unsafe in the current generation of browsers. Lately, Douglas has been spending some time convincing the main browser vendors that mashups need to be made safe, wrote a proposal, and even mentioned Google Gears as a potential solution to the problem. While fixing the browser is the right thing to do, web developers are confronted with this problem today, and cannot afford to wait 5 years for a definitive solution.

Existing solutions to the mashup problem

One way mashups (or widgets, badges and gadgets, take your pick…) can be made safe is by sandboxing them in an IFrame pointing to another domain (Note: another way would be to run the untrusted code through ADsafe, and provide some safe API to do useful things on the page) The problem is that the Same Origin Policy isolates them so completely that they are then unable to cooperate with the page containing them or with each other. Several hacks have been exploited to achieve reasonably secure client-side cross-domain communication. The most popular ones use the URL fragment identifier or the Flash LocalConnection object.

Why the need for another technique?

CrossFrame is a variant of the URL fragment identifier mechanism. In the original technique, the containing page sets the URL fragment identifier of an embedded IFrame (usually via its src attribute), and the IFrame must poll to detect changes in the value of its location.hash property. This technique can be further built upon to allow for 2-way communications between an IFrame and its containing page, or between two distinct IFrames.

The original URL fragment identifier technique has many limitations, many of which can be worked around except maybe for the following:

  • It unnecessarily consumes CPU cycles by requiring the receiver to poll.
  • It creates “fake” history entries on Safari and Opera.

How does CrossFrame work?

While CrossFrame also has limitations of its own, I find it to be a much cleaner and simpler approach. Here is how it works:

In order to communicate with the mashup hosted in domain Y, the page, hosted in domain X, dynamically creates a hidden IFrame and points it to a special proxy file hosted in domain Y, using the URL fragment identifier to convey the message (step 1) When the special proxy file is loaded in the hidden IFrame, it reads its URL fragment identifier and passes it to a globally accessible function defined in the IFrame hosting the mashup (step 2) using parent.frames['mashup'] to get to it. The same technique can also be used by the mashup to communicate with the page (the proxy will use parent.parent to get to the page) Finally, when all is said and done, the hidden IFrame is automatically removed from the DOM by the library.

This however cannot work on Opera, which does not allow us to query any property of a window pointing to a different domain (so getting parent.parent for example will throw an exception) CrossFrame takes care of this by using, on Opera only, the HTML 5 way of sending messages across frames and across domains.

How to use CrossFrame?

In order to use the CrossFrame library, place the proxy file (proxy.html, included in the downloadable archive) on your web server so you can receive CrossFrame notifications for that domain. Make sure that the proxy file gets cached properly by web browsers, for example using a .htacess file similar to this one:

<Files proxy.html>
    ExpiresActive on
    ExpiresDefault "access plus 1 year"
</Files>

Then, import the necessary code and its dependencies in your page:

<script type="text/javascript" src=".../2.3.1/build/yahoo-dom-event/yahoo-dom-event.js"></script>
<script type="text/javascript" src="cross-frame.js"></script>

To receive messages, subscribe to the onMessage event:

YAHOO.util.CrossFrame.onMessageEvent.subscribe(
    function (type, args, obj) {
        var message = args[0];
        var domain = args[1];
        // Do something with the incoming message...
    }
);

To send a message, call YAHOO.util.CrossFrame.send():

YAHOO.util.CrossFrame.send(".../proxy.html",
                           "frames['mashup']",
                           "message");

Here is a demo showing the CrossFrame library in action.

Limitations

The CrossFrame library does not support chunking (i.e. the ability to pass a large message in several smaller chunks) so the size of the messages that may be sent is limited by the maximum length of a URL (which varies across browsers…) However, it is not impossible to implement (for more information on chunking, you may want to look at Dojo’s XHR IFrame proxy implementation, which I believe supports chunking)

Also, the user may experience a short delay the first time a message gets sent to a specific domain. This is due to the server round trip necessary to download the proxy file. However, this can easily be mitigated by preloading the proxy file for that specific domain.

Conclusion: The dangers of temporary solutions

There is a danger associated with this kind of “hack”. First of all, browser vendors may decide to change their security policies and mimic Opera’s behavior for example. If this happens, CrossFrame will stop working for those browsers. Furthermore, I do not recommend using hacks because they slow down the rate of innovation on the web (it makes the task of developing web browsers even more complicated than it already is, and also makes your application less maintainable) Therefore, as paradoxical as it may seem, I do not recommend using CrossFrame (or any of those ugly hacks for that matter)

Posted in Web Development | 7 Comments