Previous Entry Share Next Entry
Coding in web browsers
robhu has a good point over here.

At the moment we write Javascript in web pages, which is then compiled down by the various JIT methods that Firefox/IE/Webkit use to make it super fast.

Seeing as what's run clearly isn't the actual JS itself, but bytecode, why not have a standardised bytecode that all browsers would support, which would then mean you could write your code in any language you liked, providing there was a compiler to convert it to the standardised bytecode?

At the moment Google uses GWT to convert Java into Javascript that then gets converted into the running code, (And MS used to have something similar) wouldn't it be handy if the intermediate step wasn't necessary?

> developers can use whatever language they want and just compile it to the opcode language used by the browser.

Compile? What is this 'compile'?

For stuff that's included in the HTML you wouldn't, obviously.

But for stuff that's linked in there's no reason that the equivalent of:
<script type="byte/code" language="bytecode" src="http://myserver/somecode.byt"></script>

What is compile? For languages like Java or C# that use a bytecode, there are two things that you can call "compile".

First is turning the source into platform-independent bytecode, this happens upfront.

Second is "Just in Time compilation" during execution, when the runtime needs to use a method that has not been used before, and so needs to turn the bytecode for it into something that can actually execute on whatever cpu.

I'd say that Andrew is referring to the first of these.

How do you get there from here, though?


It'd need to be part of HTML6, obviously :->

And once we have standardised bytecode, the next logical step would presumably be to improve performance by creating CPUs that can execute it directly. In 20 years we'll all be back where we started.

Hey we could do this now - just implement the VM in Javascript ;-)

Google have already proposed this, and I believe it's coming in Chrome. I'll see if I can dig out a link,

They have their "Native Client" stuff, but I don't believe they've proposed it as a standard.

Also - native client is not the same sort of thing. It's a way of running native code 'safely'. But we're not talking about running native code.

The bytecode that each browser turns the Javascript into is obviously different for each browser, because each browser has different internals. So in your plan, each browser would still have to translate the bytecode into different bytecode in order to execute it. In which case, why not just call the existing Javascript language the bytecode, which also has the advantage of being human-readable, and make it the target for your compilers from other languages?

(The answer is because Javascript is lacking some functionality, not specified with sufficient rigour, in too many versions, with varying degrees of implementation in the browsers, and it takes too long to get changes into the spec, but I don't see how another spec for bytecode is going to be any different in this respect. You can only get round these problems by owning and implementing the spec yourself -- in which case you're re-inventing Flash or Silverlight.)

Edited at 2010-11-11 01:22 pm (UTC)

The bytecode each browser sues for javascript is obviously going to be different - but I wasn't suggesting that they use these same engines for whatever standardises version came about.

Heck, I'd be happy with either Java bytecode or IL if they were suitable.

Compiling Java into JS seems terribly suboptimal to me, although I'm prepared to be told that actually it's a good fit. Having something that can be implemented in a standard way by the big 3 would be good.

It's the big 4, I'm afraid. Safari and Chrome have not dissimilar usage figures (especially when you include all the people using Safari on iPhones, iPod Touches and iPads).

But my point is that it's not possible to have the same bytecode run natively in multiple browsers, or indeed in the same browser on different OS platforms. It will have to be run through a VM that translates the bytecode into actual native code in any case, so why not just call the Javascript interpreter the VM?

In what way is it suboptimal? There's a slight time penalty in the translation from the higher-level language than bytecode, and there's a bit of a semantic gap between java and javascript, but it isn't necessarily terribly inefficient. If the javascript emitted is low-level enough, all the potential optimisation in the original source code can be exposed and there's almost no work for the host JIT to do.

Similarly, it's possible to write C code that's low-level enough that it's virtually impossible to do better by writing in assembly.

The bytecode that each browser turns the Javascript into is obviously different for each browser

Currently yes. But there's no reason why that should be so. In fact, isn't that the proposal?

Something like how The Java and .Net bytecodes are standardised and designed to be platform-independent, which is kinda the point. They can be executed by different engines. Those engines may have different internals, but executing that standard bytecode is their entire purpose.

.Net bytecode is less obviously "platform-independent" than Java. But there are 32 and 64 bit windows runtimes that work with the same bytecode. And mono/Linux run that same bytecode.

Edited at 2010-11-11 02:50 pm (UTC)

Exactly this exists already. Check out this:

It probably will use LLVM bytecode at some point (I think it already supports that). Atm., it executes x86.

The hard part is the sandboxing. But it seems they have done well on this in the Natice Client.

Yup, thanks. I hope it gets picked up by other browsers.

Reminds me of this recent rant: "we can write any program we like so long as it's in Javascript".

A bytecode standard is not a bad idea.

I would imagine that browsers currently have no defences against malicious bytecode, since they generate it themselves and so don't expect it to be malicious. Their defences against forkbombs, buffer overflows, and the like are likely in the Javascript compiler, not the bytecode interpreter. There'd be huge security issues with giving third parties on the internet direct control of the browser internals as they are.

Which is where Google's Native Client stuff comes in - which does sandbox things.

The defence against forkbombs is that that Javascipt has little-to-no threading ability. Long-running scripts get shut down within a minute.

Security in bytecode systems is a problem that has been solved more than once, so it doesn't appear to be that hard. JS has no pointers, hence no buffer overflows, and I'd expect the bytecode to be likewise. In Java or .Net bytecode, security is mainly about gating access to the system API by untrusted bytecode.

Edited at 2010-11-11 03:21 pm (UTC)

Why give it access to browser internals? Just provide access to the things JS already has access to, like the DOM.

This post is the top item on Hacker News atm btw :)

Yeah - I submitted it, and I've been astounded at its popularity. I've never posted my own journal there before, so I'm pretty happy :->

the problem here is not compiling to javascript but compiling to javascript *semantics*. if your lists, strings or dictionaries work differently, you have to re-implement them (in javascript)

llvm would be a nice idea but is still relatively young and unsuitable for jit compilation or mobile devices.

anything higher up (like an object system or datatypes like a string or a list or a hash) would present the same problems as compiling to javascript.

You are viewing andrewducker