The Essence of Google Dart: Building Applications, Snapshots, Isolates
Wіth thousands of programming languages floating around, why is Google introducing Google Dart? What can it possibly add? The short answer: the Google Dart team wanted a language well suited to modern application development, both on the server and the (mobile) client.
InfoQ takes a look at the most interesting aspects of Dart for application development, with a focus on the Dart VM and some of the notable language features.
Dart is an Application Language: Snapshots And Initialization
Is an application's startup time really relevant? How often a day do users restart their IDE or word processor? With the rise of memory constrained mobile devices, application startup happens a lot; the Out Of Memory (OOM) killer process is very trigger happy on mobile OSes and will kill suspended applications without hesitation. iOS's multitasking model and the prominent physical home button, have also shortened the average life span of mobile apps. Before iOS 4, pressing the Home button always killed the running application; with iOS 4 the situation has become a bit more complicated, but applications still have to be prepared to die at any time, be it at the hand of the user or the OOM process.
This behavior won't stay on mobile OSes. "Sudden Termination" and "Automatic Termination" are application properties introduced in recent OS X versions that declare an application can handle being killed at any point (eg. when the available memory is low) and then restarted, all transparent to the user.
Slow startup has plagued Java GUI applications since Java 1.0. Booting up a large Java application is a huge amount of work: thousands of classes need to be read, parsed, loaded and linked; before Java 1.6, that process included generating the stack map of methods for bytecode verification. And once classes are loaded, they still need to be initialized, which includes running static initializers.
That's a whole lot of work for a modern Java GUI applications - just to show an initial GUI. The introduction of a SplashScreen API in Java 6 shows that it's a problem that hasn't been solved, and that's affecting developers and their users.
Snapshots vs Smalltalk Images
Dart addresses this with the heap snapshot feature, which is similar to Smalltalk's image system. An application's heap is walked and all objects are written to a file. At the moment, the Dart distribution ships with a tool that fires up a Dart VM, loads an application's code, and just before calling main, it takes a snapshot of the heap. The Dart VM can use such a snapshot file to quickly load an application.
The snapshot facility is also used to serialize object graphs sent between Isolates in the Dart VM.
In the initial tech preview of Dart, there doesn't seem to be a Dart language API for initiating a snapshot, although there shouldn't be a fundamental reason for that.
Technical Details of Snapshots
The Dart team put a lot of effort into the snapshot format. First off, it can be moved between machines, whether they're 32 bit, 64 bit or else. The format's also made to be quick to read into memory with a focus on minimizing extra work like pointer fixups.
For details see runtime/vm/snapshot.cc, and runtime/vm/snapshot_test.cc for some uses of the Snapshot system, іe. writing out full snapshots, reading them back in, starting Isolates from snapshot, etc.
Snapshots vs Smalltalk Images
Smalltalk's images are not universally popular; Gilad Bracha, wrote about the problems of Smalltalk images in practice. Smalltalk development usually takes place in an image which is then stripped of unused code and frozen for deployment. Dart's snapshots are different because they're optional and need to be generated by loading up an application and then taking a snapshot. Dart's lack of dynamic code evaluation and other code loading features can allow the stripping process to be more thorough.
Currently Snapshots are used in message passing between Isolates; objects sent across in messages are serialized using
SnapshotWriter and read in on the other side.
In any case, the snapshot facility is in the Dart VM and tools, and as with many other features of Dart, it's up to the community to come up with uses for it.
Even without snapshots, Dart has been designed to avoid initialization at startup if possible. Classes are declarative, ie. no code is executed to create them. Libraries can define
final top level elements, ie. functions and variables outside of a class, but they must be compile time constants (see section 10.1 in the language spec).
Compare this to static initializers in Java or languages that rely on various metaprogramming methods at startup to generate data structures, object systems or else. Dart is optimized for applications that start up quickly.
Dart doesn't come with a Reflection mechanism at the moment, although one based on Mirrors (PDF) is supposed to come to the language in the near future, possibly with the ability to construct code using an API and load it in a new Isolate, bringing metaprogramming to Dart.
The Units of Concurrency, Security and the Application: Isolates
The basic unit of concurrency in Dart is the Isolate. Every Isolate is single threaded. In order to do work in the background or use multiple cores or CPUs, it's necessary to launch a new Isolate.
Other single or green threaded languages have similar process herding solutions. Ruby's Phusion Passenger is an example which also tried to fix the overhead problem when loading the same code in multiple processes: Phusion Passenger loads up a Rails application and then uses the OS'
fork call to quickly create multiple processes with the same program contents, thus avoiding parsing and initializing the same applications many times over. Dart's snapshot feature would another way to solve the problem.
The first tech preview of Dart uses one thread per Isolate, although other modes are being considered, ie. multiplexing multiple Isolates onto one thread or having Isolates run in different OS processes, which would also allow them to run on different machines.
Splitting up an application into independent processes (or Isolates) helps with reliability: іf one Isolate crashes, other Isolates are unaffected and a clean restart of the Isolate is possible. Erlang's model of supervision trees is helpful with this model in that it allows to monitor the life and death of groups of processes and write custom policies to handle their death.
This interview with the creators of Akka and Erjang gives a good overview of the advantages of Erlang's model.
Untrusted code can be run in its own Isolate. All communication with it must take place over message passing, which will be enhanced with a capability-style mechanism that permits to restrict who can talk to which ports in Isolates. An Isolate must be given a port to send messages to; without one, it can't do anything.
Compartmentalization of Memory
Another benefit of splitting an application into Isolates: each Isolate's heap is standalone; all objects in it clearly belong only to it, unlike the objects in a shared memory environment. The key benefit: if an Isolate was launched for a task and it's done - the whole Isolate can be deallocated in one go; no GC run necessary.
What's more: if an application is split into Isolates, that means the application's used memory is split into smaller heaps, ie. smaller than the total amount of memory the application uses. Each heap is governed by its own GC with the effect that a full GC run in one Isolate only stops the world in that Isolate, the other Isolates won't notice. Good news for GUI apps as well as server applications that are sensitive to GC pauses: time sensitive components are unaffected by one or a few messy, garbage spewing Isolates that will keep the GC busy. Hence, having one heap per isolate improves modularity: each Isolate controls its own GC pause behavior and is not affected by some other component.
While GCs in Java and .NET have been improving a lot, GC pauses are still an important issue for GUI applications and time sensitive server applications. Solutions like Azul's GC have managed to make pauses managable or even nearly disappear, but they need either special hardware or access to low level OS infrastructure, as in their x86-based Zing. Realtime GCs do exist, but they also slow down execution in exchange for predictable pauses.
Splitting up the memory into seperate heaps means that GC implementations can be simple yet still be fast enough. Of course, it all depends on the developer - to benefit from these characteristics, an application must be split into multiple Isolates.
No more Dependency Injection Ceremony: Interfaces and Factories in Dart
"Program To Interface" is common advice, in practice it gets a bit harder as someone has to call
new with an actual class name. In the Java world this issue has led to the creation of Dependency Injection (DI) frameworks. Adopting a DI framework first means to inject a dependency on the specific DI framework into a project.
What problem does DI solve? Calling
new on s apecific class hardcodes the class, creating problems for testing and simple flexibility of the code. After all, if all code is written to an interface, the specific implementation shouldn't matter and someone should choose the right implementation for a use case.
Dart now ships with one DI solution, making it unnecessary to chose from a host of different options. It does so in the language by linking an interface to code that can instantiate an object for it. All flexibility that's required can be hidden in that Factory, whether it's deciding which class to instantiate or whether to allocate a new object at all and just return a cached object.
The interface refers to a factory by name, which can be provided by a library; different implementations of a factory can live in their own libraries and it's up to the developer to include the best implementation.
For details official Dart language specification or for a quick overview see 'Idiomatic Dart' on the Dart website.
Namespacing in Dart is done with the library mechanism, which is different to Java where the class names are the only way to namespace things like methods or variables. One consequence: libraries in Dart can contain top level elements other than classes, ie. variables and functions outside of classes.
#import("foo.dart", "foo") will import the library and make all its elements available with the prefix "foo.".
The key word in "Optional Typing" is "Optional". The developer can add type annotations to the code, but these annotations have no impact at all on the behavior of the code. As a matter of fact, it's possible to specify nonsensical types - the code will still execute fine.
Having the types in the code allows for the various type checkers to do their work. The editor shipped with Dart has a type checker and can highlight type errors as warnings. Dart also comes with a checked mode in which the type annotations are used to check the code and violations will be reported as warnings or errors.
The optional type annotations allow to actually have type information in the code where it's useful for documentation purposes; no more hunting for documentation that explains that an argument must implement a certain list of methods in order to be considered an acceptable duck. The presence of interfaces, ie. a named set of methods with method signatures, and optional type annotations allows to document APIs.
Crucially, the language is always dynamic and arguments can be specified as dynamic, ie. of type
Runtime Extensibility and Mutability - or lack thereof
Let's get it out of the way: No Monkeypatching. No
eval. No Reflection at the moment, although a Mirror-based system is in the works (for details see this paper introducing Mirrors). The plan seems to be to limit construction of new code to a new Isolate, not the currently running process.
Dart allows some dynamic magic with the
noSuchMethod feature, similar to Ruby's
Closed Classes, no Eval
Languages like Ruby allow to change classes, even at runtime, which is referred to as open classes. Not having this feature helps with performance: all members are known at compile time, which allows to analyze code and remove functionality that's never referenced. Refer to the 'Criticisms' section below to see the current status and what current solutions exist in other languages.
Future Language Features
An async/await-style extension is being considered to facilitate writing I/O code. Many of the I/O APIs in Dart are async, and hence some support with making that easier is welcome. The reason to stay away from adding features like Coroutines, Fibers and their variants, is to avoid adding synchronization features. Once coroutines are in the system, it's possible to schedule and interleave their execution and in order to write correct code, it's necessary to synchronize shared resources. Hence the focus on single threading; concurrency is done with Isolates: explicit communication, no sharing, Isolates can be locked away etc.
Nothing riles up developers more than a new programming language. A quick look at some common criticisms.
Certain characteristics of Dart make these optimizations possible, in particular the closed classes which means that all functions are known at compile time. The lack of
eval means that all at compile time, the compiler knows which functions are used, and more importantly: which are not. The latter can be safely removed from the output.
eval or other features, the code wіll break.
Google Dart doesn't need to rely on the programmer to stick to the rules, the language's restrictions provide the necessary guarantees to the compiler.
Another example of a language that makes use of Google Closure's Advanced Compilation is ClojureScript (that's Clojure with a 'j'). ClojureScript is also meant to be an application language and lacks
Why not use static typing for runtime optimization?
Why is the typing optional and when it's present, why isn't it used to improve the generated code. Surely, knowing that something's an
int must help optimize the generated code.
As it turns out, the team behind Dart knows about these ideas, they've done a VM or two in the past, Google V8 and Oracle's HotSpot are just two examples.
Using static type information in Dart doesn't help with the runtime code for several reasons. One is: the types the developer specifieѕ have no impact on the semantics at all and, as a matter of fact, they can be totally incorrect. If that's the case, the program will run fine, although you'll get warnings from the type checker. What's more, since the given types can be nonsensical, the VM can not use them for optimizations as is, because they're unreliable. Generating, say,
int specific code just because the developer specified it is wrong if the actual objects at runtime are really Strings.
The static type system is an aid for tooling and documentation, but it has nothing to do with the executed code.
There is another reason why the static types aren't very helpful in generating optimized code: Dart is interface-based. Operators for, say,
ints are actual methods calls - method calls on an interface. Dart isn't kidding, eg.
int is actually an interface, not a class.
Calling interface methods means resolving them at runtime, based on the actual object and its class. Concepts like (polymorphic) inline caches at callsites can help remove the overhead of method lookup. StrongTalk and its direct descendent HotSpot use feedback based optimization to figure out what code is actually executed and generate optimized code. V8 has also gained these optimizations recently in the form of Crankshaft.
Where's my favorite language feature?
Google has released Dart at a rather early stage. It's easy to get fooled by the lanugage spec, IDE, VM, DartC etc.: the clear message from the Dart team is: now is the time to try Dart and provide feedback. A lot of features are already planned but haven't been finished or implemented yet; Reflection and Mixins, are but a few ideas that have been mentioned as potential future features.
If a feature isn't in the Dart repository or language spec, now is the time to provide feedback and suggest fixes or changes to the language and runtime environment.
However, the initial release of Dart is a technology preview and the language, APIs and tools are very much a work in progress. Now is the time to give the Dart team feedback and actually have a chance to have an impact on the language. The language will change, some of the proposed and planned changes were mentioned in this article.
Some have already started experimenting with Dart, for instance a Java port has been started with the JDart project, which makes heavy use of Java 7's
While the initial development of Google Dart was done in secret, the whole project, source, tools, ticket system, etc. is now out in the open. It remains to be seen if and where Dart will be adopted. As mentioned, the Dart VM comes with features that will make Dart appealing to both client developers as well as server developers.
A : "I have written over 100,000 lines of code in primary school" B : "Oh. So cool. Then what's after?" A : "Then I learnt how to write loop statements"