A New Hope: Polyglotism


OK, so this isn’t necessarily anything new, but I had to go with the running joke of the two blog posts this post is more or less a follow up to. If you haven’t already read them, go read Yegge’s Dynamic Languages Strikes Back, and Beust’s Return Of The Statically Typed Languages.

So let’s see. Distilled, Steve thinks that static languages have reached the ceiling for what’s possible to do, and that dynamic languages offer more flexibility and power without actually sacrificing performance and maintainability. He backs this up with several research papers that point to very interesting runtime performance improvement techniques that really can help dynamic languages perform exceptionally well.

On the other hand Cedric believes that Scala is bad because of implicits and pattern matching, that it’s common sense to not allow people to use the languages they like, that tools for dynamic languages will never be as good as the ones for static ones, that Java generics isn’t really a problem, that dynamic language performance will improve but that this doesn’t matter, that static languages really hasn’t failed at all and that Java is still the best language of choice, and will continue to be for a long time.

Now, these two bloggers obviously have different opinions, and it’s really hard to actually see which parts are facts and which are opinions. So let me try to sort out some facts first:

Dynamic language have been around for a long time. As long as statically typed languages in fact. Lisp was the first one.

There have been extremely efficient dynamic language implementations. Some of the Common Lisp implementations are on par with C performance, and Strongtalk also achieved incredible numbers. As several commenters have noted, Strongtalks performance did not come from the optional type tags.

All dynamic languages in large use today are not even on the same map with regards to performance. There are several approaches to fixing these, but we can’t know how well they will work out in practice.

Java’s type system is not very strong, and not very static, as these definitions go. From a type theoretic stand point Java does not offer neither static type safety nor any complete guarantees.

There is a good reason for these holes in Java. In particular, Java was created to give lots of hints to the compiler so the compiler can catch errors where the programmer is insoncistent. This is one of the reasons that you very often find yourself writing the same type name twice, including the type name arguments (generics). If the programmer makes a mistake at one side, the compiler will be able to catch this error very easily. It is a redundancy in the syntax that makes Java programs very verbose, but helps against certain kinds of mistakes.

Really strong type systems like those Haskell and OCaML use provide extremely strong compile time guarantees. This means that if the compiler accepts your program, you will never see any runtime errors from the type system. This allows these compilers to generate very efficient code, because they know more about the state of the application at most points in time, compared to the compiler for Java, which knows some things, but not nearly as much as Haskell or OCaML.

The downside of really strong type systems is that they disallow some extremely common expressions – these are things you intuitively can imagine, but it can’t be expressed within the constraints of such a type system. One solution to these problems is to add higher kinds, but these have a tendency to create more complexity and also suffer from some of the same problems.

So, we have three categories of languages here. The strongly statically checked ones, like Haskell. The weakly statically checked ones, like Java. And the dynamically checked ones, like Ruby. The way I look at these, they are good at very different things. They don’t even compete in the same leagues. And comparing them is not really a valid point of reasoning. The one thing that I am totally sure if is that we need better tools. And the most important tool in my book is the language. It’s interesting, many Java programmers talk so much about tools, but they never seem to think about their language as a tool. For me, the language is what shapes my thinking, and thus it’s definitely much more important than which editor I’m using.

I think Cedric have a point in that dynamic language tool support will never be as good as those for statically typed languages – at least not when you’re defining “good” to be the things that current Java tools are good at. Steve thinks that the tools will be just as good, but different. I’m not sure. To a degree I know that no tool can ever be completely safe and complete, as long as the language include things like external configuration, reflection and so on. There is no way to include all dynamic aspects of Java, but using the common mainstream parts of the language will give you most of these. As always this is a tradeoff. You might get better IDE support for Java right now, but you will be able to express things in Ruby that you just can’t express in Java because the abstractions will become too large.

This is the point where I’m going to do a copout. These discussions are good, to the degree that we are working on improving our languages (our tools). But there is a fuzzy line in these discussions, where you end up comparing apples and oranges. These languages are all useful, for different things. A good programmer uses his common sense to provide the best value possible. That includes choosing the best language for the job. If Ruby allows you to provide functionality 5 times faster than the equivalent functionality with Java, you need to think about whether this is acceptable or not. On the one hand, Java has IDEs that make maintainability easier, but with the Ruby codebase you will end up maintaining a fifth of the size of the Java code base. Is that trade off acceptable? In some cases yes, in some cases no.

In many cases the best solution is a hybrid one. There is a reason that Google allows more than one language (C++, Java, Python and JavaScript). This is because the languages are good at different things. They have different characteristics, and you can get a synergistic effect by combining them. A polyglot system can be greater than the sum of it’s parts.

I guess that’s the message of this post. Compare languages, understand your most important tools. Have several different tools for different tasks, and understand the failings of your current tools. Reason about these failings in comparison to the tasks they should do well, instead of just comparing languages to languages.

Be good polyglot programmers. The world will not have a new big language again, and you need to rewire your head to work in this environment.


13 Comments, Comment or Ping

  1. Bob Aman

    A minor point. OCaml can’t gaurantee type safety if you unmarshal something. It just assumes that you know what you’re doing and that you’re treating some deserialized value as the correct type. If you make a mistake, you get a segfault. Generally in OCaml, it’s a good idea to manually specify the type of values that are being deserialized so that the compiler can more reliably infer the types.

    May 15th, 2008

  2. Bill Shirley

    I miss Objective-C.

    (Not that it’s gone, but that I’m not using it.)

    May 15th, 2008

  3. Philip Schwarz

    Hi Ola,

    you said:

    “Java’s type system is not very strong, and not very static…

    So, we have three categories of languages here. The strongly statically checked ones, like Haskell. The weakly statically checked ones, like Java. And the dynamically checked ones, like Ruby.”

    Can you clarify (for me) your labelling of Java’s typing system in the context of the following definitions from On Understanding Types, Data Abstraction, and Polymorphism (1985) Luca Cardelli, Peter Wegner:

    Programming languages in which the type of every expression can be determined by static program
    analysis are said to be statically typed.

    Static typing is a useful property, but the requirement that all variables and expressions are bound to a type at compile time is sometimes too restrictive. It may be replaced by the weaker requirement that all expressions are guaranteed to be type-consistent although the type itself may be statically unknown; this can be generally done by introducing some run-time type checking.

    Languages in which all expressions are type-consistent are called strongly typed languages. If a
    language is strongly typed its compiler can guarantee that the programs it accepts will execute
    without type errors.

    In general, we should strive for strong typing, and adopt static typing whenever possible.

    Note that every statically typed language is strongly typed but the converse is not necessarily true.

    Static typing allows type inconsistencies to be discovered at compile time and guarantees that
    executed programs are type-consistent. It facilitates early detection of type errors and allows greater execution-time efficiency. It enforces a programming discipline on the programmer that makes programs more structured and easier to read.

    Thanks.

    Philip Schwarz.

    May 16th, 2008

  4. Christophe

    Philip,

    I am not Ola, but I can tell you readily that your book is wrong on many points.

    “Note that every statically typed language is strongly typed but the converse is not necessarily true.”

    That’s wrong. Statically typed languages can easily be weak, due in particular to one property: typecasting. Because of casting, even when your static analysis tells you that your surface types match correctly, you are never sure that your underlying types are correct as well, since they can easily be cast to other types. C and Java both allow typecasting, and are both statically typed languages, so they are weak statically typed languages. Haskell, OCaml and Ruby don’t allow such casting, so they are strong languages (the first two being static, while the last one being dynamic).

    “In general, we should strive for strong typing, and adopt static typing whenever possible.”

    Modern practice has proven this piece of advice to be misleading. Strong typing is good to have, and I do think indeed that you should strive to have it. Static typing is something completely different, whose usefulness (at least without the presence of type inference to shorten the written expressions) isn’t unquestioned.

    May 16th, 2008

  5. Greg

    “The downside of really strong type systems is that they disallow some extremely common expressions”

    Eh? An example? My experience has been that this is very rare with the sophisticated type systems we have these days, and it’s almost always because I’ve made a mistake with my design. (sometimes even though that behaviour was possible, it wasn’t desirable, hence the type-checker not being extended to facilitate it) To the point where it seems likely that any exceptions are mistakes I don’t yet realise I made.

    May 20th, 2008

  6. Avdi

    What I find ironic is that both SteveY and Cedric almost seem to be in agreement on the idea that picking one language and sticking to it is ultimately a good idea, no matter how it may chafe. Although I wonder if Yegge will stick to that conclusion if and when he leaves Google.

    Frankly, Cedric lost me after he started complaining about pattern matching in Scala. Advocating sophisticated type systems in languages like Haskell and Scala is one thing; but advocating Java as your paragon of static typing is like trying to introduce people to on authentic Mexican food by taking them to Taco Bell.

    May 21st, 2008

  7. Curt Sampson

    I liked Avdi’s last sentence in his comment.

    After four years of Java and four years of Ruby after that (all full-time programming), switching to Haskell made me notice that Java has one interesting property: it’s not possible to create a type that does not include null amongst its set of values. It turns out that this was the source of a lot of pain.

    In one of the language continuums, I’d place Java at one end, Ruby in the middle, and Haskell at the other.

    May 29th, 2008

  8. Philip Schwarz

    you said: “Java’s type system is not very strong, and not very static…

    Here is how Venkat Subramaniam classifies Java in (in Programming Groovy) :

    STRONG
    |
    |
    Ruby | Java
    /Groovy | /C++
    |
    |
    DYNAMIC——–|——-STATIC
    |
    |
    JavaScript| C/C++
    /Perl |
    |
    WEAK

    June 3rd, 2008

  9. Anonymous

    I’ll try again…let’s see if this displays OK
    …………………………
    ………….STRONG………..
    …………….|………….
    …………….|………….
    …….Ruby…..|..Java…….
    ……./Groovy..|../C++…….
    …………….|………….
    …………….|………….
    .DYNAMIC——–|——-STATIC
    …………….|………….
    …………….|………….
    ……JavaScript|..C/C++……
    ……../Perl…|………….
    …………….|………….
    …………..WEAK…………

    June 3rd, 2008

  10. Philip Schwarz

    Doh! typing mistake: The C++ in the top right quarter should of course be C#.

    June 3rd, 2008

  11. Ahmet Yetgin

    You may find this funny but I believe from the language (not libraries) point of view Javascript has the best usability, flexibility, read and maintainability amongst others. I am not talking about mess of the (web browsers) clientside programming here, so dont be harsh on me.

    I used all 4 four parts of the diagram above and this is just a personal opinion :)

    With the right tools, and support; OO and functional programming compatibility; in a world where marketing and buzzwords and programming heroes mean nothing; I believe that (kind of) option would have higher chance to survive XD

    June 6th, 2008

  12. dterror

    From what I can understand, polyglot programming is all over the place, say on a LAMP web application where you’d normally have even more APIs other than LAMP, like memcached for example. Also, a more sophisticated example would be Google’s AppEngine, which gives the user a ‘domain layer’ in the form of a sandboxed Python VM and exposes lots of useful apis for things like Picasa image processing engine, Google Account authentication and others.

    But this kind of polyglotism doesn’t interest me very much, (maybe AppEngine) and what I think could be a great nail for polyglot programming would be managed runtimes, like all languages running on top of the JVM. now this is the part I don’t get very well, how would that be better than, say, interfacing a RDBMS through a language binding? I saw Bini’s example of interfacing Ruby and Erlang, but that wasn’t exactly in the same runtime and I suspect there’d be an overhead in having both languages go through Java. I might be totally wrong here, this is not a statement, it’s a question.

    Furthermore, is it possible to, say, create a class in JRuby and subclass it in Jython? I really don’t know, but this would give polyglot programming a new meaning to me.

    December 15th, 2008

  13. DTERROR: Maybe F-Script (http://www.fscript.org) is a good example of what you’re interested in. You create classes in C# and manipulate them in F-Script. Typically you only manipulate instances, but IIRC in the latest version you can also subclass etc.

    December 26th, 2008

Reply to “A New Hope: Polyglotism”