Languages of the future


Martin Fowler writes about one language, and neatly encapsulates what I think about the subject in the way I would have written it, if I were actually a good writer. Go read.



Introducing TIJuAVA – Java with Type Inference


Every time I’ve written Java code lately, I’ve been painfully aware of how much unnecessary code I write every time. And most of this is Java’s fault. This blog post is a very small thought experiment. TIJuAVA does not exist as software. Yet. If I someday have the time I would love to implement it, but there are more pressing needs right now.

So, what are the rules? Basically, all valid Java programs are valid TIJuAVA programs. Some valid TIJuAVA programs are not valid Java programs. Simply put, the main difference is that you don’t need to declare a type for any local variables or member variables. Type declarations are only necessary in method declarations. You can declare local variables and member variables if you want to, and in certain very unlikely circumstances you will need too.

Let’s take a very simple example. This code is taken from the JRuby source code, but I have added one or two things to make it easier to showcase:

package org.jruby.util.collections;

import java.util.ArrayList;
import java.util.Collection;
import java.util.Iterator;

public class IdentitySet {
private items = new ArrayList();

public void add(Object item) {
items.add(item);
}

public void remove(Object item) {
iter = items.iterator();
while (iter.hasNext()) {
storedItem = iter.next();
if (item == storedItem) {
iter.remove();
}
}
}

public boolean contains(Object item) {
iter = items.iterator();
while (iter.hasNext()) {
storedItem = iter.next();
if (item == storedItem) {
return true;
}
}
return false;
}

private Collection getItems() {
return items;
}

private void something(java.util.AbstractSet inp) {
val1 = inp;
for(iter = val1.iterator();iter.hasNext();) {
System.err.println(iter.next());
}
}
}

This code doesn’t really show all that can be done with this approach, and if I were to show a real example, this blog would be unbearably filled with code. So, this is just a tidbit.

The TIJuAVA system would need to be implemented as a Java two-pass compiler. Basically, the first pass finds all variable names that need to have a type inferred, and then walks through the information it’s got, basic on method signatures and methods called on the variable. In almost all cases it will be possible to come to one conclusion on which type to use. The compiler would then generate regular Java byte code, basically the same bytecode that would have been generated had you written the types by hand.

Of course, most people use IDE’s to write code nowadays. Wizards and code generators and what not. So why something like this? Well, even though your IDE writes your code for you, it is still there, and you still have to understand it at some level. If not when writing, you would still need to read it. And boy does type declarations clutter things. Especially generics. And here is one interesting tidbit. Generic types would also be possible to infer in most cases.

Another thing that could be easily added is some kind of in-place literal syntax for lists and maps. This would be more like a macro feature, but the list syntax would mostly just be a call to Array.asList, which isn’t to bad.

An objection that I anticipate is from people who think that the code will be less readable by removing the type pointers. This should be more of a problem when you have large methods, but everyone these days use refactorings so they won’t have methods with a LOC over 20. And if that’s the case, the local variables should be easily understood by the operations that are used on them.

So. Someday, when I have time, this may be reality. If anyone is interested, that is.



The Dark Ages of programming languages


We seem to be living in the dark ages of programming languages. I’m not saying this to bash everything; I’m actually being totally objective right now. Obviously, our situation right now is much better than it was 10 years ago. Or even 5 years ago. I would actually say that it’s really much better now, than 1 year ago. But programming is still way too painful in almost all cases. We are doing so much stuff by hand that obviously should be done be computer.

I spend quite much time learning new languages now and then, to try to find something that’s really good for me. So far, the best contestants are Ruby, Erlang, OCaml and Lisp, but all of those have their share of problems too. They just suck less than the alternatives.

  • Ruby… I really like Ruby. Ruby is such an improvement that I really want to do almost everything in it nowadays. I think in Ruby half the time and in Lisp the other half. But it’s not enough. It is still clunky. I want tail calls. I want real macros. I want blazing speed and complete integration with good libraries for everything and more. I’m just a sucker for power, and I want more of it in Ruby.
  • Erlang and OCaml. These languages are really great. For specific applications. Specifically, Erlang is totally superior for concurrent programming. And OCaml is incredibly fast, very typesafe and has great GUI libraries. So, if I was asked to do something massively concurrent I would probably choose Erlang, and OCaml if it was GUI programming. But otherwise… Well, Erlang does have some neat functional properties, but not any nice macro support. It doesn’t have a central code repository and many other things you expect from a general purpose language. OCaml suffers from the same things.
  • Lisp is the love of my life. But as so many people before me has noted, all the implementations are bad in some way or another. Scheme is lovely; for research. Common Lisp is so powerful, but it needs users. Lots of them, creating libraries for every little data format there can be, creating competing implementations of particularly important API’s; like databases.

Conclusion. Nothing is good enough, right now. I see two two paths ahead. Two ways that could actually end in the “100-year language”.

The first path is one new language. This language will be based on all the best features of all current languages, plus a good amount of research output. I have a small list what this language would need to be successful as the next big one:

  • It needs to be multiparadigm. I’m not saying it can’t choose one paradigm as the base, but it should be possible to program in it functionally, OOP, AOP, imperative. It should be possible to build a declarative library so you can do logic programming without leaving the language.
  • It should have static type inference where possible. It should also allow optional type hints. This is so important for creating great implementations. It can also increase readability in some cases.
  • It needs all the trappings of functional languages; closures, first-order functions and lambdas. This is essential, to avoid locking the language into an evolutionary corner.
  • It needs garbage collection. Possibly several competing implementations of GC’s, running evolutionary algorithms to find out which one is best suited for long running processes of the program in question.
  • A JIT VM. It seems almost a given right now that Virtual Machines are a big win. They can also be made incredibly fast.
  • Another JIT VM.
  • A non-VM implementation. Several competing implementations for different purposes is important to allow competition and experimentation with new features of implementation.
  • Great integration with legacy languages (Java, Ruby (note, I’m counting on all Rubyists moving to this new language when it gets out, making Ruby legacy), Cobol). This is obvious. There are to many things lying around, bitrotting, that we will never get rid of.
  • The language and at least one production quality implementation needs to be totally open-source. No lock-in of the language should be possible.
  • Likewise, good company support is essential. A language needs money to be developed.
  • A centralized code/library repository. This is one of Java’s biggest failings. Installing a new library in Java is painful. We need something like CPAN, ASDF, RubyGems.
  • The language needs great, small and very orthogonal libraries. The libraries included with the language needs to be great, since they have to be small but still pack all the most needed punch.
  • Concurrency must be a breeze. There should be facilities in the language itself for making this obvious. (Like Erlang or Gambit Scheme).
  • It should be natural to do meta-programming in it (in the manner of Ruby).
  • It should be natural to solve problems bottom-up, by implementing DSL’s inside or outside the language.
  • The languages needs a powerful macro facility that isn’t to hard to use.
  • Importantly, for the macro facility, the language needs to have a well-defined syntax tree of the simplest possible kind, but it also needs to have optional syntax.

So, that’s what I deem necessary (but maybe not sufficient) for a really useful, good, long term programming language. When I read this list, it doesn’t seem that probables that this language will show up any time soon, though. Actually, it seems kinda unrealistic.

So maybe the other way ahead is the right one? The other way I envision is that languages become easier and easier to create, and languages have their strength in different places. Along this path I envision the descendants of Ruby and Erlang exploiting what they’re good at and eschewing everything else. But for this strategy to work, the first thing implemented in each language needs to be a seamless way to integrate to other languages. Maybe there will come an extremely good glue-language (not like Perl or Ruby, but a language that only will serve as glue between programming languages), and all languages will implement good support for that language. For example you could code a base Erlang concurrent framework, which uses G (the glue language) to implement some enterprise functionality in Java sandboxes, and some places where Ruby through G will implement a DSL, which have subparts where Ruby uses G to run Prolog knowledge engines.

If you had to choose among the two futures, I am frankly more inclined towards the one-language one. But the multi-language way seems much more probable. And since I’m trying to choose way now, I’m placing my bets on the second option. We are not ready to implement G yet, but I do think that as many p-language techs as possible should do their best to learn how languages can cooperate in different ways, to prepare this project.