Should languages be multi-lingual?


I’m currently sitting in the Beijing ThoughtWorks office, and for some reason language is on my mind… =)

One of the discussions related to DDD that have turned up several times the last few months at conferences
is how you handle ubiquitous language when your domain is not in English. Since most programming languages are based on English, you end up mixing English and Swedish for example, if you are working with a Swedish domain. Of course, the benefits of working with these concepts in Swedish are very hard to argue against. But the dichotomy between the programming language and the domain language is definitely something that hurts my eyes, so I’m generally not very fond of that approach.

In fact, I haven’t heard anyone come up with a good solution to this problem, and this post is not really a solution either.

One of the things I’ve proposed to make this situation better is to create an external DSL that is fully in the domain language. The implementation of that DSL can then be implemented in English. The main benefit is that there is a clear separation.between the domain language and the programming language. On the other hand, the overhead of creating the DSL and also the complexities involved in translating the domain concepts into programming language concepts can become problematic too.

One interesting idea in Cucumber is the idea that you can easily add new natural languages to write the features in. When it comes to user stories at the level of testing that Cucumber provides, it’s really important to use the right language. So it got me thinking, could you use the same kind of approach in a general programming language too?

As an experiment I took a small example program for Ioke, and translated it into Mandarin, with simplified Chinese characters. Of course I used Google Translate for this, so the translation is probably not very good, but the end result is still interesting. I’m not going to try to get this into my blog, so take a look at the file at github instead: http://github.com/olabini/ioke/blob/master/examples/chinese/account.ik. As you can see there is nothing in there that even reeks of English. If you don’t understand Chinese characters it is probably hard to see what’s happening here. Basically an Account object is created, with a “transfer” method and a “print” method. Further down, two instances of this Account object is created, some transfers are made, and then the objects are printed. But provided my translation is not too crappy, this code should make sense to someone reading Chinese.

Now, this is actually extremely simple to implement in Ioke, since it relies on several of the features Ioke handles very easily. That everything is a message really helps, and having everything be first class means I can alias methods and things like that without any worry. Obviously your language also need to handle non-ascii identifiers correctly, but that should be standard in this day and age.

When thinking about it, something similar to do this can be created in languages like Lisp, Smalltalk, Factor, Io and Haskell – but most other languages would struggle. If you have keywords in your language, it’s really a killer – you would need to branch your parser to make it happen.

Of course, this approach only works when you can simply translate from one word to another. If the writing system is right to left, or top to bottom, it’s much more tricky to create a good translation.

I’m also not sure if this is actually a really good idea or not. It might be. The other thing I’ve been thinking about is how to handle multilingual editing. What if you want to be able to switch back and forth between languages? How can you handle identifiers with more than one name. Would you want to?

Lots of unanswered questions here. But it’s still funny to think about. Communication is the main goal, as usual.



Communication over Implementation


Last week I wrote a post about some of the statements that percolate in my mind when designing Ioke. What I didn’t mention was that these ideas are things I use to judge other programming languages too. I would say that this philosophy pretty much captures my views on programming languages. (The post in question is here: The Ioke Philosophy). So, this post is of course not complete, and I don’t think I would ever be able to write something that is totally complete.

One of the things missing – and I did allude to it in the post – was a statement that has grown on me a bit. I did a presentation about Ioke last week, and at that point I decided I needed to talk about this some more. The statement in question is what I call Communication over Implementation. This turns out to be pretty important for programming languages in general, at least in my experience.

One of the things I’m fond of saying when talking about programming languages, is that programming languages – just like natural languages – are about communication. And we don’t necessarily always think clearly about who we are communicating with. The immediate and intuitive reaction to programming languages is that they are supposed to communicate with the compiler/interpreter/cpu. That is of course true, but it is also incidental in many cases. There are many ways in which you can communicate with the machine to get it to achieve something. So the question becomes what other parties should you consider when communicating.

The next most obvious party would be yourself. If you ever need to read your code, you need to write code in such a way that you can read it later on. This constrains the way you write code quite severely. There are reasons we don’t write much code in assembly language or JVM bytecodes anymore. Yes, at some level these descriptions are extremely nice communication towards the executing machine, but they are so bad at communicating with human stakeholders that the balance generally ends up in favor of more readable languages.

When communicating with human stakeholders the thing I focus most on is intent. If your code/text communicates the intent of what you are trying to do in a good way, this makes it easier to read. There are many movements that focus on how to do this well, where domain domain design and clean code are the two that immediately comes to mind for me.

So coming back to the title. For me, implementation is a special sort of communication – that kind of communication that is supposed to be functional and describe what should actually be done in deterministic instructions to a machine. As long as this communication is functional enough – meaning that the machine does more or less the right thing – there is much leeway in how the code can be written to make other kinds of communication easier. And that is the core of this argument. A language should make it easy to communicate with other stakeholders than the machine, since those other forms of communication with code is actually much more important than only the implementation pieces. Yes, if the implementation works your program might run for a while – but if no one can read the code it can’t be maintained, it can’t be understood except in a black box way, and the utility of the system will be limited.

Go the other way. If you have a program that communicates badly with the machine (but still well enough according to the above definition), but it is written in a clearly communicating way, this means that it is easier to grow the system, it is easier to fix it or implement it more correctly. It can also easier be replaced since the program communicates what it is doing.

There are exceptions to this principle. But we seem to to favor languages that are focused on implementation and only incidentally on communication. This is the wrong choice and it need to be fixed. Communication is at the core of programming, and should also be the focus of it.