Simplified finders in DataMapper


It seems I’m spending lots of time reading about DataMapper currently, while planning the first refactoring of Ribs. While reading I keep seeing Sam’s examples of simplified finders compared to the ActiveRecord versions. A typical example of the kind of thing I’m talking about is something like this with ActiveRecord:

Exibition.find(:all, :conditions => ["run_time > ? AND run_time < ?", 2, 5])

Which with DataMapper would be:

Exibition.all(:run_time.gt => 2, :run_time.lt => 5)

Oh, and yeah, it the typo is directly from the DataMapper documentation. So what’s wrong here? Well, in most cases it probably does exactly what you want it too. And that’s the problem in itself. If you use these simplified finders with more than one argument, you will get subpar SQL queries in some cases. Simply, if you don’t control the order of the clauses separated by AND, you might get queries that perform substantially worse than they should. Of course, that doesn’t happen often, but it’s important to keep it in mind.

And incidentally, I really do hate the methods added to Symbol.


22 Comments, Comment or Ping

  1. I don’t see the typo you mention?

    Either way, I agree. Conditions order is very important. Which is why we’re changing over to an Array. This example would be written as:

    Exhibition.all(:run_time.gt(2), :run_time.lt(5))

    I’m not against getting rid of the Symbol extensions, but I’m wondering what you might have in mind?

    Exhibition.all(DM::Operator.new(:runtime, :gt, 2), DM::Operator.new(:runtime, :lt, 5))

    That might satisfy some, but it ranks pretty low on the “beauty” gauge. And that really does matter for adoption…

    September 8th, 2008

  2. Erik Hetzner

    Why do people continue to mangle the Ruby syntax to create horrible little “DSL”s? Maybe it looks good in a blog post to somebody who doesn’t understand Ruby, but it is meaningless. Furthermore, you are hacking the semantics of Ruby to make the syntax work the way that you want. This will come back to bite you.

    If you understand Ruby, you see:

    Exhibition.all(:run_time.gt(2), :run_time.lt(5))

    What does this mean? It is a method, with two arguments. The args are the results of evaluating the method gt on the symbol object :run_time. Admittedly this is better than the alternative syntax that Ola first mentioned. Why does the symbol class have these new methods? No reason, except to help out DataMapper. So the DataMapper clutters up the core language just so it can have sexy syntax? What do these methods return? Who knows. Presumably some new object.

    Look, I’m sorry that you are not using Lisp. It kind of sucks that you are stuck with the Ruby syntax and cannot create macros to have new syntax. But that is how it is. Ruby syntax is what it is, and you are stuck with it.

    Your ‘bad’ example is really not that bad. Imagine how it could be if we used class methods on a class that we include as Op:

    Exhibition.all(Op.gt(:run_time, 2), Op.lt(:run_time, 5))

    Ok, it is a bit wordier. But we know what it means! And it can be composed:

    Exhibition.all(Op.or(Op.gt(:run_time, 5), Op.lt(:run_time,2)))

    We start needing special purpose symbols?

    Exhibition.all(:or, :run_time.lt(2), :run_time.gt(5))

    What about functions?

    Exhibition.all(:date.eq(:curdate)) ?

    Compare: Exhibition.all(Op.eq(:date, Func.new(‘curdate’)))

    And what about all the other comparators?

    Exhibintion.all(:run_time.eq(1), :text.like(‘%this%’), :a.gte(1), :b.isnotnull)

    Now you’ve really cluttered up symbol.

    Why not expose to the users what you are doing, which (presumably) is creating a tree of conditionals which will be transformed into a SQL statement? If you expose this, users could begin doing things like generating the trees programmatically, reusing them, composing them, extending them, etc? Hiding what happens behind magic symbol methods means that only the most dedicated of users will ever look into what is actually happening. And you are going to run into places where what can be expressed in the underlying semantics cannot be expressed in your hacked syntax without further hacks.

    September 8th, 2008

  3. @Sam: Check the spelling of “Exibition”.

    September 8th, 2008

  4. @David: Ah, got it. Thanks.

    @Erik: Op and Func are much more likely to collide I think. Which is really what it all boils down to. And yeah, you could create the DataMapper::Query::Operator instances directly. The symbols are just short-cuts to that.

    Which matters.

    I can’t emphasize that enough. It may suck. It may violate good programming. But it’s reality. Ignore it and watch your project sail down the drain. People coming to Ruby and Rails are doing so because of the syntax. You can’t over-estimate the impact that has.

    So… what to do? Zoo.all(Zoo.name.eq(“Bob”)) is the cleanest non-Symbol variant I’ve been able to imagine. And it’s OK. But still… believe me, I’m a lot more 1RR, DI, SoC, blah blah blah than most, and you still won’t find me advocating for killing a syntax convenience (Symbol extensions) that in over a year has *never* been reported to have conflicted with another library just because “it’s bad”.

    So like I said, I’m open to alternatives… but it’s got to be nice. It’s got to make sense. And ditching Symbols aint gonna happen before 1.0 because it would break too many projects (though it may move into an optional plugin to give people a migration path).

    September 9th, 2008

  5. typo fixed on the datamapper site. Thanks for catching that!

    September 9th, 2008

  6. Erik Hetzner

    @Sam: Op & Func were just examples. They could be:

    DataMapper::SQL::Syntax::Base::Operation

    and users can define shortcuts:

    Op = DataMapper::SQL::Syntax::Base::Operation

    or whatever suits their fancy.

    I don’t mean to pick on DataMapper. In fact I’d never heard of DataMapper before this post. I do mean to pick on DSLs which twist the semantics to get a new kind of syntax.

    RoR actually has very few of these kind of tricks, as far as I can tell. It largely uses Ruby syntax the way it was meant to be used. Yes, it adds some methods to String, etc. But these methods logically belong in String. Methods to help out various packages do not belong in the symbol class. RoR does a lot of other magic, but nothing that I can recall in it radically violates what one would expect from Ruby syntax.

    It *is* bad, & I can guarantee you it will come back to bite you. You say that it is necessary to uptake, but even if this were true, it is not good for the long term of Ruby. It will drive people away in the end, as it will cause confusion as every project uses the syntax of Ruby in some different way, & this causes strange errors that are almost impossible to track down. This will not lead to satisfied Ruby users.

    September 9th, 2008

  7. I think the problem with these is always that you globally pollute the symbol space for very local benefits. I don’t know how exactly these macros work in Lisp, but I understand that you do create a global macro, no?

    I wonder why you don’t keep the extensions local to the place where they are really needed, i.e., the Ex(h)ibition class:

    Exhibition.all().with(:runtime).gt(2).and(:runtime).lt(5).find();

    Depending on your preference, you could also do “.with(:runtime, :gt, 5)”, or similar. This doesn’t pollute Symbol and it looks comparably nice. You can even skip the .find() call if you implement your own, lazily loaded enumeration.

    Interestingly, Java syntax can also come pretty close to that. Exchange the “:symbol” clauses to strings, and you can implement identical functionality in Java.

    I alway like to look into the supported methods of an instance as a way of interactive programming in the shell, and this pollution of basic classes that does not make sense outside of a very narrow domain is really nasty. Plus, if everyone was to do this, you’d get bizarre effects, particularly on popular classes like Symbol and common method names like ‘gt’.

    September 9th, 2008

  8. Sam,

    Glad to hear condition order will be fixed, and it’s a good way of fixing it.

    I don’t have a specific example in mind for a better syntax for the symbol extensions, but you have gotten several in these comments. I definitely agree with these comments about the pollution of the Symbol namespace is not really nice. Or actually, I would be fine with it if it was something optional that you could require separately – while there exists some other way of doing it.

    Just providing an DataMapperOperations module that can be included into relevant classes/modules would be fine. This module can provide methods so you can do:

    Exhibition.all(gt(:run_time, 2), lt(:run_time, 5))

    Maybe it’s just my Lisp heritage that makes the prefix notation feel nice – but it means you get much more flexibility – as some people already mentioned – for adding stuff like aggregate functions and other things in the same syntax.

    The focus on adoption over solidness of the implementation is something I don’t agree with. In fact, I’ll write a blog post about that today. =)

    September 9th, 2008

  9. Sam again:
    I do understand your reasoning. And I totally believe that no one has reported any clashes. For me personally it’s as simple as this: I won’t use a library that pollutes the central classes with things that are as specific as these are. So that mean I won’t get any clashes. And I think that I’m not the only one that reasons about it this way – so yeah, you drive adoption with a nice DSLish syntax, but that adoption may face maintainability issues further down the line. And you’re also turning away some users.

    I realize it’s there to stay. But as a middle ground I would really like to have it separate, so you can choose if you want it or not.

    September 9th, 2008

  10. Adam:

    Quick work! Good on you.

    September 9th, 2008

  11. Erik:

    Agreed about your points for Rails. Interestingly, for a project that uses tricks all over the place, it’s actually quite good at only polluting namespaces with suitable stuff.

    September 9th, 2008

  12. Martin:

    I like that style of finding much more. It looks quite a lot like Hibernate actually – with the difference that you can do without the parenthesis.

    Exhibition.all.filter.run_time.gt(2).and.run_time.lt(5)

    Quite nice in my opinion.

    Macros in Common Lisp doesn’t need to be global. They can be part of packages which is namespacing. And you can also define local macros much like you can define a local function.

    September 9th, 2008

  13. Anatoly Medvedkov

    @Sam:
    “Zoo.all(Zoo.name.eq(”Bob”)) is the cleanest non-Symbol variant I’ve been able to imagine.”

    Look at python Storm ORM, they have pretty solution for this problem. Your example could look like this:

    Zoo.all(Zoo.name == ”Bob”)

    September 9th, 2008

  14. Anatoly:

    We could go with operators yes. The “not” state is a minor nuisance, but Zoo.name.not == ‘Bob’ is OK.

    Martin:

    That syntax doesn’t allow for clean nesting. It would with some minor additions of course, but at the end of the day, I’m not a huge fan. Instead of fixed parameters, it allows for limitless mutations of the Query receiver. I’m having a hard time putting into words why I dislike that approach. It just feels wrong.

    Ola:

    I give. If I’m going to attract more .NET/Java developers (which is definitely my goal), I should put my money where my mouth is.

    I’ll go with the module to include like/eq/gt/lt/etc functions to generate a DataMapper::Query::UnboundCondition. The DataMapper::Query::Path (Zoo.exhibits.name.eq(“Monkeys”)) already works, so it should be a fairly slight modification.

    Symbolic operators will be an optional module.

    class Symbol
    include DataMapper::Symbol::Operators
    end

    The comment that really won me over was the “Software Engineering” comment. And yes, I’ve been frustrated by that as well…

    Sorry to be so defensive. Honestly I’m just tired of the endless refactors and want to ship this thing. ;-)

    September 9th, 2008

  15. What about something like this? Would it be feasible?

    @Exhibition.all { run_time > 2 && run_time 2″ for the time being.

    Just a thought… I really haven’t considered whether this would work for DataMapper at all :)

    September 18th, 2008

  16. Looks like my comment was stripped of some characters as I was assuming wrongly that Textile was available. Here is is in full:

    What about something like this? Would it be feasible?

    Exhibition.all { run_time > 2 && run_time 2″ for the time being.

    Just a thought… I really haven’t considered whether this would work for DataMapper at all :)

    September 18th, 2008

  17. Gah – this blog commenting system doesn’t seem to like urls… sorry for the spam :/

    What about something like this? Would it be feasible?

    Exhibition.all { run_time > 2 && run_time 2″ for the time being.

    Just a thought… I really haven’t considered whether this would work for DataMapper at all :)

    September 18th, 2008

  18. Last attempt! :(

    “That’s similar to how I did a small Poisson library (which was really just a toy project). poisson dot rubyforge dot org”

    “The issue I did encounter with it is that “!=” is not a method. However, it will be a method in Ruby 1.9, and you could easily do something like “run_time.not > 2″ for the time being.”

    September 18th, 2008

  19. Bill

    >
    > Exibition.find(… :conditions => [” … “, …])

    But… DataMapper *does* support this syntax! Look here:

    http://datamapper.org/doku.php?id=docs:finders

    Search that page for:

    zoos = Zoo.all(:conditions => [“id = ?”, 34])

    > if you don’t control the order of the clauses
    > separated by AND, you might get queries
    > that perform substantially worse than they
    > should.

    BTW, starting with Ruby 1.9, hashes are ordered, so you won’t have this problem.

    January 20th, 2009

  20. Bill:

    Yes, I know DataMapper supports that syntax. That’s doesn’t make it less of a problem that they have the other syntax – and you can’t avoid this pollution of symbols.

    Right. And if I don’t want to upgrade to 1.9? Also, I must say that having a stable ordering for a hash always seemed a bit weird to me. Not sure why that decision was made.

    January 20th, 2009

  21. Bill:

    Simply put, the problem is that there is too much implicit things going on. That’s a general problem with many Ruby DSLs – only the things that doesn’t matter should be implicit, but when something matters … well, you don’t want it to be hidden.

    January 20th, 2009

Reply to “Simplified finders in DataMapper”