The General Problem Solver, part 2


In the last post I described a simple version of the General Problem Solver. This post will take a look at a more complete version that fixes some of the problems I discussed earlier. The first problem to solve is that it can sometimes be a bit hard to debug the output from our solver. If it fails we only know that it failed, not how far it got, and so on. So lets begin by creating a custom debug framework for us (lib/ch04/dbg.rb):

$dbg_ids = []

def dbg(id, format, *args)
  if $dbg_ids.include?(id)
    $stderr.puts(format%args)
  end
end

def debug(*ids)
  $dbg_ids = ($dbg_ids + ids).uniq
end

def undebug(*ids)
  $dbg_ids -= ids
end

def dbg_indent(id, indent, format, *args)
  if $dbg_ids.include?(id)
    $stderr.print "  "*indent
    $stderr.puts(format%args)
  end
end

This code uses a global variable to decide which debug statements should be used. We can turn on and off specific ids using the debug and undebug methods, while we can use dbg and dbg_indent to actually write something out.

Notice that I’ve used the sprintf operator (%) to make formatting of strings built into the system itself.

At this point it’s time to look at GPS2, which provide solutions for the “running around the block”, “prerequisite clobbers sibling”, “leaping before you look” and “recursive subgoal” problems.

One of the more interesting changes is that this version will not print anything, but instead return a status list that describes the actions taken. This will allow some more flexibility in how the code is used, later on.

Before starting out there are some things we need to add to make the output reasonable. We do that by representing operations a little bit differently, by including in the add-list a composite action that will include the symbol executing, and the name of the action. We also need an equivalent of the some-function. Finally, we need to have something that can create new operations including the execution part in the add list.

This code can be seen in lib/ch04/gps2.rb:

require 'gps'
require 'gps_no_clobber'
require 'dbg'

module Enumerable
  def some?
    self.each do |v|
      result = yield v
      return result if result
    end
  end
end

class GPS::Op
  def convert!
    unless self.add_list.any? do |l|
        l.is_a?(Array) && l.first == :executing
      end
      self.add_list << [:executing, self.action]
    end
    self
  end
end

GPS::SCHOOL_OPS.each do |op|
  op.convert!
end

class GPS2 < GPS
  class << self
    def op(action, parts = {})
      GPS::Op.new(action,
             parts[:preconds],
             parts[:add_list],
             parts[:del_list]).convert!
    end
  end
end

In this case I’ve chosen to make a subclass of GPS2, and use the existing helpers in gps and gps_no_clobber, although at the end most methods will be overridden.

I define the method some? on Enumerable; it’s actually really similar to a Ruby version of find/detect. The only difference is that we need to save the result of the block and actually return that instead of the value we sent in to the block.

I open up the GPS::Op class and add convert!. Notice that I’ve named it with a bang, since it will possibly modify the current instance. I’ve also decided to not follow convention by returning nil from a bang method. Instead I return self because it makes some of my code a bit simpler later on. The convert! method will check if the add-list already contains an array where the first element is :executing, otherwise it will add one of these. This will be more interesting later.

The gps2 file also converts all the available school operations.

Finally we provide a new method called GPS2::op, that will create a new converted operation for us. Notice that I’m using the convention of opening up the self class with class << self. I almost always prefer this way.

The next step is to take a look at the new solve-method. This will become slightly different since now all methods will take a state argument to keep track of where in the evaluation we are. The only place that doesn’t take a state is the solve-method, since it uses the initial state given to the initialize method. All methods except solve will also take a parameter for the current goal stack, that will help us keep track of which goals we have achieved so far.

The method solve looks like this (lib/ch04/gps2.rb):

  def solve(*goals)
    achieve_all([[:start]] + @state, goals, []).grep(Array)
  end

The original Lisp code uses remove-if #’atom to filter out all noise from the state, but the Ruby is much simplified by just using grep and sending in an Array. You did know that grep can take as argument anything that implements ===, right?

To keep track of the fact that we have started, we add the list [:start] to the front of the current state.

There are three more methods that need to be updated to make this implementation run. The first two are achieve_all and achieve (lib/ch04/gps2.rb):

  def achieve_all(state, goals, goal_stack)
    current_state = goals.inject(state) do |current_state, g|
      return [] if current_state.empty?
      achieve(current_state, g, goal_stack)
    end
    if goals.subset_of?(current_state)
      current_state
    else
      []
    end
  end

  def achieve(state, goal, goal_stack)
    dbg_indent(:gps, goal_stack.length, "Goal: %p", goal)

    if state.include?(goal)
      state
    elsif goal_stack.include?(goal)
      []
    else
      @ops.find_all do |op|
        appropriate?(goal, op)
      end.some? do |op|
        apply(state, goal, op, goal_stack)
      end || []
    end
  end

These are a fair bit more complicated than the first implementation. This is because we need to handle some things that the first implementation really didn’t handle well at all.

The achieve_all method is definitely a bit more messy. That’s because we need to update the current state for each call to achieve. In Norvig’s book, the implementation actually used a state variable that was updated after each iteration, but I felt that explicit modification didn’t feel right here, especially not in the presence of inject, which is really a perfect tool for this kind of updating. The only complication is that we want to halt everything if any call to achieve returns an empty state (this means that it couldn’t continue). If a goal couldn’t be achieved, it’s obvious we can just jump out directly.

The use here of inject is a bit different since it doesn’t explicitly build up something, but instead creates a new variable each time that is a modified version of the original. This is a refactoring that I would like to see in more code bases, since it’s a very nice way of getting rid of explicit mutation.

After building up the current state by calling achieve on all goals, we also need to make sure that all goals are still part of the current state. We do this using the subset? method defined in the code for the first GPS version.

The achieve method is the first place we use our debugging system. The length of the goal stack gives us an obvious indentation parameter, since it will get longer and longer the deeper we recurse. I use %p instead of %s since I really want to get the inspected output here.

There are three ways achieve can go. The first is if the goal is already in the current state. That means we have already achieved that goal, and all is fine.

If the goal is in the goal stack, that means we are in the second iteration of a recursive goal search – not good. That means we have failed that goal, so we return an empty array.

Finally, if none of the above situations are true, we need to find all appropriate operations for the current goal – using the same appropriate? method as the first GPS. We then use some? on all appropriate operations. The difference between find/detect and some? is obvious here – if we’d used find, the result would be an instance of Op, while some? will return the current state returned from the apply-method called inside the block.

At the end of the evaluation, if some? doesn’t find anything, it’s necessary to return an empty array instead of nil, since I’m using the empty array to represent failure every. (I will discuss this in more detail in a comparison post between Common Lisp and Ruby).

The missing method in GPS2 is now apply. It looks like this (lib/ch04/gps2.rb):

  def apply(state, goal, op, goal_stack)
    dbg_indent(:gps, goal_stack.length, "Consider: %p", op.action)

    state2 = achieve_all(state, op.preconds, [goal] + goal_stack)

    unless state2.empty?
      dbg_indent(:gps, goal_stack.length, "Action: %p", op.action)
      (state2 - op.del_list) + op.add_list
    end
  end

As you can see, we use dbg_indent in two different places here to make it obvious what’s happening while actually applying an operation. This is also the place where we first recurse to make sure all the operations preconditions have been satisfied. If we get back a state that is not empty, we can return a new state that is modified by the delete-list and then the add-list.

Wow, that was a bit more complicated, but not too bad. Let’s take a look at how this version works with the problems we established for GPS earlier (lib/ch04/gps2_problem1.rb):

require 'gps2'
require 'pp'

gps = GPS2.new([:son_at_home,
                :car_needs_battery,
                :have_money,
                :have_phone_book],
              GPS::SCHOOL_OPS)
pp gps.solve(:son_at_school)

If you run it, you’ll see that the result matches more or less the steps we got from the first GPS. I use pp to display the result of the operation. The main difference is that the start condition is in there too.

Something very simple, like lib/ch04/gps2_problem2.rb also works fine:

require 'gps2'
require 'pp'

gps = GPS2.new([:son_at_home,
                :car_works],
              GPS::SCHOOL_OPS)
pp gps.solve(:son_at_school)

The problem the last version used to get wrong now works correctly (lib/ch04/gps2_problem3.rb):

require 'gps2'
require 'pp'

gps = GPS2.new([:son_at_home,
                :car_needs_battery,
                :have_money,
                :have_phone_book],
              GPS::SCHOOL_OPS)
pp gps.solve(:have_money, :son_at_school)

and lib/ch04/gps2_problem4.rb:

require 'gps2'
require 'pp'

gps = GPS2.new([:son_at_home,
                :car_needs_battery,
                :have_money,
                :have_phone_book],
              GPS::SCHOOL_OPS)
pp gps.solve(:son_at_school, :have_money)

Leaping before you look also fails as it should (lib/ch04/gps2_problem5.rb):

require 'gps2'
require 'pp'

gps = GPS2.new([:son_at_home,
                :car_needs_battery,
                :have_money],
              GPS::SCHOOL_OPS)
pp gps.solve(:son_at_school)

Really simple problems that require no action also works fine (lib/ch04/gps2_problem6.rb):

require 'gps2'
require 'pp'

gps = GPS2.new([:son_at_home],
              GPS::SCHOOL_OPS)
pp gps.solve(:son_at_home)

Monkey and bananas

So that we can see that the G in GPS is there for a reason, let’s see if we can apply the new version of GPS to a new domain. This problem is a classic AI problem – monkey and bananas. The basic idea is that a monkey is standing at the doorway to a room, there are bananas hanging from a rope in the middle of the room – out of the monkeys reach – and there is a chair near the door that the monkey can push into the room and stand upon to reach the bananas. The first thing we need to do is define the operators for this domain (lib/ch04/gps2_monkey1.rb):

require 'gps2'
require 'pp'

$banana_ops = [
              GPS2::op(:climb_on_chair,
                       :preconds => [:chair_at_middle_room,
                                     :at_middle_room,
                                     :on_floor],
                       :add_list => [:at_bananas, :on_chair],
                       :del_list => [:at_middle_room, :on_floor]),
              GPS2::op(:push_chair_from_door_to_middle_room,
                       :preconds => [:chair_at_door,
                                     :at_door],
                       :add_list => [:chair_at_middle_room,
                                     :at_middle_room],
                       :del_list => [:chair_at_door, :at_door]),
              GPS2::op(:walk_from_door_to_middle_room,
                       :preconds => [:at_door, :on_floor],
                       :add_list => [:at_middle_room],
                       :del_list => [:at_door]),
              GPS2::op(:grasp_bananas,
                       :preconds => [:at_bananas, :empty_handed],
                       :add_list => [:has_bananas],
                       :del_list => [:empty_handed]),
              GPS2::op(:drop_ball,
                       :preconds => [:has_ball],
                       :add_list => [:empty_handed],
                       :del_list => [:has_ball]),
              GPS2::op(:eat_bananas,
                       :preconds => [:has_bananas],
                       :add_list => [:empty_handed, :not_hungry],
                       :del_list => [:has_bananas, :hungry])
             ]

Nothing really strange here, really. So let’s use it and see if GPS works for it (lib/ch04/gps2_monkey1.rb):

gps = GPS2.new([:at_door,
                :on_floor,
                :has_ball,
                :hungry,
                :chair_at_door],
               $banana_ops)
pp gps.solve(:not_hungry)

This will generate the expected output:

[[:start],
 [:executing, :push_chair_from_door_to_middle_room],
 [:executing, :climb_on_chair],
 [:executing, :drop_ball],
 [:executing, :grasp_bananas],
 [:executing, :eat_bananas]]

And since we didn’t make any changes to the GPS problem, the claims of being generic does have some validity.

Maze searching

Another classic AI problem is maze searching. We can easily define a maze as numbered squares where one numbered square can lead to zero or more other squares. To make definitions really easy, we define a few simple methods that make it simple to define the operations for us (lib/ch04/maze.rb):

require 'gps2'
require 'pp'

module Maze
  class << self
    def make_ops(pair)
      [make_op(pair[0], pair[1]),
       make_op(pair[1], pair[0])]
    end

    def make_op(here, there)
      GPS2::op([:move, :from, here, :to, there],
               :preconds => [[:at, here]],
               :add_list => [[:at, there]],
               :del_list => [[:at, here]])
    end
  end

  Ops = [[1,2], [2,3], [3,4], [4,9], [9,14], [9,8],
         [8,7], [7,12], [12,13], [12,11], [11,6],
         [11,16], [16,17], [17,22], [21,22], [22,23],
         [23,18], [23,24], [24,19], [19,20], [20, 15],
         [15,10], [10,5], [20,25]].inject([]) do |sum, obj|
    sum + make_ops(obj)
  end
end

Here we first have make_ops, that just make all passages from one square to another symmetric – meaning we won’t have to define the passage from 1 to 2 AND the one from 2 to 1. The make_op method make use of some interesting properties of our code that was there without any real thought given to it. Namely that goals can be represented as arrays instead of symbols. In fact it doesn’t matter how you represent the goals, as long as you always use the same representation for the same goal. You can mix and match strings, symbols, regexps and arrays if you would like too. In this case the only goals we have are of the form [:at, 1], but it gives us a bit more representation power to do it like this.

Finally, a large array of pairs of numbers is transformed into operations using inject and calls to make_ops. The end result is all the operations we need to define this particular maze.

So, a first problem for this maze would probably look at how to get out of it, so lets see it (lib/ch04/maze_problem1.rb):

require 'maze'

gps = GPS2.new([[:at, 1]],
               Maze::Ops)
pp gps.solve([:at, 25])

The initial state is simple, we are at the first square, and want to get to the last (which is number 25). The output from running this problem gives us this output:

[[:start],
 [:executing, [:move, :from, 1, :to, 2]],
 [:executing, [:move, :from, 2, :to, 3]],
 [:executing, [:move, :from, 3, :to, 4]],
 [:executing, [:move, :from, 4, :to, 9]],
 [:executing, [:move, :from, 9, :to, 8]],
 [:executing, [:move, :from, 8, :to, 7]],
 [:executing, [:move, :from, 7, :to, 12]],
 [:executing, [:move, :from, 12, :to, 11]],
 [:executing, [:move, :from, 11, :to, 16]],
 [:executing, [:move, :from, 16, :to, 17]],
 [:executing, [:move, :from, 17, :to, 22]],
 [:executing, [:move, :from, 22, :to, 23]],
 [:executing, [:move, :from, 23, :to, 24]],
 [:executing, [:move, :from, 24, :to, 19]],
 [:executing, [:move, :from, 19, :to, 20]],
 [:at, 25],
 [:executing, [:move, :from, 20, :to, 25]]]

Except for that spurious [:at, 25], everything looks like we would expect here. We have a start move and then moving from square to square until we get to the last square.

The problem with the extra element is that the way we defined solve leaves everything that is an Array in there. What we really want to do is to leave everything that denotes an action in there. As you can see in lib/ch04/gps3.rb:

require 'gps2'

class GPS2
  def action?(x)
    x == [:start] || (x.is_a?(Array) && x.first == :executing)
  end

  def solve(*goals)
    achieve_all([[:start]] + @state, goals, []).find_all do |state|
      action?(state)
    end
  end
end

We first define what an action is and then find everything that satisfies that criteria.

What’s interesting with the current representation of the GPS, and also the way our maze domain looks like is that we can use the output for several different things. Say for example that we want to get a list of the path to the exit. Then we can use the code in lib/ch04/find_path.rb:

require 'maze'
require 'gps3'

module Maze
  class << self
    def find_path(start, finish)
      results = GPS2.new([[:at, start]], Maze::Ops).
        solve([:at, finish])
      unless results.empty?
        [start] + ((results - [[:start]]).map do |action|
          destination(action)
        end)
      end
    end

    def destination(action)
      action[1][4]
    end
  end
end

pp Maze.find_path(1, 25)
pp Maze.find_path(1, 1)
pp(Maze.find_path(1, 25) == Maze.find_path(25,1).reverse)

Running this first gives us the paths from 1 to 25, then from 1 to 1, and finally shows that our expectations of reversing the path is correct:

[1, 2, 3, 4, 9, 8, 7, 12, 11, 16, 17, 22, 23, 24, 19, 20, 25]
[1]
true

The blocks world

A final domain that probably has gotten more attention than any other in AI circles is the blocks world. The basic point of it is that you have a child’s set of building blocks on a table. The problems usually involve moving them in different ways, putting them on top of each other in different order, and so on. The assumption for our current modeling of it will be that you can only have one block on top of another. The only action you can take in this world is to move a block that has nothing on top of itself on top of either the table or on top of a block that has nothing on top of it.

First of, creating the operators for this domain (lib/ch04/blocks.rb):

require 'gps3'

module Blocks
  class << self
    def make_ops(blocks)
      ops = []
      blocks.each do |a|
        blocks.each do |b|
          unless a == b
            blocks.each do |c|
              unless [a, b].include?(c)
                ops << move_op(a, b, c)
              end
            end
            ops << move_op(a, :table, b)
            ops << move_op(a, b, :table)
          end
        end
      end
      ops
    end

    # Make an operator to move A from B to C
    def move_op(a, b, c)
      GPS2.op([:move, a, :from, b, :to, c],
              :preconds => [[:space, :on, a], [:space, :on, c], [a, :on, b]],
              :add_list => move_ons(a, b, c),
              :del_list => move_ons(a, c, b))
    end

    def move_ons(a, b, c)
      b == :table ? [[a, :on, c]] :
        [[a, :on, c], [:space, :on, b]]
    end
  end
end

This implementation is trickier than the others. The style of operations we want is something along the lines of “Move block A from TABLE to B”. This code accomplishes all available variations, while checking that there are spaces on top of A and B, and that A is on top of TABLE. The guards in make_ops need to make sure that we don’t create operations from a block to itself.

Now that we can make operators, lets see a simple problem. We have two blocks a and b, that lie on a table, and we want to put a on top of b. That would look like this (lib/ch04/blocks_problem1.rb):

require 'blocks'
require 'pp'

gps = GPS2.new([[:a, :on, :table],
                [:b, :on, :table],
                [:space, :on, :a],
                [:space, :on, :b],
                [:space, :on, :table]],
               Blocks.make_ops([:a, :b]))

pp gps.solve([:a, :on, :b], [:b, :on, :table])

This gives us the expected output:

[[:start], [:executing, [:move, :a, :from, :table, :to, :b]]]

Lets consider the case where we have a on top of b, and want to move it so b is on top of a. This time with debug output (lib/ch04/blocks_problem2.rb):

require 'blocks'
require 'pp'

gps = GPS2.new([[:a, :on, :b],
                [:b, :on, :table],
                [:space, :on, :a],
                [:space, :on, :table]],
               Blocks.make_ops([:a, :b]))

pp gps.solve([:b, :on, :a])

This also gives us what we expect:

Goal: [:b, :on, :a]
Consider: [:move, :b, :from, :table, :to, :a]
  Goal: [:space, :on, :b]
  Consider: [:move, :a, :from, :b, :to, :table]
    Goal: [:space, :on, :a]
    Goal: [:space, :on, :table]
    Goal: [:a, :on, :b]
  Action: [:move, :a, :from, :b, :to, :table]
  Goal: [:space, :on, :a]
  Goal: [:b, :on, :table]
Action: [:move, :b, :from, :table, :to, :a]
[[:start],
 [:executing, [:move, :a, :from, :b, :to, :table]],
 [:executing, [:move, :b, :from, :table, :to, :a]]]

One of the more interesting features of my implementation compared to Norvigs is that he has some ordering problems that needs extra code in achieve_all to handle. I haven’t figured out why, but my code doesn’t have that problem, so I’ll not show the fixed code (since there was no need to fix it). An example that caused these problems for Norvig looks like this with the Ruby code (lib/ch04/blocks_problem3.rb):

require 'blocks'
require 'pp'

gps = GPS2.new([[:a, :on, :b],
                [:b, :on, :c],
                [:c, :on, :table],
                [:space, :on, :a],
                [:space, :on, :table]],
               Blocks.make_ops([:a, :b, :c]))

pp gps.solve([:c, :on, :b], [:b, :on, :a])

This problem represent a block world where a is on top of b, b is on top of c and c is on top of table, and you want to reverse the ordering of them.

Another problem Norvig has that doesn’t show up in this code is a few examples that do to many operations to achieve something. The file lib/ch04/blocks_problem4.rb is one of those:

require 'blocks'
require 'pp'

gps = GPS2.new([[:c, :on, :a],
                [:a, :on, :table],
                [:b, :on, :table],
                [:space, :on, :b],
                [:space, :on, :c],
                [:space, :on, :table]],
               Blocks.make_ops([:a, :b, :c]))

pp gps.solve([:c, :on, :table])

Where the optimal solution takes one action, the lisp code does two actions (moving c from a to b, and then from b to table). This is because of ordering problems. Topological sort might solve it, for example. There is another example that takes four operations instead of two (lib/ch04/blocks_problem5.rb):

require 'blocks'
require 'pp'

gps = GPS2.new([[:c, :on, :a],
                [:a, :on, :table],
                [:b, :on, :table],
                [:space, :on, :b],
                [:space, :on, :c],
                [:space, :on, :table]],
               Blocks.make_ops([:a, :b, :c]))

pp gps.solve([:c, :on, :table], [:a, :on, :b])

But as mentioned before, it doesn’t seem like the Ruby code exhibit this problem, so I decided to not implement Norvigs fixes for them.

Interestingly, there are problems that can’t be solved no matter which order you put the goals. Take the example where b and a is on the table, and c is on top of a. You want to have c on the table, b on top of c and a on top of b. The current GPS implementation can’t handle this no matter which order you try (lib/ch04/blocks_problem6.rb):

require 'blocks'
require 'pp'

gps = GPS2.new([[:c, :on, :a],
                [:a, :on, :table],
                [:b, :on, :table],
                [:space, :on, :c],
                [:space, :on, :b],
                [:space, :on, :table]],
               Blocks.make_ops([:a, :b, :c]))

pp gps.solve([:a, :on, :b], [:b, :on, :c])
pp gps.solve([:b, :on, :c], [:a, :on, :b])

This is called the Sussman anomaly, and solutions will be covered later in the book.

Conclusion

It’s obvious that this version of GPS is much more powerful than the original one. But there is also some things that still show up as problems. Even though we introduced several measures there are still cases where a correct solution can’t be found. For example, we can take a look at the getting a child to school problem again, adding a new operator for taking a taxi there. Say that we want to get the son to school without using any money and define it like this (lib/ch04/gps2_problem7.rb):

require 'gps3'
require 'pp'

gps = GPS2.new([:son_at_home,
                :have_money,
                :have_works],
              GPS::SCHOOL_OPS +
               [
                GPS2::op(:taxi_son_to_school,
                         :preconds => [:son_at_home,
                                       :have_money],
                         :add_list => [:son_at_school],
                         :del_list => [:son_at_home,
                                       :have_money])
               ])
debug :gps
pp gps.solve(:son_at_school, :have_money)

Here something funny happens. If you look at the debug output:

Goal: :son_at_school
Consider: :drive_son_to_school
  Goal: :son_at_home
  Goal: :car_works
  Consider: :shop_installs_battery
    Goal: :car_needs_battery
Consider: :taxi_son_to_school
  Goal: :son_at_home
  Goal: :have_money
Action: :taxi_son_to_school
Goal: :have_money
[]

There are two ways of solving this: first solve the have_money goal and then the get son to school. In that case the have_money goal is solved trivially, and then the money is spent on a taxi. If the other ordering is tried the child is taxied to school and then the have_money goal fails. There is a valid solution (driving the son to school) but it’s never tried.

Another problem with the approach is that the descriptive power is not general and powerful enough. The conditions we use for operators in GPS is not really abstract enough.

GPS also assumes that you know everything about the world (“perfect information”), and that everything is either true or false. Probability is nowadays one of the better approaches available for AI, and the GPS doesn’t take fuzzy knowledge into account.

Finally, interacting goals, where you have several active goals at the same time is something most people have, but GPS takes one goal at a time.

All of these problems made it problematic to continue to use the GPS approach. Another thing that had a large impact was the realization how NP-hard problems affected problem solvers like GPS. At the end of the day, GPS was an interesting early approach to solve the problem, and it’s definitely been part of AI heritage available today.


2 Comments, Comment or Ping

  1. Out of curiosity, is the GPS a suitable tool for solving “Who owns the Zebra?”-style puzzles?

    September 16th, 2008

  2. I’d just like to say thanks for this great paipr series, I’m having totally indecent amounts of fun playing with the code. These articles have really clarified some ruby idioms I wasn’t familiar with before.

    September 18th, 2008

Reply to “The General Problem Solver, part 2”