Eliza


Eliza is probably one of the most famous AI programs ever made; it was first described in 1966. It’s interesting to note that after we have looked at the implementation it’s obvious that there isn’t much intelligence in this program at all, even though it does feel a bit spooky sometimes when having conversations with it.

The understanding Eliza displays is a result of quite simple pattern matching rules and have nothing to do with understanding. It’s also pattern matching that will take up most of this blog post. The algorithm implemented here is quite basic and something more advanced will be prepared for later chapters.

The Ruby code will have some differences compared to the original Lisp code, and I’ll note these as we go along. The most important one is that I’ll represent patterns and input strings as arrays of strings that are separated using split. Norvig uses (READ) so the Lisp code works on Lisp lists and atoms. This Ruby code will be a bit more complex in some cases due to this.

The file lib/ch05/pat.rb contains a first stab at simple pattern matching:

require 'enumerator'

class Pattern
  def initialize(pattern)
    @pattern = pattern
  end

  # Does pattern match input? Any variable can match anything
  def match(input, pattern = @pattern)
    pattern.enum_for(:zip, input).all? do |p, i|
      (p.is_a?(Array) && i.is_a?(Array) && match(i, p)) ||
        variable?(p) ||
        p == i
    end
  end

  def variable?(pattern)
    pattern.is_a?(String) && pattern[0] == ??
  end
end

Also note the cute trick combining zip and all?. The Enumerable#zip method will allow you to combine two or more arrays, and getting an enumerator for it makes it possible to first combine the array and then call all? on it. This should be quite efficient too. This will match simple patterns against input, using variables that begins with a question mark. This version really only returns true or false. To use it, do something like this:

p Pattern.new(%w(I need a ?X)).match(%w(I need a vacation))

At this stage, really simple matching works, but anything complicated will fail. We also need a way of getting hold of the variables captured in the matching, so that we can replace them in answers from Eliza.

The second version (in lib/ch05/pat2.rb) uses the same algorithm, but captures variables:

require 'enumerator'

class Pattern
  def initialize(pattern)
    @pattern = pattern
  end

  # Does pattern match input? Any variable can match anything
  def match(input, variables = [], pattern = @pattern)
    pattern.enum_for(:zip, input).all? do |p, i|
      (p.is_a?(Array) && i.is_a?(Array) && match(i, variables, p)) ||
        (variable?(p) && (variables << [p, i])) ||
        p == i
    end && variables
  end

  def variable?(pattern)
    pattern.is_a?(String) && pattern[0] == ??
  end
end

As you can see, the main difference is that match takes an optional array for variables. If the match succeeds it returns all variables, otherwise false. So something like this:

p Pattern.new(%w(I need a ?Y)).match(%w(I need a vacation))

would return [[“?Y”, “vacation”]].

The next version will also include a primitive form of unificiation, which makes it possible to match the same variable twice (lib/ch05/pat3.rb):

require 'pat2'

class Pattern
  # Match pattern against input in the context of the bindings
  def match(input, variables = {}, pattern = @pattern)
    pattern.enum_for(:zip, input).all? do |p, i|
      match_recurse(p, i, variables) ||
        match_variable(p, i, variables) ||
        p == i
    end && variables
  end

  def match_recurse(p, i, bindings)
    p.is_a?(Array) && i.is_a?(Array) && match(i, bindings, p)
  end

  def match_variable(p, i, bindings)
    variable?(p) &&
      ((bindings[p].nil? && (bindings[p] = i)) ||
       bindings[p] == i)
  end
end

This version is different mainly in that it returns a Hash instead of an Array. Norvig uses Lisp lists for this, but it doesn’t really make sense to define lots of extra functions for working with it when we already have a nice Hash in Ruby. Using a Hash also makes it really easy to check for previous examples of the same variable. Try it with a pattern such as %w(?Y should be ?Y).

The algorithm is slightly tricky, specifically since it actually modifies the hash as it recurses. This could easily bite you, but it’s still a step in the right direction.

Now, for making Eliza work out well enough for us, we actually need to have segment matching. This will allow matching of none or several atoms in a sweep. Adding this functionality actually transforms the code quite a lot, so let’s take a look at a first stab for this (lib/ch05/pat4.rb):

require 'pat3'

class Pattern
  # Match pattern against input in the context of the bindings
  def match(input, variables = {}, pattern = @pattern)
    case
    when variables.nil?
      nil
    when variable?(pattern)
      match_variable(pattern, input, variables)
    when pattern == input
      variables
    when segment_pattern?(pattern)
      match_segment(pattern, input, variables)
    when pattern.is_a?(Array) && input.is_a?(Array)
      match(input[1..-1],
            match(input[0], variables, pattern[0]),
            pattern[1..-1])
    else
      nil
    end
  end

  def match_variable(p, i, bindings)
    (bindings[p].nil? && bindings.merge(p => i)) ||
      (bindings[p] == i && bindings)
  end

  def segment_pattern?(p)
    p.is_a?(Array) && p[0][0] == ?*
  end

  def match_segment(pattern, input, bindings, start = 0)
    var = "?#{pattern[0][1..-1]}"
    pat = pattern[1..-1]
    if pat.nil? || pat.empty?
      match_variable(var, input, bindings)
    else
      pos = input[start..-1].index(pat[0])
      return nil unless pos
      b2 = match(input[(start+pos)..-1], bindings, pat)
      if b2.nil?
        match_segment(pattern, input, bindings, start+1)
      else
        match_variable(var, input[0...(start+pos)], b2)
      end
    end
  end
end

Here we aren’t modifying the Hash anymore, instead using a recursive version with match_variable merging a new array. Note how functional in style match_variable is. No modifications at all are happening there.

There are now several different cases to keep in mind. The first decision I made was to mirror Norvigs original algorithm that walks through the head of each array, instead of zipping everything up. Since we’re working with stuff that will be able to match even though the length of arrays are different, zip is really not the way to go anymore.

Another difference is the addition of the segment_pattern? and match_segment. Norvig uses a representation that includes a nested list such as (?* ?Y) to denote a segment. Since I don’t want to parse text, I decided to make a much easier choice, such that *Y denotes a segment, that will save into the ?Y variable. In my opinion, it makes both the code and the specifications easier to read.

Since we’re doing a recursive algorithm, the variables parameter is used for the base case of failure – if anything returns false or nil for variables, we fail match immediately.

So what is match_segment doing, really? This is the most tricky bit. Basically it needs to check what the next part of the pattern is to match (you can’t have two segments after each other without anything in between). Once that’s been found, match_segment tries to run the rest of a full match on the rest of the text. If that fails it tries to eat one more atom and recurses, otherwise it associates the current stretch of atoms with the variable.

You can try it out like this:

p Pattern.new(%w(*p need *x)).
  match(%w(Mr Hulot and I need a vacation))
p Pattern.new(%w(*x is a *y)).
  match(%w(what he is is a fool))

The first example shows the simple case, while the second one needs to do some backtracking. But the current version is not totally correct. Specifically, a case such as this would fail:

p Pattern.new(%w(*x a b *x)).
  match(%w(1 2 a b a b 1 2 a b))

The problem is that the algorithm successfully matches the first part of the string but can’t do anything with the rest, since ?X has different values unless the segment matching proceeds further. The fix is to first call match_variable so we can try to match the segment again in the worst case. It looks like this (lib/ch05/pat5.rb):

require 'pat4'

class Pattern
  def match_segment(pattern, input, bindings, start = 0)
    var = "?#{pattern[0][1..-1]}"
    pat = pattern[1..-1]
    if pat.nil? || pat.empty?
      match_variable(var, input, bindings)
    else
      pos = input[start..-1].index(pat[0])
      return nil unless pos

      b2 = match(input[(start+pos)..-1],
                 match_variable(var, input[0...(start+pos)], bindings),
                 pat)

      if !b2
        match_segment(pattern, input, bindings, start+1)
      else
        b2
      end
    end
  end
end

With this code the previous example works fine.

That is all the code there is to the pattern matching for our version of Eliza. Now for the actual implementation. Remember that this version is slightly less complicated compared with the original version. The list of rules in this code is not a full set either. (lib/ch05/eliza1.rb):

require 'pat5'

module Enumerable
  def some?
    self.each do |v|
      result = yield v
      return result if result
    end
    nil
  end
end

class Array
  def random_elt
    self[rand(self.length)]
  end
end

class Rule
  attr_reader :pattern, :responses

  def initialize(pattern, *responses)
    @pattern, @responses = pattern, responses
  end

  ELIZA_RULES =
    [
     Rule.new(Pattern.new(%w(*x hello *y)),
              "How do you do. Please state your problem"),
     Rule.new(Pattern.new(%w(*x I want *y)),
              "What would it mean if you got ?y?",
              "Why do you want ?y?",
              "Suppose you got ?y soon"),
     Rule.new(Pattern.new(%w(*x if *y)),
              "Do you really think it's likely that ?y? ",
              "Do you wish that ?y?",
              "What do you think about ?y?",
              "Really-- if ?y?"),
     Rule.new(Pattern.new(%w(*x no *y)),
              "Why not?",
              "You are being a bit negative",
              "Are you saying \"NO\" just to be negative?"),
     Rule.new(Pattern.new(%w(*x I was *y)),
              "Were you really?",
              "Perhaps I already knew you were ?y?",
              "Why do you tell me you were ?y now?"),
     Rule.new(Pattern.new(%w(*x I feel *y)),
              "Do you often feel ?y?"),
     Rule.new(Pattern.new(%w(*x I felt *y)),
              "What other feelings do you have?")
    ]
end

module Eliza
  class << self
    def run(rules = Rule::ELIZA_RULES)
      while true
        print "eliza> "
        puts eliza_rule(gets.downcase.split, rules)
      end
    end

    def eliza_rule(input, rules)
      rules.some? do |rule|
        result = rule.pattern.match(input)
        if result
          switch_viewpoint(result).
            inject(rule.responses.random_elt) do |sum, repl|
            sum.gsub(repl[0], repl[1].join(" "))
          end
        end
      end
    end

    def switch_viewpoint(words)
      replacements = [%w(I you),
                      %w(you I),
                      %w(me you),
                      %w(am are)]
      hash = {}
      words.each do |key, val|
        hash[key] = replacements.inject(val) do |sum, repl|
          sum.map { |val| val == repl[0] ? repl[1] : val}
        end
      end
      hash
    end
  end
end

if __FILE__ == $0
  Eliza.run
end

This is quite a lot of code, so let’s take it in chunks. First off I need some methods we defined in earlier chapters, namely Enumerable#some? and Array#random_elt. I’ve re-added them here for ease. Then we have the definition of a Rule, which is a pattern and a list of possible responses. It’s a very simple class definition, that takes all the arguments as parameters. Finally, there is the definition of a choice of rules from the original Eliza program. As you can see there are several variations in most cases, so the program can vary in its output.

The Eliza module has three different methods. The run-method takes the rules to use and goes into an infinite loop where it first prints a prompt, then reads a line of input, downcases this and finally splits it, sends it into the eliza_rule method and prints the result.

The eliza_rule method tries to match against at least one rule. If it succeeds in matching it first calls switch_viewpoint on the result (remember that the result here is a Hash of extracted segments from the input). It then replaces all instances of variables in one of the rules replacement text with the variable information.

The switch_viewpoint method just makes sure that the extracted strings from variables is turned into the other viewpoint, replacing “I” with “you”, “me” with “you”, etc.

The last lines of code just runs the Eliza program if this is the main program file that’s running. Try it out!

Now, wasn’t that quite easy? The magic really disappear once you’ve looked at this tiny implementation. And it’s obvious that there are many things that don’t work as well as it should. Pattern matching solves many problems, but you need a quite comprehensive suite of rules to make it really believable.

Eliza was one of the first programs to try to look like it understood English. But really understanding English is a much harder problem that is still at the core of AI.

For some more fun, I’ve encoded the full set of Eliza rules in lib/ch05/eliza2.rb.


6 Comments, Comment or Ping

  1. Ola, really enjoying this series, thank you. I don’t have any formal CS training or experience with algorithmic and functional coding styles, so this is making a nice entry point for me!

    The pattern matching code doesn’t quite behave right when there is unconsumed input, e.g. pattern = %w(?x need a ?y), input = %w(I need a vacation now). The first three versions decide that it’s a match when there is unconsumed input, while the second two fall over.

    I’ve forked your repo at http://github.com/alexscordellis/paipr/ and introduced an example and fixes.

    September 22nd, 2008

  2. I think there are two small problems with this implementation of eliza and the rules: the first is that the apostrophes shouldn’t be escaped in the rules in eliza2.rb, and the second is that “I” in the input gets somehow eaten before it can be matched against any of the rules. This results in the following behaviour:

    eliza> Last night I dreamt about electric sheep
    Do you feel strongly about discussing such things?
    eliza> Often I dream about electric sheep
    How do you feel about electric sheep in reality?

    Here the first input failed to match against the pattern %w(*x I dreamt *y) but the second matched against %w(*x dream about *y).

    September 22nd, 2008

  3. Alex:

    Yeah, those problems are symptomatic of the approach in those implementations, and I’m not going to fix them. =). They should fall over on the last ones though, so that’s correct.

    Ilkka:

    Totally correct. I forgot about the case for rules – the I is upcase while I downcase all input to make it easier. I’ve pushed a fix that downcase all patterns too, to make it consistent – that fixes this problem.

    September 23rd, 2008

  4. Steven

    I’m confused by some syntax I’ve never seen before. what does ?? mean.

    irb > ??
    => 63

    I don’t get how this ends up matching one of the variables which–I think– is the string “?X”

    tack

    October 9th, 2008

  5. Steven

    Ok, I get it ?#{somestring} returns the number that represents the character

    October 9th, 2008

Reply to “Eliza”