Written Homework #2 Solutions

1. [35] Consider the following state-space graph with Start State S and Goal state G, heuristic function values and path costs noted:

Assuming that successor states are generated in alphabetical order (in stack-based algorithms, placed on the Open List in alpha order either at the start or at the end), and ties (for priority-queue based algorithms) broken in alphabetical order, in what order are the nodes in this graph expanded by each of the following search algorithms? Also for each, what is the cost of the path found? If the path found is not optimal, briefly explain why the algorithm didn't find the optimal path. Do not remove repeated states (i.e. state C).

  • Breadth-First Search: I'll do this one for you: expanded=SABCEG1, path (SAG1) cost is 13 (not optimal), the open list still has C(via B) F (via B) G3 (via C) G1(via E) on it.  You still need to tell me *why* it didn't find the optimal answer . Here's how I got this answer: we start by expanding S, which places A B C (in that order) on the open list. Then we expand A, placing E then G1 at the end of the open list, then expand B, placing C and F on the end of the open list, then expand C, placing G3 at the end, then E placing G3 at the end, then expand G1, finishing with S-A-G1.
    The reason that we don't find the optimal path is that breadth-first search is only optimal when the shallowest solution is optimal (i.e., when the cost to a node is purely a function of it's depth only).
  • Depth First Search DFS is not optimal SAEG1 (12)
  • Uniform Cost Search SBACFDCG2 (9)
  • Iterative Deepening ID is optimal only when breadth-first is optimal.
           SSABCSAEG1 (12)
  • Greedy Best-First Search Greedy seearch is not optimal SBCG3 (cost is either 16 or 14, depending on which "C" you chose)
  • A* SBCCFDG2 (9)
  • IDA* SBCFDCSBCFDG2 (9) (remember depth-first search w/in f-limit)

2. [15] Problem 3.17 from the book. (five parts)

a. Any path, no matter how bad it appears, might lead to an arbitrarily large reward (NEGATIVE COST). Therefore, one would need to exhuast all possible paths to be sure of finding the best one.

b. Suppose the greatest possible reward is r. Then if we also know the maximum depth of the state space (e.g. when the state space is a tree), then any path with d levels remaining can be improved by at most rd, so any paths worse than rd less than the best path can be pruned. For state spaces with loops, this won't help, because you can go around the loop any number of times, picking up r each time (see next part).

[Note: the above answer is Stuart Russel's: If I am reading it correctly, I think the pruning part is buggy...But the core idea, that such info allows us to still search in a tree, is correct.]

c. The agent should plan to go around this loop forever (unless it can find another loop with an even better reward)

d. The value of a scenic loop is lessened each time one revisits it; a novel site is a great reward, but seeing the same one for the tenth time in an hour is tedious, not rewarding. To accommodate this, we would have to expand the state space to include a memory---a state is now not just represented by the current location, but by the current location and a bag of already-visited locations. The reward for visiting a new location is now a diminishing function of the number of times it has been seen before.

e. Real domains with looping behavior include eating junk food, going to class, watching certain movies, etc.

3. [8] Problem 4.2 (four parts)

w=0 gives f(n) = 2g(n). This behaves exactly like uniform cost search because the constant POSITIVE factor will make no difference in the order that nodes are visited.

w=1 gives A* search

w=2 gives f(n) = 2h(n), i.e., greedy best-first search.

We can rewrite the equation as f(n) = (2-w)[g(n) + (w/[2-w])h(n)]. For w< 2 the outside factor is positive, and in particular, when w <= 1 the factor modifying h(n) is less than or equal to h(n), and thus is underestimating, and thus is admissible.

4. [8] Problem 5.4 (two parts)

a. Crossword puzzle construction can be solved in many ways. One simple way is depth-first search. Each successor fills in a word in the puzzle with one of the words in the dictionary. It is better to go a word at a time, to minimize the number of steps.

b. As a CSP, there are even more choices. You could have a variable for each box in the crossword puzzle---in this case the value of each variable is a letter, and the contraints are that the letters must form words. This approach is feasible with a most-constraining-value heuristic. Alternately, we could have each string of consecutive horizontal or verticle boxes be a single variable, and the domain of the variables are words in the dictionary of the right length. The constraints would say that two intersecting words must have the same letter in the connecting box. Solving a problem in this formulation requires fewer steps, but the domains are bigger (assuming a really big dictionary) and there are fewer constraints. Both formulations are feasible in practice.

 

UNDERGRADS (481): Only those taking 481 need to do problems 5, 6 and 7.

5. [9] (a) Describe a search space where iterative deepening performs much worse than depth-first search.

A search tree, which is a full tree of depth d with branching factor b and only one goal state – leftmost node in depth d. DFS finds the goal in time O(d), while IDS needs time O(b^d)  


          (b) Construct a finite search tree for which it is possible that depth-first search
uses more memory than breadth-first search. (Be sure to show the goal node(s) in your
tree.)

In the worst case, depth-first search requires memory proportional to the depth of the tree, while breadthfirst search requires O(bg), where g is the depth of the shallowest goal node. In a deep tree (large d) with a sufficiently small branching factor (small b) and a shallow goal node (small g), depth-first search requires more memory than breadthfirst search.


          (c) Is there any tree and distribution of goal nodes for which depth-first search always requires more memory than breadth-first search? Briefly justify your answer.

There is no case where depth-first search always requires more memory than breadth-first search, since depth-first search can always go straight to the shallowest goal node.

6. [15]  Problem 4.9 (there are three parts)

The misplaced tiles heuristic is exact for the problem where a tile can move from square A to square B. As this is a relaxation itself of the condition that a tile can move from A to B only if B is blank, Gaschnig's heuristic cannot be less than the misplaced tiles heuristic. As it is also admissible (being an exact relaxation of the orginal problem) it is also more accurate.

If we permute 2 adjacent tiles in the goal state, we have a state where misplaced tiles and Manhattan distance both return 2, but Gaschnig's heuristic returns 3.

To compute Gaschnig;s heuristic, repeat the following until the goal state is reached: Let B be the location of the blank. If B is occupied by tile X (not the blank) in the goal state, then move X to B. Otherwise, move any misplaced tile to B. (This is actually an optimal solution).

7. [10] In the "Four-Queens puzzle", we try to place 4 queens on a 4x4 chess board so that none can capture any other (that is, only one queen can be on any row, column, and diagonal of the array). Suppose we try to solve this problem with the following problem space: The start node is labeled by an empty 4x4 array; the successor function creates new 4x4 arrays containing one additional legal placement of a queen anywhere in the array; the goal predicate is satisfied iff there are four queens in the array (legally positioned).

  • Invent an admissible heuristic function for this problem based on the number of queen placements remaining to achieve the goal. (Note that all goal nodes are precisely four steps from the start node!)

    Let L: # of  Legal spaces left ; R: # of queens Remaining. The heuristic function can be h(n) = R-(L/16) or a little better maybe R-((L-R)/16). Many other answers are also possible (must be admissible). Also, if L<R then h(n) should be infinite, or very large (> 4 :-) and if L=R=1, then this state is guarenteed to reach a goal and h(n) could be 0 (underestimating on purpose).
     
  • Use your h function in an A* search to a goal node. Draw the search tree consisting of all 4x4 arrays produced by the search and label each array by its value of g and h. (Note that symmetry considerations allow us to generate only three sucessors of the start node).

    The following picture uses the more complex h fn above where h(n) = R-((L-R)/16) except when L<R h(n)= 1000 and when L=R=1 then h(n) = 0.

GRADUATES (681): Only those taking 681 need to do problems 8 – 11.

8. [9] Prove that if a heuristic function h is consistent/monotonic, then it is admissible.

  • Best to do an inductive proof. Recall that " h is consistent" means
              h(n) <= c(n,a,n') + h(n') where n' is a direct successor of n via action a.
    Recall that we want to show that "h is admissible" meaning
            h(n) <= h*(n) where h*(n) is the actual lowest cost from node n to a goal node.
    Let k be the number of nodes on the shortest path to a goal from n.

    For k=1, let n' be the goal node. We know h(n')=0. We want to show h(n) <= h*(n), and in this case we know that h*(n) is c(n,a,n'). Since h is consistent we know h(n) <= c(n,a,n') + h(n'),
    and thus h(n) <= c(n,a,n').

    Assume the conjecture is true for all n less than k, and show it holds for n=k+1.
    Now assume n' is on the shortest path k steps from the goal and that h(n') is admissible. We know by construction that h*(n) = c(n,a,n') + h*(n'), and by assumption that h(n') <= h*(n'). So
    if we want to show h(n) <= h*(n) we can show that
    h(n) <= c(n,a,n') + h*(n')
    and we know that h is consistent, hence h(n) <= c(n,a,n') + h(n').
    Since h(n') <= h*(n),
    h(n)<=c(n,a,n')+h(n') <= c(n,a,n')+h*(n') which = h*(n) [so h(n) <= h*(n)]

 

9. [16] Problem 6.1 in the book; parts b, c, d, and e.

Dark oulines would not be expanded in optimal alpha/beta ordering.

10. [5] 6.12, part a.

Basically, If MAX has just won the trick, then MAX gets to play again, otherwise play alternates. Thus the successors of a MAX node could either be MIN or MAX nodes. (and the same thing for MIN)

11. [4] The minimax algorithm returns the best move for MAX under the assumption that MIN plays optimally. What happens when MIN plays suboptimally?

The outcome for MAX can only be the same or better if MIN plays suboptimally compared to MIN playing optimally. If a deterministic model is available for MIN's irrationality, then it can be applied to the game tree in the same way as the optimal policy. However, in all real games, the opponent is only "reasonable" in some unspecified way and a pure minimax strategy can do far worse than some other schemes. Suppose MAX assumes MIN is rational and minimax says MIN will win. In such cases, all moves are losing and are "equally good", including those that lose immediately! A better algorithm would make moves for which it will be very difficult for MIN to find the winning line. Notice also that minimax never "sets traps".