|
Written
Homework #2 Solutions
1. [35] Consider the following
state-space graph with Start State S and Goal state G, heuristic function
values and path costs noted:

Assuming that successor states
are generated in alphabetical order (in stack-based algorithms, placed
on the Open List in alpha order either at the start or at the end),
and ties (for priority-queue based algorithms) broken in alphabetical
order, in what order are the nodes in this graph expanded by each of
the following search algorithms? Also for each, what is the cost of
the path found? If the path found is not optimal, briefly explain why
the algorithm didn't find the optimal path. Do not remove repeated states
(i.e. state C).
- Breadth-First Search: I'll do this one for you: expanded=SABCEG1,
path (SAG1) cost is 13 (not optimal), the open list still has C(via
B) F (via B) G3 (via C) G1(via E) on it. You still need to tell
me *why* it didn't find the optimal answer . Here's how I got this
answer: we start by expanding S, which places A B C (in that order)
on the open list. Then we expand A, placing E then G1 at the end of
the open list, then expand B, placing C and F on the end of the open
list, then expand C, placing G3 at the end, then E placing G3 at the
end, then expand G1, finishing with S-A-G1.
The reason that we don't find the optimal path
is that breadth-first search is only optimal when the shallowest solution
is optimal (i.e., when the cost to a node is purely a function of
it's depth only).
- Depth First Search DFS is not optimal SAEG1 (12)
- Uniform Cost Search SBACFDCG2 (9)
- Iterative Deepening ID is optimal only when breadth-first
is optimal.
SSABCSAEG1 (12)
- Greedy Best-First Search Greedy seearch is
not optimal SBCG3 (cost is either 16 or 14, depending on which "C"
you chose)
- A* SBCCFDG2 (9)
- IDA* SBCFDCSBCFDG2 (9) (remember depth-first search w/in
f-limit)
2. [15] Problem 3.17 from the book. (five
parts)
a.
Any path, no matter how bad it appears, might lead to an arbitrarily
large reward (NEGATIVE COST). Therefore, one would need to exhuast
all possible paths to be sure of finding the best one.
b.
Suppose the greatest possible reward is r.
Then if we also know the maximum depth of the state space (e.g. when
the state space is a tree), then any path with d
levels remaining can be improved by at most rd,
so any paths worse than rd
less than the best path can be pruned. For state spaces with loops,
this won't help, because you can go around the loop any number of
times, picking up r
each time (see next part).
[Note:
the above answer is Stuart Russel's: If I am reading it correctly, I think
the pruning part is buggy...But the core idea, that such info allows
us to still search in a tree, is correct.]
c.
The agent should plan to go around this loop forever (unless it can find
another loop with an even better reward)
d.
The value of a scenic loop is lessened each time one revisits it;
a novel site is a great reward, but seeing the same one for the tenth
time in an hour is tedious, not rewarding. To accommodate this, we
would have to expand the state space to include a memory---a state
is now not just represented by the current location, but by the current
location and a bag of already-visited locations. The reward for visiting
a new location is now a diminishing function of the number of times
it has been seen before.
e.
Real domains with looping behavior include eating junk food, going
to class, watching certain movies, etc.
3. [8] Problem 4.2 (four parts)
w=0
gives f(n) = 2g(n). This behaves exactly like uniform cost search
because the constant POSITIVE factor will make no difference in the
order that nodes are visited.
w=1
gives A* search
w=2
gives f(n) = 2h(n), i.e., greedy best-first search.
We
can rewrite the equation as f(n) = (2-w)[g(n) + (w/[2-w])h(n)]. For
w< 2 the outside factor is positive, and in particular, when w
<= 1 the factor modifying h(n) is less than or equal to h(n), and
thus is underestimating, and thus is admissible.
4. [8] Problem 5.4 (two parts)
a.
Crossword puzzle construction can be solved in many ways. One simple
way is depth-first search. Each successor fills in a word in the puzzle
with one of the words in the dictionary. It is better to go a word
at a time, to minimize the number of steps.
b.
As a CSP, there are even more choices. You could have a variable for
each box in the crossword puzzle---in this case the value of each
variable is a letter, and the contraints are that the letters must
form words. This approach is feasible with a most-constraining-value
heuristic. Alternately, we could have each string of consecutive horizontal
or verticle boxes be a single variable, and the domain of the variables
are words in the dictionary of the right length. The constraints would
say that two intersecting words must have the same letter in the connecting
box. Solving a problem in this formulation requires fewer steps, but
the domains are bigger (assuming a really big dictionary) and there
are fewer constraints. Both formulations are feasible in practice.
UNDERGRADS
(481): Only those taking 481 need to do problems 5, 6 and 7.
5. [9] (a) Describe a search
space where iterative deepening performs much worse than depth-first
search.
A
search tree, which is a full tree of depth d with branching factor
b and only one goal state – leftmost node in depth d. DFS finds
the goal in time O(d), while IDS needs time O(b^d)
(b) Construct a finite
search tree for which it is possible that depth-first search
uses more memory than breadth-first search. (Be sure to show the goal
node(s) in your
tree.)
In the worst case,
depth-first search requires memory proportional to the depth of the
tree, while breadthfirst search requires O(bg), where g is the depth
of the shallowest goal node. In a deep tree (large d) with a sufficiently
small branching factor (small b) and a shallow goal node (small g),
depth-first search requires more memory than breadthfirst search.
(c) Is there any tree
and distribution of goal nodes for which depth-first search always requires
more memory than breadth-first search? Briefly justify your answer.
There is no case
where depth-first search always requires more memory than breadth-first
search, since depth-first search can always go straight to the shallowest
goal node.
6. [15] Problem 4.9 (there
are three parts)
The
misplaced tiles heuristic is exact for the problem where a tile can
move from square A to square B. As this is a relaxation itself of
the condition that a tile can move from A to B only if B is blank,
Gaschnig's heuristic cannot be less than the misplaced tiles heuristic.
As it is also admissible (being an exact relaxation of the orginal
problem) it is also more accurate.
If
we permute 2 adjacent tiles in the goal state, we have a state where
misplaced tiles and Manhattan distance both return 2, but Gaschnig's
heuristic returns 3.
To
compute Gaschnig;s heuristic, repeat the following until the goal
state is reached: Let B be the location of the blank. If B is occupied
by tile X (not the blank) in the goal state, then move X to B. Otherwise,
move any misplaced tile to B. (This is actually an optimal solution).
7. [10] In the "Four-Queens
puzzle", we try to place 4 queens on a 4x4 chess board so that
none can capture any other (that is, only one queen can be on any row,
column, and diagonal of the array). Suppose we try to solve this problem
with the following problem space: The start node is labeled by an empty
4x4 array; the successor function creates new 4x4 arrays containing
one additional legal placement of a queen anywhere in the array; the
goal predicate is satisfied iff there are four queens in the array (legally
positioned).
- Invent an admissible heuristic function for this problem based on
the number of queen placements remaining to achieve the goal. (Note
that all goal nodes are precisely four steps from the start node!)
Let L: # of Legal spaces left ; R: # of queens
Remaining. The heuristic function can be h(n) = R-(L/16) or a little
better maybe R-((L-R)/16). Many other answers are also possible (must
be admissible). Also, if L<R then h(n) should be infinite, or very
large (> 4 :-) and if L=R=1, then this state is guarenteed to reach
a goal and h(n) could be 0 (underestimating on purpose).
- Use your h function in an A* search to a goal node. Draw the search
tree consisting of all 4x4 arrays produced by the search and label
each array by its value of g and h. (Note that symmetry considerations
allow us to generate only three sucessors of the start node).
The following picture uses the more complex h fn above where
h(n) = R-((L-R)/16) except when L<R h(n)= 1000 and when L=R=1 then
h(n) = 0.
GRADUATES
(681): Only those taking 681 need to do problems 8 – 11.
8. [9] Prove that if a heuristic
function h is consistent/monotonic, then it is admissible.
- Best to do an inductive proof. Recall that
" h is consistent" means
h(n) <= c(n,a,n') + h(n') where n' is a direct successor
of n via action a.
Recall that we want to show that "h is admissible" meaning
h(n) <= h*(n) where h*(n) is the actual lowest cost from
node n to a goal node.
Let k be the number of nodes on the shortest path to a goal from n.
For k=1, let n' be the goal node. We know h(n')=0.
We want to show h(n) <= h*(n), and in this case we know that h*(n)
is c(n,a,n'). Since h is consistent we know h(n) <= c(n,a,n') +
h(n'),
and thus h(n) <= c(n,a,n').
Assume the conjecture is true for all n less
than k, and show it holds for n=k+1.
Now assume n' is on the shortest path k steps from the goal and that
h(n') is admissible. We know by construction that h*(n) = c(n,a,n')
+ h*(n'), and by assumption that h(n') <= h*(n'). So
if we want to show h(n) <= h*(n) we can show that
h(n) <= c(n,a,n') + h*(n')
and we know that h is consistent, hence h(n) <= c(n,a,n') + h(n').
Since h(n') <= h*(n),
h(n)<=c(n,a,n')+h(n') <= c(n,a,n')+h*(n') which = h*(n) [so
h(n) <= h*(n)]
9. [16] Problem 6.1 in the
book; parts b, c, d, and e.
Dark
oulines would not be expanded in optimal alpha/beta ordering.

10. [5] 6.12, part a.
Basically,
If MAX has just won the trick, then MAX gets to play again, otherwise
play alternates. Thus the successors of a MAX node could either be
MIN or MAX nodes. (and the same thing for MIN)
11. [4] The minimax algorithm
returns the best move for MAX under the assumption that MIN plays optimally.
What happens when MIN plays suboptimally?
The
outcome for MAX can only be the same or better if MIN plays suboptimally
compared to MIN playing optimally. If a deterministic model is available
for MIN's irrationality, then it can be applied to the game tree in
the same way as the optimal policy. However, in all real games, the
opponent is only "reasonable" in some unspecified way and
a pure minimax strategy can do far worse than some other schemes.
Suppose MAX assumes MIN is rational and minimax says MIN will win.
In such cases, all moves are losing and are "equally good",
including those that lose immediately! A better algorithm would make
moves for which it will be very difficult for MIN to find the winning
line. Notice also that minimax never "sets traps".
|
|