CSCI 3202
Artificial Intelligence

Assignment 2

Assigned: Thu Sep 2
Due: Thu Sep 9

In this assignment, you will implement a simulation of the Wumpus World, build several agents that interact with the world, and collect statistics on their performance.

You  may use whatever computer language you find convenient.  You do not need to implement a graphical interface to the simulation.    The assignment is broken into 4 parts to guide you along.  You must do all four parts.

Part 1

Write a function that creates a Wumpus World according to the rules described in this assignment and in Section 7.2 of Russell & Norvig.  Your simulation should represent the 16 rooms of the Wumpus World, and should (a) pick a room for the wumpus, (b) pick a room for the gold (not the same room as the wumpus is in), (c) generate pits with probability .2 (but never put pits in room [1,1] or in the room containing the gold), and (d) determine the sensations (breeze, stench, and glitter) in each room.

(The Wumpus World we describe should be the same as that of Russell & Norvig, except that we allow you an unlimited number of arrows, and we added the constraint that there can be no pits or wumpuses in room [1,1] or in the room containing the gold.  If you see any other differences or ambiguities, email Prof. Mozer.)

Creating the Wumpus World defines the world state, which consists of:  (a) an array specifying the contents of each room, (b) the [x,y] coordinates of the room occupied by the agent, (c) the direction the agent is facing (up, down, left, right), and (d) the sensations available to the agent (breeze, stench, glitter, bump, scream).

Part 2

Write a function that takes the current world state and an action of the agent, and returns the new world state and a payoff resulting from the action. Remember, when the agent tries to walk off the board, it does not move, and it senses a bump.  And when the wumpus is killed, a scream is generated in all rooms.

The actions are:  move forward in the current direction, turn left, turn right, grab object in room, and fire an arrow.  Firing an arrow will shoot the arrow in the direction the agent is facing, and the arrow will continue until it hits a wall or hits the wumpus.  If the wumpus is hit it will die.  Killing the wumpus removes the wumpus and all stenches from the world.  "Grabbing the object" will have no effect unless the room contains gold, in which the gold is picked up.

The payoffs resulting from an action are as follows:  +1000 if the gold is picked up; otherwise, -1000 if the agent enters a room containing a pit or the wumpus; otherwise, -10 if an arrow was shot; otherwise -1 for whatever action was taken.

Part 3

Write a simulation involving a very simple agent.  The rules of the agent should be as follows:  (1)  If the agent senses glitter in the current room, it should pick up the gold.  (2) If the agent did not shoot an arrow on the last time step, and the agent senses a stench in the current room, it should shoot an arrow. (3) Otherwise, the agent should choose one of the remaining three actions (turn left, turn right, move forward) at random.

The simulation should terminate when:  (1) a payoff of -1000 is received, (2) a payoff of +1000 is received, or (3) 1000 actions have been performed.

You should compute the cumulative payoff over the course of the simulation, which is the sum of the payoffs received up to and including the action at which the simulation terminates.

You should run the simulation 10000 times to determine the average cumulative payoff of your agent.  For each run, create a new Wumpus World according to the rules given in part 1, and return the agent to the starting room.  Running your simulation just one time doesn't tell you much about your agent's performance, because that one environment that was created might happen to be a very simple environment (e.g., the gold is in [1,2]) or an impossible environment (e.g., the gold is surrounded by pits).  By running the simulation a large number of times, you get an expected measure of performance across the class of environments specified by the Wumpus World.  (By analogy, suppose you are thrown into a strange city and your task is to navigate to a particular goal location.  Even if you are generally good at directions, some cities my just be very difficult to navigate, and so it will take you a long time to get to your goal.  The fact that it took you a long time in one particular city doesn't necessarily imply that you will have a difficult time in another city.  By testing out your navigation abilities in a variety of cities, you can get a better sense of how good a navigator you are in general.)

Part 4

Design a more intelligent agent than the dumb one in Part 3.  You can optimize your agent by trying out different programs and seeing which one produces the highest average cumulative payoff (over 10000 runs).

You may not build in knowledge of the current environment (e.g., the fact that there is a pit in [1,2], or that the gold can be found in [3,3]).  However, you can build in knowledge of the class of environments (e.g., the fact that the agent starts in [1,1] and that the world has a 4x4 array of rooms, and that pits are placed at random with probability .2 in any room.)

What to hand in

On the due date, you are expected to bring to class hardcopies of your work.  The work should include the code and a one page write up.  The write up should provide the average cumulative payoff of the simple agent (part 3) and of your more intelligent agent (part 4).   You should describe the program (algorithm) of your more intelligent agent.  You may choose to describe the program as a ranked list of mutually exclusive conditiona-action rules (i.e., the top ranked rule is chosen if it applies).  The explanation of your program should be easy for a reader to understand; do not simply include the code for the agent.

Warnings

We are expecting you to implement the Wumpus World exactly as it is specified in the text (which is consistent with how we've described it in this assignment).  Consequently, the performance statistic you compute for part 3 should match the statistics computed by your classmates.  You are allowed to check with your classmates to make sure you are getting comparable statistics. (Remember, they may not be exactly the same, due to different random number generation sequences.)  You may also talk to your classmates to make sure you have implemented the Wumpus World according to specification.

No late assignments will be accepted.  They are due in class on the due date.  If there are circumstances beyond your control that will prevent you from handing the assignment in on time, let Professor Mozer know in advance.