CSCI 3202: Artificial Intelligence, Fall 2000

Problem Set 4 Partial Solutions

Problem 4.3: Here is a short Scheme program to find the answer to 4.3.

(define (log2 n) (/ (log n) (log 2))) (define (info p q) (if (or (= p 0) (= q 0)) 0 (* -1 (+ (* p (log2 p)) (* q (log2 q)))))) (define (info-gain ys1 ns1 ys2 ns2 ys3 ns3) (let* ((tot1 (+ ys1 ns1)) (tot2 (+ ys2 ns2)) (tot3 (+ ys3 ns3))) (- (info (/ ys1 tot1) (/ ns1 tot1)) (+ (* (/ tot2 tot1) (info (/ ys2 tot2) (/ ns2 tot2))) (* (/ tot3 tot1) (info (/ ys3 tot3) (/ ns3 tot3)))))))

We find:

>>> (info-gain 100 100 70 40 30 60) 0.06665370714512742 >>> (info-gain 100 100 12 0 88 100) 0.06276448651079403 >>> (info-gain 100 100 13 0 87 100) 0.06826219558661883

So N = 13 is the smallest value of N that would make the second attribute more informative than the first.