Template-based procedures for neural network interpretation
Although neural networks often achieve impressive learning and generalization
performance, their internal workings are typically all but impossible to
decipher. This characteristic of networks, their opacity, is one of
the disadvantages of connectionism compared to more traditional, rule-oriented
approaches to Artificial Intelligence. Without a thorough understanding of
network behavior, confidence in a system's results is lowered, and transfer of
learned knowledge to other processing systems--including humans--is precluded.
Methods that address the opacity problem by casting network weights in
symbolic terms are commonly referred to as rule extraction
techniques. This work describes a principled approach to symbolic rule
extraction from standard multilayer feedforward networks based on the notion
of weight templates, parameterized regions of weight space
corresponding to specific symbolic expressions. With an appropriate choice of
representation, we show how template parameters may be efficiently identified
and instantiated to yield the optimal match to a unit's actual weights.
Depending on the requirements of the application domain, the approach can
accommodate n-ary disjunctions and conjunctions with O(k) complexity,
simple n-of-m expressions with O(k^2) complexity, or more
general classes of recursive n-of-m expressions with O(k^(L+2)) complexity,
where k is the number of inputs to a unit and L is the recursion level of the
expression class. Compared to other approaches in the literature, our method
of rule extraction offers benefits in simplicity, computational performance,
and overall flexibility. Simulation results on a variety of problems
demonstrate the application of our procedures as well as the strengths and
weaknesses of our general approach.
Retrieve Paper (pdf)