"A stochastic algorithm for feature selection in pattern recognition."

S. GADAT, L. YOUNES

Abstract :

We introduce a new model formalizing feature selection from a large dictionnary of variables that can be computed from a signal or an image. Features are extracted according to an efficiency criterion, on the basis of specified classification or recognition tasks. This is done by estimating a probability distribution $P$ on the complete dictionnary, that will weight positively the more efficient, or more informative, components. This is done by implementing a stochastic gradient descent algorithm, using the probability as a state variable, optimizing a multi-task goodness of fit criterion for classifiers based on variable randomly chosen according to $P$. The method is tested on several pattern recognition problems like face detection, handwritten recognitino digits or spam classification.