Jx-WFST : A Wrapper Feature Selection Toolbox

March 4, 2021 ยท View on GitHub

View Wrapper Feature Selection Toolbox on File Exchange License GitHub release


"Toward Talent Scientist: Sharing and Learning Together" --- Jingwei Too


Wheel

Introduction

  • This toolbox offers more than 40 wrapper feature selection methods

  • The A_Main file provides the examples of how to apply these methods on benchmark dataset

  • Source code of these methods are written based on pseudocode & paper

  • Main goals of this toolbox are:

    • Knowledge sharing on wrapper feature selection
    • Assists others in data mining projects

Usage

The main function jfs is adopted to perform feature selection. You may switch the algorithm by changing the 'pso' to other abbreviations

  • If you wish to use particle swarm optimization ( see example 1 ) then you may write
FS = jfs('pso',feat,label,opts);
  • If you want to use slime mould algorithm ( see example 2 ) then you may write
FS = jfs('sma',feat,label,opts);

Input

  • feat : feature vector matrix ( Instance x Features )
  • label : label matrix ( Instance x 1 )
  • opts : parameter settings
    • N : number of solutions / population size ( for all methods )
    • T : maximum number of iterations ( for all methods )
    • k : k-value in k-nearest neighbor

Output

  • Acc : accuracy of validation model
  • FS : feature selection model ( It contains several results )
    • sf : index of selected features
    • ff : selected features
    • nf : number of selected features
    • c : convergence curve
    • t : computational time (s)

Notation

Some methods have their specific parameters ( example: PSO, GA, DE ), and if you do not set them then they will be defined as default settings

  • you may open the m.file to view or change the parameters
  • you may use opts to set the parameters of method ( see example 1 or refer here )
  • you may also change the fitness function in jFitnessFunction file

Example 1 : Particle Swarm Optimization ( PSO )

% Common parameter settings
opts.k  = 5;      % Number of k in K-nearest neighbor
opts.N  = 10;     % number of solutions
opts.T  = 100;    % maximum number of iterations
% Parameters of PSO
opts.c1 = 2;
opts.c2 = 2;
opts.w  = 0.9;

% Load dataset
load ionosphere.mat;

% Ratio of validation data
ho = 0.2;
% Divide data into training and validation sets
HO = cvpartition(label,'HoldOut',ho); 
opts.Model = HO; 

% Perform feature selection 
FS = jfs('pso',feat,label,opts);

% Define index of selected features
sf_idx = FS.sf;

% Accuracy  
Acc = jknn(feat(:,sf_idx),label,opts); 

% Plot convergence
plot(FS.c); grid on;
xlabel('Number of Iterations'); 
ylabel('Fitness Value');
title('PSO');

Example 2 : Slime Mould Algorithm ( SMA )

% Common parameter settings
opts.k  = 5;      % Number of k in K-nearest neighbor
opts.N  = 10;     % number of solutions
opts.T  = 100;    % maximum number of iterations

% Load dataset
load ionosphere.mat; 

% Ratio of validation data
ho = 0.2;
% Divide data into training and validation sets
HO = cvpartition(label,'HoldOut',ho); 
opts.Model = HO; 

% Perform feature selection 
FS = jfs('sma',feat,label,opts);

% Define index of selected features
sf_idx = FS.sf;

% Accuracy  
Acc = jknn(feat(:,sf_idx),label,opts); 

% Plot convergence
plot(FS.c); grid on; 
xlabel('Number of Iterations');
ylabel('Fitness Value'); 
title('SMA');

Example 3 : Whale Optimization Algorithm ( WOA )

% Common parameter settings
opts.k  = 5;      % Number of k in K-nearest neighbor
opts.N  = 10;     % number of solutions
opts.T  = 100;    % maximum number of iterations
% Parameter of WOA
opts.b = 1;

% Load dataset
load ionosphere.mat; 

% Ratio of validation data
ho = 0.2;
% Divide data into training and validation sets
HO = cvpartition(label,'HoldOut',ho); 
opts.Model = HO; 

% Perform feature selection 
FS = jfs('woa',feat,label,opts);

% Define index of selected features
sf_idx = FS.sf;

% Accuracy  
Acc = jknn(feat(:,sf_idx),label,opts); 

% Plot convergence
plot(FS.c); grid on; 
xlabel('Number of Iterations'); 
ylabel('Fitness Value'); 
title('WOA');

Requirement

  • MATLAB 2014 or above
  • Statistics and Machine Learning Toolbox

List of available wrapper feature selection methods

  • Note that the methods are altered so that they can be used in feature selection tasks
  • The extra parameters represent the parameter(s) other than population size and maximum number of iterations
  • Click on the name of method to view the extra parameter(s)
  • Use the opts to set the specific parameter(s)
No.AbbreviationNameYearExtra Parameters
43'mpa'Marine Predators Algorithm2020Yes
42'gndo'Generalized Normal Distribution Optimization2020No
41'sma'Slime Mould Algorithm2020No
40'mrfo'Manta Ray Foraging Optimization2020Yes
39'eo'Equilibrium Optimizer2020Yes
38'aso'Atom Search Optimization2019Yes
37'hgso'Henry Gas Solubility Optimization2019Yes
36'hho'Harris Hawks Optimization2019No
35'pfa'Path Finder Algorithm2019No
34'pro'Poor And Rich Optimization2019Yes
33'boa'Butterfly Optimization Algorithm2018Yes
32'epo'Emperor Penguin Optimizer2018Yes
31'tga'Tree Growth Algorithm2018Yes
30'abo'Artificial Butterfly Optimization2017Yes
29'ssa'Salp Swarm Algorithm2017No
28'wsa'Weighted Superposition Attraction2017Yes
27'sbo'Satin Bower Bird Optimization2017Yes
26'ja'Jaya Algorithm2016No
25'csa'Crow Search Algorithm2016Yes
24'sca'Sine Cosine Algorithm2016Yes
23'woa'Whale Optimization Algorithm2016Yes
22'alo'Ant Lion Optimizer2015No
21'hlo'Human Learning Optimization2015Yes
20'mbo'Monarch Butterfly Optimization2015Yes
19'mfo'Moth Flame Optimization2015Yes
18'mvo'Multiverse Optimizer2015Yes
17'tsa'Tree Seed Algorithm2015Yes
16'gwo'Grey Wolf Optimizer2014No
15'sos'Symbiotic Organisms Search2014No
14'fpa'Flower Pollination Algorithm2012Yes
13'foa'Fruitfly Optimization Algorithm2012No
12'ba'Bat Algorithm2010Yes
11'fa'Firefly Algorithm2010Yes
10'cs'Cuckoo Search Algorithm2009Yes
09'gsa'Gravitational Search Algorithm2009Yes
08'abc'Artificial Bee Colony2007Yes
07'hs'Harmony Search-Yes
06'de'Differential Evolution1997Yes
05'aco'Ant Colony Optimization-Yes
04'acs'Ant Colony System-Yes
03'pso'Particle Swarm Optimization1995Yes
02'ga'/'gat'Genetic Algorithm-Yes
01'sa'Simulated Annealing-Yes