[Cliques] = FindNeighborhood(MI,sizeconstraint,MIthreshold) FindNeighborhood: Find the neighborhood of each variable using the values of the mutual information Neighbors are those variables whose interaction is above the MIthreshold Each variable neighborhood is constrained to be at most of size sizeconstraint INPUTS: T: Truncation parameter (when T=0, proportional selection is used) F: Name of the function that has as an argument a vector or NumbVar variables CantGen: Maximum number of generations MaximumFunction: Maximum of the function that can be used as stop condition when it is known Card: Vector with the dimension of all the variables. Elitism: Number of the current population individuals that pass to the next one. ---Elistism=-1: The whole selected population (only for truncation) passes to the next generation OUTPUTS Cliques: Structure of the model in a list of cliques that defines the neighborhood Each row of Cliques is a clique. The first value is the number of neighbors for variable i. The second, is the number of new variables (one new variable, i). Then, neighbor variables are listed and finally new variables (variable i) are listed When Cliques is empty, the model is learned from the data Last version 8/26/2008. Roberto Santana and Siddarta Shakya (roberto.santana@ehu.es)
0001 function [Cliques] = FindNeighborhood(MI,sizeconstraint,MIthreshold) 0002 % [Cliques] = FindNeighborhood(MI,sizeconstraint,MIthreshold) 0003 % FindNeighborhood: Find the neighborhood of each variable using the values of the mutual 0004 % information 0005 % Neighbors are those variables whose interaction is above the MIthreshold 0006 % Each variable neighborhood is constrained to be at most of size sizeconstraint 0007 % INPUTS: 0008 % T: Truncation parameter (when T=0, proportional selection is used) 0009 % F: Name of the function that has as an argument a vector or NumbVar variables 0010 % CantGen: Maximum number of generations 0011 % MaximumFunction: Maximum of the function that can be used as stop condition when it is known 0012 % Card: Vector with the dimension of all the variables. 0013 % Elitism: Number of the current population individuals that pass to the next one. 0014 %---Elistism=-1: The whole selected population (only for truncation) passes to the next generation 0015 % OUTPUTS 0016 % Cliques: Structure of the model in a list of cliques that defines the neighborhood 0017 % Each row of Cliques is a clique. The first value is the number of neighbors for variable i. 0018 % The second, is the number of new variables (one new variable, i). 0019 % Then, neighbor variables are listed and finally new variables (variable i) are listed 0020 % When Cliques is empty, the model is learned from the data 0021 % 0022 % Last version 8/26/2008. Roberto Santana and Siddarta Shakya (roberto.santana@ehu.es) 0023 0024 NumbVars = size(MI,1); 0025 epsilon = 1e-200; % It is used to avoid bias when the mutual information is equal 0026 0027 %for i=2:3:29, 0028 % Cliques(i-1,:) = [2,1,i,i+1,i-1]; 0029 % Cliques(i,:) = [2,1,i-1,i+1,i]; 0030 % Cliques(i+1,:) = [2,1,i-1,i,i+1]; 0031 %end, 0032 %return 0033 0034 for i=1:NumbVars, 0035 candidates = find(MI(i,:)>MIthreshold); 0036 ncandidates = size(candidates,2); 0037 if(ncandidates > sizeconstraint) 0038 [MIvals,VarOrder] = sort(MI(i,candidates)+rand(1,ncandidates)*epsilon,'descend'); 0039 %shuffle = randperm(ncandidates); 0040 Cliques(i,1:sizeconstraint+3) = [sizeconstraint,1,candidates(VarOrder(1:sizeconstraint)),i]; 0041 elseif(ncandidates > 0) 0042 Cliques(i,1:ncandidates+3) = [ncandidates,1,candidates,i]; 0043 else % No neighbor for variable i 0044 Cliques(i,1:3) = [0,1,i]; 0045 end, 0046 end, 0047