[run_structures,maxgen,nruns] = ReadStructures(namefile,n) ReadStructures: Reads all structures stored in the file 'namefile' which correspond to structures of the probabilistic models learned in nruns of an EDA with a maximum of maxgen generations. INPUTS namefile: file that contains the structures The file contains two columns of numerical values where negative values -g -r mean that the precedent positive values were the edges learned at generation g or run r. (In the file, vertices are numbered from 0 to n-1) EXAMPLE: 2 0 8 0 14 3 -1 -1 13 8 0 10 3 5 9 10 -1 -2 The structure learned at run 1, generation 1 contains edges (3,1), (9,1) and (15,4). The structure learned at run 2, generation 1 contains edges (14,9), (1,11), (4,6) and (10,11). n: Number of variables mangen: maximum number of generations nruns: number of runs of the algorithm OUTPUTS run_structures{1} = indexmatrix(n,n): Associates an index to each possible edge in the network. e.g. indexmatrix(1,2) = 1, number of edges m = n*(n+1)/2; run_structures{2} = AllBigMatrices{nruns}(m,maxgen}: For each run contains whether the edge i appeared in generation j run_structures{3} = AllSumMatrices(m,maxgen): = \sum_i^nruns AllBigMatrices{i}, i.e. the number of runs that each edge i appeared in generation j run_structures{4} = AllContactMatrix{maxgen}(n,n): The number of runs in which edge i,j was present in generation k. run_structures{5} = SumAllContactMatrix(n,n): = \sum_k^maxgen AllContactMatrix{k}. i.e. Total number of times edge i,j was present in all the structures learned in all generations of all runs. EXAMPLE [] = ReadStructures('ProteinStructsExR.txt',20,43,50) Last version 8/26/2008. Roberto Santana (roberto.santana@ehu.es)
0001 function[run_structures,maxgen,nruns] = ReadStructures(namefile,n) 0002 % [run_structures,maxgen,nruns] = ReadStructures(namefile,n) 0003 % ReadStructures: Reads all structures stored in the file 'namefile' which correspond to 0004 % structures of the probabilistic models learned in nruns of an EDA with 0005 % a maximum of maxgen generations. 0006 % INPUTS 0007 % namefile: file that contains the structures 0008 % The file contains two columns of numerical values 0009 % where negative values -g -r mean that the precedent positive 0010 % values were the edges learned at generation g or run r. 0011 % (In the file, vertices are numbered from 0 to n-1) 0012 % EXAMPLE: 0013 % 2 0 0014 % 8 0 0015 % 14 3 0016 % -1 -1 0017 % 13 8 0018 % 0 10 0019 % 3 5 0020 % 9 10 0021 % -1 -2 0022 % The structure learned at run 1, generation 1 contains edges 0023 % (3,1), (9,1) and (15,4). The structure learned at run 2, 0024 % generation 1 contains edges (14,9), (1,11), (4,6) and (10,11). 0025 % n: Number of variables 0026 % mangen: maximum number of generations 0027 % nruns: number of runs of the algorithm 0028 % OUTPUTS 0029 % run_structures{1} = indexmatrix(n,n): Associates an index to each possible edge in the network. 0030 % e.g. indexmatrix(1,2) = 1, number of edges m = n*(n+1)/2; 0031 % run_structures{2} = AllBigMatrices{nruns}(m,maxgen}: For each run contains whether the edge i appeared in generation j 0032 % run_structures{3} = AllSumMatrices(m,maxgen): = \sum_i^nruns AllBigMatrices{i}, 0033 % i.e. the number of runs that each edge i appeared in generation j 0034 % run_structures{4} = AllContactMatrix{maxgen}(n,n): The number of runs in which edge i,j 0035 % was present in generation k. 0036 % run_structures{5} = SumAllContactMatrix(n,n): = \sum_k^maxgen AllContactMatrix{k}. 0037 % i.e. Total number of times edge i,j was present in all the structures 0038 % learned in all generations of all runs. 0039 % EXAMPLE 0040 %[] = ReadStructures('ProteinStructsExR.txt',20,43,50) 0041 % 0042 % Last version 8/26/2008. Roberto Santana (roberto.santana@ehu.es) 0043 0044 0045 0046 [m,indexmatrix] = Find_indexmatrix(n); 0047 0048 AuxFile = load (namefile); % The file is read 0049 Cycle = size(AuxFile,1); % Its length is calculated 0050 0051 nruns = -1*min(AuxFile(:,1)); 0052 maxgen = -1*min(AuxFile(:,2)); 0053 0054 0055 a = 1; 0056 0057 for i=1:Cycle, 0058 if(AuxFile(i,1)<0) 0059 run = -1*AuxFile(i,1); 0060 gen = -1*AuxFile(i,2); 0061 AllStruct{run,gen} = AuxStruct; 0062 a = 1; 0063 AuxStruct = []; 0064 else 0065 AuxStruct(a,:) = AuxFile(i,:); 0066 a = a+1; 0067 end 0068 end 0069 0070 % The matrices containing the occurrence of edges at each generation are constructed 0071 AllSumMatrices = zeros(m,maxgen); 0072 0073 for j=1:maxgen 0074 AllContactMatrix{j}= zeros(n,n); 0075 end 0076 0077 for i=1:nruns, 0078 BigMatrix = zeros(m,maxgen); 0079 for j=1:size(AllStruct,2) 0080 if ~isempty(AllStruct{i,j}) 0081 edges = AllStruct{i,j}; 0082 for k=1:size(edges,1), 0083 AllContactMatrix{j}(edges(k,1)+1,edges(k,2)+1) = AllContactMatrix{j}(edges(k,1)+1,edges(k,2)+1) + 1; 0084 AllContactMatrix{j}(edges(k,2)+1,edges(k,1)+1) = AllContactMatrix{j}(edges(k,2)+1,edges(k,1)+1) + 1; 0085 edgeindex = indexmatrix(edges(k,1)+1,edges(k,2)+1); 0086 BigMatrix(edgeindex,j) = 1; 0087 end 0088 end 0089 end 0090 AllBigMatrices{i} = BigMatrix; 0091 AllSumMatrices = AllSumMatrices + BigMatrix; 0092 end 0093 0094 SumAllContactMatrix = zeros(n,n); 0095 for j=1:maxgen 0096 SumAllContactMatrix = SumAllContactMatrix + AllContactMatrix{j}; 0097 end 0098 0099 run_structures{1} = indexmatrix; 0100 run_structures{2} = AllBigMatrices; 0101 run_structures{3} = AllSumMatrices; 0102 run_structures{4} = AllContactMatrix; 0103 run_structures{5} = SumAllContactMatrix; 0104 0105 0106 0107