In this work, we introduce the Perceptron Average neural network fusion strategy and implemented a number of other fusion strategies to identify breast masses in mammograms as malignant or benign with both balanced and imbalanced input features. We numerically compare various fixed and trained fusion rules, i.e., the Majority Vote, Simple Average, Weighted Average, and Perceptron Average, when applying them to a binary statistical pattern recognition problem. To judge from the experimental results, the Weighted Average approach outperforms the other fusion strategies with balanced input features, while the Perceptron Average is superior and achieves the goals with lowest standard deviation with imbalanced ensembles. We concretely analyze the results of above fusion strategies, state the advantages of fusing the component networks, and provide our particular broad sense perspective about information fusion in neural networks.

}, keywords = {Biological organs, Breast cancers, Component neural networks (CNN), Image segmentation, Information fusions, Learning algorithms, Linear systems, Mammography, Mathematical models, Multilayer neural networks, Pattern recognition, Posterior probabilities, Tumors}, isbn = {0780383591}, doi = {10.1109/IJCNN.2004.1381010}, url = {http://www.scopus.com/inward/record.url?eid=2-s2.0-10844231826\&partnerID=40\&md5=2be794a5832413fed34152d61dd49388}, author = {Y Wu and J He and Y Man and J I Arribas} } @article {409, title = {Cost functions to estimate a posteriori probabilities in multiclass problems}, journal = {IEEE Transactions on Neural Networks}, volume = {10}, year = {1999}, pages = {645-656}, abstract = {The problem of designing cost functions to estimate a posteriori probabilities in multiclass problems is addressed in this paper. We establish necessary and sufficient conditions that these costs must satisfy in one-class one-output networks whose outputs are consistent with probability laws. We focus our attention on a particular subset of the corresponding cost functions; those which verify two usually interesting properties: symmetry and separability (well-known cost functions, such as the quadratic cost or the cross entropy are particular cases in this subset). Finally, we present a universal stochastic gradient learning rule for single-layer networks, in the sense of minimizing a general version of these cost functions for a wide family of nonlinear activation functions.

}, keywords = {Cost functions, Estimation, Functions, Learning algorithms, Multiclass problems, Neural networks, Pattern recognition, Probability, Problem solving, Random processes, Stochastic gradient learning rule}, issn = {10459227}, doi = {10.1109/72.761724}, url = {http://www.scopus.com/inward/record.url?eid=2-s2.0-0032643080\&partnerID=40\&md5=d528195bd6ec84531e59ddd2ececcd46}, author = {Jes{\'u}s Cid-Sueiro and J I Arribas and S Urban-Munoz and A R Figueiras-Vidal} }