Robust Clustering of Banks in Argentina

Authors

  • José M. Vargas Universidad Nacional de Córdoba, Facultad de Ciencias Económicas (Córdoba, Argentina)
  • Margarita Díaz Universidad Nacional de Córdoba, Facultad de Ciencias Económicas (Córdoba, Argentina)
  • Fernando García Universidad Nacional de Córdoba, Facultad de Ciencias Económicas (Córdoba, Argentina)

DOI:

https://doi.org/10.55444/2451.7321.2018.v56.n1.29385

Keywords:

robust clustering, projection pursuit, common principal components, robust K-means, influence measures, theory of the firm

Abstract

The purpose of this paper is to classify and characterize 64 banks, active as of 2010 in Argentina, by means of robust techniques used on information gathered during the period 2001-2010. Based on the strategy criteria established in (Wang 2007) and (Werbin 2010), seven variables were selected. In agreement with bank theory, four “natural” clusters were obtained, named “Personal”, “Commercial”, “Typical” and “Other banks”. In order to understand this grouping, projection pursuit based robust principal component analysis was conducted on the whole set showing that essentially three variables can be attributed the formation of different clusters. In order to reveal each group inner structure, we used R package mclust to fit a finite Gaussian mixture to the data. This revealed approximately a similar component structure, granting a common principal components analysis as in (Boente and Rodrigues, 2002). This allowed us to identify three variables which suffice for grouping and characterizing each cluster. Boente’s influence measures were used to detect extreme cases in the common principal components analysis.

Downloads

Download data is not yet available.

References

Boente, G., and L. Orellana (2001). “A robust approach to common principal components”. Statistics in Genetics and in the Environmental Sciences. Ed. by Birkhauser Basel et al., pp. 117–147.

Boente, G., A. M. Pires, and I. M.. Rodrigues (2002). “Influence functions and outlier detection under the common principal components model: A robust approach”. Biometrika 89.4, pp. 861–875.

Boente, G., A. M. Pires, and I. M. Rodrigues (2010). “Detecting influential observations in principal components and common principal components”. Computational Statistics and Data Analysis 54, pp. 2967–2975.

Chen, Edwin (2010). R Implementation of gap-statistics. [Retrieved from https://github.com/echen/gap-statistic/blob/master/gap-statistic.R].

Croux, C. personal website. [Retrieved from http://www.econ.kuleuven.be/public/NDBAE06/programs/#pca].

Croux, C., P. Filzmoser, and M. R. Oliveira (2005). “Algorithms for projection-pursuit robust principal component analysis”. Department of Decision Sciences an Information Management (KBI) KBI 0624.

Croux, C., and A. Ruiz-Gazen (1996). “A fast algorithm for robust principal components based on projection pursuit”. Compstat: Proceedings in Computational Statistics. Ed. by A. Prat. Physica-Verlag, Heidelberg, pp. 211-217.

Croux, C., and A. Ruiz-Gazen(2005). “High breakdown estimators for principal components: The projection-pursuit approach revisited”. Journal of Multivariate Analysis 95, pp. 206–226.

Ding, Chris, and Xiaofeng He (2004). “K-means clustering via principal component analysis”. Proceedings of the 21 St International Conference on Machine Learning. Canada, 2004: Banff.

Ercan, H and S. Sayaseng (2016). “The cluster analysis of the banking sector in Europe”. Economics and Management of Global Value Chains, pp. 111–127.

Farnè, M. and A. Vouldis (2017). “Business models of the banks in the euro area”. Working Paper Series European Central Bank 2070.

Filzmoser, P. et al. (2012). Package “pcaPP”.Peter Filzmoser, Heinrich Fritz and Klaudius Kalcher. [Retrieved from http://www.statistik.tuwien.ac.at/public/filz/]

Flury, B. K. (1984). “Common principal components in K Groups”.J. Amer. Statist. Assoc. 79, pp. 892–898.

Flury, B. K. (1988). Common principal components and related multivariate models, Wiley, New York.

Fraley, C., and A. Raftery (2007). “Model-based methods of classification: Using mclust software in chemometrics”. Journal of Statistical Software 18.6. [Retrieved from: http://www.jstatsoft.org/].

Fraley, C., A. Raftery, and Scrucca L. R package mclust Mantainer Luca Scrucca luca@stat.unipg.it.

Gordaliza, A. (1991a). “Best approximations to random variables based on trimming procedures”. Journal of Approximation Theory 64.2, pp. 162–180.

Gordaliza, A. (1991b). “On the breakdown point of multivariate location estimators based on trimming procedures”. Statistics & Probability Letters 11.5, pp. 387–394.

Hartigan, J. A. (1975). Clustering Algorithms. Inc: John Wiley & Sons.

Hennig, C. R Package fpc. c.hennig@ucl.ac.uk ucakche/ [Retrieved from http://www.homepages.ucl.ac.uk/].

Kassani, S.H., P. H. Kassani, and S. E. Najafi (2015). “Introducing a hybrid model of DEA and data mining in evaluating efficiency. Case study: Bank Branches”. Academic Journal of Research in Economics and Management 3.2, pp. 72–80.

Kondo, Y. (2011). “Robustification of the sparse K-means clustering algorithm”. University of British Columbia.

Kondo, Yumi, Matias Salibian-Barrera, and Ruben Zamar (2016). “RSKC: An R package for a robust and sparse K-means clustering algorithm”. Journal of Statistical Software 72.5, pp. 1–26. [Retrieved from https://doi.org/10.18637/jss.v072.i05].

Li, G., and Z. Chen (1985). “Projection-pursuit approach to robust dispersion matrices and principal components: Primary theory and Monte Carlo”. Journal of the American Statistical Association 80.391, pp. 759–766.

Lloyd, S. P. (1982). “Least squares quantization in PCM”. IEEE Transactions on Information Theory 28.2, pp. 129–136.

Sørensen, C. K. and J. M. Puigvert Gutiérrez (2006). “Euro area banking sector integration using hierarchical cluster analysis techniques”. Working Paper Series EuropeanCentral Bank 627.

Tibshirani, R., G. Whalter, and T. Hastie (2001). “Estimating the number of clusters in a data set via the gap statistic”. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63 Part 2, pp. 411–423.

Wang, D. (2007). “Three Essays on Bank Technology, cost Structure, and Performance”. PhD Dissertation. State University of New York at Binghamton.

Werbin, Eliana (2010). “Los determinantes de la rentabilidad de los bancos en Argentina (2005 – 2007)”. PhD thesis, Universidad Nacional de Córdoba.

Witten, D. M., and R. A Tibshirani (2010). “Framework for feature selection in clustering”. Journal of the American Statistical Association 105.490, pp. 713–726.

Downloads

Published

2018-12-01

How to Cite

Vargas, J. M., Díaz, M., & García, F. (2018). Robust Clustering of Banks in Argentina. Revista De Economía Y Estadística, 56(1), 21–41. https://doi.org/10.55444/2451.7321.2018.v56.n1.29385

Issue

Section

ARTÍCULOS