Abstract

Information about financial risk almost always contains a problem of class imbalance. Class-imbalanced data refers to the asymmetric categories of data, and it is divided into a major class and a minor class. If we guide all information into the training sample to model of this situation, it may happen that the accuracy rate of the major class is high, but the accuracy rate of the minor class is too low. Many risk assessment models have been developed in many studies, but most of them only use sampling methods to deal with the class-imbalanced data; this may cause the distortion of information. In order to effectively solve the problem of class imbalance in credit risk assessment, this paper proposed a novel credit risk assessment model using a granular computing technique to construct a risk assessment model to provide a better insight into the essence of data and effectively solve class imbalance problems. On the other hand, in order to improve the lack of granular computing and enhance the efficiency of the credit risk assessment model, this paper adds a new index, “% of minor class (PM),” to avoid a situation in which minor class data spread to the major class granular. Finally, this paper also compares the results of the area under the receiver operating characteristic curve (AUC) and G-means methods for dealing with class-imbalanced data. The results demonstrate that the proposed granular computing credit assessment model would have better results than other sampling models.

References

1.
Chang
,
K. H.
, “
A More General Risk Assessment Methodology Using Soft Sets Based Ranking Technique
,”
Soft Comput.
, Vol.
18
, No.
1
,
2014
, pp.
169
183
. https://doi.org/10.1007/s00500-013-1045-3
2.
Chang
,
K. H.
and
Cheng
,
C. H.
, “
Evaluating the Risk of Failure Using the Fuzzy OWA and DEMATEL Method
,”
J. Intell. Manuf.
, Vol.
22
, No.
2
,
2011
, pp.
113
129
. https://doi.org/10.1007/s10845-009-0266-x
3.
Floyd
,
S.
and
Warmuth
,
M.
, “
Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension
,”
Mach. Learn.
, Vol.
21
, No.
3
,
1995
, pp.
269
304
.
4.
Chawla
,
N. V.
,
Bowyer
,
K. W.
,
Hall
,
L. O.
, and
Kegelmeyer
,
W. P.
, “
SMOTE: Synthetic Minority Over-Sampling Technique
,”
J. Artif. Intell. Res.
, Vol.
16
,
2002
, pp.
321
357
.
5.
Desai
,
V. S.
,
Crook
,
J. N.
, and
Overstreet
,
G. A.
, “
A Comparison of Neural Networks and Linear Scoring Models in the Credit Union Environment
,”
Eur. J. Oper. Res.
, Vol.
95
, No.
1
,
1996
, pp.
24
37
. https://doi.org/10.1016/0377-2217(95)00246-4
6.
West
,
D.
, “
Neural Network Credit Scoring Models
,”
Comput. Oper. Res.
, Vol.
27
, Nos.
11–12
,
2000
, pp.
1131
1152
. https://doi.org/10.1016/S0305-0548(99)00149-5
7.
Lee
,
T. S.
,
Chiu
,
C. C.
,
Lu
,
C. J.
and
Chen
,
I. F.
, “
Credit Scoring Using the Hybrid Neural Discriminant Technique
,”
Exp. Syst. Appl.
, Vol.
23
, No.
3
,
2002
, pp.
245
254
. https://doi.org/10.1016/S0957-4174(02)00044-1
8.
Tong
,
L. I.
,
Chang
,
Y. C.
, and
Lin
,
S. H.
, “
Determining the Optimal Re-Sampling Strategy for a Classification Model With Imbalanced Data Using Design of Experiments and Response Surface Methodologies
,”
Exp. Syst. Appl.
, Vol.
38
, No.
4
,
2011
, pp.
4222
4227
. https://doi.org/10.1016/j.eswa.2010.09.087
9.
Domingos
,
P.
, “
MetaCost: A General Method for Making Classifiers Cost-Sensitive
,”
Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining
,
ACM Press
,
New York
,
1999
, pp.
155
164
.
10.
Lin
,
Y.
,
Lee
,
Y.
and
Wahba
,
G.
, “
Support Vector Machines for Classification in Nonstandard Situations
,”
Mach. Learn.
, Vol.
46
, Nos.
1–3
,
2002
, pp.
191
202
. https://doi.org/10.1023/A:1012406528296
11.
Nugroho
,
A. S.
,
Kuroyanagi
,
S.
, and
Iwata
,
A.
, “
A Solution for Imbalanced Training Sets Problem by CombNET-II and its Application on Fog Forecasting
,”
IEICE Trans. Inf. Syst.
, Vol.
E85D
, No.
7
,
2002
, pp.
1165
1174
.
12.
Su
,
C. T.
,
Chen
,
L. S.
, and
Chiang
,
T. L.
, “
A Neural Network Based Information Granulation Approach to Shorten the Cellular Phone Test Process
,”
Comput. Ind.
, Vol.
57
, No.
5
,
2006
, pp.
412
423
. https://doi.org/10.1016/j.compind.2006.01.001
13.
Su
,
C. T.
,
Chen
,
L. S.
, and
Yih
,
Y. W.
, “
Knowledge Acquisition Through Information Granulation for Imbalanced Data
,”
Exp. Syst. Appl.
, Vol.
31
, No.
3
,
2006
, pp.
531
541
. https://doi.org/10.1016/j.eswa.2005.09.082
14.
Chen
,
M. C.
,
Chen
,
L. S.
,
Hsu
,
C. C.
, and
Zeng
,
W. R.
, “
An Information Granulation Based Data Mining Approach for Classifying Imbalanced Data
,”
Inf. Sci.
, Vol.
178
, No.
16
,
2008
, pp.
3214
3227
. https://doi.org/10.1016/j.ins.2008.03.018
15.
Zadeh
,
L. A.
, “
Toward a Theory of Fuzzy Information Granulation and its Centrality in Human Reasoning and Fuzzy Logic
,”
Fuzzy Sets Syst.
, Vol.
90
, No.
2
,
1997
, pp.
111
127
. https://doi.org/10.1016/S0165-0114(97)00077-8
16.
Saberi
,
M.
,
Mirtalaie
,
M. S.
,
Hussain
,
F. K.
,
Azadeh
,
A.
,
Hussain
,
O. K.
, and
Ashjari
,
B.
, “
A Granular Computing-Based Approach to Credit Scoring Modeling
,”
Neurocomputing
, Vol.
122
, No.
25
,
2013
, pp.
100
115
. https://doi.org/10.1016/j.neucom.2013.05.020
17.
Chen
,
L. S.
and
Su
,
C. T.
, “
Using Granular Computing Model to Induce Scheduling Knowledge in Dynamic Manufacturing Environments
,”
Int. J. Comput. Integrat. Manuf.
, Vol.
21
, No.
5
,
2008
, pp.
569
583
. https://doi.org/10.1080/09511920701381255
18.
Japkowicz
,
N.
and
Stephen
,
S.
, “
The Class Imbalance Problem: A Systematic Study
,”
Intell. Data Anal.
, Vol.
6
, No.
5
,
2002
, pp.
429
450
.
19.
Wu
,
G.
and
Chang
,
E. Y.
, “
KBA: Kernel Boundary Alignment Considering Imbalanced Data Distribution
,”
IEEE Trans. Knowl. Data Eng.
, Vol.
17
, No.
6
,
2005
, pp.
786
795
. https://doi.org/10.1109/TKDE.2005.95
20.
Jo
,
H. K.
,
Han
,
I. G.
, and
Lee
,
H. Y.
, “
Bankruptcy Prediction Using Case-Based Reasoning, Neural Networks, and Discriminant Analysis
,”
Exp. Syst. Appl.
, Vol.
13
, No.
2
,
1997
, pp.
97
108
. https://doi.org/10.1016/S0957-4174(97)00011-0
21.
Zadeh
,
L. A.
, “
Fuzzy Sets and Information Granularity
,”
Advances in Fuzzy Set Theory and Applications
,
Gupta
M. M.
,
Ragade
R. K.
, and
Yager
R. R.
, Eds.,
North-Holland
,
Amsterdam
,
1979
, pp.
3
18
.
22.
Zadeh
,
L. A.
, “
A New Direction in AI – Toward a Computational Theory of Perceptions
,”
AI Mag.
, Vol.
22
, No.
1
,
2001
, pp.
73
84
.
23.
Altman
,
E. I.
, “
Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy
,”
J. Finance
, Vol.
23
, No.
4
,
1968
, pp.
589
609
. https://doi.org/10.1111/j.1540-6261.1968.tb00843.x
24.
Baridam
,
B. B.
, “
More Work on K-Means Clustering Algorithm: The Dimensionality Problem
,”
Int. J. Comput. Appl.
, Vol.
44
, No.
2
,
2012
, pp.
23
30
. https://doi.org/10.1504/IJCAT.2012.048205
25.
Zhang
,
C.
and
Fang
,
Z.
, “
An Improved K-Means Clustering Algorithm
,”
J. Inf. Comput. Sci.
, Vol.
10
, No.
1
,
2013
, pp.
193
199
. https://doi.org/10.1007/978-3-642-41908-9
This content is only available via PDF.
You do not currently have access to this content.