Dissertation credit risk

Publicado em Agosto 2017

In the present environment of tighter regulation and control of financial risk, there is a revolution in risk management – to become more quantitative in its approach to all risks. Two data sets include 10 variables. In recent years, bank loans have provided around 80% of funding to China’s domestic non-financial sector. At the mean time, the private sector has limited access to bank funding. Csv data sets provided for Assignment 4. Before interest rate deregulation in 1995, interest rates on loans and deposits were set by the central bank and were conformed by all commercial banks. As for this, we will discuss at the evaluation stage. In the previous evaluation stage, we proved that the decision tree is better. Overall for Task 1 you need to report on the output of each analysis in sub task activity a to f and briefly comment on the important aspects of each analysis and relevance to credit risk scoring in determining whether to approve a loan with an appropriate credit risk rating or to not lend to a loan application. However, the appropriate analytical process for achieving this objective is not so simply stated. The deregulation of interest rate expectantly will enhance capital allocation efficiency. Low growth areas, meanwhile, tended to have high quotas that the banks could not meet through their deposits alone; therefore, they had to turn to the PBC for funds, leaving the central bank redistributing funds from high deposit areas to low deposit areas. Are there any interesting relationships between the potential predictor variables and your target variable credit risk? Its reformulation as a true central bank was approved by China’s State Council in September 1983, and its former responsibility of lending to state-owned commercial and industrial enterprises was transferred to the newly-formed ICBC in 1984. They were allowed to expand nationwide in 1993. OMO are conducted mainly through trading securities and foreign currencies with designated 40 commercial banks and other partners. It is often used to indicate the magnitude of the credit risk. According to the authors, the KMV approach of estimating expected default loss (EDL) follows three steps, based on option pricing theory. Now we has the decision tree that shows credit institutions which attributes matter most in predicting the credit risk. At the end of the course, you will be able to understand and correctly use the basic tools of credit risk management, both from a theoretical and, most of all, a practical point of view. Continuing reform of the dissertation credit risk PBC intends to transform it to a true central bank. Risk attribute should be set role into a ‘label’ attribute. If the borrower defaults, you will face losses in your portfolio. The loss may be complete or partial and can arise in a number of circumstances (Wikipedia. In this assignment, we will use only two models; decision tree and neural network. As we consider the data understanding, all attributes have valid types and ranges. Chiu and Lewis (2006) observe that a strict quota system was implemented by the central government, and the loan quota listed the number of firms in each province to receive bank loans, and SOEs had priority. The other is creditrisk_score. In the case that credit score is less than or equal to 518, the next best predictor is Debt_Income_Ratio attribute. It was recently reclassified as a “large state-owned bank” in April 2007, but remains at only approximately a quarter the size of the other SOCBs in terms of total assets. For example, interest rates are determined by the central bank instead of the market. Are there any variables with unusual patterns? Explore, Modify, Model,Assessment) and CRISP-DM (Cross Industry Standard Process for Data Mining) are the three major attempts to standardize the data mining process (Azevedo, 2008). In the second step, the firm’s default point is calculated from the firm’s liabilities. How consistent are the characteristics of the creditrisk_train. As for DO NOT LEND, it’s max confidence is 0. You should refer to the data dictionary for dissertation credit risk creditrisk_train. The three attributes; Loan_Amt, Liquid_Assets and Num_Credit_Lines have not been related to the prediction. Risk to a nominal variable with two values (Good and Bad). In Figure 2-8, the CreditRisk Scoring data set is linked to the unlabelleddata port (unl). More specific, “CreditMetrics TM is a tool for assessing portfolio risk due to changes in debt value caused by changes in obligator credit quality”. “In the first dissertation credit risk step, the market value and volatility of the firms are estimated from the market value of its stock and the book value of its liabilities. In fact, “regions with low growth potential tended to be highly dependent on SOEs, and, as such, received relatively large amounts of loans”. Therefore, Chiu and Lewis (2006) find that private firms are less likely to receive funding from the banking system. For example, the class precision (or true positive rate) of pred. For example, even though the SOE’s share of industrial output had declined to 28% the help essay on skeeter in 1999 from 78% in 1978, they still received half of total investment funds in that year (Bajona and Chu, 2004, p. To show the results, the label predictions (lab) port and the decision tree model (mod) are connected to res ports. Csv datasets? We will do this in a rigorous way, but also with fun: there is no need to be boring. (2001) shows that 91% of initial capital and 62% of established business financing for the private firms came from internal sources in 1998. The yield spread between a credit-risky security and its corresponding default-free security is usually regarded as the compensation for the credit risk the risky asset assumes. You have two data sets. Secondly, we need to consider that there will are any missing values in the data sets. Unfortunately, lending money is a risky business - there is no 100% guarantee that you will get all your money back. Only the Very Low value has 100% convince. Fortunately, there is no missing value (Figure 2-2), so we do not need to replace or impute missing values. Banks do not have the freedom to choose competitive strategies. A survey by Garnaut et al. 93% and 95. The PBC operated as a ‘monobank’ even after China’s economic reforms began in 1978. 622 (62. The Main Task of credit Risk Analysis The objective of credit risk analyses can be simply stated as: to forecast the ability and willingness of a borrower to meet its debt obligations when due. 2% confidence. Concentration of risk refers to additional portfolio risk resulting from increased exposure to one obligator or groups of correlated obligators”. In relation to loan expectations and prediction, Brice identifies several banks that identify default risk as being comprised of two components: (1) expected loss—which involves dissertation credit risk an actuarial calculation of the anticipated average loss on loans within a particular asset class over a given time period; and (2) unexpected loss – which is the change (volatility) of those expected losses from one year to the next year. However, we cannot close our eyes, because as the range of the other attributes change in the real world, they are able to become a potential predictor variable. The confidence will be changed according to the criterion. PBC’s Open Market Operations becomes the main tool to influence monetary supply and commercial bank liquidity since 1999. 8%) so that there is no prediction for DO NOT LEND. Csv, which is a training data set containing the previous history, borrower’s financial informationand the target variable (Credit. The formal loan quota system was removed in 1998. According to Brice, the use of these systems enters a new phase where they are widely considered a key factor of any credit risk management framework. Figure 2-2 and figure 2-3 are meta data for the two data sets, respectively. However, “the historical burden of prior bad loans plus ongoing protection of many SOEs continued to hamper full commercialization of the SOCBs. Under the credit plan, the big four SOCBs were forced to make loans to politically motivated projects, regardless of the credit worthiness of the borrowers. In high growth areas where deposits were high, the bank’s lending options were constrained and they typically kept large excess reserves with the People’ Bank. Confidence attributes have been created by RapidMiner, along with a prediction attribute. Firstly, we added the basic decision tree in the main process (Figure 2-4). P. Deposits rate were usually set relatively low and sometimes even below inflation rate. How about consistency between creditrisk_train and creditrisk_score? Csv data set. The interesting thing is that the max value of confidence Moderate is 0. When it requires administrative distribution of funds SOEs certainly have priority. For example, Months_In_job (months in current job) attribute has the proper range between 2 and 102 months, with about overall 27 months. Moreover, Brice argues that many banks have started to apply the modern portfolio theory (MPT) with its main thrust – that diversification can reduce the risk of portfolio. A growing number of joint-stock commercial banks have been established subject to the central bank’s approval, with majority government ownership and minority private and foreign ownership. All values in the scoring data set are in the range of the training data set. The circles in the neural network graph are nodes, and the lines between nodes are called neurons. The thicker and darker the neuron is between nodes, the strong the affinity between the nodes (Matthew, 2012). As we selected four attributes instead of the all predictor variables, we can see similar result. The largest such institution, the Bank dissertation credit risk of Communications successfully went public on the Hong Kong Stock Exchange on June 23, 2005. Are there any missing values, variables with unusal patterns? Regulation on interest rate has prevented capital from being allocated efficiently. Csv (see Table 1 below). Ideally, the curve will climb quickly toward the top-left meaning the model correctly predicted the cases (Sayad, 2011). (Hint: identify the variables that will allow you to split the data set into subgroups). 41%. That’s why there is no prediction in the DO NOT LEND value. We need to consider data consolidation, cleaning and transformation to be sure that the data sets should keep consistency. 93% accuracy. BOC and ABC financed only about one-fifth and one-fourth of their policy lending by deposits. Using Select Attributes in RapidMiner, the attribute, RowNo has been eliminated for the analysis (Figure 2-1). Most of the banks are almost entirely government-owned and few new banks have entered the market. As recent as 1998, these institutions were entirely government-owned. Policy lending quotas were set without any reference to the bank’s ability to meet the quotas. Moderate is 94. , no rigid assumptions), and their ability to generalize (Haykin, 2009). It began relaxing its control by setting the range of interest rates instead. 2%; CCB, 58%,; and ICB, 25%. One of the main reasons to choose a decision tree is that the appeal of decision trees lies in their relative power, ease of use, robustness with a variety of data and levels of measurement, and ease of interpretability (Barry, 2006). It is a rather simple and useful method to indicate credit risk. That is why all values in Credit. In the case of the confidence for Bad from 0. RapidMiner calculates a 95. Imagine that you are a bank and a main part of your daily business is to lend money. Considered the overall accuracy is 97. It will be sure that DO NOT LEND and High values are bad credit risk as Moderate, Low and Very Low to Good. Even though they have similar processes, CRISP-DM is the popular methodology in the fields of data mining. The remaining industry assets are held by city commercial banks operating at the city level, urban and rural credit cooperatives, three policy banks, foreign banks, and d non-bank financial institutions. The central bank now influences interest rates by setting the floor on loan interest rate and the ceiling on deposit interest rates. China’s central bank – the People’s Bank of China (PBC) historically controlled all lending and deposit-taking activities in China. Csv, which is a dataset to will be predicted. Chui and Lewis (2006) estimate that SOEs still receive 45% of SOCB’s short-terms loans and probably about 60% of SOCB’s medium-to long-term loans, for fixed assets in 2003. CreditMetrics TM “seeks to construct what cannot be directly observed—the volatility of value due to credit quality changes”. Therefore, the big four were forced to make loans regardless of the creditworthiness of the borrowers. The risk is primarily that of the lender and includes essay on drunk driving lost principal and interest, disruption to cash flows, and increased collection costs. RapidMiner is completely convinced that Applicant ID 88858 is going to be Very Low (100%), while applicant 628458 is going to be Low with 98. As the 2008 financial crisis has shown us, a correct understanding of credit risk and the ability to manage it are fundamental in today's world. Risk). The central government, thereby, was forced to commercialize its banks. Even though applicant 628458 has 1. Identify which (variables) attributes can be omitted from your credit risk data mining model and why. Loan and deposit interest rates are not determined by market supply and demand. 2%). Even though, the prediction for Credit Risk is Moderate, confidence in the neural network is only 0. The AUC (Area Under the Curve) is almost 1, because the overall accuracy is 97. 2%), while those of the decision tree is 0. This will be a quite unconventional course. Interest rates on loans to SOEs and private sectors were set the same and below market clearing price. We do not need one of them, because these are duplicated. In Task 1 of this Assignment 4 you are required to follow the six step CRISP DM process and make use of the data mining tool RapidMiner to analyse and report on the creditrisk_train. Each of the big four focuses on different sectors in the national economy. In Figure 2-6, the prediction for the class ‘DO NOT LEND’ is 100%, because there are no other class frequencies in the class. When we see the two lift charts, it is essential to choose the decision tree for our prediction of credit risk. Banks therefore had access to large amount of low-cost household deposits while being able to offer favorable rates to SOE borrowers. 8% at confidence Very Low, this applicant is predicted as the Low credit risk. Csv and creditrisk_score. Banks charge the same below-market interest rates on loans to SOEs and to the private sector, which creates excess demand for loans. Most predictive model operators expect essays about reading the training stream to supply a ‘label’ attribute. However, Wastgaard and Wijst point out that the unexpected loss is the primary driver of the amount of economic capital required for credit risk. Also we can see four predictor attributes; Credit_Score, Late_Payments, Months_In_Job and Debt_Income_Ratio (Figure 2-9). The interesting thing is that in the decision tree using accuracy criterion only four variables have influenced on the prediction of the credit risk, which are Credit_Score, Debt_Income_Ratio, Late_Payments and Months_In_Job. The graph demonstrated that the decision tree is more accurate than the neural network. And, in the third step, a mapping is determined between the distance to default and the default rate based on historical default experience of companies with different distance-to-default values”. In the case of the decision tree with accuracy criterion, it is simpler to analyze. The label attribute has five values; Very Low, Low, Moderate, High and DO NOT LEND, which will be predicted in the scoring data set. Agricultural Bank of China (ABC) focuses on rural lending to agriculture and industry; Bank of China (BOC) handles foreign exchange and new policy initiatives; China Construction Bank (CCB) focuses on new investment and infrastructure; Industrial and Commercial Bank of China (ICBC) focuses on industrial lending. 174). To reduce a financial institution’s credit risk, the lender may perform a credit check on the potential borrowers to determine whether a borrower should lend at an appropriate level of risk or not lend to a loan application. A series of reforms that took place after 1994 were intended to commercialize the state-owned commercial banks and to prepare the banks for foreign competition. The central government used to incorporate the big four in its credit plan to finance the state-owned enterprises (SOEs) and provide financial support to poor economic regions. Since October 2004, the PBC no longer sets a ceiling on RMB loan interest rate. Org). Through the Select Attribute operator, four variables, id variable and the target variable are selected. According to the Basel Accord, a global regulation framework for financial institutions, credit risk is one of the three fundamental risks a bank or any other regulated financial institution has to face when operating in the markets (the two other risks being market risk and operational risk). To use lift charts and ROC curves for evaluating models, we need to convert the target variable credit. 54%, leaving us with a the best resume writing services 5. The next step is do essay for me to apply the decision tree model to the scoring data. 41% for the decision tree and neural network, respectively. Before applying some modeling such as decision tree and neural network in this assignment, as a target variable, Credit. The non-financial sector in China finances mainly through bank loans. When the confidence goes down to 0. When it comes to artificial neural networks, it has been shown to be very promising computational systems in many forecasting and business classification applications due to their ability to learn from the data, their nonparametric nature (i. The qualitative analysis is represented by the credit ratings assigned by the rating agencies, such as Moody’s, Standard & Poor’s and many others. B) Conduct an exploratory analysis of the creditrisk_train. As we see, Credit_Score is the best predictor to determine which Credit_Risk borrowers are belonging to. Gradually, the PBC introduces greater flexibility in interest rates. Csv might influence differences in credit scores and credit risk ratings and possible approval or rejection of loan applications? Especially, accuracy criterion has overall 97. As we see the figure 2-16, the accuracy rate for this model is 95. These are typical situations in which credit risk manifests itself. In Task 1 and 2 of Assignment 4 you are required to consider all of the business understanding, data understanding, data preparation, modelling, evaluation and deployment phases of the CRISP DM process. Wastgaard and Wijst (2002) describe the theoretical background behind the methodology used to evaluate the credit risk that a bank faces. If Debt_Income_Ratio is greater than about 10%, the borrowers expect their credit risk to belong to the ‘DO NOT LEND’. The credit plan prevented the allocation of credit from being determined by market forces. “All three quickly ranked amongst the world’s top ten commercial banks in terms of market value and, as of July 23, 2007, ICB’s rising share price made it the biggest lender in the world by market capitalization” (Ren, 2007). In the previous assignment, we have discussed about CRISP-DM through the analysis for the survival rate of passengers on the Titanic. Shih (2004), for example, shows that although the province of Liaoning had a high concentration of SOEs it did not receive the lending rates enjoyed by the neighboring province of Jilin – a province whose leaders appear to have enjoyed much closer ties to the central government elite than did Liaoning’s. This will provide you with a business understanding of the dataset you will be analysing in Assignment 4. Despite facing increasing domestic and foreign competition, China’s four largest state-owned commercial banks (SOCBs) remain dominant players in China’s banking industry. According to Brice, these banks develop portfolio measurement and management systems that monitor a range of variables including geographic exposure, industry concentrations, product clustering, a secured versus unsecured mix, and research paper media risk rating. The Credit Risk in Banking Industry in China The banking sector has always played a key role in the national economy. When confidence for Bad is 1 (100%), 209 out of 209 are predicted accurately. Decision trees dissertation credit risk are a simple, but powerful form of multiple variable analyses. Firstly, let’s consider the neural network. While we were preparing the data, we decided to use only four predictor variables. However, we need to keep in mind that the deployment phase can be as simple as generating a report or as complex as implementing a repeatable data mining process. Although the training data is going to predict that if Debt_Income_Ratio is less than 10%, the borrowers belong to the ‘High’ credit risk class, the model is not 100% based on that prediction, because there are 163 ‘High’ frequencies and one ‘DO NOT LEND’ frequency (Figure 2-6). The ROC chart shows false positive rate (1-specificity) on X-axis, the probability of target=1 when its true value is 0, against true positive rate (sensitivity) on Y-axis, the probability of target=1 when its true value is 1. Policy Loans The central government originally incorporated the four major banks into its credit plan to finance its state-owned enterprises (SOEs). According to the document, “the primary reason to have a quantitative portfolio approach to credit risk management is so that one can systematically concentrate risk. Recently, the PBC’s major roles involves in determining the required reserve ratio, the base interest rates for deposits and loans, and discount rate. However, it does not influence overall true positive rate. SOE managers were not made accountable for non repayment of loans in the past and, even if an SOE had previously defaulted on loans, the banks still lacked the authority to independently cut off dissertation credit risk new lending to that SOE”. In terms of applicant 863682, there is a great difference. Using these two values, plus the firm’s volatility, a measure is constructed that represents the number of standard deviations from expected firm value to the default point. Under each year’s credit plan, the SOCBs received loan quotas to every region. Like this, we can predict other credit risks following to the nodes and leaves of the decision tree. This essentially forced the banks to make new loans to cover defaulted interest payments, reporting phantom interest profits in the process. For example, the risk rating system is used as a basis for such activities as loan review, loan pricing, and loan provisioning. What value belongs to Good or Bad depends on the decision of real business. Thus, we can be sure that the four predictor variables are enough to predict the credit risk. 972, which means there will be false positive predictions. Mo (1999) estimates that CCB borrowed about one-half policy dissertation credit risk lending from the central bank, the other half from deposits. As a result, we do not need any data cleansing. In 1997, J. The input circles have each predictor attribute, while the output nodes have each target value, in which there are Moderate, High, Low, DO NOT LEND and Very Low. Moreover, it reduced the predictor variables from seven to four. Brice (1992) surveys the international use of risk rating systems in the context of management of credit risk. The commercial banking industry in China has been relatively stable. This step assesses the degree to which the selected model meets the business objectives and, if so, to what extent (Efraim et al, p. 972 (97. When we get to the Evaluation process, we will discuss how this uncertainty translates into confidence percentages and how to validate these confidences. E. Like the decision tree, we can see similar metadata for the scoring data set predictions. The next step is to add predictive model operators to the training data set. 87 (87%) to 1 (100%), RapidMiner predicts Bad credit risk with 100%. Moreover, essays on domestic violence he discusses how capital requirements are set so that they reflect the credit risk in off-balance sheet items. However, the predictions and confidence make a little difference. Many foreign financial firms are therefore poised to open services in China because of its huge market potential. Other criterion will be applied at the evaluation progress. Three of the four, BOC, CCB, and ICBC had initial public offerings in Hong Kong. One of the nice side-effects of setting an attribute’s role to ‘id’ rather than removing it using a Select Attributes is that it makes each record easier to match back to individual people later, when viewing predictions in results perspective (Matthew, 2012). Luo (1993) estimated the policy lending as a percent of total lending, and found as follow: BOC, 67%; ABC, 51. In the evaluation process, we will validate decision trees. One is creditrisk_train. The authors discuss alternative methods to calculate the parameters in Credit Risk Capital (CRC) and Risk Adjusted Return on Capital (RAROC). 7). The approach uses a contingent claims pricing theory and is particularly appropriate for an off-balance contract that has either a positive or a negative value of contingency”. Surprisingly, the true positive rate of pred. Of cause, if we change accuracy criterion to other criterion such as gain_ratio, information_gain and gini_index, the three attributes are used for the prediction. Credit risk refers to the risk that a borrower will default on any type of debt by failing to make payments which it is obligated to do. Risk attribute are missing. 01 (1%), the false positive rate is only about 16%. Therefore, loan interest rates were not determined by expected return on capital. In Figure 2-14, like the decision tree, 888858 and 628458 have similar confidences, in which they are going to be Very Low and Low, respectively. For each methodology, we will analyse its strengths as well as its weaknesses. DO NO LEND is 0%. Firstly, in the data sets, there are two unique identifiers. For instance, in terms of Liquid_Assets attribute, the range from 834 to 24297 in the scoring data set is a subset of those of the training data set, in which the range is between 830 and 24699 (Figure 2-2 and Figure 2-3). 46% false positive rate for this value. Lastly, as data transformation, the Application. A) Research the concepts of credit risk and credit scoring in determining whether a financial institution should lend at an appropriate level of risk dissertation credit risk or not lend to a loan application. Comment on your findings in relation to determining the credit risk of loan applicants. Whenever business needs change or predictor variables added or modified, we have to recycle the CRISP-DM processes. )”. The four largest state-owned commercial banks (SOCBs) possess market power and account for over half of total industry assets. Morgan “issued a Technical Document that describes CreditMetrics TM as a framework for quantifying credit risk in portfolios of traditional credit products, fixed income instruments, and market-driven instruments subject to counterparty default (swaps, forwards, etc. We can see the graphical view of the neural network model. This overall accuracy rate reflects the class precision rates for each possible value in the Credit_Risk attribute. Hull (1989) presents “a general approach to valuing a financial institution’s contracts when there is credit risk. Figure 2-19 show how to change the old values to the new values. In order to do this, the Map operator is used. Csv and creditrisk_score. Other political factors also affected commercial bank lending. And ultimately it cannot be described in its entirety because it is based to a large degree on judgment and because it pertains to the future, which by definition is uncertain. 498 (49. ID attribute has been used as an id, which is implemented by Set Role in RapidMiner. In RapidMiner, there are four criterion on which attributes will be selected for splitting; gain_ratio, information_gain, gini_index and accuracy. Only 7% of initial capital and 18% of business financing was contributed by loans from financial institutions. In this step, we will use accuracy criterion. Many more formal numerical measures for valuing credit risk have been replacing the qualitative tradition in measuring credit quality. Comment on what variables in the data set creditrisk_train. As part of the WTO accession commitments, China had to open its banking industry to foreign competition by 2007. The ROC chart is similar to the gain or lift charts in that they provide a means of comparison between classification models. In this assignment, we used Cross-Validation, which is a statistical method of evaluatingand comparing learning algorithms by dividing datainto two segments: one used to learn or train a modeland the other used to validate the model (Payam, 2008). In Figure 2-5, we will see the preliminary tree using the accuracy criterion. In 2006, almost 87% of China’s non-financial sector’s funding is through bank loans. Or, in a bit less extreme scenario, if the credit quality of your counterparty deteriorates according to some rating system, the loan will become more risky. ICBC financed all of its policy lending by borrowing from the central bank. The data dictionary for two data sets is shown in Table 1. 41% accuracy rate for this model. 93%, it is natural.