4 research outputs found

    Lending Club loan dataset for granting models

    No full text
    <p>Lending Club offers peer-to-peer (P2P) loans through a technological platform for various personal finance purposes and is today one of the companies that dominate the US P2P lending market. The original dataset is publicly available on <a href="https://www.kaggle.com/datasets/wordsforthewise/lending-club">Kaggle</a> and corresponds to all the loans issued by Lending Club between 2007 and 2018. The present version of the dataset is for constructing a granting model, that is, a model designed to make decisions on whether to grant a loan based on information available at the time of the loan application. Consequently, our dataset only has a selection of variables from the original one, which are the variables known at the moment the loan request is made. Furthermore, the target variable of a granting model represents the final status of the loan, that are "default" or "fully paid". Thus, we filtered out from the original dataset all the loans in transitory states. Our dataset comprises 1,347,681 records or obligations (approximately 60% of the original) and it was also cleaned for completeness and consistency (less than 1% of our dataset was filtered out).</p> <p><strong>TARGET VARIABLE</strong></p> <p>The dataset includes a target variable based on the final resolution of the credit: the default category corresponds to the event charged off and the non-default category to the event fully paid. It does not consider other values in the loan status variable since this variable represents the state of the loan at the end of the considered time window. Thus, there are no loans in transitory states. The original dataset includes the target variable “loan status”, which contains several categories ('Fully Paid', 'Current', 'Charged Off', 'In Grace Period', 'Late (31-120 days)', 'Late (16-30 days)', 'Default'). However, in our dataset, we just consider loans that are either “Fully Paid” or “Default” and transform this variable into a binary variable called “Default”, with a 0 for fully paid loans and a 1 for defaulted loans.</p> <p><strong>EXPLANATORY VARIABLES</strong></p> <p>The explanatory variables that we use correspond only to the information available at the time of the application. Variables such as the interest rate, grade, or subgrade are generated by the company as a result of a credit risk assessment process, so they were filtered out from the dataset as they must not be considered in risk models to predict the default in granting of credit.</p> <h1><strong>FULL LIST OF VARIABLES</strong></h1> <p><strong>Loan identification variables:</strong></p> <ul> <li> <p>id: Loan id (unique identifier). </p> </li> <li> <p>issue_d: Month and year in which the loan was approved.</p> </li> </ul> <p><strong>Quantitative variables:</strong></p> <ul> <li> <p>revenue: Borrower's self-declared annual income during registration. </p> </li> <li> <p>dti_n: Indebtedness ratio for obligations excluding mortgage. Monthly information. This ratio has been calculated considering the indebtedness of the whole group of applicants. It is estimated as the ratio calculated using the co-borrowers’ total payments on the total debt obligations divided by the co-borrowers’ combined monthly income.</p> </li> <li> <p>loan_amnt: Amount of credit requested by the borrower. </p> </li> <li> <p>fico_n: Defined between 300 and 850, reported by Fair Isaac Corporation as a risk measure based on historical credit information reported at the time of application. This value has been calculated as the average of the variables “fico_range_low” and “fico_range_high” in the original dataset.</p> </li> <li> <p>experience_c: Binary variable that indicates whether the borrower is new to the entity. This variable is constructed from the credit date of the previous obligation in LC and the credit date of the current obligation; if the difference between dates is positive, it is not considered as a new experience with LC.</p> </li> </ul> <p><strong>Categorical variables:</strong></p> <ul> <li> <p>emp_length: Categorical variable with the employment length of the borrower (includes the no information category) </p> </li> <li> <p>purpose:  Credit purpose category for the loan request. </p> </li> <li> <p>home_ownership_n: Homeownership status provided by the borrower in the registration process. Categories defined by LC: “mortgage”, “rent”, “own”, “other”, “any”, “none”.  We merged the categories “other”, “any” and “none” as “other”.</p> </li> <li> <p>addr_state: Borrower's residence state from the USA. </p> </li> <li> <p>zip_code: Zip code of the borrower's residence.</p> </li> </ul> <p><strong>Textual variables</strong></p> <ul> <li> <p>title: Title of the credit request description provided by the borrower.</p> </li> <li> <p>desc: Description of the credit request provided by the borrower.</p> </li> </ul> <p>We cleaned the textual variables. First, we removed all those descriptions that contained the default description provided by Lending Club on its web form (“Tell your story. What is your loan for?”). Moreover, we removed the prefix “Borrower added on DD/MM/YYYY >” from the descriptions to avoid any temporal background on them. Finally, as these descriptions came from a web form, we substituted all the HTML elements by their character (e.g. “&” was substituted by “&”, “<” was substituted by “<”, etc.).</p> <h1><strong>RELATED WORKS</strong></h1> <p>This dataset has been used in the following academic articles:</p> <ul> <li>Sanz-Guerrero, M. Arroyo, J. (2024). Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending. arXiv preprint arXiv:2401.16458. <a href="https://doi.org/10.48550/arXiv.2401.16458">https://doi.org/10.48550/arXiv.2401.16458</a></li> <li>Ariza-Garzón, M.J., Arroyo, J., Caparrini, A., Segovia-Vargas, M.J. (2020). Explainability of a machine learning granting scoring model in peer-to-peer lending. IEEE Access 8, 64873 - 64890. <a href="https://doi.org/10.1109/ACCESS.2020.2984412">https://doi.org/10.1109/ACCESS.2020.2984412</a></li> </ul&gt

    Start Up Financing

    No full text
    [Discription] Start-up creation is the most distinctive feature of the entrepreneurial knowledge-based economy. It is also essential for economic growth and especially important in the current context of young graduate’s high unemployment rates that are expected to increase in the next few decades. There are other books on the creation of start-up companies, designed to be of value to academics wishing to exploit the commercial value of a new technology or business solution, but none of these existing titles focus on start-up creation in the construction industry. In the second edition of this extremely successful title the editors present a state-of-the-art review on advanced technologies, and their application in several areas of the built environment covering energy efficiency, structural performance, air and water quality to inspire the creation of start-up companies from university research. Part One begins with the key factors behind successful start-up companies from university research, including the development of a business plan, start-up financing, and the importance of intellectual property. Part Two focuses on the use of Big Data, Intelligent decision support systems, the Internet of Things and their use in the energy efficiency of the built environment. Finally, Part three is an entire new section that focuses on several smartphone applications for the smart built environment. While in the first edition the section concerning apps for smart buildings had just two chapters, one for app programming basics and a second a case study on building security in this second edition the core of the book is about app development that constitutes 50% of the book

    The Business Models and Economics of Peer-to-Peer Lending

    No full text
    This is a research report.This paper reviews peer-to-peer (P2P) lending, its development in the UK and other countries, and assesses the business and economic policy issues surrounding this new form of intermediation. P2P platform technology allows direct matching of borrowers’ and lenders’ diversification over a large number of borrowers without the loans having to be held on an intermediary balance sheet. P2P lending has developed rapidly in both the US and the UK, but it still represents a small fraction, less than 1%, of the stock of bank lending. In the UK – but not elsewhere – it is an important source of loans for smaller companies. We argue that P2P lending is fundamentally complementary to, and not competitive with, conventional banking. We therefore expect banks to adapt to the emergence of P2P lending, either by cooperating closely with third-party P2P lending platforms or offering their own proprietary platforms. We also argue that the full development of the sector requires much further work addressing the risks and business and regulatory issues in P2P lending, including risk communication, orderly resolution of platform failure, control of liquidity risks and minimisation of fraud, security and operational risks. This will depend on developing reliable business processes, the promotion to the full extent possible of transparency and standardisation and appropriate regulation that serves the needs of customers
    corecore