From the time we wake up to the minute our head hits the pillow, we make about 35,000 conscious and unconscious decisions a day. That’s a lot of processing in a 24-hour period. As part of that process, some decisions are intuitive: we’ve been in a situation before and know what to expect. Our minds make shortcuts to save time for the tasks that take a lot more brainpower. As for new decisions, it might take some time to adjust, weigh all the information and decide on a course of action. But after the new situation presents itself over and over again, it becomes easier and easier to process. Similarly, using traditional data is intuitive. Lenders have been using the same types of data in consumer credit worthiness decisions for decades. Throwing in a new data asset might take some getting used to. For those who are wondering whether to use alternative credit data, specifically alternative financial services (AFS) data, here are some facts to make that decision easier. In a recent webinar, Experian’s Vice President of Analytics, Michele Raneri, and Data Scientist, Clara Gharibian, shed some light on AFS data from the leading source in this data asset, Clarity Services. Here are some insights and takeaways from that event. What is Alternative Financial Services? A financial service provided outside of traditional banking institutions which include online and storefront, short-term unsecured, short-term installment, marketplace, car title and rent-to-own. As part of the digital age, many non-traditional loans are also moving online where consumers can access credit with a few clicks on a website or in an app. AFS data provides insight into each segment of thick to thin-file credit history of consumers. This data set, which holds information on more than 62 million consumers nationwide, is also meaningful and predictive, which is a direct answer to lenders who are looking for more information on the consumer. In fact, in a recent State of Alternative Credit Data whitepaper, Experian found that 60 percent of lenders report that they decline more than 5 percent of applications because they have insufficient information to make a loan decision. The implications of having more information on that 5 percent would make a measurable impact to the lender and consumer. AFS data is also meaningful and predictive. For example, inquiry data is useful in that it provides insight into the alternative financial services industry. There are also more stability indicators in this data such as number of employers, unique home phone, and zip codes. These interaction points indicate the stability or volatility of a consumer which may be helpful in decision making during the underwriting stage. AFS consumers tend to be younger and less likely to be married compared to the U.S. average and traditional credit data on File OneSM . These consumers also tend to have lower VantageScore® credit scores, lower debt, higher bad rates and much lower spend. These statistics lend themselves to seeing the emerging consumer; millennials, immigrants with little to no credit history and also those who may have been subprime or near prime consumers who are demonstrating better credit management. There also may be older consumers who may have not engaged in traditional credit history in a while or those who have hit a major life circumstance who had nowhere else to turn. Still others who have turned to nontraditional lending may have preferred the experience of online lending and did not realize that many of these trades do not impact their traditional credit file. Regardless of their individual circumstances, consumers who leverage alternative financial services have historically had one thing in common: their performance in these products did nothing to further their access to traditional, and often lower cost, sources of credit. Through Experian’s acquisition and integration of Clarity Services, the nation’s largest alternative finance credit bureau, lenders can gain access to powerful and predictive supplemental credit data that better detect risk while benefiting consumers with a more complete credit history. Alternative finance data can be used across the lending cycle from prospecting to decisioning and account review to collections. Alternative data gives lenders an expanded view of consumer behavior which enables more complete and confident lending decisions. Find out more about Experian’s alternative credit data: www.experian.com/alternativedata.
With scarce resources and limited experience available in the data science field, a majority of organizations are partnering with outside firms to fill gaps within their teams. A report compiled by Hexa Research found that the data analytics outsourcing market is set to expand at a compound annual growth rate of 30 percent between 2016 and 2024, reaching annual revenues of more than $6 billion. With data science becoming a necessity for success, outsourcing these specific skills will be the way of the future. When working with outside firms, you may be given the option between offshore and onshore resources. But how do you decide? Let’s discuss a few things you can consider. Offshore A well-known benefit of using offshore resources is lower cost. Offshore resources provide a larger pool of talent, which includes those who have specific analytical skills that are becoming rare in North America. By partnering with outside firms, you also expose your organization to global best practices by learning from external resources who have worked in different industries and locations. If a partner is investing research and development dollars into specific data science technology or new analytics innovations, you can use this knowledge and apply it to your business. With every benefit, however, there are challenges. Time zone differences and language barriers are things to consider if you’re working on a project that requires a large amount of collaboration with your existing team. Security issues need to be addressed differently when using offshore resources. Lastly, reputational risk also can be a concern for your organization. In certain cases, there may be a negative perception — both internally and externally — of moving jobs offshore, so it’s important to consider this before deciding. Onshore While offshore resources can save your organization money, there are many benefits to hiring onshore analytical resources. Many large projects require cross-functional collaboration. If collaboration is key to the projects you’re managing, onshore resources can more easily blend with your existing resources because of time zone similarities, reduced communication barriers and stronger cultural fit into your organization. In the financial services industry, there also are regulatory guidelines to consider. Offshore resources often may have the skills you’re looking for but don’t have a complete understanding of our regulatory landscape, which can lead to larger problems in the future. Hiring resources with this type of knowledge will help you conduct the analysis in a compliant manner and reduce your overall risk. All of the above Many of our clients — and we ourselves — find that an all-of-the-above approach is both effective and efficient. In certain situations, some timeline reductions can be made by having both onshore and offshore resources working on a project. Teams can include up to three different groups: Local resources who are closest to the client and the problem Resources in a nearby foreign country whose time zone overlaps with that of the local resources More analytical team members around the world whose tasks are accomplished somewhat more independently Carefully focusing on how the partnership works and how the external resources are managed is even more important than where they are located. Read 5 Secrets to Outsourcing Data Science Successfully to help you manage your relationship with your external partner. If your next project calls for experienced data scientists, Experian® can help. Our Analytics on DemandTM service provides senior-level analysts, either offshore or onshore, who can help with analytical data science and modeling work for your organization.
What if you had an opportunity to boost your credit score with a snap of your fingers? With the announcement of Experian BoostTM, this will soon be the new reality. As part of an increasingly customizable and instant consumer reality in the marketplace, Experian is innovating in the space of credit to allow consumers to contribute information to their credit profiles via access to their online bank accounts. For decades, Experian has been a leader in educating consumers on credit: what goes into a credit score, how to raise it and how to maintain it. Now, as part of our mission to be the consumer’s bureau, Experian is ushering in a new age of consumer empowerment with Boost. Through an already established and full-fledged suite of consumer products, Experian Boost is the next generation offering a free online platform that places the control in the consumers’ hands to influence their credit scores. The platform will feature a sign-in verification, during which consumers grant read-only permission for Experian Boost to connect to their online bank accounts to identify utility and telecommunications payments. After they verify their data and confirm that they want the account information added to their credit file, consumers will receive an instant updated FICO® Score. The history behind credit information spans several centuries from a group of London tailors swapping information on customers to keeping credit files on index cards being read out to subscribers over the telephone. Even with the evolution of the credit industry being very much in the digital age today, Experian Boost is a significant step forward for a credit bureau. This new capability educates the consumer on what types of payment behavior impacts their credit score while also empowering them to add information to change it. This is a big win-win for consumers and lenders alike. As Experian is taking the next big step as a traditional credit bureau, adding these data sources is a new and innovative way to help consumers gain access to the quality credit they deserve as well as promoting fair and responsible lending to the industry. Early analysis of Experian’s Boost impact on the U.S. consumer credit scores showed promising results. Here’s a snapshot of some of those findings: These statistics provide an encouraging vision into the future for all consumers, especially for those who have a limited credit history. The benefit to lenders in adding these new data points will be a more complete view on the consumer to make more informed lending decisions. Only positive payment histories will be collected through the platform and consumers can elect to remove the new data at any time. Experian Boost will be available to all credit active adults in early 2019, but consumers can visit www.experian.com/boost now to register for early access. By signing up for a free Experian membership, consumers will receive a free credit report immediately, and will be one of the first to experience the new platform. Experian Boost will apply to most leading consumer credit scores used by lenders. To learn more about the platform visit www.experian.com/boost.
“We don’t know what we don’t know.” It’s a truth that seems to be on the minds of just about every financial institution these days. The market, not-to-mention the customer base, seems to be evolving more quickly now than ever before. Mergers, acquisitions and partnerships, along with new competitors entering the space, are a daily headline. Customers expect the same seamless user experience and instant gratification they’ve come to expect from companies like Amazon in just about every interaction they have, including with their financial institutions. Broadly, financial institutions have been slow to respond both in the products they offer their customers and prospects, and in how they present those products. Not surprisingly, only 26% of customers feel like their financial institutions understand and appreciate their needs. So, it’s not hard to see why there might be uncertainty as to how a financial institution should respond or what they should do next. But what if you could know what you don’t know about your customer and industry data? Sound too good to be true? It’s not—it’s exactly what Experian’s Ascend Analytical Sandbox was built to do. “At OneMain we’ve used Sandbox for a lot of exploratory analysis and feature development,” said Ryland Ely, a modeler at Experian partner client, OneMain Financial and a Sandbox user. For example, “we’ve used a loan amount model built on Sandbox data to try and flag applications where we might be comfortable with the assigned risk grade but we’re concerned we might be extending too much or too little credit,” he said. The first product built on Experian’s big data platform, Ascend, the Analytical Sandbox is an analytics environment that can have enterprise-wide impact. It provides users instant access to near real-time customer data, actionable analytics and intelligence tools, along with a network of industry and support experts to drive the most value out of their data and analytics. Developed with scalability, flexibility, efficiency and security at top-of-mind, the Sandbox is a hybrid-cloud system that leverages the high availability and security of Amazon Web Services. This eliminates the need, time and infrastructure costs associated with creating an internally hosted environment. Additionally, our web-based interface speeds access to data and tools in your dedicated Sandbox all behind the protection of Experian’s firewall. In addition to being supported by a revolutionized tech stack backed by an $825 million annual investment, Sandbox enables use of industry-leading business intelligence tools like SAS, RStudio, H2O, Python, Hue and Tableau. Where the Ascend Sandbox really shines is in the amount and quality of the data that’s put into it. As the largest, global information services provider, the Sandbox brings the full power of Experian’s 17+ years of full-file historical tradeline data, boasting a data accuracy rate of 99.9%. The Sandbox also allows users the option to incorporate additional data sets including commercial small business data and soon real estate data, among others. Alternative data assets add to the 50 million consumers who use some sort of financial service, in addition to rental and utility payments. In addition to including Experian’s data on the 220+ million credit-active consumers, small business and other data sets, the Sandbox also allows companies to integrate their own customer data into the system. All data is depersonalized and pinned to allow companies to fully leverage the value of Experian’s patented attributes and scores and models. Ascend Sandbox allows companies to mine the data for business intelligence to define strategy and translate those findings into data visualizations to communicate and win buy-in throughout their organization. But here is where customers are really identifying the value in this big data solution, taking those business intelligence insights and being able to take the resulting models and strategies from the Sandbox directly into a production environment. After all, amassing data is worthless unless you’re able to use it. That’s why 15 of the top financial institutions globally are using the Experian Ascend Sandbox for more than just benchmarking and data visualization but also risk modeling, score migration, share of wallet, market entry, cross-sell and much more. Moreover, clients are seeing time-savings, deeper insights and reduced compliance concerns as a result of consolidating their production data and development platform inside Sandbox. “Sandbox is often presented as a tool for visualization or reporting, sort of creating summary statistics of what’s going on in the market. But as a modeler, my perspective is that it has application beyond just those things,” said Ely. To learn more about the Experian Ascend Analytical Sandbox and hear more about how OneMain Financial is getting value out of the Sandbox, watch this on-demand webinar.
Your model is only as good as your data, right? Actually, there are many considerations in developing a sound model, one of which is data. Yet if your data is bad or dirty or doesn’t represent the full population, can it be used? This is where sampling can help. When done right, sampling can lower your cost to obtain data needed for model development. When done well, sampling can turn a tainted and underrepresented data set into a sound and viable model development sample. First, define the population to which the model will be applied once it’s finalized and implemented. Determine what data is available and what population segments must be represented within the sampled data. The more variability in internal factors — such as changes in marketing campaigns, risk strategies and product launches — and external factors — such as economic conditions or competitor presence in the marketplace — the larger the sample size needed. A model developer often will need to sample over time to incorporate seasonal fluctuations in the development sample. The most robust samples are pulled from data that best represents the full population to which the model will be applied. It’s important to ensure your data sample includes customers or prospects declined by the prior model and strategy, as well as approved but nonactivated accounts. This ensures full representation of the population to which your model will be applied. Also, consider the number of predictors or independent variables that will be evaluated during model development, and increase your sample size accordingly. When it comes to spotting dirty or unacceptable data, the golden rule is know your data and know your target population. Spend time evaluating your intended population and group profiles across several important business metrics. Don’t underestimate the time needed to complete a thorough evaluation. Next, select the data from the population to aptly represent the population within the sampled data. Determine the best sampling methodology that will support the model development and business objectives. Sampling generates a smaller data set for use in model development, allowing the developer to build models more quickly. Reducing the data set’s size decreases the time needed for model computation and saves storage space without losing predictive performance. Once the data is selected, weights are applied so that each record appropriately represents the full population to which the model will be applied. Several traditional techniques can be used to sample data: Simple random sampling — Each record is chosen by chance, and each record in the population has an equal chance of being selected. Random sampling with replacement — Each record chosen by chance is included in the subsequent selection. Random sampling without replacement — Each record chosen by chance is removed from subsequent selections. Cluster sampling — Records from the population are sampled in groups, such as region, over different time periods. Stratified random sampling — This technique allows you to sample different segments of the population at different proportions. In some situations, stratified random sampling is helpful in selecting segments of the population that aren’t as prevalent as other segments but are equally vital within the model development sample. Learn more about how Experian Decision Analytics can help you with your custom model development needs.
I believe it was George Bernard Shaw that once said something along the lines of, “If economists were laid end-to-end, they’d never come to a conclusion, at least not the same conclusion.” It often feels the same way when it comes to big data analytics around customer behavior. As you look at new tools to put your customer insights to work for your enterprise, you likely have questions coming from across your organization. Models always seem to take forever to develop, how sure are we that the results are still accurate? What data did we use in this analysis; do we need to worry about compliance or security? To answer these questions and in an effort to best utilize customer data, the most forward-thinking financial institutions are turning to analytical environments, or sandboxes, to solve their big data problems. But what functionality is right for your financial institution? In your search for a sandbox solution to solve the business problem of big data, make sure you keep these top four features in mind. Efficiency: Building an internal data archive with effective business intelligence tools is expensive, time-consuming and resource-intensive. That’s why investing in a sandbox makes the most sense when it comes to drawing the value out of your customer data.By providing immediate access to the data environment at all times, the best systems can reduce the time from data input to decision by at least 30%. Another way the right sandbox can help you achieve operational efficiencies is by direct integration with your production environment. Pretty charts and graphs are great and can be very insightful, but the best sandbox goes beyond just business intelligence and should allow you to immediately put models into action. Scalability and Flexibility: In implementing any new software system, scalability and flexibility are key when it comes to integration into your native systems and the system’s capabilities. This is even more imperative when implementing an enterprise-wide tool like an analytical sandbox. Look for systems that offer a hosted, cloud-based environment, like Amazon Web Services, that ensures operational redundancy, as well as browser-based access and system availability.The right sandbox will leverage a scalable software framework for efficient processing. It should also be programming language agnostic, allowing for use of all industry-standard programming languages and analytics tools like SAS, R Studio, H2O, Python, Hue and Tableau. Moreover, you shouldn’t have to pay for software suites that your analytics teams aren’t going to use. Support: Whether you have an entire analytics department at your disposal or a lean, start-up style team, you’re going to want the highest level of support when it comes to onboarding, implementation and operational success. The best sandbox solution for your company will have a robust support model in place to ensure client success. Look for solutions that offer hands-on instruction, flexible online or in-person training and analytical support. Look for solutions and data partners that also offer the consultative help of industry experts when your company needs it. Data, Data and More Data: Any analytical environment is only as good as the data you put into it. It should, of course, include your own client data. However, relying exclusively on your own data can lead to incomplete analysis, missed opportunities and reduced impact. When choosing a sandbox solution, pick a system that will include the most local, regional and national credit data, in addition to alternative data and commercial data assets, on top of your own data.The optimum solutions will have years of full-file, archived tradeline data, along with attributes and models for the most robust results. Be sure your data partner has accounted for opt-outs, excludes data precluded by legal or regulatory restrictions and also anonymizes data files when linking your customer data. Data accuracy is also imperative here. Choose a big data partner who is constantly monitoring and correcting discrepancies in customer files across all bureaus. The best partners will have data accuracy rates at or above 99.9%. Solving the business problem around your big data can be a daunting task. However, investing in analytical environments or sandboxes can offer a solution. Finding the right solution and data partner are critical to your success. As you begin your search for the best sandbox for you, be sure to look for solutions that are the right combination of operational efficiency, flexibility and support all combined with the most robust national data, along with your own customer data. Are you interested in learning how companies are using sandboxes to make it easier, faster and more cost-effective to drive actionable insights from their data? Join us for this upcoming webinar. Register for the Webinar
This is an exciting time to work in big data analytics. Here at Experian, we have more than 2 petabytes of data in the United States alone. In the past few years, because of high data volume, more computing power and the availability of open-source code algorithms, my colleagues and I have watched excitedly as more and more companies are getting into machine learning. We’ve observed the growth of competition sites like Kaggle, open-source code sharing sites like GitHub and various machine learning (ML) data repositories. We’ve noticed that on Kaggle, two algorithms win over and over at supervised learning competitions: If the data is well-structured, teams that use Gradient Boosting Machines (GBM) seem to win. For unstructured data, teams that use neural networks win pretty often. Modeling is both an art and a science. Those winning teams tend to be good at what the machine learning people call feature generation and what we credit scoring people called attribute generation. We have nearly 1,000 expert data scientists in more than 12 countries, many of whom are experts in traditional consumer risk models — techniques such as linear regression, logistic regression, survival analysis, CART (classification and regression trees) and CHAID analysis. So naturally I’ve thought about how GBM could apply in our world. Credit scoring is not quite like a machine learning contest. We have to be sure our decisions are fair and explainable and that any scoring algorithm will generalize to new customer populations and stay stable over time. Increasingly, clients are sending us their data to see what we could do with newer machine learning techniques. We combine their data with our bureau data and even third-party data, we use our world-class attributes and develop custom attributes, and we see what comes out. It’s fun — like getting paid to enter a Kaggle competition! For one financial institution, GBM armed with our patented attributes found a nearly 5 percent lift in KS when compared with traditional statistics. At Experian, we use Extreme Gradient Boosting (XGBoost) implementation of GBM that, out of the box, has regularization features we use to prevent overfitting. But it’s missing some features that we and our clients count on in risk scoring. Our Experian DataLabs team worked with our Decision Analytics team to figure out how to make it work in the real world. We found answers for a couple of important issues: Monotonicity — Risk managers count on the ability to impose what we call monotonicity. In application scoring, applications with better attribute values should score as lower risk than applications with worse values. For example, if consumer Adrienne has fewer delinquent accounts on her credit report than consumer Bill, all other things being equal, Adrienne’s machine learning score should indicate lower risk than Bill’s score. Explainability — We were able to adapt a fairly standard “Adverse Action” methodology from logistic regression to work with GBM. There has been enough enthusiasm around our results that we’ve just turned it into a standard benchmarking service. We help clients appreciate the potential for these new machine learning algorithms by evaluating them on their own data. Over time, the acceptance and use of machine learning techniques will become commonplace among model developers as well as internal validation groups and regulators. Whether you’re a data scientist looking for a cool place to work or a risk manager who wants help evaluating the latest techniques, check out our weekly data science video chats and podcasts.
If your company is like many financial institutions, it’s likely the discussion around big data and financial analytics has been an ongoing conversation. For many financial institutions, data isn’t the problem, but rather what could or should be done with it. Research has shown that only about 30% of financial institutions are successfully leveraging their data to generate actionable insights, and customers are noticing. According to a recent study from Capgemini, 30% of US customers and 26% of UK customers feel like their financial institutions understand their needs. No matter how much data you have, it’s essentially just ones and zeroes if you’re not using it. So how do banks, credit unions, and other financial institutions who capture and consume vast amounts of data use that data to innovate, improve the customer experience and stay competitive? The answer, you could say, is written in the sand. The most forward-thinking financial institutions are turning to analytical environments, also known as a sandbox, to solve the business problem of big data. Like the name suggests, a sandbox is an environment that contains all the materials and tools one might need to create, build, and collaborate around their data. A sandbox gives data-savvy banks, credit unions and FinTechs access to depersonalized credit data from across the country. Using custom dashboards and data visualization tools, they can manipulate the data with predictive models for different micro and macro-level scenarios. The added value of a sandbox is that it becomes a one-stop shop data tool for the entire enterprise. This saves the time normally required in the back and forth of acquiring data for a specific to a project or particular data sets. The best systems utilize the latest open source technology in artificial intelligence and machine learning to deliver intelligence that can inform regional trends, consumer insights and highlight market opportunities. From industry benchmarking to market entry and expansion research and campaign performance to vintage analysis, reject inferencing and much more. An analytical sandbox gives you the data to create actionable analytics and insights across the enterprise right when you need it, not months later. The result is the ability to empower your customers to make financial decisions when, where and how they want. Keeping them happy keeps your financial institution relevant and competitive. Isn’t it time to put your data to work for you? Learn more about how Experian can solve your big data problems. >> Interested to see a live demo of the Ascend Sandbox? Register today for our webinar “Big Data Can Lead to Even Bigger ROI with the Ascend Sandbox.”
Machine learning (ML), the newest buzzword, has swept into the lexicon and captured the interest of us all. Its recent, widespread popularity has stemmed mainly from the consumer perspective. Whether it’s virtual assistants, self-driving cars or romantic matchmaking, ML has rapidly positioned itself into the mainstream. Though ML may appear to be a new technology, its use in commercial applications has been around for some time. In fact, many of the data scientists and statisticians at Experian are considered pioneers in the field of ML, going back decades. Our team has developed numerous products and processes leveraging ML, from our world-class consumer fraud and ID protection to producing credit data products like our Trended 3DTM attributes. In fact, we were just highlighted in the Wall Street Journal for how we’re using machine learning to improve our internal IT performance. ML’s ability to consume vast amounts of data to uncover patterns and deliver results that are not humanly possible otherwise is what makes it unique and applicable to so many fields. This predictive power has now sparked interest in the credit risk industry. Unlike fraud detection, where ML is well-established and used extensively, credit risk modeling has until recently taken a cautionary approach to adopting newer ML algorithms. Because of regulatory scrutiny and perceived lack of transparency, ML hasn’t experienced the broad acceptance as some of credit risk modeling’s more utilized applications. When it comes to credit risk models, delivering the most predictive score is not the only consideration for a model’s viability. Modelers must be able to explain and detail the model’s logic, or its “thought process,” for calculating the final score. This means taking steps to ensure the model’s compliance with the Equal Credit Opportunity Act, which forbids discriminatory lending practices. Federal laws also require adverse action responses to be sent by the lender if a consumer’s credit application has been declined. This requires the model must be able to highlight the top reasons for a less than optimal score. And so, while ML may be able to deliver the best predictive accuracy, its ability to explain how the results are generated has always been a concern. ML has been stigmatized as a “black box,” where data mysteriously gets transformed into the final predictions without a clear explanation of how. However, this is changing. Depending on the ML algorithm applied to credit risk modeling, we’ve found risk models can offer the same transparency as more traditional methods such as logistic regression. For example, gradient boosting machines (GBMs) are designed as a predictive model built from a sequence of several decision tree submodels. The very nature of GBMs’ decision tree design allows statisticians to explain the logic behind the model’s predictive behavior. We believe model governance teams and regulators in the United States may become comfortable with this approach more quickly than with deep learning or neural network algorithms. Since GBMs are represented as sets of decision trees that can be explained, while neural networks are represented as long sets of cryptic numbers that are much harder to document, manage and understand. In future blog posts, we’ll discuss the GBM algorithm in more detail and how we’re using its predictability and transparency to maximize credit risk decisioning for our clients.
The August 2018 LinkedIn Workforce Report states some interesting facts about data science and the current workforce in the United States. Demand for data scientists is off the charts, but there is a data science skills shortage in almost every U.S. city — particularly in the New York, San Francisco and Los Angeles areas. Nationally, there is a shortage of more than 150,000 people with data science skills. One way companies in financial services and other industries have coped with the skills gap in analytics is by using outside vendors. A 2017 Dun & Bradstreet and Forbes survey reported that 27 percent of respondents cited a skills gap as a major obstacle to their data and analytics efforts. Outsourcing data science work makes it easier to scale up and scale down as needs arise. But surprisingly, more than half of respondents said the third-party work was superior to their in-house analytics. At Experian, we have participated in quite a few outsourced analytics projects. Here are a few of the lessons we’ve learned along the way: Manage expectations: Everyone has their own management style, but to be successful, you must be proactively involved in managing the partnership with your provider. Doing so will keep them aligned with your objectives and prevent quality degradation or cost increases as you become more tied to them. Communication: Creating open and honest communication between executive management and your resource partner is key. You need to be able to discuss what is working well and what isn’t. This will help to ensure your partner has a thorough understanding of your goals and objectives and will properly manage any bumps in the road. Help external resources feel like a part of the team: When you’re working with external resources, either offshore or onshore, they are typically in an alternate location. This can make them feel like they aren’t a part of the team and therefore not directly tied to the business goals of the project. To help bridge the gap, performing regular status meetings via video conference can help everyone feel like a part of the team. Within these meetings, providing information on the goals and objectives of the project is key. This way, they can hear the message directly from you, which will make them feel more involved and provide a clear understanding of what they need to do to be successful. Being able to put faces to names, as well as having direct communication with you, will help external employees feel included. Drive engagement through recognition programs: Research has shown that employees are more engaged in their work when they receive recognition for their efforts. While you may not be able to provide a monetary award, recognition is still a big driver for engagement. It can be as simple as recognizing a job well done during your video conference meetings, providing certificates of excellence or sending a simple thank-you card to those who are performing well. Either way, taking the extra time to make your external workforce feel appreciated will produce engaged resources that will help drive your business goals forward. Industry training: Your external resources may have the necessary skills needed to perform the job successfully, but they may not have specific industry knowledge geared towards your business. Work with your partner to determine where they have expertise and where you can work together to providing training. Ensure your external workforce will have a solid understanding of the business line they will be supporting. If you’ve decided to augment your staff for your next big project, Experian® can help. Our Analytics on DemandTM service provides senior-level analysts, either onshore or offshore, who can help with analytical data science and modeling work for your organization.
As I mentioned in my previous blog, model validation is an essential step in evaluating a recently developed predictive model’s performance before finalizing and proceeding with implementation. An in-time validation sample is created to set aside a portion of the total model development sample so the predictive accuracy can be measured on a data sample not used to develop the model. However, if few records in the target performance group are available, splitting the total model development sample into the development and in-time validation samples will leave too few records in the target group for use during model development. An alternative approach to generating a validation sample is to use a resampling technique. There are many different types and variations of resampling methods. This blog will address a few common techniques. Jackknife technique — An iterative process whereby an observation is removed from each subsequent sample generation. So if there are N number of observations in the data, jackknifing calculates the model estimates on N - 1 different samples, with each sample having N - 1 observations. The model then is applied to each sample, and an average of the model predictions across all samples is derived to generate an overall measure of model performance and prediction accuracy. The jackknife technique can be broadened to a group of observations removed from each subsequent sample generation while giving equal opportunity for inclusion and exclusion to each observation in the data set. K-fold cross-validation — Generates multiple validation data sets from the holdout sample created for the model validation exercise, i.e., the holdout data is split into K subsets. The model then is applied to the K validation subsets, with each subset held out during the iterative process as the validation set while the model scores the remaining K-1 subsets. Again, an average of the predictions across the multiple validation samples is used to create an overall measure of model performance and prediction accuracy. Bootstrap technique — Generates subsets from the full model development data sample, with replacement, producing multiple samples generally of equal size. Thus, with a total sample size of N, this technique generates N random samples such that a single observation can be present in multiple subsets while another observation may not be present in any of the generated subsets. The generated samples are combined into a simulated larger data sample that then can be split into a development and an in-time, or holdout, validation sample. Before selecting a resampling technique, it’s important to check and verify data assumptions for each technique against the data sample selected for your model development, as some resampling techniques are more sensitive than others to violations of data assumptions. Learn more about how Experian Decision Analytics can help you with your custom model development.
An introduction to the different types of validation samples Model validation is an essential step in evaluating and verifying a model’s performance during development before finalizing the design and proceeding with implementation. More specifically, during a predictive model’s development, the objective of a model validation is to measure the model’s accuracy in predicting the expected outcome. For a credit risk model, this may be predicting the likelihood of good or bad payment behavior, depending on the predefined outcome. Two general types of data samples can be used to complete a model validation. The first is known as the in-time, or holdout, validation sample and the second is known as the out-of-time validation sample. So, what’s the difference between an in-time and an out-of-time validation sample? An in-time validation sample sets aside part of the total sample made available for the model development. Random partitioning of the total sample is completed upfront, generally separating the data into a portion used for development and the remaining portion used for validation. For instance, the data may be randomly split, with 70 percent used for development and the other 30 percent used for validation. Other common data subset schemes include an 80/20, a 60/40 or even a 50/50 partitioning of the data, depending on the quantity of records available within each segment of your performance definition. Before selecting a data subset scheme to be used for model development, you should evaluate the number of records available in your target performance group, such as number of bad accounts. If you have too few records in your target performance group, a 50/50 split can leave you with insufficient performance data for use during model development. A separate blog post will present a few common options for creating alternative validation samples through a technique known as resampling. Once the data has been partitioned, the model is created using the development sample. The model is then applied to the holdout validation sample to determine the model’s predictive accuracy on data that wasn’t used to develop the model. The model’s predictive strength and accuracy can be measured in various ways by comparing the known and predefined performance outcome to the model’s predicted performance outcome. The out-of-time validation sample contains data from an entirely different time period or customer campaign than what was used for model development. Validating model performance on a different time period is beneficial to further evaluate the model’s robustness. Selecting a data sample from a more recent time period having a fully mature set of performance data allows the modeler to evaluate model performance on a data set that may more closely align with the current environment in which the model will be used. In this case, a more recent time period can be used to establish expectations and set baseline parameters for model performance, such as population stability indices and performance monitoring. Learn more about how Experian Decision Analytics can help you with your custom model development needs.
According to our recent research for the State of Alternative Credit Data, more lenders are using alternative credit data to determine if a consumer is a good or bad credit risk. In fact, when it comes to making decisions: More than 50% of lenders verify income, employment and assets as well as check public records before making a credit decision. 78% of lenders believe factoring in alternative data allows them to extend credit to consumers who otherwise would be declined. 70% of consumers are willing to provide additional financial information to a lender if it increases their chance for approval or improves their interest rate. The alternative financial services space continues to grow with products like payday loans, rent-to-own products, short-term loans and more. By including alternative financial data, all types of lenders can explore both universe expansion and risk mitigation. State of Alternative Credit Data
The traditional credit score has ruled the financial services space for decades, but it‘s clear the way in which consumers are managing their money and credit has evolved. Today’s consumers are utilizing different types of credit via various channels. Think fintech. Think short-term loans. Think cash-checking services and payday. So, how do lenders gain more visibility to a consumer’s credit worthiness in 2018? Alternative credit data has surfaced to provide a more holistic view of all consumers – those on the traditional file and those who are credit invisibles and emerging. In an all-new report, Experian dives into “The State of Alternative Credit Data,” providing in-depth coverage on how alternative credit data is defined, regulatory implications, consumer personas attached to the alternative financial services industry, and how this data complements traditional credit data files. “Alternative credit data can take the shape of alternative finance data, rental, utility and telecom payments, and various other data sources,” said Paul DeSaulniers, Experian’s senior director of Risk Scoring and Trended/Alternative Data and attributes. “What we’ve seen is that when this data becomes visible to a lender, suddenly a much more comprehensive consumer profile is formed. In some instances, this helps them offer consumers new credit opportunities, and in other cases it might illuminate risk.” In a national Experian survey, 53% of consumers said they believe some of these alternative sources like utility bill payment history, savings and checking account transactions, and mobile phone payments would have a positive effect on their credit score. Of the lenders surveyed, 80% said they rely on a credit report, plus additional information when making a lending decision. They cited assessing a consumer’s ability to pay, underwriting insights and being able to expand their lending universe as the top three benefits to using alternative credit data. The paper goes on to show how layering in alternative finance data could allow lenders to identify the consumers they would like to target, as well as suppress those that are higher risk. “Additional data fields prove to deliver a more complete view of today’s credit consumer,” said DeSaulniers. “For the credit invisible, the data can show lenders should take a chance on them. They may suddenly see a steady payment behavior that indicates they are worthy of expanded credit opportunities.” An “unscoreable” individual is not necessarily a high credit risk — rather they are an unknown credit risk. Many of these individuals pay rent on time and in full each month and could be great candidates for traditional credit. They just don’t have a credit history yet. The in-depth report also explores the future of alternative credit data. With more than 90 percent of the data in the world having been generated in just the past five years, there is no doubt more data sources will emerge in the coming years. Not all will make sense in assessing credit decisions, but there will definitely be new ways to capture consumer-permissioned data to benefit both consumer and lender. Read Full Report
In my first blog post on the topic of customer segmentation, I shared with readers that segmentation is the process of dividing customers or prospects into groupings based on similar behaviors. The more similar or homogeneous the customer grouping, the less variation across the customer segments are included in each segment’s custom model development. A thoughtful segmentation analysis contains two phases: generation of potential segments, and the evaluation of those segments. Although several potential segments may be identified, not all segments will necessarily require a separate scorecard. Separate scorecards should be built only if there is real benefit to be gained through the use of multiple scorecards applied to partitioned portions of the population. The meaningful evaluation of the potential segments is therefore an essential step. There are many ways to evaluate the performance of a multiple-scorecard scheme compared with a single-scorecard scheme. Regardless of the method used, separate scorecards are only justified if a segment-based scorecard significantly outperforms a scorecard based on a broader population. To do this, Experian® builds a scorecard for each potential segment and evaluates the performance improvement compared with the broader population scorecard. This step is then repeated for each potential segmentation scheme. Once potential customer segments have been evaluated and the segmentation scheme finalized, the next step is to begin the model development. Learn more about how Experian Decision Analytics can help you with your segmentation or custom model development needs.