Written Practical Report using Rapid Miner Software
Assignment 4 relates to the specific course learning objectives 1, 2 and 4 and associated MBA program learning goals and skills: Global Content, Problem solving, Change, Critical thinking, and Written Communication at level 3. 1. demonstrate applied knowledge of people, markets, finances, technology and management in a global context of business intelligence practice (data warehouse design, data mining process, data visualisation and performance management) and resulting organisational change and how these apply to implementation of business intelligence in organisation systems and business processes 2. identify and solve complex organisational problems creatively and practically through the use of business intelligence and critically reflect on how evidence based decision making and sustainable business performance management can effectively address real world problems 4. demonstrate the ability to communicate effectively in a clear and concise manner in written report style for senior management with correct and appropriate acknowledgment of main ideas presented and discussed. The key frameworks, concepts and activities covered in modules 2–12 and more specifically modules 6 to 12 are particularly relevant for this assignment. This assignment consists of three tasks 1, 2 and 3 and builds on the research and analysis you conducted in Assignment 2. Task 1 is concerned with developing and evaluating a model of key factors impacting on credit risk ratings for loan applications in determining whether approve a loan or not approve a loan. Task 2 is concerned with the key opportunities and challenges associated with the implementation and utilisation of business intelligence systems. Task 3 is concerned with performance management and provides you with the opportunity to design and build an interactive sales performance dashboard with drill down capability using Tableau 8.0 Desktop or pivot tables. Task 1 (Worth 40 marks) In Task 1 of this Assignment 4 you are required to follow the six step CRISP DM process and make use of the data mining tool RapidMiner to analyse and report on the creditrisk_train. csv and creditrisk_score.csv data sets provided for Assignment 4. You should refer to the data dictionary for creditrisk_train.csv (see Table 1 below). In Task 1 and 2 of Assignment 4 you are required to consider all of the business understanding, data understanding, data preparation, modelling, evaluation and deployment phases of the CRISP DM process. ￼￼￼￼￼￼Table 1 Data Dictionary for creditrisk_train.csv Variable ￼￼￼Description ￼￼￼￼￼Row.No ￼￼￼Unique identifier for each row – integer ￼￼￼￼￼Application.ID ￼￼￼￼￼￼Unique identifier for loan application – integer ￼￼￼￼Credit.Score ￼Credit score given to the loan application This is a measure of the creditworthiness of the applicant. (http://en.wikipedia.org/wiki/Credit_score _in_the_United_States) http://www.buzzle.com/articles/credit-score- rating-scale.html ￼￼￼￼￼Late.Payments ￼￼￼￼￼￼History of late payments with existing loans ￼￼￼￼Months.In.Job ￼￼￼￼Months in current job ￼￼￼￼Debt.To.Income.Ratio The Percentage Of consumer’s gross income that goes toward paying debts (http://en.wikipedia.org/wiki/Debt_to_inc ome_ratio) ￼￼￼￼￼￼Loan.Amount ￼￼￼￼￼￼Loan amount requested ￼￼￼￼Liquid.Assets ￼Liquid assets ￼￼￼￼￼Num.Credit.Lines ￼￼￼￼￼￼Number of credit lines ￼￼￼￼Credit.Risk ￼￼￼￼Credit risk rating (Very Low, Low, Moderate, High, Do not lend) http://www.dico.com/design/Publications/En/ By-law5-CommercialLendingPractices- May2005- UpdatedMay2008/CreditRiskRatings.pdf ￼￼a) Research the concepts of credit risk and credit scoring in determining whether a financial institution should lend at an appropriate level of risk or not lend to a loan application. This will provide you with a business understanding of the dataset you will be analysing in Assignment 4. Identify which (variables) attributes can be omitted from your credit risk data mining model and why. Comment on your findings in relation to determining the credit risk of loan applicants. b) Conduct an exploratory analysis of the creditrisk_train.csv data set. Are there any missing values, variables with unusual patterns? How consistent are the characteristics of the creditrisk_train.csv and creditrisk_score.csv datasets? Are there any interesting relationships between the potential predictor variables and your target variable credit risk? (Hint: identify the variables that will allow you to split the data set into subgroups). Comment on what variables in the data set creditrisk_train.csv might influence differences in credit scores and credit risk ratings and possible approval or rejection of loan applications? c) Run a decision tree analysis using RapidMiner. Consider what variables you will want to include in this analysis and report on the results. (Hint: Identify what is your target variable and what are your predictor variables?) Comment on the results of your final model. d) Run a neural network analysis using RapidMiner, Again consider what variables you will want to include in this analysis and report on the results. (Hint: Identify what is your target variable and what are your predictor variables?) Comment on the results of your final model. e) Based on the results of the Decision Tree analysis and Neural Network analysis - What are the key variables and rules for predicting either good credit risk or bad credit risk? (Hint: with RapidMiner you will need to validate your models on the creditrisk_train.csv data using a number of validation processes for the two models you have generated previously using decision trees and neural network models). Comment on your two predictive models for credit risk scoring in relation to a false/positive matrix, lift chart and ROC chart (Note: for the evaluation operator reports – a Lift chart and a ROC chart you will need to convert the target variable credit.risk to a nominal variable with two values (Good and Bad). Comment on the results of your final model. Overall for Task 1 you need to report the output of each analysis in sub task activities a to e and briefly comment on the important aspects of each analysis and relevance to customer segmentation, behaviours and propensity to default on a loan (Note you will find the North Text book an invaluable reference for the data mining process activities (Approx 1500 words). Note the important outputs from your statistical analyses in RapidMiner should be included as appendices in your report to provide support your conclusions regarding each analysis and are not included in the word count Task 2 (Worth 15 marks) For the deployment phase of the CRISP DM process discuss the key opportunities and challenges including socio-technical change management associated with the implementation and utilisation of a business intelligence system which supports improved evidence based decision making in organisations. (1500 words approx.) Task 3 (Worth 35 marks) Scenario Global Bike International (GBI) is a world class bicycle company serving both professional and amateur cyclists. The company sells bicycles and accessories. In the touring bike category, GBI’s handcrafted bicycles have won numerous design awards and are sold in over 10 countries. GBI’s signature composite frames are world-renowned for their strength, low weight and easy maintenance. GBI bikes are consistently ridden in the Tour de France and other major international road races. GBI produces two models of their signature road bikes, a deluxe and professional model. The key difference between the two models is the type of wheels used, aluminium for the basic model and carbon composite for the professional model. GBI’s off-road bikes are also recognized as incredibly tough and easy to maintain. GBI off-road bikes are the preferred choice of world champion off road racers and have become synonymous with performance and strength in one of the most gruelling sports in the world. GBI produces two types of off-road bike, a men’s and women’s model. The basic difference between the two models is the smaller size and ergonomic shaping of the women’s frame. GBI also sells an accessories product line comprised of helmets, t-shirts and other riding accessories. GBI partners with only the highest quality suppliers of accessories which will help enhance riders’ performance and comfort while riding GBI bikes. The Figure below displays the GBI range of products. Traditionally GBI was a wholesaler who sold their bikes to retailers who then resold the bikes to the end consumers. Recently GBI has decided to sell their bike to the end consumer via the internet. Organisational Structure Rules have been kept simple: GBI’s headquarters are located in Dallas and the European subsidiary company (GBI Europe) is based in Heidelberg, Germany. In regards to the GBI sales process there are two sales organisations for America (Eastern US and Western US) and two for Germany (Northern Germany and Southern Germany). All sales organisations have a wholesale distribution channel responsible for delivering the products to the customers. However only one sales organisation is required in each country to support internet sales. The diagram below displays the GBI organisation to support the sales process. Dashboard GBI management require a sales dashboard to be created to provide greater insight to their sales data to understand the trends and sales performance. They want the flexibility to ￼￼ visualize sales data in a number of different ways. They want to be able to get a quick overview of the data and then be able to zoom and filter on particular aspects and then get further details as required. The specific information they are concerned with is the following four sales performance reports. 1. Sales Revenue and Sales Gross Profit by Week, Month, and Year 2. Sales Revenue and Sales Gross Profit by Product/Product Category 3. Sales Revenue and Sales Gross Profit by sales organisation 4. Sales Revenue and Sales Gross Profit by country The CEO of the GBI needs each morning an overview of how the company is performing. He has a very busy schedule and needs the information to be displayed in less than 5 seconds. The data has been extracted from the GBI’s SAP enterprise resource planning system and has been made available in spread sheet format. You can re-organise the spread sheet as you determine necessary to support the dashboard. GBI Sales spread sheet data set is available on the course study desk Your task 3 is create (a) dashboard to satisfy the GBI management requirements for the four specified sales performance reports: 1. Sales Revenue and Sales Gross Profit by Week, Month, and Year 2. Sales Revenue and Sales Gross Profit by Product/Product Category 3. Sales Revenue and Sales Gross Profit by sales organisation 4. Sales Revenue and Sales Gross Profit by country (b) provide a rationale for the graphic design and functionality that is provided in your dashboard for GBI in terms of how it meets GBI management requirements for four specified sales performance report (1000 words approx). You will need to submit your Tableau workbook in .twbx format which contains your dashboard as a separate document to your main report for Assignment 4.
Looking for a Solution to the Assignment above, we have a team of experts who have
a complete expertise in completing this assignment within your specified deadline. The
assignment will be uniquely made for you and will be delivered along with Turnitin Plagiarism