- 4500 Words
Assignment 4 relates to the specific course learning objectives 1, 2 and 4 and associated MBA
program learning goals and skills: Global Content, Problem solving, Change, Critical thinking,
and Written Communication at level 3.
1. demonstrate applied knowledge of people, markets, finances, technology and management in
a global context of business intelligence practice (data warehouse design, data mining process,
data visualisation and performance management) and resulting organisational change and how
these apply to implementation of business intelligence in organisation systems and business
processes
2. identify and solve complex organisational problems creatively and practically through the
use of business intelligence and critically reflect on how evidence based decision making and
sustainable business performance management can effectively address real world problems
4. demonstrate the ability to communicate effectively in a clear and concise manner in written
report style for senior management with correct and appropriate acknowledgment of main ideas
presented and discussed.
The key frameworks, concepts and activities covered in modules 2–12 and more specifically
modules 6 to 12 are particularly relevant for this assignment. This assignment consists of three
tasks 1, 2 and 3 and builds on the research and analysis you conducted in Assignment 2. Task 1
is concerned with developing and evaluating a model of key factors impacting on credit risk
ratings for loan applications in determining whether approve a loan or not approve a loan. Task
2 is concerned with the key opportunities and challenges associated with the implementation
and utilisation of business intelligence systems. Task 3 is concerned with performance
management and provides you with the opportunity to design and build an interactive sales
performance dashboard with drill down capability using Tableau 8.0 Desktop or pivot tables.
Task 1 (Worth 40 marks)
In Task 1 of this Assignment 4 you are required to follow the six step CRISP DM process and
make use of the data mining tool RapidMiner to analyse and report on the creditrisk_train. csv
and creditrisk_score.csv data sets provided for Assignment 4. You should refer to the data
dictionary for creditrisk_train.csv (see Table 1 below). In Task 1 and 2 of Assignment 4 you
are required to consider all of the business understanding, data understanding, data preparation,
modelling, evaluation and deployment phases of the CRISP DM process.
Table 1 Data Dictionary for
creditrisk_train.csv Variable
Row.No Unique identifier for each row – integer
Application.ID Unique identifier for loan application – integer
Credit.Score Credit score given to the loan application
Late.Payments History of late payments with existing loans
Months.In.Job Months in current job
Debt.To.Income.Ratio The Percentage Of consumer’s gross
Loan.Amount Loan amount requested
Liquid.Assets Liquid assets
Num.Credit.Lines Number of credit lines
Credit.Risk Credit risk rating (Very Low, Low, Moderate,
a) Research the concepts of credit risk and credit scoring in determining whether a financial
institution should lend at an appropriate level of risk or not lend to a loan application. This will
provide you with a business understanding of the dataset you will be analysing in Assignment 4.
Identify which (variables) attributes can be omitted from your credit risk data mining model and
why. Comment on your findings in relation to determining the credit risk of loan applicants.
b) Conduct an exploratory analysis of the creditrisk_train.csv data set. Are there any missing
values, variables with unusual patterns? How consistent are the characteristics of the
creditrisk_train.csv and creditrisk_score.csv datasets? Are there any interesting relationships
between the potential predictor variables and your target variable credit risk? (Hint: identify the
variables that will allow you to split the data set into subgroups). Comment on what variables in the
data set creditrisk_train.csv might influence differences in credit scores and credit risk ratings and
possible approval or rejection of loan applications?
c) Run a decision tree analysis using RapidMiner. Consider what variables you will want to include
in this analysis and report on the results. (Hint: Identify what is your target variable and what are
your predictor variables?) Comment on the results of your final model.
d) Run a neural network analysis using RapidMiner, Again consider what variables you will want
to include in this analysis and report on the results. (Hint: Identify what is your target variable and
what are your predictor variables?) Comment on the results of your final model.
e) Based on the results of the Decision Tree analysis and Neural Network analysis - What are the
key variables and rules for predicting either good credit risk or bad credit risk? (Hint: with
RapidMiner you will need to validate your models on the creditrisk_train.csv data using a number
of validation processes for the two models you have generated previously using decision trees and
neural network models). Comment on your two predictive models for credit risk scoring in relation
to a false/positive matrix, lift chart and ROC chart (Note: for the evaluation operator reports – a Lift
chart and a ROC chart you will need to convert the target variable credit.risk to a nominal variable
with two values (Good and Bad). Comment on the results of your final model.
Overall for Task 1 you need to report the output of each analysis in sub task activities a to e
and briefly comment on the important aspects of each analysis and relevance to customer
segmentation, behaviours and propensity to default on a loan (Note you will find the North
Text book an invaluable reference for the data mining process activities (Approx 1500 words).
Note the important outputs from your statistical analyses in RapidMiner should be included as
appendices in your report to provide support your conclusions regarding each analysis and are
not included in the word count
Task 2 (Worth 15 marks)
For the deployment phase of the CRISP DM process discuss the key opportunities and
challenges including socio-technical change management associated with the implementation
and utilisation of a business intelligence system which supports improved evidence based
decision making in organisations. (1500 words approx.)
Task 3 (Worth 35 marks)
Scenario
Global Bike International (GBI) is a world class bicycle company serving both professional
and amateur cyclists. The company sells bicycles and accessories. In the touring bike
category, GBI’s handcrafted bicycles have won numerous design awards and are sold in over
10 countries. GBI’s signature composite frames are world-renowned for their strength, low
weight and easy maintenance. GBI bikes are consistently ridden in the Tour de France and
other major international road races.
GBI produces two models of their signature road bikes, a deluxe and professional model. The
key difference between the two models is the type of wheels used, aluminium for the basic
model and carbon composite for the professional model. GBI’s off-road bikes are also
recognized as incredibly tough and easy to maintain. GBI off-road bikes are the preferred
choice of world champion off road racers and have become synonymous with performance
and strength in one of the most gruelling sports in the world.
GBI produces two types of off-road bike, a men’s and women’s model. The basic difference
between the two models is the smaller size and ergonomic shaping of the women’s frame.
GBI also sells an accessories product line comprised of helmets, t-shirts and other riding
accessories.
GBI partners with only the highest quality suppliers of accessories which will help enhance
riders’ performance and comfort while riding GBI bikes. The Figure below displays the GBI
range of products.
Traditionally GBI was a wholesaler who sold their bikes to retailers who then resold the bikes
to the end consumers. Recently GBI has decided to sell their bike to the end consumer via the
internet.
Organisational Structure
Rules have been kept simple:
GBI’s headquarters are located in Dallas and the European subsidiary company (GBI Europe)
is based in Heidelberg, Germany. In regards to the GBI sales process there are two sales
organisations for America (Eastern US and Western US) and two for Germany (Northern
Germany and Southern Germany). All sales organisations have a wholesale distribution
channel responsible for delivering the products to the customers. However only one sales
organisation is required in each country to support internet sales. The diagram below displays
the GBI organisation to support the sales process.
Dashboard
GBI management require a sales dashboard to be created to provide greater insight to their
sales data to understand the trends and sales performance. They want the flexibility to
visualize sales data in a number of different ways. They want to be able to get a quick
overview of the data and then be able to zoom and filter on particular aspects and then get
further details as required.
The specific information they are concerned with is the following four sales performance
reports.
1. Sales Revenue and Sales Gross Profit by Week, Month, and Year
2. Sales Revenue and Sales Gross Profit by Product/Product Category
3. Sales Revenue and Sales Gross Profit by sales organisation
4. Sales Revenue and Sales Gross Profit by country
The CEO of the GBI needs each morning an overview of how the company is performing. He
has a very busy schedule and needs the information to be displayed in less than 5 seconds.
The data has been extracted from the GBI’s SAP enterprise resource planning system and has
been made available in spread sheet format.
You can re-organise the spread sheet as you determine necessary to support the dashboard.
GBI Sales spread sheet data set is available on the course study desk
Your task 3 is create
(a) dashboard to satisfy the GBI management requirements for the four specified sales
performance reports:
1. Sales Revenue and Sales Gross Profit by Week, Month, and Year
2. Sales Revenue and Sales Gross Profit by Product/Product Category
3. Sales Revenue and Sales Gross Profit by sales organisation
4. Sales Revenue and Sales Gross Profit by country
(b) provide a rationale for the graphic design and functionality that is provided in your
dashboard for GBI in terms of how it meets GBI management requirements for four specified
sales performance report (1000 words approx). You will need to submit your Tableau
workbook in .twbx format which contains your dashboard as a separate document to your main
report for Assignment 4.