|
|
|
Course Outline:
"Successful Data Mining"
The following subjects will be covered in the context of the
sample data sets.
OVERVIEW OF DATA MINING
- Definition
- Business problems
- Types of analysis: descriptive (clustering, association
detection, sequence detection); predictive (classification,
regression, time series);
- The Two Crows data mining process
- Why building predictive models is a difficult problem
IDENTIFYING THE BUSINESS PROBLEM
- Targeting data mining applications
- Uses of data mining
PREPARING THE DATA FOR MINING
- Building the mining database: architecture, types of data,
collecting the data
- Data quality: data consolidation, missing values, erroneous
values, outliers
- Understanding and transforming the data: visualizations,
statistical profiling
- Selecting data: columns (reducing dimensionality); rows (sampling)
- Transforming the data: data representation (scaling, binning,
encoding, time series)
MINING E-COMMERCE DATA
- Building the e-commerce database
- Personalizing the interaction
- Making recommendations
- What can go wrong
BUILDING THE MODEL
- The train, test, validate cycle
- Validating the model
- Model evaluation
- What can go wrong
MODEL TYPES AND ALGORITHMS
- Classical regression (linear and non-linear), logistic regression
- Decision trees
- Neural nets
- K-nearest neighbor
- MARS
DATA MINING PRODUCT SELECTION
- Types of data mining tools
- Analytic applications
- Market overview
|