|
|
| TBIL2026 | Data Mining | 2+1+0 | ECTS:3 | | Year / Semester | Spring Semester | | Level of Course | Short Cycle | | Status | Elective | | Department | DEPARTMENT of COMPUTER TECHNOLOGIES | | Prerequisites and co-requisites | None | | Mode of Delivery | | | Contact Hours | 14 weeks - 2 hours of lectures and 1 hour of practicals per week | | Lecturer | Öğr. Gör. Dr Zafer YAVUZ | | Co-Lecturer | - | | Language of instruction | Turkish | | Professional practise ( internship ) | None | | | | The aim of the course: | | The primary objective of this course is to provide students with an understanding of the process of transforming raw data into valuable information and to develop the competence to approach business problems (such as customer prediction and market basket analysis) with an analytical perspective. The course aims to enable students to acquire skills in data cleaning, analysis, and the accurate interpretation of results within business processes by focusing on the operational logic and practical applications of algorithms rather than complex mathematical theories. |
| Learning Outcomes | CTPO | TOA | | Upon successful completion of the course, the students will be able to : | | | | LO - 1 : | Defines the Knowledge Discovery from Data (KDD) process, the data warehousing concept, and the lifecycle of data mining. | 1 - 6 | 1, | | LO - 2 : | Applies data cleaning and preprocessing techniques required to prepare raw data for analysis. | 2 - 3 | 1,6, | | LO - 3 : | Identifies the appropriate method for business problems such as customer segmentation (Clustering) and market basket analysis (Association Rules). | 3 | 1, | | LO - 4 : | Interprets the results obtained from data mining analyses and reports them using data visualization tools. | 2 - 13 | 1,6, | | CTPO : Contribution to programme outcomes, TOA :Type of assessment (1: written exam, 2: Oral exam, 3: Homework assignment, 4: Laboratory exercise/exam, 5: Seminar / presentation, 6: Term paper), LO : Learning Outcome | | |
| Introduction to Data Mining and the KDD Process: The Data-to-Information pyramid, the Knowledge Discovery in Databases (KDD) process, and the CRISP-DM methodology (The lifecycle of data mining).
Data Warehousing and OLAP Concepts: The difference between Databases and Data Warehouses, operational data vs. historical data, and the logic of Multi-Dimensional Data Analysis (OLAP).
Data Preprocessing and Cleaning: Detection of dirty data, handling missing values, and preparing data for analysis (Normalization).
Association Rules (Market Basket Analysis): The logic of "Those who bought this also bought that," the working principle of the Apriori algorithm, and cross-selling applications.
Clustering and Segmentation: Grouping similar records, customer profiling, and segmentation studies (K-Means logic).
Classification and Prediction: Extracting data-driven rules using Decision Trees and simple classification scenarios.
Data Visualization and Business Intelligence: Visualizing mining results, the Dashboard concept, and interpretation of findings.
|
| |
| Course Syllabus | | Week | Subject | Related Notes / Files | | Week 1 | Intro: Welcome to the World of Data Mining
Real-life examples (Netflix, Spotify, Google Maps). What is Big Data? What is the purpose of Data Mining?
| | | Week 2 | Understanding the Process: KDD and Data Warehouse
The path from raw data to information (KDD Process). Difference between database and data warehouse. What is operational data? | | | Week 3 | Data Types and Intro to Excel
Attributes and records concepts. Structured vs. unstructured data. Exploring data in Excel, simple sorting and filtering.
| | | Week 4 | Practice Block 1: Fighting Dirty Data
Scenario: "The boss sent a corrupted customer list."
Finding missing values, correcting erroneous entries, removing duplicates (Excel)
| | | Week 5 | Classification 1: Statistical Learning (Naive Bayes)
Thinking based on probability. How does a spam filter work? The logic of "Look at the past, predict the future".
| | | Week 6 | Classification 2: Decision Trees
Algorithms that decide like humans. What are root, node, and leaf? How to read and interpret a tree?
| | | Week 7 | Practice Block 2: Building a Decision Tree
Drawing a decision tree on paper based on a scenario (e.g., Should loan be approved?) and writing rules in Excel using "IF" formulas.
| | | Week 8 | Practice Block 3: "How Accurate is Our Prediction?" (Model Testing)
Testing the rules created last week on "Test Data". Comparing "Prediction" vs "Actual" columns in Excel. Calculating simple Accuracy rate. | | | Week 9 | Mid-term exam | | | Week 10 | Clustering and Segmentation
Difference from classification. Grouping customers. Logic of K-Means algorithm (Proximity to center).
| | | Week 11 | Practice Block 3: Customer Segmentation
Grouping customers as "VIP", "Standard", and "Risky" based on spending using simple filters or Pivot Tables in Excel.
| | | Week 12 | Association Rules (Market Basket Analysis)
The "Diapers and Beer" example. Apriori logic. What is Cross-selling?
| | | Week 13 | Practice Block 4: Basket Analysis
Analysis of finding which products are sold together on a market receipt dataset using observation and simple counting methods.
| | | Week 14 | Data Visualization and Storytelling
How do we present results to the manager? Pie chart or bar chart? What is a Dashboard?
| | | Week 15 | Final Project Presentations
Students apply a technique learned during the term (Cleaning, Clustering, etc.) on a small dataset and present it.
| | | Week 16 | Final exam
| | | |
| 1 | Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking;
Authors: Foster Provost & Tom Fawcett
ISBN-10: 1449361323
ISBN-13: 978-1449361327
Edition: 1st
Publisher: O'Reilly Media
Publication date: September 17, 2013 | | | |
| 1 | Veri Madenciliği Kavram ve Algoritmaları
Doç. Dr. Gökhan Silahtaroğlu PAPATYA BİLİM | | | |
| Method of Assessment | | Type of assessment | Week No | Date | Duration (hours) | Weight (%) | | Mid-term exam | 9 | | 1 | 20 | | Project | Dönem Boyunca | | 20 | 30 | | End-of-term exam | 16 | | 1 | 50 | | |
| Student Work Load and its Distribution | | Type of work | Duration (hours pw) | No of weeks / Number of activity | Hours in total per term | | Yüz yüze eğitim | 2 | 14 | 28 | | Sınıf dışı çalışma | 1 | 10 | 10 | | Arasınav için hazırlık | 4 | 1 | 4 | | Arasınav | 1 | 1 | 1 | | Uygulama | 1 | 14 | 14 | | Proje | 3 | 5 | 15 | | Dönem sonu sınavı için hazırlık | 4 | 2 | 8 | | Dönem sonu sınavı | 1 | 1 | 1 | | Total work load | | | 81 |
|