|
BILL5210 | Knowledge Discovery in Large Data Sets | 3+0+0 | ECTS:7.5 | Year / Semester | Fall Semester | Level of Course | Second Cycle | Status | Elective | Department | DEPARTMENT of COMPUTER ENGINEERING | Prerequisites and co-requisites | None | Mode of Delivery | Face to face | Contact Hours | 14 weeks - 3 hours of lectures per week | Lecturer | Dr. Öğr. Üyesi Murat AYKUT | Co-Lecturer | | Language of instruction | Turkish | Professional practise ( internship ) | None | | The aim of the course: | The course intends to teach the students for the principles of knowledge discovery in large data sets, and the ability to use the popular methods in this area. |
Programme Outcomes | CTPO | TOA | Upon successful completion of the course, the students will be able to : | | | PO - 1 : | Understand the basic concepts of knowledge discovery and data mining. | 1,14 | 1, | PO - 2 : | Gain knowledge on how preprocessing methods work. | 11,14 | 1 | PO - 3 : | Design and implement supervised / unsupervised learning methods, outlier detection methods and association rules.. | 12,14 | 1,3 | PO - 4 : | Design and implement advanced data mining methods. | 11,15 | 1,3 | CTPO : Contribution to programme outcomes, TOA :Type of assessment (1: written exam, 2: Oral exam, 3: Homework assignment, 4: Laboratory exercise/exam, 5: Seminar / presentation, 6: Term paper), PO : Learning Outcome | |
Basic Concepts; Preprocessing Methods; Feature Extraction; Outlier Analysis; Supervised Learning; Statistical Learning Theory; Instance-based Learning; Decision Trees; Clustering; Association Rules; Advances in Data Mining, Advanced Data Mining Methods. |
|
Course Syllabus | Week | Subject | Related Notes / Files | Week 1 | Concepts - Knowledge discovery, data mining, big data sets, data warehouses | | Week 2 | Preprocessing Methods: Data cleaning, missing feature value handling, dimension reduction, discretization methods, feature extraction | | Week 3 | Outlier analysis: Extreme value analysis, probabilistic models, clustering for outlier detection, distance-based outlier detection, information-theoretic models, outlier validity | | Week 4 | Supervised learning - Statistical learning theory, statistical inference, regression estimation, model estimation | | Week 5 | Bayesian inference, analysis of variance, linear discriminant analysis, Support Vector Machines | | Week 6 | Instance Based Learning - Reducing the number of examples, pruning noisy examples, weighting features, instance based learning methods | | Week 7 | Decision Trees - C 4.5 algorithm, unknown feature values, limitations of decision trees, associated classification method | | Week 8 | Clustering analysis Similarity criteria, agglomerative hierarchical clustering, discriminative clustering, incremental clustering, graph and probability-based clustering | | Week 9 | Midterm exam | | Week 10 | Association Rules - algorithm prior, multidimensional association rules, path extraction, web mining, text mining | | Week 11 | Advances in Data Mining: Graph mining, temporal data mining, spatial data mining, distributed data mining | | Week 12 | Advanced Methods: Multi-label data mining, meta learning, data mining on imbalanced datasets, ensemble methods | | Week 13 | Scalable classification, regression modeling with numerical classes, semi-supervised learning, active learning | | Week 14 | Visualization Methods: Perception and visualization, scientific visualization, angular visualization, visualization using SOM | | Week 15 | Project | | Week 16 | Final exam | | |
1 | O. ve Rokach, L., Data Mining and Knowledge Discovery Handbook, Maimon, Springer, 2010, 1285 sayfa. | | |
Method of Assessment | Type of assessment | Week No | Date | Duration (hours) | Weight (%) | Mid-term exam | 9 | | 2 | 30 | Project | 15 | | 2 | 20 | End-of-term exam | 16 | | 3 | 50 | |
Student Work Load and its Distribution | Type of work | Duration (hours pw) | No of weeks / Number of activity | Hours in total per term | Yüz yüze eğitim | 3 | 14 | 42 | Sınıf dışı çalışma | 4 | 14 | 56 | Arasınav için hazırlık | 4 | 3 | 12 | Arasınav | 2 | 1 | 2 | Proje | 5 | 14 | 70 | Dönem sonu sınavı için hazırlık | 5 | 3 | 15 | Dönem sonu sınavı | 3 | 1 | 3 | Total work load | | | 200 |
|