|
|
| BILL5210 | Knowledge Discovery in Large Data Sets | 3+0+0 | ECTS:7.5 | | Year / Semester | Fall Semester | | Level of Course | Second Cycle | | Status | Elective | | Department | DEPARTMENT of COMPUTER ENGINEERING | | Prerequisites and co-requisites | None | | Mode of Delivery | Face to face | | Contact Hours | 14 weeks - 3 hours of lectures per week | | Lecturer | Dr. Öğr. Üyesi Murat AYKUT | | Co-Lecturer | | | Language of instruction | Turkish | | Professional practise ( internship ) | None | | | | The aim of the course: | | The course intends to teach the students for the principles of knowledge discovery in large data sets, and the ability to use the popular methods in this area. |
| Programme Outcomes | CTPO | TOA | | Upon successful completion of the course, the students will be able to : | | | | PO - 1 : | Understand the basic concepts of knowledge discovery and data mining. | 1 - 14 | 1, | | PO - 2 : | Gain knowledge on how preprocessing methods work. | 11 - 14 | 1 | | PO - 3 : | Design and implement supervised / unsupervised learning methods, outlier detection methods and association rules.. | 12 - 14 | 1,3 | | PO - 4 : | Design and implement advanced data mining methods. | 11 - 15 | 1,3 | | CTPO : Contribution to programme outcomes, TOA :Type of assessment (1: written exam, 2: Oral exam, 3: Homework assignment, 4: Laboratory exercise/exam, 5: Seminar / presentation, 6: Term paper), PO : Learning Outcome | | |
| Basic Concepts; Preprocessing Methods; Feature Extraction; Outlier Analysis; Supervised Learning; Statistical Learning Theory; Instance-based Learning; Decision Trees; Clustering; Association Rules; Advances in Data Mining, Advanced Data Mining Methods. |
| |
| Course Syllabus | | Week | Subject | Related Notes / Files | | Week 1 | Concepts - Knowledge discovery, data mining, big data sets, data warehouses | | | Week 2 | Preprocessing Methods: Data cleaning, missing feature value handling, dimension reduction, discretization methods, feature extraction | | | Week 3 | Outlier analysis: Extreme value analysis, probabilistic models, clustering for outlier detection, distance-based outlier detection, information-theoretic models, outlier validity | | | Week 4 | Supervised learning - Statistical learning theory, statistical inference, regression estimation, model estimation | | | Week 5 | Bayesian inference, analysis of variance, linear discriminant analysis, Support Vector Machines | | | Week 6 | Instance Based Learning - Reducing the number of examples, pruning noisy examples, weighting features, instance based learning methods | | | Week 7 | Decision Trees - C 4.5 algorithm, unknown feature values, limitations of decision trees, associated classification method | | | Week 8 | Clustering analysis Similarity criteria, agglomerative hierarchical clustering, discriminative clustering, incremental clustering, graph and probability-based clustering | | | Week 9 | Midterm exam | | | Week 10 | Association Rules - algorithm prior, multidimensional association rules, path extraction, web mining, text mining | | | Week 11 | Advances in Data Mining: Graph mining, temporal data mining, spatial data mining, distributed data mining | | | Week 12 | Advanced Methods: Multi-label data mining, meta learning, data mining on imbalanced datasets, ensemble methods | | | Week 13 | Scalable classification, regression modeling with numerical classes, semi-supervised learning, active learning | | | Week 14 | Visualization Methods: Perception and visualization, scientific visualization, angular visualization, visualization using SOM | | | Week 15 | Project | | | Week 16 | Final exam | | | |
| 1 | O. ve Rokach, L., Data Mining and Knowledge Discovery Handbook, Maimon, Springer, 2010, 1285 sayfa. | | | |
| Method of Assessment | | Type of assessment | Week No | Date | Duration (hours) | Weight (%) | | Mid-term exam | 9 | | 2 | 30 | | Project | 15 | | 2 | 20 | | End-of-term exam | 16 | | 3 | 50 | | |
| Student Work Load and its Distribution | | Type of work | Duration (hours pw) | No of weeks / Number of activity | Hours in total per term | | Yüz yüze eğitim | 3 | 14 | 42 | | Sınıf dışı çalışma | 4 | 14 | 56 | | Arasınav için hazırlık | 4 | 3 | 12 | | Arasınav | 2 | 1 | 2 | | Proje | 5 | 14 | 70 | | Dönem sonu sınavı için hazırlık | 5 | 3 | 15 | | Dönem sonu sınavı | 3 | 1 | 3 | | Total work load | | | 200 |
|