Course Description
This course provides an accessible introduction to the core principles and techniques of data mining and big data analytics. Students will learn how to explore, preprocess, and analyze data to extract meaningful patterns and insights. The course introduces fundamental concepts such as classification, clustering, and association rule mining using real-world examples. It covers the basics of big data platforms like Hadoop and Spark to demonstrate how data mining scales in modern distributed environments. This course emphasizes intuitive understanding and practical applications over mathematical rigor, making it ideal for students with little or no prior background in data science. (3 credits)
Prerequisite
- None
Student Learning Outcomes (SLOs)
Upon successful completion of the course, the student will be able to:
- Describe the fundamental concepts of data mining and big data and their role in modern analytics.
- Apply data preprocessing techniques such as cleaning, normalization, and transformation.
- Perform basic classification tasks using intuitive methods like decision trees and k-nearest neighbors.
- Apply clustering techniques to group data based on similarity and explore how to interpret the results.
- Generate simple association rules from transactional data and explain their practical uses.
- Explain the architecture and role of big data tools such as Hadoop, MapReduce, and Spark.
- Analyze large datasets using scalable data mining techniques in distributed environments.
- Create insights and findings from data through small projects or case studies using real-world data.
Course Activities and Grading
| Assignments | Points | Weight |
|---|---|---|
Discussions (Weeks 1-7) | 70 | 20% |
Assignments (Weeks 1-7) | 325 | 50% |
Presentations (Week 8) | 30 | 5% |
Final Project (Weeks 6-8) | 100 | 25% |
Total | 525 | 100% |
Required Textbooks
Available through Charter Oak State College's Book Bundle
- Erl, T., Khattak, W., & Buhler, P. (2016). Big data fundamentals: Concepts, drivers & techniques. Pearson. (Original work published 2016) ISBN 978-0134291079.
- Han, J., Pei, J., & Tong, H. (2022). Data mining: Concepts and techniques. (4th ed.). Morgan Kaufmann. (Original work published 2022) ISBN 978-0128117606.
Additional Resource
- Mining of Massive Datasets – Leskovec, Rajaraman, Ullman
Course Schedule
Week | SLOs | Readings and Exercises | Assignments |
1 | 1,2 | Topic: Intro to Data Mining and Big Data
|
|
2 | 2,3 | Topics: Cleaning and Organizing Data
|
|
3 | 3 | Topic: Classification – Fundamentals
|
|
4 | 3,4 | Topic: Classification – Advanced Topics
|
|
5 | 4,5 | Topic: Clustering
|
|
6 | 5 | Topics: Big Data Fundamentals & Ecosystem
|
|
7 | 6,7 | Topics: Hadoop & Spark Architecture
|
|
8 | 4,5,6,7,8 | Topics: Big Data Integration & Final Project Presentation
|
|
COSC Accessibility Statement
Charter Oak State College encourages students with disabilities, including non-visible disabilities such as chronic diseases, learning disabilities, head injury, attention deficit/hyperactive disorder, or psychiatric disabilities, to discuss appropriate accommodations with the Office of Accessibility Services at OAS@charteroak.edu.
COSC Policies, Course Policies, Academic Support Services and Resources
Students are responsible for knowing all Charter Oak State College (COSC) institutional policies, course-specific policies, procedures, and available academic support services and resources. Please see COSC Policies for COSC institutional policies, and see also specific policies related to this course. See COSC Resources for information regarding available academic support services and resources.
