Jun 15, 2025  
2020-2021 Graduate Catalog 
    
2020-2021 Graduate Catalog [ARCHIVED CATALOG]

Add to Portfolio (opens a new window)

DATA 603 - Platforms for Big Data Processing

[3]
The goal of this course is to introduce methods, technologies, and computing platforms for performing data analysis at scale. Topics include the theory and techniques for data acquisition, cleansing, aggregation, management of large heterogeneous data collections, processing, information and knowledge extraction.  Students are introduced to map-reduce, streaming, and external memory algorithms and their implementations using Hadoop and its eco-system (HBase, Hive, Pig and Spark). Students will gain practical experience in analyzing large existing databases.
Course ID: 102550
Components: Lecture
Grading Method: R



Add to Portfolio (opens a new window)