Sep 28, 2024  
2017-2018 Graduate Catalog 
    
2017-2018 Graduate Catalog [ARCHIVED CATALOG]

Add to Portfolio (opens a new window)

DATA 603 - Platforms for Big Data Processing

[3]
The goal of this course is to introduce methods, technologies, and computing platforms for performing data analysis at scale. Topics include the theory and techniques for data acquisition, cleansing, aggregation, management of large heterogeneous data collections, processing, information and knowledge extraction.  Students are introduced to map-reduce, streaming, and external memory algorithms and their implementations using Hadoop and its eco-system (HBase, Hive, Pig and Spark). Students will gain practical experience in analyzing large existing databases.
Prerequisite: DATA 601
Components: Lecture
Grading Method: R



Add to Portfolio (opens a new window)