May 05, 2024  
2022-2023 SGPP Catalog and Handbook 
    
2022-2023 SGPP Catalog and Handbook [ARCHIVED CATALOG]

Add to Portfolio (opens a new window)

DIGA620 Data Engineering (3 cr.)


The course utilizes data processing requirements necessary to implement technology-based analytics. The course explores strengths and limitations of various data formats to make better decisions. The importance of structured and unstructured data formats as well as performing methods of data extraction, transformation, and loading are covered. Data wrangling methodologies explore constructing custom data pipelines to support efficient analysis. These methods include cleaning, filtering, standardizing, and categorizing data. Processes to review data for accuracy, consistency, and completeness are covered as well as techniques to mitigate error and improve data integrity.  The course also investigates legal and ethical considerations of data management.

Upon completion of the course students are expected to be able to do the following:

  1. Perform extract, transform, and load (ETL) processes using structure and unstructured data formats.
  2. Assess data for error and implement techniques to improve data integrity.
  3. Determine appropriate data formats for given situations.
  4. Design and document processes for converting raw data into a product suitable for analysis.
  5. Identify legal and ethical issues related to the processing and dissemination of data.



Add to Portfolio (opens a new window)