Upon completion of the course, students will learn:
System development infrastructure
✓ Source code management with Git and GitHub
✓ Task Management with ClickUp
✓ Free document system, Confluence
Data crawling and extraction framework
✓ Extract information from web pages with Scrapy
Data storage system
✓ Choose the right storage architecture based on data characteristics
✓ Use data store and document store for various types of data
Massive data processing frameworks
✓ Install and deploy Hadoop and Spark
✓ Program big data processing logics with Hadoop and Spark
Data interface between modules
✓ Process JSon files
✓ Adopts GraphQL as the data interface
Other open source big data tools
✓ Visualise data results with D3.js
✓ Monitors online data with Prometheus