Goal: Get familiar with the main Big Data and IoT platforms.
Course description: The course introduces the distributed/parallel architectures, the operational mechanisms, the applied technologies and the offered cloud based services concerning various IT platforms with the main aim to serve Big Data and IoT (Internet of Things) application areas. In the first 4 topics, the course discusses the evolution and characteristics of Big Data solutions, including Hadoop, SPARK, Hana and noSQL databases (including some related Platform-as-a-Service offerings) that are widely adopted in the typical research and industrial environments. In Topics 5 and 6, the course covers the theoretical and practical backgrounds of management and orchestration solutions (Ambari/CloudBreak/Occopus) for cloud based Big Data application areas. From Topic 7, the focus has been shifting to IoT and related back-ends for processing the ingested data with more use cases including medical and agriculture areas. The theoretical background is extended with Lambda, Kappa and other approaches in Topic 8, and more practical information from Amazon in Topic 10. By the end of the course, the students are to improve their problem solving and model creation/architecture design skills concerning large-scale parallel and distributed computing by applying typical Big Data/IoT platform engineering approaches together with the most advanced Big Data/IoT platforms (from Microsoft, Amazon, Hortonworks, etc.), and methods in the appropriate way for addressing medical and other application areas. A special research seminar on “reference architectures” will be held on the 7th week.