Big Data - get started with Hadoop and Spark - København

The amount of data that many organisations need to be able to process is increasing. This course will introduce the technologies used build systems that scales to handle very large volumes of data.

Get more value out of your data:
Many organisations face growing volumes of data. Extracting value from all this data can be challenging, since traditional systems based on relational databases are not suitable for big data. This course is focused on the Hadoop and Spark ecosystems and will teach the skills needed to build modern big data applications.

The course will put emphasis on giving the participants practical hands-on experience with the tools. The technologies introduced as part of the course will be centered around the Hadoop/Spark open source ecosystem.

The course will be as vendor neutral as possible, but the practical hands-on exercises will be run on Google Cloud Platform. All the principles taught will be transferable to other cloud providers or to on-premise solutions and differences between providers will be highlighted where relevant.

Your yield:
- Participants will be introduced to the fast evolving world of big data technologies.
- Participants will have insight into the differences between traditional database systems and modern distributed systems.
- Participants will gain hands-on experience with storing, processing and serving large amounts of data.
- Participants will understand the main challenges of designing systems for big data.
- Participants will be introduced to the programming style used in big data.

Your company's dividend:
- Will have an employee with a good understanding of when big data technologies are the right choice for solving a specific problem.
- Will have an employee with the ability to get started with setting up big data infrastructure.
- Will have an employee able to carry out analysis of large datasets.

The agenda of Big data course:
The two-day course will be instructor led with hands-on exercises. The focus will be on giving the participants the knowledge and the confidence to get started with modern distributed data processing. The technologies that we will work with includes Apache Hadoop, Apache Spark and Hbase/BigTable.

The course will be taught in Danish, unless there are non-Danish speakers among the participants, in which case the language will be English.

Day 1:
- Basic Big Data Concepts
- Introduction to Hadoop and Spark
- Getting value from unstructured data
- Storing, loading and processing data
- Notebooks - interactive environments for analysing data
- Distributed processing of data - counting and aggregating
- SparkSQL - from unstructured to structured data

Day 2:
- Large scale machine learning
- Big data architecture
- Data lakes
- Indexing data
- Streaming data

Before the course:
There is no preparation before the course.

During the course:
The course will be held during the period at. 9 am to 4 pm both days. IDA provides food, breakfast, lunch and afternoon cake.

After the course:
After the course, participants will receive a course certificate

Target group and prerequisites:
Course participants should have prior experience with writing code for data analysis, but no prior knowledge of Hadoop or Spark is required. In this instructor-led course, participants will go through hands-on sessions with planned exercises. The exercises will be in the Python programming language, but the focus will be on the overall concepts, not the particulars of the programming language, so deep knowledge of Python is not required for participants with programming experience from other languages. Hadoop and Spark readily supports Scala, Java and Python.

Participants are expected to bring their own laptop to the class, everything else needed for the course is provided.

Instructor:
Andreas Koch has a background in data science and has for the past 6 years been architecting and developing data intensive applications based on Hadoop and Spark. Currently, Andreas is working as a data science consultant using his extensive experience with data processing systems to advice organisations on how to best leverage their data.
The training language and the study material will be in English at this course.

Information
  • When

    From: 12. dec. 2019 - 09:00 To: 13. dec. 2019 - 16:00
  • Where

    IDA Conference, Kalvebod Brygge 31-33, 1530 København V

  • Registration Deadline

    10. dec. 2019 - 23:59

  • Organizer

    IDA Learning

  • Available Seats

    12

  • Event Number

    330799