Welcome to Getting Started with Apache Spark for Advanced Analytics, a self-paced guide to the Spark analytics engine using Qubole! Apache Spark offers a robust analytics tool capable of large-scale data processing, with particular utilities for machine learning (ML) and data science.
Course Overview
The Qubole Notebooks interface will be our workbench for this Spark course. This will allow us to process a wide variety of examples and use case scenarios. Each step will dive deeper into the capabilities of Spark, from visualization and data preparation to ML model training and deployment.
What is Apache Spark?
Apache Spark is built for cloud big data operations, leveraging the ability to separate storage from compute for the most affordable big data workloads. The technology allows teams to have quick and efficient access to analyze structured, semi-structured, and unstructured data; with a variety of interfaces to explore, collaborate, and work in a centralized environment. Qubole is the perfect tool for taking your big data prototypes to production, and beyond.
About Qubole Test Drive
During this course, we will be demonstrating the capabilities of Spark on Qubole using the Notebooks user interface. Qubole provides immediate access to Spark clusters, without the need for manual configuration. Sign up for your free Test Drive account to get instant access to Data Science and Analytics big data examples today!