Apache Spark Benchmark for Autoscaling: Qubole versus Competition
This blog covers new benchmark tests to better understand the Autoscaling behavior of concurrent Apache Spark applications. We believe that this will help in advancing…
This blog covers new benchmark tests to better understand the Autoscaling behavior of concurrent Apache Spark applications. We believe that this will help in advancing…
Guest authors: Jerry Xu, Co-founder, and CEO, Datatron; Lekhni Randive, Product Manager, Datatron Qubole author: Jorge Villamariona, Sr. Product Marketing Manager, Qubole In today’s world,…
The sixth release of Apache Sqoop i.e. 1.4.7 is out! This is one of the most significant updates to the Sqoop platform. We give you…
Data scientists use Notebooks for data exploration, interactive data analytics, machine learning, and collaboration. Once set up, a Notebook provides a convenient way to save,…
Introduction Presto can access S3 Buckets using one of the following options: IAM roles provided in the configuration Access-key/Secret-key provided in the configuration Credentials fetched…
Introduction Qubole provides powerful automation that optimizes underlying cloud compute management for data lakes. Qubole cluster management continuously optimizes both performance and cost by lowering…
Introducing Qubole Support Qubole processes over 250 Petabytes of data in a month, and the diversity of data we process, cloud platforms we run on,…
Introduction Enterprises are today becoming more data-driven as their data is the fuel to their innovation engine to build new products, outmaneuver the competition and…
Each month, about an exabyte of data is processed using Qubole’s data platform on Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure,…
This post is a guest publication written by Saba El-Hilo, a Senior Data Engineer at Mapbox. A version of this post first appeared as a…
Introduction ETL workloads form a major component of big data processing at any data-driven organization – from SMBs to enterprises, and ETL data pipelines at…
Introduction In an earlier blog post, we presented a secure, multi-tenant, reliable, and scalable service that provides access to logs and history for MRv2 applications.…
Free access to Qubole for 30 days to build data pipelines, bring machine learning to production, and analyze any data type from any data source.
See what our Open Data Lake Platform can do for you in 35 minutes.