Qubole Enhances Spark Performance with Dynamic Filtering, a SQL Join Optimization
SQL join operators are ubiquitous. Users performing any ETL or interactive query like “show me all the people in Bangalore under age 30 who took…
SQL join operators are ubiquitous. Users performing any ETL or interactive query like “show me all the people in Bangalore under age 30 who took…
With the increased usage of public cloud storage, intelligent management of frequently accessed data has become more important. For interactive queries, reading the same data…
This post is a guest publication written by Sean Downes, the Senior Data Scientist at Expedia Group. Data Science Implementation How Expedia Group Came to…
What Is Apache Airflow? Apache Airflow is an open-source tool to programmatically author, schedule, and monitor data workflows. With Airflow, users can author workflows as…
Extract, Transform, Load (ETL) workloads are an important use case of Apache Spark for Qubole customers. In particular, the performance of INSERT INTO / OVERWRITE…
Recently AWS announced support for AMD-powered instances. In this post, we compare the AMD and Intel instance types on cost and performance using common big…
As part of our commitment to the security and success of our customers, I’m excited to announce that Qubole has completed its ISO 27001 certification…
AWS recently announced Managed Streaming for Kafka (MSK) at AWS re:Invent 2018. Apache Kafka is one of the most popular open source streaming message queues.…
We are happy to announce the availability of sparklens.qubole.com, a reporting service built on top of Sparklens. This service was built to lower the pain…
While implementing a big data infrastructure in the cloud, companies are facing a wide range of technical and non-technical challenges. To help our customers to…
Today marks an important day for the whole Qubole community—customers, partners, and employees—as we update our environments with the latest major product release “R54.” In…
Machine Learning As a consumer of goods and services, you experience the results of Machine Learning (ML) whenever the institutions you rely on use ML…
Free access to Qubole for 30 days to build data pipelines, bring machine learning to production, and analyze any data type from any data source.
See what our Open Data Lake Platform can do for you in 35 minutes.