Qubole Test Drive
Try the Qubole Platform today. Get hands-on experience with Spark, Presto, Hive, and more.
Open Source Tools
GitHub repository of Qubole open source project contributions and tools.
Engineer Blog
Learn new developments, best practices, use cases and more from Qubole engineers and users.
The Qubole Data Service is built for the cloud; with available services in AWS, Azure, and Oracle Cloud.
No need to manage clusters. Get instant access to Hadoop, Hive, Spark, Presto, and more at the push of a query.
Security for the cloud. Qubole embraces different cloud infrastructures with enterprise compliance (HIPAA, PCI, SOC 2) attestations.
From Data Science to Engineering. Quickly visualize unstructured data, build data pipelines, or train and productionize ML algorithms.
Common user interfaces for developing Hadoop, Spark, Hive, and Presto. Providing each data team self-service access to the data lake
Integrate with technologies from the entire Big Data ecosystem (Apache Kafka, Ranger, HBase, Arrow, H2O, Superset, and many more).
Built-in scheduler, to easily build and manage production data pipelines.
Build complex end-to-end pipelines easily with Airflow and the Qubole Operator.
Qubole offers a full set of REST application programming interfaces (APIs) to manage all platform functions from infrastructure to user management
Qubole commands APIs to directly submit queries and retrieve results of Hive, Spark, and Presto commands.
Metastore caching for quick discoverability of your data lake, with secure encryption at rest.
Shared metadata caching to reduce resource inefficiency and improve performance with multiple users querying.
Engine-level caching with Rubix, an open-source technology developed by Qubole, for improving the performance of Presto and Spark workloads
Automation built for the cloud. Qubole focuses on separating storage from computing, to enable dynamic scalability.
Big data clusters built with workload aware auto-scaling, aggressive downscaling, and optimizations to leverage AWS Spot Instances
Built for petabyte-scale with cloud computing. Save and contain costs as you scale workloads, without manual intervention or tuning.
Qubole Hive Metastore allows you to easily create tables and query structured and unstructured data in seconds.
Run federated queries across multiple data sources (NoSQL databases, Data Warehouses, and more) with Qubole Presto.
Use your favorite interface with Qubole SQL engines. Whether it is Analyze Workbench, Notebooks, or connecting your favorite BI tool.
Build. Use your favorite Data Science workbench and tools (RStudio, Jupyter, SageMaker, H2O, and more) to explore and develop new models
Train. Fast, self-service access to compute allows for rapid model training. Making selecting the right ML model, a quick and iterative process.
Deploy. Whether running batch or real-time ML operations, Qubole is built to scale up to petabytes of data, and manages production pipelines.
Qubole improves query performance at runtime with Join Ordering and Dynamic Filtering optimizations for Spark and Presto.
Proactively tune Spark workloads with SparkLens or optimize tables and queries with recommendations from Qubole AIR.
Live stats collection on Table performance for optimizing production workloads and datasets
Query any file format (JSON, Avro, Parquet, ORC, etc) with any engine. Qubole allows self-service access to analyze cloud storage.
Integrate with your Data Warehouses, RDS, or Data Marts to enable read/write access to Qubole engines
Big data engines (Hadoop, Spark, Presto) built for faster query performance with cloud object stores
Free access to Qubole for 30 days to build data pipelines, bring machine learning to production, and analyze any data type from any data source.
See what our Open Data Lake Platform can do for you in 35 minutes.