Autonomous Data Platform

Meet the First Autonomous Data Platform

 

Leveraging big data is no longer a luxury. It’s necessary for survival.
The question is how, when the hurdles—complexity, scalability, speed, cost,reliability, expertise—are many.
The answer is the world’s first Autonomous Data Platformwhere metadata talks back and gives back. 

Get Started

superhuman

Make your data team superhuman

Have you been dreaming of use cases limited only by your imagination? You’ve come to the right place. Our Autonomous Data Platform self-manages and self-optimizes by sending Alerts, Insights and Recommendations (AIR) based on Cloud Agents connected to your data team’s specific data policies and preferences.

AirCloud Agents

Alerts, Insights and Recommendations (AIR)

Using a combination of heuristics and machine learning, AIR provides actionable alerts, insights and recommendations to ensure:

  • Workload continuity
  • High performance
  • Low reliance on Cloud resources
  • Greater cost savings

By automating lower-level, repetitive tasks, your engineering team can be less reactive to problems and more focused on directing better business outcomes. With AIR, QDS constantly analyzes metadata about infrastructure (cluster, nodes, CPU, memory, disk), platforms (data models and compute engines) and applications (SQL, reporting, ETL, machine learning) so you can better understand performance, usage patterns and Cloud spend.

Cloud Agents

Cloud Agents perform actions the data team determines. These typically include:

Executing automated tasks, based on a policy or configuration

Bundling specific low-level features

Learning based on individual, company and system-wide behavior

Cloud Agents are valuable to a data team because they:

  • Minimize resources consumed
  • Reduce costs
  • Automate repetitive, low-level activities
  • Increase productivity
  • Reduce custom development

QDS offers the following Cloud Agents:

Workload Aware Auto-Scaling Agent

workload auto-scale

Workload Aware Auto-Scaling Agent:

The Auto-scaling Agent augments the basic auto-scaling feature available in the Enterprise Edition with storage-based scaling and aggressive down-scaling.

The Workload Aware Auto-scaling Agent can reduce compute spend by as much as 33% over basic auto-scaling solutions available in the market today.

Workload Aware Auto-scaling offers the following capabilities:

HDFS-based

QDS continuously monitors the cluster’s HDFS storage to ensure it can support current jobs, and it will launch more nodes, if necessary.


EBS-based

When a cluster has sufficient compute resources but requires additional storage, the agent can dynamically add storage using EBS to avoid provisioning a new compute node.

Aggressive Downscaling

Aggressive downscaling is triggered when you reduce the maximum size of a cluster while it’s running. To save costs, QDS terminates nodes that are closest to completing their tasks and closest to their billing boundary.

Offloading

Your mappers may be running idly waiting for reducers to finish their job. Offloading conserves compute resources by saving mapper data to HDFS or object storage.

Qubole auto-scaling advantage

Comparing Qubole performance and cost against two fixed-cluster scenarios under typical fluctuating load conditions.

auto-scale chart

Scenario 1: Ten-node fixed cluster:

13% faster then QDS, but 32% more expensive

Qubole Data Service auto-scaling cluster:

Automatically optimizes performance and cost in response to elastic demand

Scenario 2: Five-node fixed cluster

10% cheaper than QDS, but 90% slower

Learn how much Workload Aware Auto-scaling can save you from our benchmarking analysis.

Spot Shopper Agent

spot shopper

The Spot Shopper Agent ‘shops’ for the best combination of price and performance, based on the policy you provide. It achieves this by shopping across different instance types, by dynamically rebalancing Spot and On Demand nodes and by considering different Availability Zones.

The Spot Shopper Agent can reduce compute spend by as much as 50% over solutions that exclusively rely on on-demand type instances.

Spot Shopper offers the following capabilities:

Heterogeneous Clusters

With Heterogeneous Clusters, slave nodes comprising the cluster may be of different instance types. Heterogeneity in Spot nodes is highly beneficial because Spot prices can change rapidly, and Spot Shopper can make the lowest-cost purchasing decision in real time.

Availability Zone selection

Unless you specify a particular AZ when you configure the cluster, Qubole can automatically select the AZ with the lowest Spot prices for the region and instance type you’ve specified.

Spot Rebalancing

Fluctuations in the market may mean that QDS cannot always obtain as many Spot instances as your cluster specification calls for. In these circumstances, the Spot Shopper Agent will automatically rebalance the cluster later on when prices drop by swapping out on-demand nodes for Spot nodes, ensuring that you continue to get the lowest prices possible [learn more].

Placement Policy

The Placement Policy option enables QDS to make a best effort to store one replica of each HDFS block on a stable node. This prevents job failures that could occur if all replicas were lost as a result of AWS reclaiming many Spot instances at once [learn more].

Data Caching Agent

data caching agent

The Data Caching Agent automates the movement of data for performance optimization.

Caching from Object Store

Data Caching automatically determines the right set of data to cache in the cluster so that interactive, ad-hoc queries run faster and don’t need to retrieve data for each query.

Caching of Index Metadata

Data Caching makes optimal use of ORC, Parquet and Avro data formats by minimizing the amount of data that’s read when selecting only specific columns.