Qubole co-founders Ashish Thusoo and Joydeep Sen Sharma were recently awarded the SIGMOD Software Systems Award for developing a seminal software system—Apache Hive—that brought relational-style declarative programming to the Hadoop ecosystem.
A decade back, while at Facebook, we conceived the idea of Apache Hive (Hive), an SQL-like interface for querying data that sits atop Hadoop. Turning this project into a reality required immense contributions from a talented team with a passion for the idea. We would be remiss not to mention the names of the prolific Zeng Shao and the always dependable Namit Jain, as well as Ning Zhang, Prasad Chakka, Rahotam Murthy, Hao Liu, Suresh Anthony, and many others at Facebook who were the initial team. Without the contributions from these individuals, Hive would never have seen the light of day. We also extend thanks to Alan Gates, Ashutosh Chauhan, Gunther Hagleitner, and many other active mentors for carrying the baton and continuing to innovate in the project.
It’s true that some of the simplest ideas can also be the most powerful ones. The Hive project emerged from a simple idea to make the scalable and fault-tolerant Apache Hadoop environment accessible to SQL users.
We based the project on the observation that SQL was the lingua franca of data, and if made available, could offer a more familiar option for SQL users to access the disruptive power of Hadoop across different use cases. At Facebook, we found a nurturing environment to support this idea despite the contrarian point of view, which led to the creation of Apache Hive.
We are amazed at the impact of Apache Hive on so many industries. Hive has brought big data to the mainstream and helped spawn numerous ideas that are now the underpinnings of data lake architectures. It has also created disruptions in the traditional data warehousing and large-scale data processing fields.
We consider ourselves very lucky to have been at the right place at the right time, with the right insight to start the Apache Hive journey and become an integral part of that journey.
The success of this project underscores the importance of open source in our industry. Open-source community ownership and contribution is a significant reason for the project’s success, which has more than 750 contributors today. As a new generation of contributors continues to evolve this technology, we are confident that the importance and relevance of Apache Hive have been solidified in this ever-changing data landscape.
We could not have asked for more when we started the project.
It is indeed an honor to receive the SIGMOD systems award for Apache Hive, and we accept this on behalf of the community.