
Hadoop has gained significant momentum in the past few years as a platform for scalable data storage and processing. More and more organisations are adopting to Hadoop as an integral part of their data strategy. With the concepts of data lakes and democratization of data becoming more popular, more users are getting access to data that was privy to a select few.
In absence of a robust security model in place, this approach posses a risk to the sensitive data getting into the hands of unintended audience. Also with Government regulations and compliance like HIPPA, PII and others it is becoming more important for organizations to implement a meaningful security model around the Data Lake. The topmost concern for anyone building a Data Lake today are security and governance. The problem becomes all the more complex as there are multiple components within Hadoop that accesses the data. Also there are multiple security mechanisms that one can use.

Apache Sentry overcomes most of these problems providing secure role-based authorization. Sentry provides fine grained authorization, a key requirement for any authorization service. The granularity supported by Sentry is supported for servers, databases, tables, views, indexes and collections. Sentry supports multi-tenant administration with separate policies for each database/schema and support for maintaining by separate administrators. In order to use Sentry one needs to have CDH 4.3.0 or later and secure HiveServer2 with strong authentication (Kerberos or LDAP).
In short Apache Sentry is enabled to store sensitive Data in Hadoop, there by extending Hadoop to more users and helping organizations to comply with regulations. The fact that Sentry recently as graduated out of Incubator and is now an Apache Top Level says a lot about its abilities in providing a robust authorization to Hadoop ecosystem.