The late 2000s saw an exponential growth in the number of people owning a computing device along with some form of connectivity to the internet. As we reached 2017, this resulted in enormous amounts of (often) heterogeneous & unstructured data being generated on a daily basis. “Big Data” thus became the norm and traditional Data Warehouses became increasingly unable to accommodate this change.
Today, owning an Enterprise Data Warehouse (EDW) is a hallmark of any medium-to-large 21st century firm that has successfully matured beyond infancy. Many firms desire and strive to reach a stage where they can own and fully take advantage of an EDW of their own.
Yet, as we continue to embrace the technological advances of today, our views on Data Warehouses also need a revisit: is it really necessary to own an EDW or is it best to leapfrog towards Data Lakes?
The emergence of Big Data is why Data Lakes are so important today. Traditional Data Warehouses are built to handle a limited amount of well-structured data. A large amount of storage space is often expensive and unstructured data is impossibly difficult to store.
Data Lakes are built to store enormous amounts of data - structured, unstructured, homogeneous, or heterogeneous - at affordable prices. The question of “Why should I invest in a Data Lake?” is thus straightforward to answer: it’s future proof.
As we continue the transition towards a data-centric world, it only makes sense to have the right infrastructure to handle large loads of inbound data. To understand if your firm needs a Data Lake, first take a look at its workings. If ten years down the line, the kind of data you expect to handle is diverse and large in nature, then it may be best to invest in a Data Lake rather than a Data Warehouse.
In fact, there is a list of indicators that can help effectively identify whether your firm is positioned to require and take full advantage of a Data Lake. Keeping in mind the present and future, consider the following:
If you answered yes for more than one of the above points, then a Data Lake may be more suitable for your firm than a Data Warehouse.
Despite the advantages offered by Data Lakes, it isn’t reasonable to completely rule out EDWs as a Data Storage option. Regardless of their limitations, EDW’s are still very good at what they do. To understand if your firm needs an EDW more than a Data Lake, consider the following points:
If your requirements match the above points, it may be more suitable for your firm to install an EDW instead of a Data Lake.
The benefits that Data lakes offer are many. They are highly scalable, flexible, offer a large number of ways to query data, and eliminate the concept of silos. Furthermore, the penetration of Hadoop technology means that there are incredible ways to process and learn from data stored in lakes. Unless your firm is expected to handle only limited amounts of data, it may be best to skip a warehouse and leapfrog to a Data Lake lest you should be left behind.