Big Data, Petabytes, Exabytes, IoT, NoSQL, Hadoop, Mongodb; one could go on and one. It is as if a new term is spawned and added every day to tech jargon. So before we get to the subject, let’s define the terms Big Data and NoSQL which are of particular interest to us today.
According to TechTarget, Big Data is an evolving term that describes any voluminous (terabytes and above being the most simplistic definition) amount of structured, semi-structured and unstructured data that has the potential to be mined for information. Big Data can be characterized by the volume, variety and velocity of data, also known as the 3Vs. Some people also use a fourth V, veracity, which is about the quality of data.
NoSQL is a non-relational database that treats each item as a document or an artifact with a unique key and not in the form of rows and columns as relational databases do. This provides a much more flexible approach to storing data than a relational database. While relational databases are ideal for storing structured data, NoSQL provides an unstructured or "semi-structured" approach that is ideal for capturing and storing data from media such as audio & video files, tweets or text. In comparison to relational databases, NoSQL databases are highly scalable.
NoSQL is a response to the rapid growth of usage of Facebook, WhatsApp, Twitter & Instagram as also blogs and other texts that is leading to creation of huge quantities of unstructured / semi-structured data. While adoption rates for NoSQL database technologies and their deployment at the enterprise level are relatively modest at present, they are becoming a crucial part of Big data strategies.
With the unprecedented growth of data, the need for real time processing is becoming mission critical and conventional relational databases are becoming a bottleneck. This is the void where NoSQL based applications are stepping-in. The Internet of Things (IoT) where millions of connected devices are sending data every second to central servers is also increasingly creating the need for a shift to non relational databases.
Having said that, the death knell for relational databases has not been sounded as yet. While the adoption of NoSQL technologies is growing by leaps and bounds, relational databases have their strengths in terms of transactional data, consistency and application of rules in a batch mode. So, while real time data manipulation can be done using NoSQL technologies, relational databases will continue to be used for longer term analysis. Some experts are also predicting a convergence of both technologies into a hybrid ecosystem.
In the short term, combining the two technologies seems to be a logical choice for enterprises, especially with their focus on data governance and security along with ability to scale and support large volume of transactions.