Evolution of Big Data- the now buzzword of IT industry, dates back to 1940s, when American librarian speculated the shortfall of shelves realising the rapid increase in the information and limited storage. Since then, many projects have been initiated to study the information flow, and track the volume of information. Further, over time with the advent of IT, the Internet and globalisation, volumes of data and information generated, has increased at an exponential rate.In today’s world, data is being generated everywhere. Whenever, a person tweets, writes a post on facebook or makes any payment using credit card, data is generated. This data varies in its form, from numbers to images, videos or text. This huge amount of data is termed as Big Data. It consists of large data sets, which satisfies the 4V’s- Volume, Velocity, Variety and Veracity, meaning, the amount of data generated, the rate at which this data is generated, the type of data generated and the uncertainty of the data that is generated.But, the bigger question that arises is, why do we need to collect data in the first place? The answer is, to increase productivity. Take a simple scenario, of sales in a nearby inventory during festival. During initial period of sale, it is observed that people appreciate and buy more of a particular kind of product. Once that stock gets over, what would the owner do? Wait for the next stock! Thus, suffering a business loss. But if the owner collects customer reviews, ie. In this case “price”, “material”, “value”, of various products. It can predict which type of products are mostly liked by the customers, and can next time bring a bigger stock of those items. Thus increasing its profit! What the data collected is during here is, improving the functioning of the inventory.The above mentioned example, is one the fields where data collection has benefited. In our real world, various sectors like- transportation, education, travel, government, health care, telecom, goods industry, everywhere application of big data is present.But, is it sufficient to simply collect data? NO! Simply collecting massive amount of data is just a wastage of time, effort and storage space. For example, in the above case, if any product developer company wants to improvise itself, what would its R&D team do with millions of unstructured customer reviews? This data needs to be sorted, organised, analysed and its usefulness needs to be determined. So, in this case, the analyst team looks for terms like “I wish”, “I hope”, “they should” in the review, and then gives the customer insight which helps in product development. So, what we are doing now is taking out relevant data from a set of large unstructured data.Now, you must be thinking, what this word “unstructured data” is! Actually, big data comprises of –1. Structured data2. Unstructured data3. Semi-structured dataStructured data is defined as the data that has some definite repeating pattern in it, which makes it easier to sort, read or process. For example- relational databases, flat files present in the form of records, etc. Unstructured data is the set of data that might or might not have any logical or repeating pattern, like the information that is present on social media. No pattern is followed while writing posts on facebook, but when massively analysed, it could reveal the trending pattern or any important issue of the society or simply behavioural patterns of a person. The third form, called as semi-structured data, refers to structured data which contains tags or markup elements to separate elements and generate hierarchies of records and fields in the given data.This huge amount of data is generated from many places, like social media platform, cell phone GPS signals, digital media or purchase transaction records. Broadly classifying them into –1. Internal sources2. External sourcesInternal sources produce structured data that originates from within the enterprise and is used to support daily business operations of the organization. For example- products and sales data, OLTP and operational data, etc. Whereas, external data sources produce unstructured data which originates from the environment, external to the organization, like- the internet, government or market research organizations. They basically help to understand the external entities like- customers or competitors or market.So, next time when you write any review or make any comment, remember you are generating data, and this being the information age, evidently needs big data.