Microsoft has announced it plans for Hadoop recently, and they have come with map reduce that forms an integral part of Apache hadoop. Before we dig deep into it, I would like to give you an overview of understanding Hadoop and Big Data.

Hadoop is an elastic distributed schema-less data processing platform which is ideal for scenarios where you have huge volume of data with low per-record value. A typical example is twitter and face book where there is a huge volume of data which cannot be grouped into a schema but at the same time as different file formats ranging from json, xml, image etc.., It is a good parsing solution for processing sophisticated data.

Hadoop is a more of an explorable platforms that lets you to get your hands dirty because it provides you massive scalability which cannot be accomplished with relational data bases. Hadoop is implemented as a set of interrelated project components. The core components of Hadoop are MapReduce(Google works using MapReduce) which is used as a job processor and the most important one is HDFS(Hadoop Distributed File System)which is typically a storage system( Say your NTFS file system in Windows is also a distributed file system).

A name node server keeps track of the data nodes in the environment. Data is stored in these nodes and they are called entities. Each Entity is called a Cluster. If you are familiar with the term cluster from RDBMS implementations, please note that there is not necessarily any shared storage or other resources between the nodes. A Hadoop cluster is purely logical.

This post just gives basic insight of Hadoop In our next post we will see the 4 Vs (Velocity, Volume, Variety, Variability) that govern the big data systems and also the basic architecture of big data systems.

Kindly let me know your suggestions and thoughts.

About Author – Author is a Technical Architect in Congruent, a Microsoft Gold Certified Partner specializing in Software Application Development.

Tags: Hadoop
previous post: Whats new in Microsoft Dynamics SL 2015 next post: Tiny Roslyn Compiler Application