Big data is The term which states that the very large Amount of data – both structured and unstructured – that overwhelm a business on a day-to-day basis.
But it is not the amount of data that is important. The important is what organizations do with the data that matters.
Characteristic of Big Data.
Those three Charactristic is also known as 3v.
3Vs volume, variety and velocity are three defining dimensions of big data. Volume refers to the volume or amount of data,
variety refers to the types of data and velocity refers to the speed of data processing.
According to the 3Vs model, the challenges of big data management result from the expansion of all three properties, rather than
just the volume alone -- the sheer amount of data to be managed.
1)Volume:- Big Data is the Nature of Data itself. The volume or amount of data is not useful the importent thing is what organization Do with the data that matters.
Many Factors comes when considering how to how to collect store and retrive, update data sets making up the data.
2)Velocity:- When you are processing so much of Data You need Good speed to Deal with it.High speed gives you good performance. No one has time
for watching the hour glass flip in this day and age of high performance, always on technology. So velocity is one of the important Factor in Big Data.
3)Variety:- Variety refers to the different types of data . It deals with the variety of data Elements. Good Big Data will Help you make educated Decision. It Differentiate structured Data and Unstructured data
There are many different types of data and each of those types of data require different types of analyses or different tools to use.
Framework Used For Big Data:-
1)Hadoop:-Hadoop is Java-based programming framework Hadoop Processes Large amount of data with thousannds of nodes. Hadoop is inspired by Map reduce. Map reduce is the Application framework which is broken down into numerical small part. Hadoop Framework is used by the Google, Yahoo, and IBM. Hadoop is Majorly worked on Operating system like Windows and linux. But Hadoop can also work on BSD and OSX.
2)Apache Spark Framework:- Apache Spark is a fast and general engine for large-scale processing of Data. Apache Spark Framework speed is very good than Hadoop It runs programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.
Apache Spark Framework is very easy to use We Can Write Application in Python, Java, and Scala.Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can mix these libraries seamlessly within the same application.