With the rise of technology has comes an increase in the amount of available information. In 2020, each person will generate 1.7 megabytes per second. For businesses, having consumer data is valuable but making that data actionable is a difficult task. This is where big data comes in.
Big data refers to data sets that are so large, and so rapidly changing, that traditional methods of analysis and storage are not able to analyze them effectively. For many businesses, it’s what’s needed to compete at the next level.
Who uses big data?
The range of industries and organizations that collect big data and need the sophisticated level of analysis to be able to leverage it effectively is growing all the time, but can include retail tracking, political analysis, weather monitoring, health analysis, among many other fields.
Companies target users with data, even in the course of routine tasks like online shopping, doing laundry and watching television. For example:
- Amazon: As you can imagine, Amazon has a vast amount of information on both their customers and their inventory. The ability to combine that data together is why when you log onto the site you are more likely to receive recommendations for items that you need or are interested in, versus those that you are not.
- General Electric: General Electric collects data from all of its machinery that it then turns into not only consumer insights, but also usable data for improving the energy efficiency of their products.
- Netflix: By collecting data on viewing preferences of each users, Netflix is able to recommend programming to you that you are more likely to watch because of items you’ve watched in the past, but beyond that they can analyze what other users who watch shows similar to you, will want to watch based on your preferences and vice versa.
The Seven Vs of Big Data
How does one begin to define big data? One way is with the Seven Vs — seven characteristics that can be used to measure one big data context versus another. They are:
- Volume: How many giga-, zetta-, or yottabytes of data is involved?
- Velocity: How quickly is it available?
- Variety: Is all of the data in the same format or is it varied, making it more difficult to have it work together.
- Veracity: How accurate is the data? Is it filled with inaccuracies and dummy data?
- Variability: Does the data mean the same thing no matter the context over time?
- Visualization: Can the data be presented visually better than a table of numbers? Does this help comprehension?
- Value: Is the data adding value or is it just a collection of meaningless statistics?
Some sources list the first four as the key characteristics of big data, others the first three. What’s important is that any of these dimensions provide a way to think about staggering amounts of data so that it’s more manageable.
What forms can big data take?
While big data can be a large, nebulous concept that is hard to categorize, there are some general types that emerge frequently, such as:
- Structured data: Numerical information stored in rows and columns, such as a traditional database.
- Unstructured data: Data in almost any unstructured form, usually free text.
- Geographic data: Any data related to physical locations, such as roads, buildings, or coordinates.
- Real-time media: Live or stored audio and/or visual media streaming in real-time.
- Natural language data: Human-generated text.
- Time series: Data points that occur periodically or are attached to particular events (like every time a certain temperature is reached).
- Event data: Similar to event-based time series, this is triggered by a particular event whenever it may occur.
- Network data: The connection of nodes in a network, whether it’s social (e.g. Facebook), informational (e.g. the web), biological (e.g. neurological), or technological (e.g. Internet of Things connections).
- Linked data: Data that is connected via the web, such as hyperlinks.
Big data is an increasing part of every business’s life. The benefits to be gained from working intelligently with it are immense.
FAQs
What is big data?
Big data is the analysis and organization of data sets that are too large and rapidly changing to process using traditional data processing methods.
What are some types of big data?
Big data can be broken down a number of ways, but a helpful categorization is as either structured (organized in rows and columns like a database) or unstructured (virtually any other form).
What are the Seven V’s of big data?
They Seven Vs are used to measure one big data context versus another. They are: volume, velocity, variety, veracity, variability, visualization, and value.