No matter the sector in which your organisation operates, all your data can be described generically using five factors or attributes:
Your operational systems are just the tip of the iceberg when it comes to the data your business works with every day. Think about all the emails, sales proposals, supplier and customer contracts, budget spreadsheets, marketing photos and videos, voice and video call recordings floating around. Then there’s data being captured on social media platforms, or in the activity streams from smartphones, or through a host of connected devices such as wearables and smart home systems. The variety of data and where it’s coming from is increasing all the time, while more and more of it is unstructured and semi-structured rather than structured. That means that, alongside traditional SQL databases and file stores, you’ll likely need storage solutions that can handle unstructured data, as well as options to generate structured metadata to allow you to exploit it.
The volume of data being created and stored is rising rapidly. IDC expects the collective sum of the world’s data to grow from 33ZB in 2018 to 175ZB by 2025. (In case you’re wondering, a zetabyte (ZB) is a trillion gigabytes.) IDC also predicts that in another 10 years each of us will be using data in our daily interactions 20 times as often as we do now, as our homes, workplaces, appliances, vehicles and wearables become data-enabled. Whatever your business size or sector, you’re undoubtedly generating, storing and using more data every year and you need processing and storage systems that can scale appropriately.
Some data, such as a customer’s details or order history, changes very little or not at all after it’s captured. More and more, however, we’re generating data that’s updated constantly, from IoT devices supporting tracking of delivery vehicles to clickstream data from retail websites. The velocity of your data and how quickly you need to use it to deliver insights will affect how often you need to capture and process it.
This refers to the accuracy and quality of the data. Data may be missing, conflict with data from another source or simply be inaccurate. It’s essential that you’re checking data for accuracy and validity before using it to create business insights.
Not all the data you can collect will help you create business insights. The value of data can also change over time. Some data loses its value very quickly; for example, transaction data when you’re trying to detect fraud. At the other extreme, health records can have value to researchers long after an individual patient dies. You need to decide (and regularly review) what you’re going to store, for how long and how quickly you need to access it.
In other words, the variety, volume, velocity, veracity and value of your data will all determine your approach to ingesting data, storing it and transforming it into actionable insights. It’s easy to see that future-proofing your data ingestion solution is very important; you don’t want to invest in a platform that can’t support new data types or devices. Similarly, data volumes can grow unexpectedly or need ingesting more frequently.
Understanding the characteristics of the data our clients will be working with is a key step for the experts in our Data Analytics team in any assignment. To find out more about how we work with organisations like yours to create solutions that deliver timely, actionable business insights, why not download our white paper on the 7 rules for a successful modern data platform. Or come and talk to our Data Analytics team about your specific challenges and the opportunities that are open to you.