What Makes Big Data Security Unique?

This post is about big data security and what makes big data security unique compared to regular data security.

This post was originally published on DataOps Zone

Are you responsible for the security of big data at your organisation?  Have you ever wondered if big data security is different from regular data security? Would you like to know more about some of the challenges of big data security?

If you answered “yes” to any of these questions, read on. Today, we’ll  examine what makes big data unique and special. Then, we’ll focus on the  high-level architecture and security challenges that come with big  data. In the end, we’ll visit an imaginary car factory and have some fun  with big data security. We have lots to cover, so let’s get started.

What Makes Big Data Unique?

The term big data refers to large volumes of unstructured and complex data that we can’t process or analyze using traditional methods. What makes big data unique is that it has some interesting characteristics and behaviours. These are volume, velocity, and variety—or the three Vs for short.

  • Volume refers to the huge amount of data generated.
  • Velocity means that both the speed of data generation and the speed to process and analyze that data occur in almost real time.
  • Variety describes data complexity and refers to the lack of structure of that data.

Before we move on, there are more Vs to big data, like veracity and  volatility, but we’ll leave those for another time and focus on big data  architecture. After all, if we want to understand what makes big data  security unique and challenging, we should be familiar with the big  picture of its architecture.

Big Data Architecture: The Bird's-Eye View

Our big data journey starts where big data is generated: with a large  variety of data sources. Big data comes from social media feeds,  emails, videos, audio data, IoT,  and pretty much anything else connected to the internet. Next, all that  data is sent to some kind of storage service for further processing.  You can call it a data lake, a storage blob, or as I like to call it,  the big data dumping ground.

Next, the data is processed and transformed through multiple steps,  including classification, data enrichment, and cleansing. Following  that, the data goes through business analytics. Now we’re in the realm  of data mining, prediction models, and mathematical algorithms. The  resulting valuable information and insights are sent to fancy dashboards  and interactive graphs that allow people to make intelligent and  data-driven business decisions.

Remember, big data flows through all these stages in near real time  and in large volumes, which means there will be some unique security  challenges for your organisation.

Big Data, Big Security Challenge

First, let’s remind ourselves of the core principles of data security.  We have to preserve data confidentiality, integrity, and availability.  Let’s apply these principles to big data, with all its characteristics  that make it unique. Next, we’ll factor in all the architecture  complexities and we’ll end up with a security challenge unlike anything  we’ve seen before.

Generating and Sending Big Data

Let’s start with how big data is born. We have a wide variety of data  sources, like IoT devices, smart devices, refrigerators, and so on that  generate big data with various levels of security capabilities. These  devices sometimes can’t protect their data.

In addition, big data must be kept secure during its life-cycle, which  includes all the stages it flows through. In traditional data security,  your data sources (databases, file shares, and so on) have more mature  security capabilities, and you have better control over how they  operate. This is not always the case with big data, and big data vendors  should really step up their security game.

Big Data, Big Mess

Remember, big data doesn’t have a defined structure. It can be an  innocent-looking metadata in log files or even payment card data,  personal data, health, or other sensitive personal data.

Hold on, my “compliance and data privacy risk” sense just started tingling! This is a unique challenge.

With traditional data security,  you have a lot of ways to protect data, because you have well-defined  structures to work with. You could remove, anonymize, or tokenize  sensitive data or choose from a wide variety of other security controls.  But with big data, removing, anonymizing, or tokenizing data is a lot  more complicated because you don’t have a data structure to rely on.  This means your security controls have to evolve and adapt to monitor  and protect big data.

Big Data, Complex Processing at Speed

We need to generate, capture, process, and analyze huge data volumes  in near real time. This requirement makes big data security unique  because data flows through various stages and complex architectures at a  breakneck speed. This means a lot of technologies are involved, and if  just one of these technologies fails, your security can be compromised.

In comparison, traditional data security and processing have more  maturity with a wide variety of technical security controls to choose  from. These allow us to monitor, detect, and alert security incidents.  Traditional data flows are well defined, the architecture is more  manageable, and we know that as long as the architecture is kept simple,  data security is easier.

The Big Data Car Factory Assembly Line

I’m going to use a car factory example to illustrate what makes big data security unique compared with regular data security.

Imagine, every week car parts arrive at our car factory from a  limited number of quality suppliers. Our suppliers have mature security  standards and they send quality parts all the time. These materials have  well-defined dimensions, and our engineers know exactly how to deal  with them. Each part has been classified based on its importance and  value. When our factory receives the parts, each component is identified  and kept secure easily. Our supply chain is stable, and our engineers  know how many parts they’ll get each month.

These car components move through our assembly line at a manageable  speed. This allows everyone to monitor each car part closely at each  stage and understand what happens next in the car assembly process. When  things go wrong, our engineers can intervene and stop the production  line at any time. Our engineers are safe, while all car components are  secure and always accounted for. As a result, our cars comply with the  highest manufacturing quality and security standards, and we are leaders  in the car industry.

This car factory assembly line is like traditional data security. The  car parts are the data you need to protect all the way until the car is  ready to roll out from the factory. Now, let’s see how this analogy  translates to big data security. I’ll introduce a few changes from our  factory control room. Are you ready?

New Materials Arrive at the Factory

First, our car factory suddenly gets blasted with huge amounts of car  parts and other materials from a lot of new suppliers. Those suppliers  don’t always follow car manufacturer security standards. Additionally,  the components come in many shapes and forms.

Our engineers can’t decide which parts are more valuable than others  or how to secure them properly. This means the factory will need to  spend a lot of time and money building a new automated system that helps  us identify the car components that arrive at a much faster pace.

Car Factory Assembly Lines on Steroids

Until now, our engineers supervised and kept all the car parts secure  as they moved through the assembly line at a manageable speed.

Now, we are cranking up the speed because our factory has to keep up  with the large number of car parts flowing in. Our engineers can’t  really see what exactly is happening with the components anymore;  everything happens a lot faster and it all looks kind of blurry.

To make things worse, car parts are starting to fall off the assembly  line and it’s hard to keep track of them. This means we have to build a  lot more safety and security around the assembly line. Also, we may  have to put in some extra hours to track the missing car components.

More Complex Assembly Lines

Next, I'm adding more assembly lines and many more steps to our car manufacturing process, which is now a lot more complicated.

I might outsource some steps and assembly lines to a third party because our factory can't cope anymore. And I have to realize that this complexity and outsourcing could result in non-compliance with our car manufacturer security standards. This means we have to be even more diligent and do our research before we outsource.

Because I haven't made a plan before I started making these changes, things are out of control. All of our engineers moved away from the assembly line to a safe distance and put their safety glasses on.

Suddenly, I hear a firm knock on the control room's door. Our friendly auditor wants to know why car parts are all over the place and why the factory looks like a war zone. We will have to put in long hours to fix this.

Now, our car factory describes what makes big data security unique and special. First, we received all kinds of non-standard car parts. These components illustrate the ever-changing and unstructured nature of big data. This caused challenges for our engineers to classify and secure those components, which is similar to the challenges you'll have with data classification. Next, we cranked up the speed of the assembly line and made the whole production line more complex, which is the nature of big data architecture. As a result, our engineers could not always account for the car parts which is what organisations definitely have to avoid with valuable big data.

The Car Factory Epilogue

Now you understand the uniqueness of big data security through the  car manufacturing example. You also see the consequences of facing  complexity without preparation and a plan.

To help you with this challenge we focused on the characteristics of  big data. Then, we touched on big data architecture and its complexities  and security challenges. Make sure you tackle these challenges as early  as you can, and keep the safety glasses close, just in case.