Rick Spickelmier
, CTO at Birst, considers how compliance regulations are affecting global big data and analytics projects…

Companies can often leap into big data and analytics projects without always having a solid understanding of the compliance requirements for the data they want to collect and process.

This is particularly true of smaller companies which are less likely to have sufficient knowledge of the issues around personally identifiable information (PII), or the necessary resources in place to manage them.

Even in cases where in-house IT teams do have that knowledge, other employees may not fully understand the requirements and could start to analyse or process information outside of the original remit of the project. Data gathered for HR processes, for example, could be taken and repurposed for supporting automated decisions that directly affect staff, such as hiring, promotion or termination.

With locations around the world, larger companies do not escape these problems, in fact, they grow in complexity. The larger the data set being analysed, the larger the costs associated with the compliance requirements.

Even in the EU, where there seems to be a single set of regulations, each state has interpreted them in a different way. A data controller needs to understand these locale-specific rules and not just the EU-wide requirements.

Depending upon the local demands, there are ways to organise your data so that it is compliant and useful for the whole business

Once there is sufficient understanding around these particulars, there are ways to deal with compliance, assuming that PII is truly needed. In many situations, PII is included but not actually needed for the analysis. Where it is needed, there are steps that can be taken to make PII useful for analysis but still compliant. These include field encryption, field tokenisation, and the anonymisation of data records.

Sovereignty complexities

If countries start to put locality-specific regulations in place you can risk running into situations like LinkedIn has experienced in Russia. The company is no longer allowed to operate in the country as it does not store a copy of the personal data it holds on Russian citizens in Russia. Rather than store that data on Russian soil, LinkedIn chose to terminate its operations in the country.

Depending upon the local demands, there are ways to organise your data so that it is compliant and useful for the whole business. For example, you could replicate parts of your data into the specific regions for local storage. Another possibility is to silo your data in the locales and then join these data sources up when carrying out queries. While it can work, handling querying across silos adds complexity to your data processing. Having local data centres for the storage and processing of the data will also increase your costs.

If the requirements go further and do not allow access by people outside of the region, then additional complexity related to filtering or locale-specific encryption comes into play.

I once ran into a situation where parts of the data to be analysed were going to be encrypted with different keys for different tokenisation mappings for PII from North America, Europe, and Asia. Some of the analysis results, which did not contain PII, could be seen anywhere, but certain details (i.e. those that did include PII) could only be seen within the region itself.

With larger data sets there is the possibility of having to comply with several of these regulations at once, especially in the US where there are vertical and regional privacy rules. This can complicate how data is collected and processed.

With the General Data Protection Regulation (GDPR), the EU is trying to make sure that there is more consistency on how and why data is managed so that this fragmentation of standards does not happen.

Repercussions for non-compliance

Given the size of the potential fines for non-compliance, failing on GDPR could be an existential event for some companies

With current EU data protection rules, penalties can reach up to €500,000, although very few penalties have been leveled and those that have are far from the maximum.

In contrast, in the forthcoming EU GDPR, there are significant monetary penalties for failure to comply. Based on the type of violation, the proposed fines can range from €10 million or 2% of global turnover, up to €20 million or 4% of global turnover.

Given the size of the potential fines for non-compliance, failing on GDPR could be an existential event for some companies. As a result, there is an expectation that these regulations will be taken seriously.

One of these penalties will be applied if a business fails to appoint an independent Data Protection Officer (DPO) with sufficient knowledge and resources, so there will need to be both attention and recruitment here.

Another point to consider is that other regions and localities outside of the EU are taking note and using GDPR as a framework for their own regulations. Therefore, if you do not comply, you not only run the risk of heavy fines but of also being locked out of doing business in other global geographies.