​Industry regulations share some commonalities with the design architecture of your analytics platform.

Companies manage masses of data, including highly sensitive personally identifiable information (PII). This must be protected by observing strict processing requirements as outlined in industry regulations.

There are privacy regulations such as the General Data Protection Regulations (GDPR) observed both here and in the EU (although subtle differences will begin to manifest themselves due to Brexit), electronic data storage regulations such as the Grundsätze zur ordnungsmäßigen Führung und Aufbewahrung von Büchern, Aufzeichnungen und Unterlagen in elektronischer Form sowie zum Datenzugriff (GoBD) in Germany, and transaction regulations which apply globally such as the Payment Card Industry Data Security Standard (PCI DSS).

Each of these standards stipulates how data can be processed in a secure way and accessed, which means they have direct implications on the architecture and implementation of your data analytics platform. First of all, it is worth summarising the demands of each of the respective regulations outlined above from a data storage and analytics perspective

GDPR: The user now has much more control over their data under the auspices of GDPR. Permission must usually be sought to collect data with the purposes made clear, and once the data has fulfilled this purpose, it should be deleted or archived. Where data must be retained, it must be securely stored and protected. Users also have the right to submit a Data Subject Access Request (DSAR) to obtain the data held on them and can request this data be deleted or for an explanation as to why this is not possible. A DSAR must be actioned within 30 days. In the event of a data breach, the data handler must disclose the breach within 72 hours to the data authority and should seek to notify affected parties.

GoBD: Applicable to all aspects of accounting, GoBD sets out specifications for how financial documents are processed, secured and protected, as well as the systems used. There is no prescribed format as such but companies must now provide a sizable weight of documentation to demonstrate the digital processes and providers used in order to achieve compliance. It enshrines key concepts of data immutability (i.e. that data should not be changed once created) and data security.
GoBD was updated in 2020 and now includes provisions that allow digital documents to carry the same weight as paper documents, photos to be classed as equivalent to scanned documents and systems in the cloud to be considered data processing systems. It requires a year’s worth of continuous documentation, accountancy data to be kept for six years and business correspondence to be kept for ten years.

PCI DSS: Aimed at elevating the security of payment card transactions, the PCI DSS provides a framework that covers the entire process and sets out how companies must handle authentication data and non-authentication data. The former refers to the Card Verification Value (CVV), Primary Account Number (PAN) and Personal Identification Number (PIN) of a user and cannot be stored, so they are deleted immediately, while the latter, such as name and card expiration date, can be securely retained.

When it comes to data processing, cardholder data must be encrypted in transit and must only be sent to known destinations. Access to data is restricted to specific roles that must have unique access credentials. Any activity pertaining to cardholder data must be logged, and how data travels into the business, how it is stored, and the number of times it is accessed must all be documented.

At the time of writing, version 4.0 of the PCI DSS standard was recently released, which includes more stringent adoption of multi-factor authentication, new e-commerce and phishing standards, an increase in encryption and more stringent monitoring, logging and detection.

Each of these regulations contains similar principles concerning how sensitive data is stored, processed and used, which should theoretically make it easier to manage this data. Once you recognise these commonalities, it becomes possible to design your data analytics platform around them, making it easier to achieve compliance. Considerations include:

  • Data inventory – data stored on the platform will contain valuable metadata that can be used for categorisation as well as to build a picture of where it originates from. Within the data inventory, you should define the data use, who has ownership of it and the source of the data. If you store PII, you should also seek to anonymise this data.
  • Extraction and Ingestion – any change to the source data or deletion of data must then be reconciled in the data analytics platform to ensure that any inaccuracies are dealt with. However, some data will be considered mutable and some immutable, so it’s important to accommodate both. Mutable data is increasingly managed in NoSQL databases, for example, PII that may require changes to the subject’s name or address, whereas immutable data required to provide traceability will need its own storage, for which Blockchain is now sometimes used.
  • Data observability – System logs must be maintained for auditing purposes, and these should be monitored for suspicious activity. Doing this continuously using an observability pipeline will increase the accuracy and quality of the data.
  • Access controls – sensitive data will need to be truncated, encrypted, hashed or erased at source. Whether data is in transit or at rest, it should be encrypted. Administrators should oversee role-based access, assign unique IDs and access credentials to these users and keep a record of access privileges; plus they should also log all access attempts.

However, data sprawl in the cloud can make it difficult to keep track of and prevent this sensitive data from inadvertently straying into other databases or even endpoints. For this reason, it’s also wise to consider carrying out a security posture assessment on a regular basis to ensure you have full visibility of your assets in the cloud and the data they contain. Scanning for PII should be system-wide, across all files and formats and encompass structured and unstructured data.

We have worked with customers to meet the demands of all three of these standards while helping to achieve efficiencies in their data processing and realise cost savings. To find out how you can safeguard your data and achieve cost-effective compliance, contact us today.