The unrealized potential within data is immense, hindered by concerns over data liabilities and commercial limits. These factors impede the pace of data collaborations, as organizations fear legal and reputational consequences tied to data breaches, privacy violations and non-compliance with data protection regulations. By enabling data to be linked and aggregated across organizations, and regional and sectoral boundaries, untapped value can be unlocked in the form of new insights, faster decision-making, increased accuracy, efficiency and innovation.
In this intricate landscape, where legal, technical, social, and commercial risks intertwine, a convergence of trends drives the need for innovation that unlocks the latent value amid such friction. These intertwined trends are painting a clear picture of the road ahead. Growing regulatory compliance, increased processing of sensitive data in the cloud and the inevitable digital partnerships that leverage this data are the key market drivers dictating the future of data collaboration.
Regulatory compliance
As our experiences are increasingly digital-first, businesses are collecting, storing and processing vast amounts of personal data. As a result, data privacy has been thrust to the forefront of consumer consciousness, with high-profile data breaches, misuse and intrusive data collection practices only amplifying this awareness. As a result, consumer sensitivity around privacy is influencing compliance legislation worldwide.
Legislators are implementing more stringent data protection laws that impose stricter data handling and processing requirements while mandating greater transparency and giving individuals more control over their personal data. The trajectory toward more robust privacy legislation will only continue as society grapples with the implications of privacy in an era dominated by data.
Cloud-based data processing
Over the last decade, businesses have moved much of their data to the cloud, drawn by widely recognized scalability, flexibility, and cost-effectiveness benefits. However, transferring, collecting, and processing sensitive or personal information on the cloud involves ceding some control over its security, particularly within the evolving regulatory landscape. While the cloud provider assumes responsibility for the data center's physical security, the onus for securing the data lies with the owner.
The need to process this data and extract its value presents a window of vulnerability. During this phase, the risks of data breaches or leaks, unauthorized access, and regulatory non-compliance peak. From a security and compliance perspective, data analysis is considered a weakness in the data lifecycle. While data at rest is protected by encryption and data in transit is protected by SSL/TLS, data in use is considered the weakest phase due to:
- its complexity,
- lack of mature protection solutions,
- broad attack surface,
- limited visibility,
- transient nature,
- direct user interactions, and
- vulnerability to side-channel attacks.
This weakness increases when that data is used to collaborate with other parties.
Digital partnerships
Having successfully navigated these challenges and transitioned to the cloud while ensuring regulatory compliance through layers of security measures and procedures, organizations must extract value from these datasets by collaborating with other organizations. These data partnerships are a crucial strategy for businesses to flourish in the digital landscape due to their ability to enhance decision-making, unveil fresh opportunities, improve customer service, and cut costs.
However, despite advances in cloud capabilities, obstacles persist in establishing effective partnerships. Beyond technical hurdles like integration, system compatibility, data quality, and standardization, there are collaboration risks due to data privacy and security challenges and concerns around unauthorized use. Navigating the labyrinth of legal and regulatory compliance can be intricate, costly, and time-intensive.
These data-based partnerships can range from simple collaborations between two parties to complex scenarios involving multiple parties. A pivotal factor in the success of these partnerships is delivering a mutually agreed upon positive business outcome while preserving complete control and limiting broader access to proprietary datasets.
A suite of tools is emerging to meet these needs, known as Privacy-Enhancing Technologies (PETs). When embedded into the foundations of a collaboration platform, they enable powerful Privacy-Enhanced Collaborative Computing (PECC) capabilities. PECC enables dynamic, secure and efficient data collaboration built on the foundations of security, privacy and trust. Privacy and PETs are ingrained as fundamental and foundational aspects of its design rather than retrospectively added.
Driven by increased privacy awareness and a fast-evolving regulatory environment, PETs are in high demand. While designed to protect an individual's privacy in the event of a breach or disclosure, PETs also enable organizations to extract valuable insights from data without exposing the raw information. PECC is unlocking new possibilities in situations where the risks of data usage had previously overshadowed the benefits, addressing regulatory compliance requirements, reducing the risk of processing sensitive data in the cloud and catalyzing digital partnerships between otherwise competing organizations.
PETs have been available in some form for decades, but recent strides in computing are enhancing their potential and applicability. While individual PETs can operate independently, doing so poses challenges such as implementation complexity, increased processing costs and interoperability. These factors have hampered their widespread adoption. Their true potential comes to the fore when they are integrated into a broader platform capable of navigating users through this complexity, effectively pairing the appropriate PETs with the suitable use case to enable effective collaboration.
While there is abundant information concerning PETs, their specific value remains ambiguous, particularly concerning the market drivers of regulatory compliance, processing sensitive data on the cloud and digital partnerships. Rather than listing and detailing the features of various PETs, the following section will focus on the challenges organizations face in the given context and demonstrate how PETs and PECC can provide a solution.
Use Case 1: Encrypted data processing
An organization needs to process sensitive data with an external entity. Despite trusting the third party, the organization wants to safeguard against any potential malicious actions and ensure neither the cloud service provider nor the third party involved in data processing can access the raw data.
Solution: Homomorphic encryption permits the use or analysis of encrypted data without decrypting it.
Use Case 2: Data processing in situ within a secure location
An organization requires an external entity to process (decrypt and analyze) its sensitive data without exposing or sharing it and without it leaving its physical premises, secure cloud account, or firewall. The data owner must have control over all actions performed on the data. Like the previous scenario, the data must be safeguarded from the cloud service provider, the external entity, and those hosting and processing the data, such as the operating system and administrative operators.
Solution: Trusted Execution Environments (TEEs) permits data to be used or analyzed within a secure, isolated environment.
Use Case 3: Multi-party processing of distributed data
Several organizations wish to analyze and gain insight from each other's sensitive datasets while maintaining the confidentiality of their data from one another. The data owners must safeguard their information from any possible malpractice or ineptitude from any other involved parties.
Solution: Secure Multi-Party Computation (SMPC) permits multiple parties to run analysis on their combined data without revealing the contents of the data to each other.
Use Case 4: Sharing of insights without revealing details about individuals
An organization wants to share analytical insights derived from a dataset containing sensitive information about individuals that must be kept private.
Solution: Differential privacy enables organizations to reveal data or derived information to others without revealing sensitive information about individuals in the dataset.
Use Case 5: Training an algorithm across multiple distributed datasets
An organization seeks to develop a machine learning (ML) model, but the data required for its training is dispersed across multiple datasets, possibly owned by various organizations, which cannot be unified. Models must be crafted and trained at each dataset location, with the locally-trained models being gathered and merged centrally to construct a comprehensive global model. The sensitive data must remain safeguarded from all collaborators, with only the locally trained model being exchanged.
Solution: Federated Learning enables the training of an algorithm for ML across multiple distributed devices or datasets.
Use Case 6: Sharing sensitive data for analysis and research
An organization must work with an external entity with access to bespoke capabilities for research, ML model training and analysis purposes using sensitive data. The external party needs access to the unencrypted, non-aggregated, granular, full-fidelity data. The sensitive information about the groups of individuals in the dataset must be protected from the external entity and the general public in the event of a data breach.
Solution: Synthetic Data enables the generation of a version of the data that statistically resembles the real data but does not contain any identifiable or real-world individual data.
These use cases demonstrate the innovative ways PETs transform how sensitive data can be securely and swiftly employed in collaborative efforts. These elements, though, are rarely used in isolation; an effective collaboration typically demands a combination of PETs to be employed. For instance, Secure Multi-Party Computation and Trusted Execution Environments enable the processing of multiple distributed datasets in-situ, with Differential Privacy ensuring the insights do not disclose sensitive personal information.
The above use cases and numerous others constitute some of the key elements of PECC, making PETs ubiquitous and transparent, seamlessly weaving them together, managing the inherent technical complexity, integration challenges, and the risk of misuse.
Privacy is reshaping data collaboration
Privacy acts as a catalyst for collaboration, sparking technological innovations that usher in transformative changes. The applicability of this technology is wide-ranging - including adtech (secure planning, enrichment, activation and measurement), healthcare (data sharing, disease propagation) ESG (human trafficking) Government (voting, smart cities), finance (anti-money laundering, fraud prevention) and others. Across all of these industries, privacy has the potential to not only fuel and accelerate the AI revolution, but mitigate some of the risk associated with these models accessing large volumes of rich and valuable data. By enabling secure pooling of resources and controlled access to distributed sensitive data, it can empower smaller players to train large deep learning models.
It would be imprudent to suggest that PETs and PECC replace legal frameworks for data collaboration. As privacy technology matures, it must function within them. In practice, they must work in combination with legally binding and enforceable commitments. However, current legal processes must remain dynamic, and as PETs mature and become more prevalent, this transformation should prompt a reassessment of conventional legal obligations for data collaboration. Such a review might streamline current procedures and reduce friction, considering the assurances and technical constraints that PETs impose.
Perceiving privacy as merely a regulatory concern and failing to recognize its transformative impact on data collaboration is a significant oversight. The direction is clear - Privacy-Enhanced Collaborative Computing is spearheading a transition from data sharing to a dynamic, borderless marketplace for data processing and analytics. In this modern landscape, collaborations are essential not only for personal data or commercially beneficial data but in any circumstance where restricted data access confers an advantage or where open access might result in harm.
The privacy revolution and the technological changes it drives are not a trend to be overlooked or dismissed but rather welcomed and embraced for the opportunities it enables. It is the key that will unlock the next generation of data collaboration.