It’s estimated that 2.5 quintillion bytes of data is produced every day across the globe and this rate is growing exponentially. The collection and analysis of this data is a lucrative business and as data becomes more useful and insightful the more it is combined, it comes as no surprise that the six most valuable companies in the world are technology companies leveraging customer data.
Against the backdrop of high-profile data breaches and the introduction of the General Data Protection Regulation (GDPR), consumers are increasingly wary of companies collecting, processing and potentially sharing their data. Although they feel more comfortable when companies are transparent around data, consumers currently only have a marginal degree of trust in organisations to protect their data.
Furthermore, tech giants own a disproportionate amount of data and are making it increasingly difficult for other organisations to compete. They’re collecting data at an astonishing rate and are using it to deliver relevant and personalized campaigns. This is particularly evident in the advertising space, where Facebook and Google are owning sixty-five percent of advertising spending due to their breadth of data.
As a solution, companies are investing in new marketing technology to maximise the commercial value of their data, while dealing with the challenges around business trust, control over data and end-customer privacy. This capability will enable them to effectively utilize the data they are collecting on their customers, and enhance it with external data sources such as through trusted strategic partnerships.
Although many traditional data management tools are designed to centralize multiple sources where they can be analyzed together, there is a growing trend towards data decentralization. This type of infrastructure removes many of the data privacy, security and regulatory risks that exist in centralized models, and enables businesses to collaborate over data without losing data ownership and control.
There are two fundamental approaches to storing and managing data - centralized and decentralized. In a centralized approach, all the data is unified in a single location. This is intended to ease data integration, but often relies on ETL processes. In a decentralized approach, there is no central repository. Instead the datasets separated and are connected through a virtualized data layer.
Let’s explore both of those in a bit more detail.
Centralized data
A centralized infrastructure is a database that is located, stored, and maintained in a single location. This location is most often a central computer or database system, for example a server, or a mainframe computer. This approach is commonly being utilized for CRM, DMP, data warehouse and data lake technologies. However, the very nature of centralizing data carries two core challenges:
Cleanlinesses - there are often few restrictions on the data types that can be included, meaning its usefulness can quickly lessen. Data lakes, for example, can quickly become data swamps.
Security - pooling data into a single central location makes it becomes a prime target for hackers, and therefore the risk of data breaches increases.
Additionally, centralization makes it difficult to safely utilize external data as it requires an organization to either assimilate the data into the existing store, or to share their data with an outside organization to overlay enrichment data. The commercial value held in customer data is too high to risk to give to external parties, not to mention the compliance issues in sharing personal data without permission.
Decentralized data
Utilizing a decentralized approach means that data is never held in a single location, instead all data remains in its original location and is connected through data virtualization. Processing of the data is then distributed between the different nodes.
This approach has a number of advantages over centralization:
Security -, if one node in the network is attacked, it doesn’t compromise the other datasets or enable hackers to gain access to repositories holding large amounts of personal data
Reliability - all the nodes within the network hold their own data, and manage the processing. Therefore the pressure doesn’t land on one party, instead it is distributed across the network.
Commerciality - decentralization means that the raw data is never shared and access can be revoked at any time. This means that the owner remains in complete control of their data.
Whatsmore, over 80 percent of marketers believe data silos prevent them from having a comprehensive view of campaigns and customers across channels. This fragmentation means that it’s not possible to run analysis across the entire business and gain a true picture of customer behaviour. Flying half blind has a dramatic impact on business success and enables those who have a complete picture to succeed.
Data silos commonly appear when departments are using separate systems to interact with customers, or where a company has diversified into other products. For example, some supermarkets in the UK have moved into loyalty programmes, banking, insurance, and other retail ventures. It is common that the customer data for these different divisions is held separately, so cannot be compared
Decentralization offers a solution for these organisations to bring together all their data sources, without physically moving them into all into one pot. It produces the same analytical output as stitching the datasets together, without the data privacy, trust and implementation barriers. Enabling analysis across every dataset owned by an organisation creates a truly integrated and omni-channel customer view.
One key way that companies can compete with the level of data collected by the dominating tech giants, is to collaborate. By providing strategic partners with the insights you hold, and by receiving the insights they hold, you can create not only an omni-channel view, but also a multi-party view.
Historically the legal, commercial and security risks associated with sharing data outside of an organization have stopped this collaboration from taking place. It is therefore only through decentralized technology that companies can collaborate safely to identify new and lucrative cross promotional possibilities.
“Any enterprise CEO really ought to be able to ask a question that involves connecting data across the organization, be able to run a company effectively, and especially to be able to respond to unexpected events. Most organizations are missing this ability to connect all the data together.”* Sir Tim Berners-Lee
*Sarkar, P. (2015). Data As A Service. p.25.