Metadata is usually described as “data about the data”, but it is also data about everything else that is relevant to data - such as the actors, actions, definitions, groups and assertions that reference or interact with the data over time.
In the MDMP Project, metadata is treated in the same way as data: it is owned by the party or parties involved in its creation, and access to it is controlled by the same rule sets and permissioning models used to control access to underlying data.
Although the details vary by jurisdiction, data ownership confers a set of rights on you as an individual or organisation as to how your data is stored, accessed and used by third parties. Such rights apply whether the data in question is incidental to the main produce or service on offer, or whether it is the product itself (e.g. in the case of publishing businesses).
Agency is the act of exercising those rights, and requires a set of tools that allows the data owner to control the distribution and use of their data. This toolkit extends beyond the law itself, and must include, for example, a mechanism that enables data owners to engage with the way that their data is being managed on their behalf.
This is the core tenet of the “personal information management solutions” (PIMS) that aggregate data and access to data as a fiduciary or self-sovereign agent, that provides individuals with an interface through which to exercise control over their own data.
MDMP provides a common metadata management layer that allows unlocks the potential for PIMS to compete as a standalone value proposition, separate from the services which collate, produce and use the data itself.
A data space or shared data layer refers to the data considered in scope for a data sharing network, usually comprised of service providers that both store and use customer data. In most cases, a data space is permissioned (i.e. data is only accessible by those network participants that are authorised to access it), although the blockchain technology has introduced several examples of unpermissioned data layers (i.e. all participants can access all the data).
To date, data spaces and shared data layers have been confined to the data fields agreed by participants in the data sharing network as being required for the use case in question. For example, in a payments network, the shared data layer comprises the fields that need to be exchanged to execute a payment. Similarly, participants in an identity scheme may agree to share some attributes about the claimant, as well as confirming their identity.
The MDMP Project enables access to the metadata that is so relevant to data sharing because it is used communicate the permissioning models that govern data access; to map the complex and evolving relationships between different entities; and to provide an audit log of activity to support risk transfer and liability management.
Permissioning refers to the rule sets that govern access to (and the use of) privately owned data. Permissioning models can range from the simple (e.g. bilateral consent between the customer and an application to use data for the purposes stated) to the more complex (e.g. delegated authorities within a hierarchical or organisational context; tiered or multi-party authorisations; adaptive authentication, based on circumstances; etc.).
Permissioning models restrict access to authorised users, and organisations may use one of several different approaches to define them, including Role-based Access Controls (RBAC), Attribute-based Access Controls (ABAC) and Delegated Access Controls (DAC).
MDMP and the network of MSPs provides a basis for managing metadata explicitly, allowing more fine-grained, flexible and powerful the permissioning model to be defined.
Around the world, data protection laws have lagged behind advances in technology, but they are now catching up.
New data protection legislation has been enacted in several jurisdictions including the EU (General Data Protection Regulation, 2016), South Africa (Protection of Personal Information Act, 2013) and Singapore (Personal Data Protection Act, 2012).
Other jurisdictions are considering similar changes, including Australia, where the recent Productivity Commission report recommended the implementation of a Data Sharing and Release Act.
These changes are having a profound impact on the way that service providers implement personal privacy rights, record the basis on which they are using data (e.g. customer consent), decide what data to store themselves and enable data portability.
Personal privacy rights sit at the centre of the new legislation. For example, the implementation of Personal Privacy Rights under GDPR give data subjects the right to:
Under GDPR, a transactional log of activity is also essential, as obtaining and recording a legal basis for data processing (e.g. customer consent) becomes critical: it must be verifiable, auditable and able to be withdrawn.
Service providers that act as data controllers under the data protection legislation will require a metadata management solution to facilitate the implementation of the personal privacy rights of their customers.
Several governments have implemented national schemes that give citizens better access to services and entitlements.
Banking and financial services have, in particular, been the subject of market intervention by regulators.
Other industry regulators have recognised the power of permissioned data sharing as a mechanism for creating 'open' markets.
Network operators that are used to deliver the data sharing introduced by regulators require a metadata management solution to govern access to data under the scheme.
The collection, verification and refresh of static data (e.g. for onboarding) is a pain point for both customers and service providers, particularly for those service providers in regulated industries where KYC standards are high.
Although the problems (turnaround times, high manual effort, increased hand-offs, redundancy at different steps, lack of seamless automation, refresh of book, etc.) are often shared, the cost of processing are usually duplicated across service providers.
These issues are not exclusive any particular industry and are exacerbated by the proliferation of APIs: an aggregated cross-sector network would not only enhance the client experience but mutualise costs across a broad base of service providers.
An open source framework for the management of data spaces, widely accessible and independently governed, is needed to solve for the issues of data collation, verification, refresh and control that affect all service providers.
These issues should be separated from issues associated with the processing of that data (e.g. to support a decision to onboard a customer), which are best supported by a market for micro-services rather than a vertically integrated solution.
Vertically integrated solutions attempt to solve for all steps along the value chain (i.e. from data collation through to decision) but involve a high cost of convergence, as all users must adopt common standards for every aspect, despite different starting points and priorities.
Metadata management represents the thinnest common layer that solves for the issues of data collation, verification, refresh and control.
Consumers have already had a taste of the convenience, speed and connectivity that control over their own data can bring, whether using a price comparison website, paying by Paypal or "logging in" with LinkedIn.
Similarly, the citizens of many countries benefit from national digital identity schemes (such as SingPass in Singapore, Aadhaar in India or ID Card in Estonia) that allow access to government services (and more) and strip out bureaucracy between them.
And both individuals and businesses are happy to use price comparison sites or traditional brokers to share data on their behalf, as they look to secure the best available finance or to upgrade their insurance policies.
Economic value flows quickly to those platforms or networks that make convenience, speed and connectivity useful in a particularly context, and the vast amounts of metadata collected can be used to develop ever more valuable services.
Obvious examples include the 'killer apps' that dominate their sectors by giving customer the ability to control how their data is shared across a wide network of service providers or other users, such as those in social media (e.g. Facebook), commerce (e.g. Amazon), transport (e.g. Uber) and web-search (e.g. Google).
This platform-based business model has been successfully copied in business-to-business environments, where players such as Xero (small business accounting software) and Salesforce (enterprise CRM solutions) have made it easy for third party developers to develop and integrate their own services on their platform.
Metadata management is central to delivering the convenience, speed and connectivity that consumers demand.
The MDMP Project is a collaboration between primary service providers (who store, use, manage and validate data either on their own behalf or on behalf of others), infrastructure providers (who play independent roles that foster connectivity and trust) and professional service providers (who foster change and transparency in the system).
The basis for this collaboration is a peer-to-peer network of Metadata Service Providers (MSPs), each of whom executes code that complies with the Metadata Meta Protocol (MDMP) and hosts a ‘shard’ of the Fact Table, either for themselves or on behalf of third party users.
Each ‘shard’ of the Fact Table stores its metadata as a simple table of UUIDs, and is addressable to every other ‘shard’. This means that users can access the entire ledger via any of the MSPs in the peer-to-peer network.
Users are the primary service providers seeking to access or reference data (e.g. about their own customers) that is stored by other primary service providers participating in the Project.
Private data is stored in isolated silos. It is securely locked away in multiple locations, by multiple parties, for multiple purposes. Such fragmentation results in frictional costs for the service providers that store the data. Effort is duplicated whenever the same data is captured twice, and multiple copies of the data held in isolation quickly go stale.
Similarly, the data owner (end user) lacks agency and the means of control: they must always step in to intermediate data sharing (for example, printing off a statement to take from one service provider to another), and have no way to manage their digital profile as a whole.
This fragmentation also creates greater opportunities for bad actors to commit fraud, identity theft and financial crime by exploiting weak links in a system comprised of service providers that act almost entirely independently of one another.
The MDMP Project provides a way, through the explicit management of metadata by a network of infrastructure service providers, for data in one silo to refer to data in another silo, reducing the duplication of effort, increasing the agency of data owners and fostering network effects that can be used to combat bad actors.
The explicit management of metadata has not been a priority for service providers whose applications secure customer data by heavily restricting access to the customers themselves (e.g. via account portals) or to authorised employees.
And yet whenever metadata is ignored, thrown away, or made inaccessible, a massive amount of value is lost. Metadata management represents the thinnest universal layer that solves for the common issues of data collation, verification, refresh and control.
The phenomenal success of platform-based business models has brought the relevance of explicit metadata management – and the hugely valuable network effects it engenders – into sharp focus. Value has quickly flowed to the applications that control access to the most – or the most valuable – data, and the platform-based business model has been copied by many institutions in many industries.
But metadata management must be implemented as a common protocol (like HTTP or TCIP), as this avoids undesirable concentration of control and authority into a single entity, whether corporate, government, or regulator. The MDMP Project was therefore set up to foster the development and adoption of a universal protocol for metadata management – the Metdata Meta Protocol.
There are so many potential use cases for the Metadata Meta Protocol that it would be impossible to list them all. Any use case which involves the exchange of data between two or more systems or applications is a potential candidate. Some of the most prominent use cases are listed on the main website here, and our favourites include:
We are excited at the prospect of developers and business innovators coming up with a myriad of use cases based on these and others. Whether the problem is helping to start up a business, win a new contract, manage cash flow, sign up to a supply chain, purchase an asset, deal with different areas of a big organisation (such as the NHS), plan for retirement, apply for a new position, enter countries as an immigrant, reclimb the credit ladder, hire accommodation, change roles, or manage marketing preferences, the solution benefits from the use of explicit metadata management.
The Project does not centralise data, but maintains a distributed data storage model. This means that data can continue to be stored securely in multiple locations but, as noted above, the Project provides a way for data in one silo to refer to data in another silo.
Similarly, metadata is stored on a distributed basis. Each ‘shard’ of the Fact Table is hosted by a separate Metadata Service Provider, who compete for the business of individual users. The metadata itself therefore cannot controlled by a single entity.
Authentication of the data owner is left to applications that participate in the MDMP ecosystem, as is the functional scope of the agency they provide and the transmission mechanisms that they use for data sharing. This allows standards to develop that balance the trade-offs involved in each particular use case, and for ‘kite marks’ to emerge that establish the trust needed to broaden usage.
Competition avoids the risk of reliance on a monolithic standard that becomes compromised over time. MDMP and the network of MSPs provides a consistent and persistent layer on top of which solutions that balance risk and reward can adapt and evolve as new security threats are identified.
All users (i.e. primary service providers) are known to MSPs that serve them, and MSPs can choose to form their own trust networks by permissioning (or not) access to their ‘shard’ of the ledger for other MSPs. The MDMP Project facilitates the creation of these trust networks, but does not require them or enforce them.
If one of applications of the primary service providers is breached, giving a bad actor agency over another entity’s data, the data owner will be alerted either by the application’s own authentication procedures or by another application that has been prompted to adjust by any changes made.
The website provides an overview of the MDMP ecosystem here. The MDMP Project welcomes interest from:
Customer-facing service providers in all industries (banking, insurance, payments, energy, telecommunications, retail, transport, media, etc.) whose service stack includes one or more of the following:
Independent players who provide the political, legal, operational and technical infrastructure required to build connectivity and trust on behalf of networks, hubs, schemes, platforms and markets.
The team behind the MDMP Project brings together experienced entrepreneurs from technology, financial services and information management. We have decades of experience and a track record of founding, funding, scaling and exiting businesses.
Some of our companies include AgFe (an asset manager with ~£2BN AUM, specialising in illiquid debt), OB10 (cloud-based global e-invoicing network sold to Tungsten Corporation in 2013 for $150 million), Intralinks (the global technology provider of inter-enterprise content management and collaboration solutions), pH Group (developer of the UK’s largest business database, sold to Experian in 2012), Aimetis (a leader in intelligent video technology with 10K+ deployments globally in banking, government, transportation, military and retail), and Datahog (distributed database management information system sold to Farms.com).
We now work together through Factern. Factern’s first incarnation was as the Business Data Initiative, or BDI. During late 2014 and much of 2015, a team drawn from Santander, Experian, Oliver Wyman, KPMG and AgFe collaborated to address the issue that lenders face in getting low cost access to high quality, private information on UK small businesses.
The difficulty of accessing such information – much more than any perceived lack of funding or capital – was contributing to the problems experienced in the UK small business financing market. A technology was needed to put the small businesses in control of their own data, and let them authorise access to potential lenders. The BDI team looked around, couldn’t find such a technology, and so developed a prototype.
Factern Ltd. was formed as a commercial venture in November 2015 by Oliver Wyman and AgFe, with Santander becoming a third shareholder in May 2016. All three institutions are active shareholders, acting as cornerstone users of Factern and enabling access to their own specialist expertise and global networks to support Factern’s outreach programme.
The team has met with well over 150 institutions in financial services and beyond, in the UK, Europe, North America and Asia. With over two years of design, testing and build, both Factern and MDMP have been extensively revised, pressure tested, and benchmarked against pre-existing market offerings.
For all enquiries, please contact us directly using the contact page on our website, or via our social media pages.
Today’s technology giants have been able to drive a massive convergence of standards. Operating on a global scale, the likes of Facebook, Amazon, Uber and Google dominate their sectors. Their business models depend on the collection of vast amounts of data via an initial ‘killer application’. They then use this data to develop a wider range of services, offering greater convenience to the customer, and thereby capturing a bigger set of data.
But platform-based models are distorting market dynamics: it is clearly undesirable for a few institutions to control access to ever more data, regardless of the convenience and efficiency they offer. Even as competitors around the world attempt to copy their success, legislators, regulators and competition authorities are looking to rebalance the market.
As importantly, consumers – increasingly aware of the importance of their privacy rights – are turning away from service providers that are not transparent about the way use their customers’ data, and are seeking out the tools that give them greater agency over data. Customers and data owners must be able to exercise their ownership, privacy and portability rights consistently across all applications and service providers.
This is what data sharing networks achieve, by creating a permissioned, shared data layer. But networks can be hard, expensive and time consuming to implement, and all of them operate within a specific context.
In contrast, a public protocol supports the development of permissioning models applicable in any group, organisation or situation, and allows for the creation of shared data layers that can be easily extended and connected with each other rather than remaining context specific.
Several public models have been developed as a way of consistently specifying metadata. These include RDF, Microdata and JSON-LD. While these metadata models are relatively prevalent, they are closely tied to the Semantic Web. They power rich and extensible browsing experiences, but are rarely applied across other technologies.
Furthermore, they fall short of the what is required:
As a result, there is no public infrastructure for metadata management that can be universally applied. Instead, private platforms dominate the space, exploiting network effects for proprietary gain.
The User Managed Access (UMA) protocol is a “constrained delegation” spec that aims to enable a resource owner to control the authorization of data sharing and other protected-resource access made between online services on the owner’s behalf or with the owner’s authorization by an autonomous requesting party. UMA extends OAuth by splitting applications that store customer data from a singular entity into a central authorisation server and a resource server.
UMA (with its focus on access management) is smaller in scope than the MDMP Project (as a metadata management layer), and MDMP benefits by giving all the responsibilities to the Metadata Service Providers (i.e. auditing, watches, tags, metadata, optionally data, routing, scopes, templates, authorization, authentication). This makes it easier for users to deal with MDMP, easier to use and deploy, and easier to guarantee consistency over time.
The Metadata Meta Protocol has also been architected from the ground up to be great for B2B which is a more complicated than the B2C cases that UMA was designed for. Lots of consideration gone into the columns and structure of the core metadata table, including the very powerful concept of "representing" or acting “on behalf of”.
The MDMP Project also has several capability advantages over UMA. MSPs capture metadata in the Fact Table and therefore provide audit logging. MSP APIs will offer more flexible data routing solutions, and the Fact Table will lead to an integrated data definition via metadata (templates, schemas, fields, etc.) that helps data mutualisation.
UMA doesn’t have watches, tags or the ability to deal with long asynchronous processing. All of these are available via MSP APIs, like that provided already by Factern. The structure of the Fact Table also encourages good practices around data structure (all UUIDs reference a previously created UUID), and therefore offers better building blocks from which to architect a data store.
Whether driven by regulators or commercial gain, data sharing schemes such as Open Banking have already established the central infrastructure to support collaboration between network participants.
Network operators of all kinds build connectivity and trust through a combination of:
Metadata management is functionally distinct from – but complementary to – these roles. To the extent that metadata management is explicitly defined for each network (e.g. as a protocol), it is narrowly focused on establishing communication standards to describe the data in scope, and how it can be safely accessed, within the context of that specific network.
The Metadata Meta Protocol establishes a common communication standard for describing all the relevant components of a shared data layer, such that they can be combined to express any set of rules governing access to different data resources.
This allows for the creation of shared data layers that can be easily extended and connected with each other rather than remaining specific to the hub, scheme, network, platform or market that developed them. For example, allowing Open Banking to reference data governed under PSD2, or GDPR, and vice versa.
Solutions wired to blockchains suffer from scalability and availability issues. MDMP is a technology agnostic protocol, and can either be implemented on top of blockchain, or more traditional databases.
Even if an implementation of MDMP is not backed by a blockchain, hashes of the data can intermittently be recorded in a global public blockchain. This leads to a tamper detection test: does the hash of the first N rows match the hash that was recorded in the blockchain.
The key difference between the MDMP Project and the use cases that draw on it is that the Project takes a metadata first approach. The MDMP metadata fabric unlocks the network effect latent in data, and is therefore a core piece of infrastructure on top of which solutions - such as identity proofing - can be built. However, the network of MSPs that implement MDMP do not provide the context specific applications or user interfaces for such solutions.