On-Chain Data Containers | Ethereum Improvement Proposals

Abstract

This ERC defines a series of interfaces for the abstraction of storage of on-chain data by implementing the logic functions that govern such data on independent smart contracts. “On-chain Data Containers” (ODCs) refer to the separation and indexing of data storage away from data management. We propose that on-chain data can be abstracted and stored in smart contracts called “Data Objects” (DO), which answer to external data indexing mechanisms named “Data Points” (DP). This data can be accessed and modified by implementing (one or many) separate smart contracts identified as “Data Managers” (DM). We introduce two mechanisms for access management: first, through a “Data Index” (DI) implementation, the “Data Managers” (DM) can be gated from accessing “Data Objects” (DO); second, a “Data Point Registry” (DPR) implementation manages the issuance of “Data Points” (DP). Lastly, we introduce the concept of data portability (horizontal data mobility) between implementations of “Data Index” (DI), enabling massive updates to the logic without affecting the underlying data storage.

Motivation

As the Ethereum ecosystem grows, so does the demand for on-chain functionalities. The market encourages a desire for broader adoption through more complex systems and there is a constant need for improved efficiency. We have seen times when an explosion of new standard token proposals was solely driven by market hype. While ultimately each standard serves its purpose, most of them require more flexibility to manage interoperability with other standards. A standard adapter mechanism is needed to enhance interoperability by driving the interactions between assets issued under different ERCs.

Without such mechanisms, most projects have implemented bespoke solutions for interoperability. This is an inefficient approach and leads to a fragmented ecosystem. We recognize there is no “one size fits all” solution to solve the standardization and interoperability challenges. Most assets - Fungible, Non-Fungible, Digital Twins, Real-world Assets, DePin, etc - have multiple mechanisms for representing them as on-chain tokens through the use of different standard interfaces and the diversity of standards spurs innovation.

However, for these assets to be exchanged, traded, or otherwise interacted with, protocols must implement compatibility with the relevant interfaces to access and modify on-chain data. This is especially challenging when considering the previously mentioned bespoke solutions for interoperability. Additionally, the immutability of smart contracts complicates the ability of already deployed protocols to adapt to new tokenization standards, which is critical for future-proofing implementations. A collaborative effort must be made to enable interaction between assets tokenized under different standards. The current ERC provides the tools for developing such on-chain adapters.

We aim to abstract the on-chain data handling from the logical implementation, exposing the underlying data independently of the ERC interface. We propose a series of interfaces for storing and accessing data on-chain in contracts called “Data Objects” (DO), grouping the underlying assets as generic “Data Points” (DP) that may be associated with multiple interoperable and even concurrent “Data Manager” (DM) contracts. This proposal is designed to work by coexisting with previous and future token standards, providing a flexible, efficient, and coherent mechanism to manage asset interoperability.

Data Abstraction: We propose a standardized interface for enabling developers to separate the data storage code from the underlying token utility logic, reducing the need for supporting and implementing multiple inherited -and often clashing- interfaces to achieve asset compatibility. The data (and therefore the assets) can be stored independently of the logic that governs such data.
Standard Neutrality: A neutral approach must enable the underlying data of any tokenized asset to transition seamlessly between different token standards. This will significantly improve interoperability among other standards, reducing fragmentation in the landscape. Our proposal aims to separate the storage of data representing an underlying asset from the standard interface used for representing the token.
Consistent Interface: A uniform interface of primitive functions abstracts the data storage from the use case, irrespective of the underlying token’s standard or the interface exposing such data. Data and metadata can be stored on-chain, and exposed through the same primitives.
Data Portability: We provide a mechanism for the Horizontal Mobility of data between implementations of this standard, incentivizing the implementation of interoperable solutions and standard adapters.

Specification

Terms

Data Point: A uniquely identifiable reference to an on-chain data structure stored within one or many Data Objects and managed by one or many Data Managers. Data Points are issued by a Data Point Registry.

Data Object: A Smart Contract implementing the low-level storage management of information indexed through Data Points.

Data Manager: One or many Smart Contracts implementing the high-level logic and end-user interfaces for managing Data Objects.

Data Point Registry: One or many Smart Contracts used for managing the issuance of Data Points. Additionally, a Data Point Registry defines a space of compatible or interoperable Data Points.

Data Index: One or many Smart Contracts used for managing the access of Data Managers to Data Objects.

The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119 and RFC 8174.

Data Point Structure

Data Point MUST be bytes32 storage units.
Data Point SHOULD NOT be used for storing asset data.
Data Point SHOULD be used for indexing data.
Data Point SHOULD use a 4 bytes prefix for storing information relevant to the compatibility with other Data Points.
Data Point MUST use 4 bytes for storing the Chain ID
Data Point SHOULD use the last 20 bytes for identifying which Registry allocated them.
The RECOMMENDED internal structure of the Data Point is as follows:

/**
 * RECOMMENDED internal DataPoint structure on the Reference Implementation:
 * 0xPPPPVVRRIIIIIIIIHHHHHHHHAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
 * - Prefix (bytes4)
 * -- PPPP - Type prefix (i.e. 0x4450 - ASCII representation of letters "DP")
 * -- VV   - Version of DataPoint specification (i.e. 0x00 for the reference implementation)
 * -- RR   - Reserved
 * - Registry-local identifier
 * -- IIIIIIII - 32 bit implementation-specific id of the DataPoint
 * - Chain ID (bytes4)
 * -- HHHHHHHH - 32 bit of chain identifier
 * - REGISTRY Address (bytes20)
 * -- AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA - Address of Registry which allocated the DataPoint
**/

Data Points are a low-level structure abstracting and indexing information. Data Points act as pointers to information stored within data structures in Data Objects. Data Points are allocated by a Data Point Registry. The Data Point SHOULD store which Data Point Registry initialized it within its internal structure. Each Data Point SHOULD have a unique identifier provided by the Data Point Registry when instantiated.

Data Object Interface

Data Object SHOULD be used for storing asset data.
Data Object SHOULD use Data Points for indexing the structure storing the data.
Data Object SHOULD implement the logic directly related to handling the data structure.
Data Object SHOULD implement the logic for transferring management of its Data Points to a different Data Index implementation.
Data Object MUST use the IDataObject interface:

interface IDataObject {
    /**
     * @notice Reads stored data
     * @param dp Identifier of the DataPoint
     * @param operation Read operation to run on the data
     * @param data Operation-specific data
     * @return Operation-specific data
     */
    function read(bytes32 dp, bytes4 operation, bytes calldata data) external view returns(bytes memory);

    /**
     * @notice Store data
     * @param dp Identifier of the DataPoint
     * @param operation Write operation to execute on the data
     * @param data Operation-specific data
     * @return Operation-specific data (can be empty)
     */
    function write(bytes32 dp, bytes4 operation, bytes calldata data) external returns(bytes memory);

    /**
     * @notice Sets DataIndex Implementation
     * @param dp Identifier of the DataPoint
     * @param newImpl address of the new DataIndex implementation
     */
    function setDIImplementation(bytes32 dp, address newImpl) external;
}

Data Objects are entrusted with the storage and management of data. Data Objects SHOULD implement the logic for managing the storage of on-chain data. Data Object internal data structure SHOULD use Data Points for indexing information.

Data Objects CAN receive read(), write(), or any other custom requests from a Data Manager requesting access to a storage structure indexed by a Data Point.

As such, Data Objects respond to a gating mechanism given by a single Data Index. The function setDIImplementation() SHOULD enable the delegation of the management function to an IDataIndex implementation.

Data Manager Contract

Data Manager SHOULD use IDataIndex.read() or IDataObject.read() to read data from Data Objects
Data Manager MUST use IDataIndex.write() to write data to Data Objects
Data Manager MAY share Data Point with other Data Managers
Data Manager MAY use multiple Data Points
Data Manager SHOULD implement the logic for requesting Data Points from a Data Point Registry.

Data Managers are independent smart contracts that implement the business logic or “high-level” data management. They can read() from a Data Object address and write() through a Data Index implementation managing the delegated storage of the Data Points.

Data Point Registry Interface

Data Point Registry SHOULD define functions to manage the creation, transfer, and access control of Data Points
Data Point Registry SHOULD manage Data Point access management for Data Managers
Data Point Registry MUST use the IDataPointRegistry interface:

interface IDataPointRegistry {

  /**
     * @notice Verifies if an address has an Admin role for a DataPoint
     * @param dp DataPoint
     * @param account Account to verify
     */
    function isAdmin(bytes32 dp, address account) external view returns (bool);

    /**
     * @notice Allocates a DataPoint to an owner
     * @param owner Owner of the new DataPoint
     * @dev Owner should be granted Admin role during allocation
     */
    function allocate(address owner) external payable returns (DataPoint);

    /**
     * @notice Transfers a DataPoint to an owner
     * @param dp Data Point to be transferred
     * @param owner Owner of the new DataPoint
     */
    function transferOwnership(bytes32 dp, address newOwner) external;

    /**
     * @notice Grant permission to grant/revoke other roles on the DataPoint inside a Data Index Implementation
     * This is useful if DataManagers are deployed during lifecycle of the application.
     * @param dp DataPoint
     * @param account New admin
     * @return If the role was granted (otherwise account already had the role)
     */
    function grantAdminRole(bytes32 dp, address account) external returns (bool);

    /**
     * @notice Revoke permission to grant/revoke other roles on the DataPoint inside a Data Index Implementation
     * @param dp DataPoint
     * @param account Old admin
     * @dev If an owner revokes Admin role from himself, he can add it again
     * @return If the role was revoked (otherwise account didn't have the role)
     */
    function revokeAdminRole(bytes32 dp, address account) external returns (bool);
}

The Data Point Registry is a smart contract entrusted with Data Point access control. Data Managers may request the allocation of Data Points to the Data Point Registry.

Data Index Interface

DataIndex SHOULD manage the access of Data Managers to Data Objects.
DataIndex SHOULD manage internal IDs for each user.
DataIndex MUST use the IDataIndex interface:

interface IDataIndex {
    /**
     * @notice Verifies if DataManager is allowed to write specific DataPoint on specific DataObject
     * @param dp Identifier of the DataPoint
     * @param dm Address of DataManager
     * @return if write access is allowed
     */
    function isApprovedDataManager(bytes32 dp, address dm) external view returns(bool);

    /**
     * @notice Defines if DataManager is allowed to write specific DataPoint
     * @param dp Identifier of the DataPoint
     * @param dm Address of DataManager
     * @param approved if DataManager should be approved for the DataPoint
     * @dev Function should be restricted to DataPoint maintainer only
     */
    function allowDataManager(bytes32 dp, address dm, bool approved) external;

    /**
     * @notice Reads stored data
     * @param dobj Identifier of DataObject
     * @param dp Identifier of the datapoint
     * @param operation Read operation to execute on the data
     * @param data Operation-specific data
     * @return Operation-specific data
     */
    function read(address dobj, bytes32 dp, bytes4 operation, bytes calldata data) external view returns(bytes memory);

    /**
     * @notice Store data
     * @param dobj Identifier of DataObject
     * @param dp Identifier of the datapoint
     * @param operation Read operation to execute on the data
     * @param data Operation-specific data
     * @return Operation-specific data (can be empty)
     * @dev Function should be restricted to allowed DMs only
     */
    function write(address dobj, bytes32 dp, bytes4 operation, bytes calldata data) external returns(bytes memory);
}

The Data Index is a smart contract entrusted with access control. It is a gating mechanism for Data Managers to access Data Objects. If a Data Manager intends to access a Data Point (either by read(), write(), or any other method), the Data Index SHOULD be used for validating access to the data.

The mechanism for ID management determines a space of compatibility between implementations.

Rationale

The decision to encode Data Points as bytes32 data pointers is primarily driven by flexibility and future-proofing. The use bytes32 allows for a wide range of data encodings. This provides the developer with many options to accommodate diverse use cases. Furthermore, as Ethereum and its standards continue to evolve, encoding as bytes32 ensures that the Standard Adapters built with the current ERC can reference future data types or structures without requiring significant changes to the adapter itself. The Data Point encoding should have a prefix so that the Data Object can efficiently identify compatibility issues when accessing the data storage. Additionally, the prefix should be used to find the Data Point Registry and verify admin access to the Data Point. The use of a suffix for identifying the Data Point Registry is also required, for the Data Object to quickly discard badly formed transactions that aim to use a Data Point from an unmatching Data Point Registry.

Data Manager implementations decide which Data Points they will be using. Their allocation is managed through a Data Point Registry, and the write() access to the Data Point is managed by passing through the Data Index implementation.

Data Objects are independent separate Smart Contracts that implement the same read/write interface for communicating with Data Managers. This is a decision mainly driven by the scalability of the system. Offering a simple interface for this layered structure enables different applications to have their addresses for storage of data independently from the asset’s interface. It is up to each implementation to manage access to their Data Point storage space. This enables a wide array of complex, dynamic, and interactive use cases to be implemented with multiple ERCs as well as other smart contracts, including the embedding of cross-chain and rollup logic for asset management.

Data Objects offer flexibility in storing mutable on-chain data that can be modified as per the requirements of each specific use case. This enables the Data Managers to hold mutable states in delegated storage and reflect changes over time, providing a dynamic layer to the otherwise static nature of storage through most other standardized interfaces.

Data Managers can decide to migrate the complete storage management of a Data Object from one Data Index implementation to another. As the Data Index implements user ID management, this mechanism enables a Data Manager to upgrade its internal access control mechanisms without affecting the underlying storage of value, as it is located elsewere (in the Data Object). We call this mechanism “Horizontal Data Mobility/Portability”.

Backwards Compatibility

This ERC is intended to augment the functionality of existing token standards without introducing breaking changes. As such, it does not present any backward compatibility issues. Already deployed tokens under other ERCs can be wrapped and indexed as Data Points and managed by Data Objects, and later exposed through any implementation of Data Managers. All interoperability integrations will require a compatibility analysis, depending on the use case. However, the interfaces defined in this ERC define a framework for adapting one standard to another through storage abstraction.

Reference Implementation

We present an educational example implementation of a Standard Adapter showcasing two types of tokens (Fungible and Semi-Fungible) with shared data storage. The end user can interact with this implementation through one of two types of interfaces: The semi-fungible one (represented through a single Data Manager), and the fungible interface (represented by many Data Managers). The abstraction of the storage from the logic is achieved through the use of a single Fungible Fractions Data Object. A factory is used for deploying the Fungible token interfaces that share storage with each semi-fungible collection. Note that if a transfer() is called by either interface (Fungible or Semi-Fungible), both interfaces are emitting an event.

This example has not been audited and should not be used in production environments.

See contracts

Security Considerations

The access control is separated into three layers:

Layer 1: The Data Point Registry allocates for Data Managers and manages ownership (admin/write rights) of Data Points.
Layer 2: The Data Index smart contract implements Access Control by managing Approvals of Data Managers to Data Points. It uses the Data Point Registry to verify who can grant/revoke this access.
Layer 3: The DataObject manages trust relationship between the DataPoint and a DataIndex implementation and allows a trusted DataIndex to execute write operations.

A common task for a Data Object is to store user-related data, while the Data Manager implements the logic for managing such data. Both Data Objects and Data Managers often require the management of user IDs. A Data Index can offer logic for user management (i.e. IDs based on address) that are independent of any particular implementation, but care must be taken in selecting these identifiers. If chosen improperly, they could hinder a Data Manager’s ability to migrate between different Data Indexes.

No further security considerations are derived specifically from this ERC.