Scalable MRV data storage and transformation provenance capabilities

### Problem description
At the very high level Guardian policy execution boils down to the following workflow:

1. get some data (from sensors, or humans) publish as a VC (in IPFS)
2. do some transformations
3. record the result in a VC doc, publish (in IPFS)
4. get some more data
5. combine with previous and do some more transformations
6. record the result in a VC doc, publish (in IPFS)
7. repeat the cycle 1-6 numerous times
8. create a token (in Hedera)
9. repeat the entire cycle 1-8 until END

The underlying technologies that Guardian uses for storage are IPFS and Hedera Topics.

IPFS works very well for documents but is not very efficient for data, in particular data which undergoes many transformations, each of which needs to be verifiably performed and recorded.

Hedera Topics have content size limitations and is not do not have efficient addressing system.

For many real-world use-cases the required volume and complexity of calculations (and thus transformations) on the original MRV data is such that full automation of such workflows using existing Guardian technology will likely be very challenging if not impossible.

### Requirements
Identify and integrate with a distributed storage technology to allow Guardian to scalably work directly with data (similarly how it would have worked with a relational database) while maintaining a full record of data provenance and guaranteed policy adherence verifiability for the data processing and transformations.

Some relevant links: 

- [LFEdge Alvarium](https://www.lfedge.org/projects/alvarium/) - building the concept of a Data Confidence Fabric (DCF) to facilitate measurable trust and confidence in data and applications spanning heterogeneous systems.
- [Content Addressable Transformers](https://medium.com/block-science/the-cats-out-of-the-bag-introducing-content-addressable-transformers-7483e61e3844) - a unified software framework that enables data and process verification and provenance as chains of evidence for retrieval and re-execution via content-addressing the means of processing (input, process, output, infrastructure-as-code).
- [ComposeDB](https://composedb.js.org) - a decentralized, composable graph database.
- [Tableland](https://tableland.xyz) an open source, permissionless cloud database for reading and writing tamperproof data from apps, data pipelines, or EVM smart contracts.

### Definition of done

- Efficient data storage technology is integrated into Guardian
- Documentation is updated accordingly
- At least a single examples of the complex or mass-scale data transformations relying on the new data storage technology are introduced into one of the sample policies

### Acceptance criteria

- Guardian is able to handle mass volume of data and their complex transformation sequences/logic on the level of statistical analysis tools


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalable MRV data storage and transformation provenance capabilities #2907

Problem description

Requirements

Definition of done

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Scalable MRV data storage and transformation provenance capabilities #2907

Description

Problem description

Requirements

Definition of done

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions