The Filecoin Virtual Machine unlocks boundless possibilities for innovation on the Filecoin network. Here are many ideas for systems, apps, and building blocks we’d like to see built. We are counting on you to capture these opportunities, and turn these ideas into reality!
- Undercollateralized Lending
- Perpetual Storage
- Programmable Storage Markets
- Storage Automation (Replication and Repair)
- Liquid Staking
- Reputation and QoS systems
- Storage Onramps [through DataDAOs]
- Pay-Per-View [through DataDAOs]
- Games [through DataDAOs]
- Social [through DataDAOs]
- Decentralized Science [through DataDAOs]
- Decentralized Compute
- Trustless FIL+ Notaries
- KYC and claims attestation
- Decentralized Data Aggregator
- Insurance for Storage Providers
- Access Control
- User-Friendly Names
- Blockchain Nuts & Bolts
- DEXes & Exchanges
- Overcollateralized Lending
- Token Bridges
- General purpose cross-chain bridges
- Price Oracles
- Retrievability Oracles
Version: v0.7 Date: 2023-01-26
Storage providers (SPs) have to post collateral (in FIL) to onboard storage capacity to the network, and to accept storage deals. While important for security, the need to pledge collateral creates friction and an immediate barrier that limits SP participation. Furthermore, the mining process requires large capital investments into hardware.
The Filecoin network has a large base of long-term token holders that would like to see the network grow, and are willing to lend their FIL to reputable and growth-oriented SPs. In a pre-FVM world, a number of lending partners have stepped up (Darma, Anchorage, Coinlist) to facilitate this flow of capital. However these partners are not able to service all SPs in the market, nor are they able to work with all token holders (or protocols) that might want to get access to an inflation-indexed form of Filecoin.
We see lending as a core yield lego for the FVM ecosystem. This means we can see it being used for yield aggregation, perpetual storage, liquid staking, and a lot more. Getting lending right is key to getting the FVM ecosystem kickstarted, especially in the current yield-starved marco environment.
Storage providers can borrow collateral from lenders and the smart contract will lock the future income (block rewards) until the storage providers have repaid their loan.
Underwriting lending can be important, since borrowers can default on their loan.
Initial thoughts for what this could look like:
- Permissioned lending pools that have a predefined list of lenders and borrowers (totally permissioned, underwriting done off-chain)
- One-sided permissionless lending pools (akin to Maple on Ethereum / Solana) that have a predefined list of borrowers, but anybody can lend
- Semi-permissioned lending pools where anybody can lend, but borrowers can automatically join as long as they have a list of defined criteria. For example:
- Preferential lending deals to Silver / Gold tiered SPs
- Lending only to SPs on this registry with a reputation score above 95
- Lending network for SPs to deposit collateral to retrievability oracles
We need to enable perpetual storage of data with a certain number of redundancies. This is important because it is a core use case that comes up again and again from our partners and builders.
Perpetual storage (or in the special case, permanent storage) allows clients to automate renewal of their deals in perpetuity. In many cases, clients want to be able to simply specify terms for how data should be stored (e.g. ensure there are always at least 10 copies of my data on the network) — without having to run infrastructure to manage repair or renewal of deals.
Note that because of Filecoin’s proofs - we can create contracts that operate with substantially higher capital efficiency, without sacrificing the security of a dataset.
This tweet thread shares a mental model for how one might create such a contract, along with a strategy for calculating the funds required to indefinitely fund storage via DeFi.
Programmable Storage Markets
If you regard the Filecoin storage network as a massive decentralized data warehouse whose state is being constantly proven to the public, you can think of the FVM as a programmable controller for it.
In a traditional cloud-based data center, the strategies and policies defining how data is inserted, placed, distributed, replicated, repaired, etc. are predetermined by the vendor, and users can only configure them in proprietary ways.
With the FVM, devs can envision and create data center logic to orchestrate, aggregate and broker storage capacity and data sitting all over the world in novel ways, giving rise to new storage primitives with their associated economies of scale.
Some ideas include:
- Storage bounties, where storage providers compete to win deals, bringing the price down for clients.
- Full-sector auctions, where clients get a discount for purchasing and occupying entire sectors.
- Volume discounts, where the price is further reduced for purchasing multiple sectors at once.
- Sector rebates, where the provider refunds the client (who could be a Data DAO!) on a trigger condition, e.g. when they purchase N sectors over a specific period of time.
These can compose like legos with one another to offer richer storage recipes. There is room for many automation solutions, variations and flavors to compete in the market. By standardizing open interfaces for abstract concepts, other solutions like Storage Automation tools and DataDAOs can integrate them without lock-in. In the future, derivative markets may appear and interoperability solutions could enable seamless switching between providers.
Storage Automation (Replication and Repair)
Filecoin can benefit tremendously from a “set and forget” dapp that allows users to upload a file on the Filecoin network and be assured that the file has at least X SPs storing the file at any given point in time.
Clients want their data to be replicated across the network to maximize the chances it will survive in the event of storage provider failures. To achieve that today, clients have to execute N deals with storage providers, transferring the data N times. This is a cumbersome and resource-intensive task for a client to perform in order to obtain redundancy.
Replication workers solve this problem by charging a small fee to act as a mediator, by saving the client the time and overhead of negotiating multiple deals. Instead, the replication worker can automatically copy the Filecoin deal N times across the network in accordance with the user-defined policy based on number of replicas, region selection, latency, price, etc. (potentially using L2 reputation systems to decide where to place the data!)
If a deal expires or an SP goes offline, a new deal for the file can get created and negotiated to maintain the X invariant. (While it is possible for all these SPs to go offline at the same time, this probability approaches 0 for large enough X.)
A smart contract on FVM can coordinate this invariant trustlessly and without human coordination.
Initial thoughts for what this could look like:
- Issue your own tokens on FVM and decentralize a version of Slingshot (see more here); the Data Preparer can be issued tokens for each CAR file distributed to SPs; SPs can get rewarded with tokens based on how many full replicas of datasets are being onboarded to the network.
- Note: This is also possible to do without issuing your own tokens and issuing FIL from a DataCap allocation for FIL+ datasets.
- Create a vault to support perpetual storage with a certain amount of FIL. Get a certain yield on this endowment through lending FIL, liquid staking, or some other approach that generates yield. As long as the vault has some FIL, the smart contract can negotiate new deals trustlessly with new SPs as old deals expire or SPs go offline.
- See JV’s tweet thread here for some thoughts on the second approach
Other chains (such as Ethereum mainnet) can enable liquid staking through block rewards for staking on a PoS chain. Filecoin can enable this functionality through distributing block rewards given to a winning SP per epoch. Teams such as Filmine are already hard at work building solutions here.
Initial thoughts for what this could look like:
- Build a wrapper around a lending dapp. Here is one approach for how the wrapper could look:
- Mint tFIL based on FIL deposited in the protocol
- Generate a yield on FIL by lending it to SPs on a lending platform
- Allow trading for tFIL for FIL – should maintain a 1-1 correspondence if the yield generated matches expectations
- A few directional ideas that might be useful for inspiration:
- Imagine porting ideas from Lido / RocketPool to the Filecoin ecosystem.
- Note that you’ll need to make decisions about rebasing or value accrual - it might be useful to think about where you imagine the tokenized asset to be used (e.g. as collateral in other protocols).
- Eigenlayer might have interesting elements to draw upon here
- If aiming for 1:1 pegs, you may want to think about where you imagine the peg to be maintained (a Curve/Convex system on another chain? Versions of those deployed on FIL?)
Reputation and QoS systems
With close to 4000 active storage providers (SPs) servicing the Filecoin network, it isn’t easy for clients to choose who to store data with. Different clients may value different properties, and priorities vary across data sets from the same client (e.g. price, latency, bandwidth, availability, etc.) Storage providers could congregate and advertise their Quality of Service (QoS) on a portal, but the public would have to trust them, which isn’t good.
Reputation systems are L2 networks that patrol the Filecoin network, assessing the Quality of Service of storage providers. Their activity may be funded by crowdfunding, DataDAOs, or a dedicated cryptoeconomy. They perform storage deals on the open market, and capture their observations in a provable log from which reputation scores and metrics are computed and published, in a traceable and auditable form. Pando is a potential building block.
Reputation systems would offer their services via APIs, either on-chain, off-chain, or both, potentially with complex querying capabilities, allowing storage apps to programmatically explore and filter the universe of miners by the exact characteristics they care about, for every storage deal.
One idea to monetize here might be to offer credentialing as a service (i.e. storage providers pay the credentialing service to issue a verified credential on-chain). The service could embed in the metadata of the verified credential all the relevant info used to generate the attestation (making it easily verifiable for any client), allowing the verified credential to be used standalone as shorthand in other protocols (e.g. who might be eligible to participate in an auction to store data).
FVM enables a new kind of data-centric DAO that has heretofore been impossible. DataDAOs are DAOs whose mission revolves around the preservation, curation, augmentation, and promotion of datasets considered valuable by their stakeholders.
Examples include datasets valuable to humanity at large, like large genomic or research databases, historical indexed blockchain chain and transaction data, rollup data, the Wikipedia, NFT collections, metaverse assets, and more. But also datasets valued by narrower collectives, like football video recaps, statistics published by governments and city halls, or industry-specific machine learning datasets.
Because stake in DataDAOs can be tokenized, and the data stored by the DAO (as well as its status) can be cryptographically proven, the value and utility of data can be objectively expressed and transacted within markets. Tokens can then be used to pay for services to be performed on or with the data.
For example, interested parties could harvest tokens in a DAO-driven decentralized compute fabric by analyzing datasets, performing scientific and ML modelling, calculate statistics, and more, all of which are actions that augment the value, worth, and utility of the original data. SPs can get rewarded (akin to datacap) for replicating datasets, and CDNs can get rewarded for distributing data.
Stakeholders could in turn spend tokens to incentivise even more production of raw data, e.g. sensor data collection, generative art, and even human tasks (transcription, translation...), thus resulting in a circular data-centric economy.
All in all, DataDAOs can be used to create a new layer of incentives for datasets, either in conjunction or in lieu of FIL+ rewards.
Initial thoughts for what this could look like:
- Create tooling to make dataDAOs possible (build the meta layer!)
- Make an open zeppelin style contract template to make DAO creation super easy for data archivists
- Incentivize storage for the metaverse through FVM. See more here.
- Create a token-gated paywall for copyrighted datasets
- Create a DataDAO for rare / dying languages
- Create a DataDAO as the database of a decentralized Spotify
- The paywall can be handled by a token-gating mechanism that rewards artists while also giving the artists distribution
- See more in the subsequent sections
Storage Onramps [through DataDAOs]
Filecoin is a general protocol and takes a stance of not being opinionated nor specialized in any particular vertical, but rather open to all. While powerful, this approach may create friction where users expect a more integrated or crafted experience.
Just like NFT.storage had specialized marketing and a new product for storing NFT metadata on Filecoin and IPFS, we need to create dapps that create onramps for new kinds of verticals. This can be leveraged on FVM to build new incentive schemes for uploading specific kinds of data. These onramps can be organized as DataDAOs.
What are some examples of vertically specific storage onramps we’d like to see?
- Need integration with Saturn to do subsecond video retrieval
- Token gating / content policy constraints for certain copyrighted content
- Can incorporate video processing timelines with Livepeer, which is on Filecoin
- Query metadata and find subsets of data that needs to be retrieved
- Decentralized compute to impute on this dataset
- Clear CLI tooling for uploading npm packages, Docker containers, git repositories, apt containers
- Massive optimizations possible because the structure in this data is well defined and repeated
Stock photo and video archives
- Create a decentralized version of Shutterstock to incentivize creators to upload photo and video based on client needs
- Clear, simple pricing based on token-gated paywalls for watermark-free download
Pay-Per-View [through DataDAOs]
Create token protected pages, as well as token gated media, in order to see certain movies, music, and content. This can happen per view, impression, or download. Access can also be sold to a third party at a later date. Royalties can be modeled in this manner.
This can happen for all kinds of DataDAOs and create new incentives for creating content that bypass the legacy middlemen of web2 content (record labels, copyright law and media agents).
Games [through DataDAOs]
FVM can align incentives between game developers and players. Really high quality games in web3 can be composable and democratic. Communities can decide how virtual worlds evolve, rather than a centralized group of game developers. People that are early participants in the community can vote on characters, scenes and forks.
VR games in particular have a great need for FVM due to the size of the assets involved, and the modifications that are necessary on these assets. Open virtual worlds (like Decentraland) have shown some promise, but Minecraft is really the addictive gameplay we want to emulate on FVM.
Social [through DataDAOs]
Decentralized social has not quite worked out yet, but we expect something will be of prominence in this space in the future. FVM is in a unique position to provide assets, backing, and the photos and videos to drive unique social experiences.
Decentralized Science [through DataDAOs]
DataDAOs can be used to store open access papers and journals. They can store (as proofs) the results of reproducible experiments and provide incentives for scientists to upload new papers over time. They can raise funds for research that are unorthodox or come from a unique source.
Can the peer review process be made more fair (perhaps through quadratic voting)?
How can scientific code be made reproducible? Often these are written as one-off scripts and can be hard to recreate, even for famous publications.
We also need to enable massive scale data science on the data collected on these papers. We can make metadata analysis really easy because the data collected in a DeSci DAO can be analyzed and imputed on in a straightforward manner.
Filecoin and IPFS distribute content-addressed datasets across storage providers around the world to increase data redundancy, availability and resiliency.
Such globally distributed data brings significant cost, availability, and reliability advantages, but can also mean that parts of a single dataset may end up stored geographically far away from one another. Executing computation jobs or data pipelines on globally distributed data is a difficult problem. On the other hand, having to regroup the data in a central location just to apply computation on it would defeat the purpose.
Pushing compute jobs to the edges and coordinating its execution is a brand new possibility with FVM actors. FVM actors can control and broker computational resources, incentivize compute execution, distribute workloads across available storage providers, and prove the validity of the computation's result in order to claim rewards. See projects like Bacalhau for a framework that builds on IPFS and yet has to be integrated with FVM.
Storage providers could enroll in compute networks through an FVM actor. Compute clients would post jobs to the actor. A mechanism would assign jobs to providers, and once executed, the provider would post a proof to claim rewards.
Many parts of the current computing stack can be decentralized using FVM: virtual machines, compilation, schedulers and monitors.
Different primitives are needed here. Amongst others:
- Mechanisms to intelligently distribute, route and deliver compute jobs and their results.
- Mechanisms to catalog, oversee, and monitor resources available globally.
- Mechanisms to incentivize storage providers and retrieval nodes to participate in decentralized compute fabrics.
- Mechanisms to prove the correctness of results (e.g. Lurk, or non-deterministic checks, like those available in Bacalhau) and penalize offenders.
- Mechanisms to optimize execution plans, and react to unmet expectations from compute providers.
Trustless FIL+ Notaries
Today, DataCap is allocated by Notaries. Notaries help add a layer of social trust to verify that clients are authentic (and prevent sybils from malicious actors). However, with smart contracts we can design permissionless notaries that make it economically irrational to try and sybil.
A trustless notary might look like an on-chain auction, where all participants (clients, storage providers) are required to lock some collateral to participate. By running an auction via a smart contract — everyone can verify that the winning bidder(s) came from a transparent process. Economic collateral (both from the clients and storage providers) can be used to create a slashing mechanism to disincentivize malicious actors.
For simply running the auction, the notary maintainer might collect some portion of fees for the deal clearing, collect a fee on locked collateral (e.g. if staked FIL is used as the collateral some slice of the yield), or some combination of both.
Note: Trustless notaries (if designed correctly) have a distinct advantage of being permissionless - where they can support any number of use cases that might not want humans in the loop (e.g. ETL pipelines that want to automatically store derivative datasets).
KYC and claims attestation
In order to prevent Sybil attacks and miners from gaming the protocol, different Filecoin programs (such as Evergreen) would benefit from having KYC systems built on-chain. These can be used by SPs to provably attest to the claims they are making about themselves. Can this happen in a way that is incentivized, decentralized, and made as autonomous as possible? Zero knowledge proofs may be useful for organizations to obtain proof of identity without revealing SP identity.
Initial thoughts for what this could look like:
- Build / Port Notebook over to FVM for SPs
Decentralized Data Aggregator
Build a decentralized data aggregator. This tool would trustlessly aggregate files off-chain, with a set of miners that are incentivized to store these files. Once data has been aggregated to 32 GB, a smart contract on FVM would negotiate deals with an SP to store the data that was aggregated.
This would make it possible to build a new storage onramp (such as video.storage) without having to build your own centralized servers to store and aggregate data. The founder of the new onramp would have to only focus on the key parts of making a new onramp work (marketing, new compression, tooling and business development) without having to invest in aggregator infrastructure.
Furthermore, this would provide existing onramps like Estuary, NFT.storage and web3.storage the ability to aggregate user submissions in a trustless, decentralized manner.
Insurance for Storage Providers
SPs with an established track record would take out insurance policies that protect them from active faults and ensure an active revenue stream in the case of failure. Certain characteristics (such as payment history, length of operation, availability, etc) can be used to craft insurance policies just as they can be used to underwrite loans to SPs.
In exchange for recurrent insurance premium payments, this protocol would check if the SP is in good standing with the Filecoin network. If it isn’t in good standing, the SP could issue a claim that the protocol would review and issue a payout if all coverage criteria were met.
Storage insurance brokers or marketplaces could emerge to offer single points of subscription and management for clients and providers.
In some ways, this is similar to Nexus Mutual for SPs on FVM. While Nexus guards against contract failure, this protocol would help SPs recover from an active fault or termination.
With FVM, dapps can define their own access control rules for the data they are onboarding and storing. This enables smart contracts to programmatically govern who can access a dataset, without the contract having access to the data itself. This enables paywalling media content (such as music and movies), public decryption (a timelock where the data becomes visible to the world after a certain set time period) and many other interesting applications.
The ability to assign user-friendly names to actors and identities on-chain is a basic feature in modern blockchain ecosystems. Humans just aren’t great at memorizing hashes or binary data. Some chains offer native and integrated naming systems (e.g. NEAR), while others rely on user-land services (e.g. ENS), keeping the protocol agnostic of them. The Filecoin community has not discussed native solutions, and there are greenfield opportunities to innovate and implement solid naming services in user-land, backed by their own rules and cryptoeconomies.
Desirable features include the ability to assign, reserve, and dispute names for actors and accounts; resolve address from names; reverse lookups; the ability to assign names to Code CID; access control through naming; composability with other solutions; history tracking; and more.
Blockchain Nuts & Bolts
There are items required for every blockchain in order to succeed. These are required to make sure there is enough liquidity on-chain and (eventually) enough movement of funds from other chains. While these don’t leverage fundamental programmable storage primitives, they are nonetheless important for the functioning and stacking of the ecosystem.
DEXes & Exchanges
Users on FVM need to be able to exchange FIL for other tokens issued on-chain. This may be a DEX [as simple as a fork of Uniswap or Sushi on EVM], or involve building a decentralized order book, similar to Serum on Solana.
Users on FVM need to be able to deposit a certain amount of FIL into a protocol, and withdraw another token, against their collateral. If the price of FIL relative to this token falls, the loan can become undercollateralized and eventually dissolved. This could be an AAVE or Compound fork.
While not something immediately on the roadmap, bridges are needed from EVM chains, Move chains and Cosmos chains in order to bring wrapped tokens from other ecosystems into the fold. With the current launch, we are more focused internally, since the value proposition of Filecoin is unique enough that it does not need to bootstrap TVL from other chains. However, in the long run, we expect FVM to be part of a broader family of blockchains.
Brownie points if you can build bridges that reward users for moving wrapped tokens into FVM (perhaps through a bridge token or another reward scheme).
General purpose cross-chain bridges
Storage is a universal need. Smart contracts sitting on other chains benefit from accessing the storage capabilities of the Filecoin network. Similarly, Filecoin actors should be able to interact with code on other chains, or generate proofs of Filecoin state or events that can be understood by agents sitting on other chains.
Building a suite of FVM actors that can process cryptographic primitives and data structures of other chains through IPLD (Interplanetary Linked Data) enables cross-chain web3 storage utilization. For example, NFT registries could prohibit transactions unless it can be proven that the underlying asset is stored on Filecoin, or rollup contracts could verify that the aggregator has stored transaction data in Filecoin.
In order to understand the price of a token or dataset on Filecoin, we need teams to operate and run oracles. These will be necessary for overcollateralized lending, bridges, insurance, and some order book exchanges, depending on implementation. While not a pressing need for launch, price oracles will be necessary in the long run, as more P1 / P2 use cases come to the fore.
Retrievability oracles are consortiums that allow for a storage provider to commit to a max retrievability price for a client. The basic mechanism is as follows:
- When striking a deal with a client, a storage provider can commit to retrieval terms.
- In doing so, the storage provider locks collateral with the retrievability oracle.
- In normal operation, the client and the storage provider continue to store and retrieve data as normal.
- In the event the storage provider refuses to serve data (against whatever terms previously agreed), the client can appeal to the retrievability oracle who can request the data from the storage provider.
- If the storage provider serves the data to the oracle, the data is forwarded to the client.
- If the storage provider doesn’t serve the data, the storage provider is slashed.
For running the retrieval oracles, the consortium may collect fees (either from storage providers for using the service, fees for accepting different forms of collateral, yield from staked collateral, or perhaps upon retrieval of data on behalf of the client).