Deserialize subgraph manifest YAML
Deserialize subgraph manifest YAML
ROH: 2023-01-13
Introduction
The subgraph manifest
subgraph.yaml
defines the smart contracts your subgraph indexes, which events from these contracts to pay attention to, and how to map event data to entities that Graph Node stores and allows to query.
In this lesson we’re building a program to deserialize a subgraph manifest stored on IPFS. To start, we’ll use the graph-network-mainnet
subgraph but later on we’ll test on our with the Everest
subgraph.
graph-network-mainnet
subgraph IPFS URL: https://ipfs.io/ipfs/QmbW34MGRyp7LWkpDyKXDLWsKrN8iqZrNAMjGTYHN2zHa1Everest
subgraph IPFS URL: https://ipfs.io/ipfs/QmVsp1bC9rS3rf861cXgyvsqkpdsTXKSnS4729boXZvZyH
Deserialization: The process whereby a lower-level format (e.g. that has been transferred over a network, or stored in a data store) is translated into a readable object or other data structure.
Manifest fields, subfields, and their types (in no particular order):
specVersion
: A Semver version indicating which version of this API is being used.String
type
repository
: An optional link to where the subgraph lives.String
type
description
: An optional description of the subgraph’s purposeString
type
schema
: The GraphQL schema of this subgraph.Schema
type
dataSources
: Each data source spec defines the data that will be ingested as well as the transformation logic to derive the state of the subgraph’s entities based on the source data.
schema
fields and their types:
file
: The path of the GraphQL IDL file, either local or on IPFS.Path
type
dataSource
fields and their types
kind
: The type of data source. Possible values: ethereum/contract.String
type
name
: The name of the source data. Will be used to generate APIs in the mapping and also for self-documentation purposes.String
type
network
: For blockchains, this describes which network the subgraph targets. For Ethereum, this can be any of “mainnet”, “rinkeby”, “kovan”, “ropsten”, “goerli”, “poa-core”, “poa-sokol”, “xdai”, “matic”, “mumbai”, “fantom”, “bsc” or “clover”. Developers could look for an up to date list in the graph-cli code.String
type
source
: The source data on a blockchain such as Ethereum.mapping
: The transformation logic applied to the data prior to being indexed.Mapping
type
For the full subgraph manifest specification visit this link.
Along with previously covered Rust concepts (see other guides), here’s a quick overview of the topics we’ll encouter in this lesson
- define custom
structs
that represent the generic properties of a subgraph manifest - leverage
serde_yaml
crate andserde
crate’sderive
macro to deserialize the subgraph manifestYAML
into customstructs
. - validate a program with a basic test
Let’s get started!
Code
From your terminal/command line, create a new cargo
project and open it with VSCode
- If you don’t already have Rust and
cargo
installed, here’s the official installation guide to help you get up and running. - This tutorial assumes you are using Visual Studio Code editor (VSCode).
cargo new parse_subgraph_manifest
cd parse_subgraph_manifest
code .
With VSCode now open, update Cargo.toml with the following dependencies (add this below [dependencies]) then save your changes
reqwest = { version = "0.11.13", features = ["json"] }
tokio = { version = "1.23.0", features = ["full"] }
serde = { version = "1.0.152", features = ["derive"]}
serde_yaml = "0.9.16"
reqwest
“provides a convenient, higher-level HTTP Client”tokio
is an “event-driven, non-blocking I/O platform for writing asynchronous applications with the Rust programming language”serde
is a “framework for serializing and deserializing Rust data structures efficiently and generically”serde_yaml
is a library for using the Serde serialization framework with data in YAML file format.
Create a new file called src/utils.rs and add these use
statements
use std::collections::HashMap;
use std::string::String;
use serde::Deserialize;
std::collections::HashMap
- A hash map implemented with quadratic probing and SIMD lookup.std::string::String
- A UTF-8–encoded, growable string.serde::Deserialize
- A data structure that can be deserialized from any data format supported by Serde.
Next, add some struct
statments to the same file the save
#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
pub struct SubgraphManifest {
pub dataSources: Vec<DataSource>,
pub description: Option<String>,
pub repository: Option<String>,
pub specVersion: String,
pub schema: SchemaAddress,
}
#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
pub struct SchemaAddress {
pub file: HashMap<String, String>,
}
#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
pub struct DataSource {
pub kind: String,
pub mapping: Mapping,
pub name: String,
pub network: String,
pub source: Source,
}
#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
pub struct Mapping {
pub abis: serde_yaml::Sequence,
pub apiVersion: String,
pub entities: serde_yaml::Sequence,
pub eventHandlers: serde_yaml::Sequence,
pub file: HashMap<String, String>,
pub kind: String,
pub language: String,
}
#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
pub struct Source {
pub abi: String,
pub address: String,
pub startBlock: u32,
}
Structs
SubgraphManifest
- maps to a subgraph manifest and some of it’s fieldsSchemaAddress
- `maps to a manifest’s schema address on IPFSDataSource
- maps to a single entry in a manifest’sdataSources
Mapping
- maps tomapping
field of adataSource
entrySource
- maps tosource
field of adataSource
entry
See Subgraph Manifest docs for full specification details.
Navigate to src/main.rs and add the following use
and mod
statements
use std::error::Error;
mod utils;
use crate::utils::SubgraphManifest;
use std::error::Error
- a trait representing the basic expectations for error values, i.e., values of typeE
inResult<T, E>
.mod utils
- will look for a file namedutils.rs
and will insert its contents inside a module namedutils
under this scopeuse crate::utils::SubgraphManifest
- will bind fullcrate::utils::SubgraphManifest
path toSubgraphManifest
for easier access
Now add a main
function with the following content
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let manifest_response = reqwest::get("https://ipfs.io/ipfs/QmbW34MGRyp7LWkpDyKXDLWsKrN8iqZrNAMjGTYHN2zHa1")
.await?
.text()
.await?;
let manifest_data: SubgraphManifest = serde_yaml::from_str(&manifest_response).unwrap();
println!("{:?}", manifest_data);
Ok(())
}
Some notes:
- our
main
function isasync
, powered bytokio
- doesn’t return a value so we use the
unit
type in our result- also note the
unit
type inOk(())
- also note the
- Boxing errors from our result with
Box<dyn Error>
- doesn’t return a value so we use the
- we perform a
GET
request toIPFS
then store response text inmanifest_response
variable - we leverage
serde_yaml
to deserialize a reference tomanifest_response
into a variablemanifest_data
of typeSubgraphManifest
- finally we print out
manifest_data
to our terminal
Save your changes then run the program from the integrated terminal in VSCode
cargo run
To wrap things up let’s add a test below the main
function.
Check out Chapter 11 of The Rust Programming Language book for a more thorough discussion of tests in Rust. We’re leveraging tokio
again to help with our async
testing.
#[tokio::test]
async fn deserialize_everest_subgraph_manifest_repo()-> Result<(), Box<dyn Error>> {
let manifest_response = reqwest::get("https://ipfs.io/ipfs/QmVsp1bC9rS3rf861cXgyvsqkpdsTXKSnS4729boXZvZyH")
.await?
.text()
.await?;
let manifest_data: SubgraphManifest = serde_yaml::from_str(&manifest_response).unwrap();
let subgraph_manifest_repository = "https://github.com/graphprotocol/everest";
assert_eq!(manifest_data.repository, subgraph_manifest_repository);
Ok(())
}
Instead of printing results to our terminal, we use assert_eq
macro to compare the deserialized manifest repository URL with a hard-coded value we provide. Additionally we are testing against the Everest
subgraph in this function.
Go ahead and run your test.
cargo test