By design an RPC node with a default --limit-ledger-size
will store roughly 2 epochs worth of data so Solana relies on Google Cloud's Bigtable for long term storage.
The public endpoint that Solana provides https://api.mainnet-beta.solana.com has configured its own Bigtable instance to server requests since the Genesis Block.
This guide is meant to allow anyone to run his own Bigtable instance for long term storage in the Solana Blockchain.
A Warehouse node is responsible for feeding Bigtable with ledger data, so setting up one is the first thing that needs to be done in order for you to have your own Solana Bigtable instance.
Structurally a Warehouse node is similar to an RPC node that doesn't server RPC calls, but instead uploads ledger data to Bigtable.
Keeping your ledger history consistent is very important on a Warehouse node, since any gap on your local ledger will translate to a gap on your Bigtable instance, although these gaps could be potentially patched up by using solana-ledger-tool
.
Here you'll find all the necessary scripts to run your own Warehouse node.
What different scripts do:
warehouse.sh
→ Startup script for the Warehouse node:warehouse-upload-to-storage-bucket.sh
→ Script to upload the hourly snapshots to Google Cloud Storage every epoch.service-env.sh
→ Source file for warehouse.sh
.service-env-warehouse.sh
→ Source file for warehouse-upload-to-storage-bucket.sh
.warehouse-basic.sh
→ Simplified command to start the warehouse node. Run this instead of warehouse.sh
.IMPORTANT NOTE: If all you want is write to bigtable, you only need to use the warehouse-basic.sh
script as a template. All of the scripts above are meant not only to write to bigtable but also create hourly snapshots and ledger backups every epoch and upload them to Google's Cloud Storage.
Before you begin:
Bigtable User
role.play-gcp-329606-cccf2690b876.json
. Point GOOGLE_APPLICATION_CREDENTIALS
variable to the file's path.To start the validator:
<path_to_your_ledger>
) inside the below files. Hint: CTRL-F for "<
" to find all quickly.
warehouse.sh
service-env.sh
service-env-warehouse.sh
ledger_dir
and ledger_snapshots_dir
blank. This will tell the node to fetch genesis & the latest snapshot from the cluster.chmod +x
the following files:
warehouse.sh
metrics-write-dashboard.sh
EXPECTED_SHRED_VERSION
in service-env.sh
to the appropriate version../warehouse.sh
To upload to bigtable:
<...>
in warehouse-upload-to-storage-bucket.sh
.chmod +x warehouse-upload-to-storage-bucket.sh
./warehouse-upload-to-storage-bucket.sh
To run as a continuous process as systemctl
:
.service
files (currently set to sol
).<...>
in both .service
files.cp
both files into /etc/systemd/system
sudo systemctl enable --now warehouse-upload-to-storage-bucket && sudo systemctl enable --now warehouse
In order to import Solana's Bigtable Instance, you'll first need to set own Bigtable instance:
BigTable API
if you have not done it already, then click on the Create Instance
inside the Console
.Instance
and then Select Storage type from HDD and SSD. Set the instance id and name to solana-ledger
.Nodes
for the cluster, each node provides 16TB of storage for HDD nodes (as of 09/12/21 at least 4 HDD nodes are required).Table ID | Column Family Name |
---|---|
blocks | x |
entries | x |
tx | x |
tx-by-addr | x |
NOTE: the entries
table is new and will be populated as of solana CLI tools v1.18.0
Table ID
and Column Family Name
inside your Bigtable instance or the Dataflow job will fail.Alternatively, you create the tables by running the following commands through CLI:
.cbtrc
file with credentials of the project and Bigtable instance in which we want to do the read and write operations:
echo project = [PROJECT ID] > ~/.cbtrc
echo instance = [BIGTABLE INSTANCE ID] >> ~/.cbtrc
cat ~/.cbtr
cbt createtable [TABLE NAME] “families=[COLUMN FAMILY1]
Once your Warehouse node has stored ledger data for 1 epoch successfully and you have set up your Bigtable instance as explained above, you are ready to import Solana's Bigtable to yours. The import process is done through a Dataflow template that allows importing Cloud Storage SequenceFile to Bigtable:
Service Account
.Service Account Admin
role to it.Dataflow API
in the project.SequenceFile Files on Cloud Storage to Cloud BigTable
template.Required parameters
(we will share the Cloud Storage storage path with you).NOTE: Before creating the Dataflow job, you'll need to send the email address of the Service Account you created (i.e., [email protected]
) to [email protected].
Sometimes blocks could be missing from your BigTable instance. This will be apparent on Explorer where the parent slot & child slot links won't form cycles. For example, before 59437028 was restored 59437027 incorrectly listed 59437029 as a child:
The missing blocks can be restored from GCS as follows:
gs://mainnet-beta-ledger-us-ny5
gs://mainnet-beta-ledger-europe-fr2
gs://mainnet-beta-ledger-asia-sg1
Find the bucket with the largest slot number that is smaller than the missing block. For example block 59437028 is in 59183944
Download rocksdb.tar.bz2:
~/missingBlocks/59183944$ wget https://storage.googleapis.com/mainnet-beta-ledger-us-ny5/59183944/rocksdb.tar.bz2
Also note the version number in version.txt:
curl https://storage.googleapis.com/mainnet-beta-ledger-us-ny5/59183944/version.txt
solana-ledger-tool 1.4.21 (src:50ebc3f4; feat:2221549166)
~/missingBlocks/59183944$ tar -I lbzip2 -xf rocksdb.tar.bz2
tar --use-compress-program=unzstd -xvf rocksdb.tar.zst instead
~/solana$ git checkout 50ebc3f4
(can also checkout v1.4.21)~/solana$ cd ledger-tool && ../cargo build --release
~/missingBlocks/59183944$ ~/solana/target/release/solana-ledger-tool slot 59437028 -l . | head -n 2
~/missingBlocks/59183944$ GOOGLE_APPLICATION_CREDENTIALS=<json credentials file with write permission> ~/solana/target/release/solana-ledger-tool bigtable upload 59437028 59437028 -l .
-l
should specify a directory that contains the rocksdb directory.SlotNotRooted
error, first run the repair-roots command.
~/missingBlocks/59183944$ ~/github/solana/target/release/solana-ledger-tool repair-roots --before 59437029 --until 59437027 -l .
error: Found argument 'repair-roots' which wasn't expected, or isn't valid in this context
then the ledger tool version pre-dates the repair-roots command. Add it to your local code by cherry picking ddfbae2
or manually applying the changes from PR #17045