Setup Stellar Archiver node the correct way

dvarbenov

Hi, I'm trying to setup a Stellar archiver node. My goal is to be able to query any historical data for (e.g. ledgers, transactions, balances etc.) at any point and looks like the way to ensure this is to run an archiver node. I've read the entire docs and started an instance: - kubernetes deployment starting a pod with 16GB memory and 4 CPU limit - Ubuntu 20.04.5 LTS as OS inside - used quickstart repo as a starting point and made some adjustments to it to (hopefully) fir my case. Cloning quickstart repo in my Dockerfile, using stellar-core.cfg from here and changed some options as: HTTP_PORT=11626 PUBLIC_HTTP_PORT=true LOG_FILE_PATH="/opt/stellar/log.log" MANUAL_CLOSE=false DATABASE="postgresql://dbname=core host=localhost user=stellar password=__PGPASS__" NETWORK_PASSPHRASE="Public Global Stellar Network ; September 2015" KNOWN_CURSORS=["HORIZON"] CATCHUP_COMPLETE=true AUTOMATIC_MAINTENANCE_PERIOD=0 AUTOMATIC_MAINTENANCE_COUNT=0 as I understood CATCHUP_COMPLETE=true is the option I need to add to get all the data from ledger 1

I've also added a horizon.env file as: #!/bin/bash export DATABASE_URL="postgres://stellar:AjCPSeTuabBh3RAW@localhost/horizon" export STELLAR_CORE_DATABASE_URL="postgres://stellar:AjCPSeTuabBh3RAW@localhost/core" export STELLAR_CORE_URL="http://localhost:11626" export LOG_LEVEL="info" export ENABLE_CAPTIVE_CORE_INGESTION=true export INGEST="true" export PER_HOUR_RATE_LIMIT="72000" export NETWORK_PASSPHRASE="Public Global Stellar Network ; September 2015" export DISABLE_ASSET_STATS="true" export HISTORY_ARCHIVE_URLS="https://history.stellar.org/prd/core-live/core_live_001" export ADMIN_PORT=6060 export PORT=8001 export APPLY_MIGRATIONS=true export STELLAR_CORE_BINARY_PATH=$(which stellar-core) export PARALLEL_JOB_SIZE=10000 # 100000 export RETRIES=100 export RETRY_BACKOFF_SECONDS=10 export CAPTIVE_CORE_STORAGE_PATH="/opt/stellar/data" export CAPTIVE_CORE_USE_DB=true

I'm building an image and starting a pod using start script as a start command for pod ./start --pubnet I can see on localhost:11626/info that "state" : "Catching up" and status shows checkpoints applied are increasing as well as percentage of job done. Then I'm sourcing my custom horizon.env and as suggested in docs I go with: stellar-horizon db reingest range --parallel-workers=2 1 16999999 stellar-horizon db reingest range --parallel-workers=2 17000000 <latest_ledger> It seems like it's catching up as well although it has some error messages regarding some archive files not found and doing maintenance.

My questions are: - Do I need the stellar-horizon db reingest range part for my case at all? - Is putting the stellar-core in sync state with CATCHUP_COMPLETE=True enough to access any historical data? (I can see my psql DB is growing but seems like buckets folder does not store all the data) - Is it possible to speed up stellar-core syncing someway? Thanks in advance!