Table of Contents

Datasources

Datasources are DipDup connectors to various APIs. They are defined in config and can be accessed in handlers and hooks via ctx.datasources mapping. Also, there are ctx.get_<kind>_datasource methods to get a typed datasource instance directly.

Index datasources, ones that can be attached to a specific index, are prefixed with blockchain name, e.g. tezos.tzkt or evm.subsquid.

kind	blockchain	description
evm.subsquid	⟠ EVM-compatible	Subsquid Network API
evm.node	⟠ EVM-compatible	Ethereum node
evm.etherscan	⟠ EVM-compatible	Provides ABIs for EVM contracts
evm.blockvision	⟠ EVM-compatible	Provides ABIs for EVM contracts
evm.sourcify	⟠ EVM-compatible	Provides ABIs for EVM contracts
starknet.subsquid	🐺 Starknet	Subsquid Network API
starknet.node	🐺 Starknet	Starknet node
substrate.node	🔮 Substrate	Substrate node
substrate.subscan	🔮 Substrate	Provides pallet metadata for Substrate networks
substrate.subsquid	🔮 Substrate	Subsquid Network API
tezos.tzkt	ꜩ Tezos	TzKT API
tzip_metadata	ꜩ Tezos	TZIP-16 metadata
coinbase	any	Coinbase price feed
ipfs	any	IPFS gateway
http	any	Generic HTTP API

Connection settings

All datasources share the same code under the hood to communicate with underlying APIs via HTTP. Their configs have an optional section http to configure connection settings. You can use it to set timeouts, retry policies, and other parameters.

Each datasource kind has its defaults. Usually, there's no reason to alter these settings unless you use self-hosted instances. In the example below, default values are shown:

dipdup.yaml

datasources:
  datasource:
    http:
      retry_count: 10
      retry_sleep: 1.0
      retry_multiplier: 2.0
      ratelimit_rate: 0
      ratelimit_period: 0
      ratelimit_sleep: 0.0
      connection_limit: 100
      connection_timeout: 60
      request_timeout: 60
      batch_size: 10000
      polling_interval: 1.0
      replay_path: None
      alias: None

batch_size is used to limit the number of items fetched in a single paginated request (for some APIs). replay_path is used internally in tests to save request responses to files. Finally, alias field is used to alter the datasource name in logs and metrics. Other fields should be self-explanatory.

Ratelimiting

Ratelimiting is implemented using the "leaky bucket" algorithm. The number of consumed "drops" can be set with each request (defaults to 1), and the bucket is refilled with a constant rate. If the bucket is empty, the request is delayed until it's refilled.

response = await datasource.request(
    method='get',
    url='expensive_endpoint',
    weight=10,
)

PreviousModels

NextIndexes

Help and tips -> Join our Discord

Ideas or suggestions -> Issue Tracker

GraphQL IDE -> Open Playground

DipDup

—ChangelogContributionLicense