Datasources

Datasources are DipDup connectors to various APIs. They are defined in config and can be accessed in handlers and hooks via ctx.datasources mapping. Also, there are ctx.get_<kind>_datasource methods to get a typed datasource instance directly.

Index datasources, ones that can be attached to a specific index, are prefixed with blockchain name, e.g. tezos.tzkt or evm.subsquid.

kindblockchaindescription
evm.subsquid⟠ EVM-compatibleSubsquid Network API
evm.node⟠ EVM-compatibleEthereum node
abi.etherscan⟠ EVM-compatibleProvides ABIs for EVM contracts
starknet.subsquid🐺 StarknetSubsquid Network API
starknet.node🐺 StarknetStarknet node
tezos.tzktꜩ TezosTzKT API
tzip_metadataꜩ TezosTZIP-16 metadata
coinbaseanyCoinbase price feed
ipfsanyIPFS gateway
httpanyGeneric HTTP API

Connection settings

All datasources share the same code under the hood to communicate with underlying APIs via HTTP. Their configs have an optional section http to configure connection settings. You can use it to set timeouts, retry policies, and other parameters.

Each datasource kind has its defaults. Usually, there's no reason to alter these settings unless you use self-hosted instances. In the example below, default values are shown:

dipdup.yaml
datasources:
  datasource:
    http:
      retry_count: 10
      retry_sleep: 1.0
      retry_multiplier: 2.0
      ratelimit_rate: 0
      ratelimit_period: 0
      ratelimit_sleep: 0.0
      connection_limit: 100
      connection_timeout: 60
      request_timeout: 60
      batch_size: 10000
      polling_interval: 1.0
      replay_path: None
      alias: None

batch_size is used to limit the number of items fetched in a single paginated request (for some APIs). replay_path is used internally in tests to save request responses to files. Finally, alias field is used to alter the datasource name in logs and metrics. Other fields should be self-explanatory.

Ratelimiting

Ratelimiting is implemented using the "leaky bucket" algorithm. The number of consumed "drops" can be set with each request (defaults to 1), and the bucket is refilled with a constant rate. If the bucket is empty, the request is delayed until it's refilled.

response = await datasource.request(
    method='get',
    url='expensive_endpoint',
    weight=10,
)
Help and tips -> Join our Discord
Ideas or suggestions -> Issue Tracker
GraphQL IDE -> Open Playground