Core concepts
Before proceeding to development, let's take a look at basic DipDup concepts. The main terms are highlighted in italics and will be discussed in detail in the following sections.
Big picture
DipDup is a Python SDK for building custom backends for decentralized applications, or, indexers. DipDup indexers are off-chain services that aggregate blockchain data from various sources and store it in a database.
Each indexer consists of a YAML config file and a Python package with models, handlers, and other code. The configuration file describes what contracts to index, what data to extract from them, and where to store the result. Powerful configuration features like templates, environment variables substitution, and merging multiple files allow making your indexer completely declarative. If you're coming from The Graph, the syntax is somewhat similar to Subgraph manifests.
An index is a set of contracts and rules for processing them as a single entity. Your config can contain more than one index, but they are processed in parallel and cannot share data as execution order is not guaranteed.
The Python package contains ORM models, callbacks, typeclasses, scripts and queries. Models describe the domain-specific data structures you want to store in the database. Callbacks implement the business logic, i.e., how to convert blockchain data to your models. Other files in the package are optional and can be used to extend DipDup functionality.
As a result, you get a service responsible for filling the database with indexed data. Then you can use it to build a custom API backend or integrate with existing ones. DipDup provides Hasura GraphQL Engine integration to expose indexed data via REST and GraphQL with zero configuration, but you can use other API engines like PostgREST or develop one in-house.
Storage layer
DipDup uses PostgreSQL or SQLite as a database backend. All the data is stored in a single database schema created on the first run. Make sure it's used by DipDup exclusively since changes in index configuration or models DipDup trigger reindexing, dropping the whole database schema and starting indexing from scratch. You can, however, mark specific tables as immune to preserve them or configure actions to be performed on each reindexing reason.
DipDup does not support database schema migrations, as they introduce complexity and can mess with data consistency. Any change in models or index definitions will trigger reindexing. DipDup stores hashes of the SQL schema and config file, and checks them each time you run indexing.
DipDup applies all updates atomically block by block, ensuring data integrity. If indexing is interrupted, the next time DipDup starts, it will check the database state and continue from the last block processed. The DipDup state is stored in the database per index and can be used by API consumers to determine the current indexer head.
Handling chain reorgs
Reorg messages signal chain reorganizations, which means some blocks, including all operations, are rolled back in favor of another with higher fitness. It's crucial to handle these messages correctly to avoid accumulating duplicate or invalid data. DipDup processes chain reorgs by restoring a previous database state, but you can implement your rollback logic by editing the on_index_rollback
system hook.