Hooks
Hooks are user-defined callbacks not linked to any index. There are two types of hooks:
- System hooks are called on system-wide events like process restart.
- User hooks are called either with the
ctx.fire_hook
method or by the job scheduler.
System hooks
Every DipDup project has multiple system hooks; they fire on system-wide events and, like regular hooks, are not linked to any index. Names of those hooks are reserved; you can't use them in config. System hooks are not atomic and can't be fired manually or with a job scheduler.
You can also put SQL scripts in corresponding sql/on_*
directories to execute them like with regular hooks.
on_restart
This hook executes right before starting indexing. It allows configuring DipDup in runtime based on data from external sources. Datasources are already initialized at execution and available at ctx.datasources
. You can, for example, configure logging here or add contracts and indexes in runtime instead of from static config.
SQL scripts in sql/on_restart
directory may contain CREATE OR REPLACE VIEW
or similar non-destructive operations.
on_reindex
This hook fires after the database are re-initialized after reindexing (wipe) before starting indexing.
Helpful in modifying schema with arbitrary SQL scripts before indexing. For example, you can to change the database schema in ways that are not supported by the DipDup ORM, e.g., to create a composite primary key.
on_synchronized
This hook fires when every active index reaches a realtime state. Here you can clear caches internal caches or do other cleanups.
on_index_rollback
Fires when one of the index datasources received a chain reorg message.
Since version 6.0 this hook performs a database-level rollback by default. If you want to process rollbacks manually, remove ctx.rollback
call and implement custom logic in this callback.
User hooks
Let's assume we want to calculate some statistics on-demand to avoid blocking an indexer with heavy computations. Add the following lines to the DipDup config:
hooks:
calculate_stats:
callback: calculate_stats
atomic: False
args:
major: bool
depth: int
Values of args
mapping are used as type hints in a signature of a generated callback. The following callback stub will be created on init:
from dipdup.context import HookContext
async def calculate_stats(
ctx: HookContext,
major: bool,
depth: int,
) -> None:
await ctx.execute_sql_script('calculate_stats')
By default, hooks execute SQL scripts from the corresponding subdirectory of sql/
. Remove or comment out the ctx.execute_sql
call to prevent it.
To trigger the hook, call the ctx.fire_hook
method from any callback:
await ctx.fire_hook('calculate_stats', major=True, depth=10)
Atomicity and blocking
The atomic
option defines whether the hook callback will be wrapped in a single SQL transaction or not. If this option is set to true main indexing loop will be blocked until hook execution is complete. Some statements, like REFRESH MATERIALIZED VIEW
, do not require to be wrapped in transactions, so choosing a value of the atomic
option could decrease the time needed to perform initial indexing.
Note that calling an atomic hook from a handler will block the main loop forever. To avoid this, use the wait=False
option:
async def handler(ctx: HandlerContext, ...) -> None:
await ctx.fire_hook('atomic_hook', wait=False)
This hook will be executed when the current transaction is committed.
Arguments typechecking
DipDup will ensure that arguments passed to the hooks have the correct types when possible. CallbackTypeError
exception will be raised otherwise. Values of an args
mapping in a hook config should be either built-in types or __qualname__
of external type like decimal.Decimal
. Generic types are not supported: hints like Optional[int] = None
will be correctly parsed during codegen but ignored on type checking.
Job scheduler
Jobs are schedules to run specific hooks by crontab or interval. They are defined in the jobs
config section.
jobs:
midnight_stats:
hook: calculate_stats
crontab: "0 0 * * *"
args:
major: True
minute_stats:
hook: calculate_stats
interval: 60
args:
major: False
You can also run hook as a daemon, implying that it will never finish execution. This is useful for hooks that perform some kind of background processing, like sending notifications or updating caches.
jobs:
daemon_stats:
hook: calculate_stats
daemon: True
If you're unfamiliar with the crontab syntax, an online service crontab.guru will help you build the desired expression.
apscheduler configuration
DipDup utilizes apscheduler
library to run hooks according to schedules in jobs
config section. In the following example, apscheduler
will spawn up to three instances of the same job every time the trigger is fired, even if previous runs are in progress:
advanced:
scheduler:
apscheduler.job_defaults.coalesce: True
apscheduler.job_defaults.max_instances: 3
See apscheduler
docs for details.
Note that you can't use executors from apscheduler.executors.pool
module - ConfigurationError
exception will be raised.