Fri, Jun 18 2021

DipDup v1.0 introduces dynamic indexes and other improvements

A major update of the selective indexing framework by Baking Bad brings new features and improves stability

Lev Follow

DipDup is a framework for building selective indexers for Tezos dapps. It helps to reduce boilerplate code and lets developers focus on what's really important — the business logic. It works on top of TzKT API, which provides normalized and humanified blockchain data via REST and WebSocket endpoints.
This article will guide you through the recent DipDup changes with code snippets and demo samples.

NOTE

You can spin up any of the demo projects in a minute:

git clone https://github.com/dipdup-net/dipdup-py
cd dipdup-py
poetry install
poetry run dipdup -c src/demo_<name>/dipdup.yml run

Then do sqlite3 <name>.sqlite (or use any other tool) to explore the created database.

# Breaking changes

# Several internal classes renamed

Some classes have changed their names in the v1.0.0 release:

dipdup.models.TransactionContext -> dipdup.models.Transaction
dipdup.models.OriginationContext -> dipdup.models.Origination
dipdup.models.BigMapContext -> dipdup.models.BigMapDiff
dipdup.models.HandlerContext -> dipdup.context.HandlerContext
dipdup.models.OperationHandlerContext -> dipdup.context.HandlerContext
dipdup.models.BigMapHandlerContext -> dipdup.context.HandlerContext

This change aims to make a clear distinction between HandlerContext and operation/bigmapdiff data classes.

ctx: HandlerContext is the first argument of each DipDup handler (including the default ones) containing all the additional data and helpers you may need. We will talk about the context more later in this article.

MIGRATION

Run dipdup migrate to automatically migrate your existing code.
Check the resulting diff to ensure nothing has broken.

# `big_map` indexes process diffs one by one

big_map index allows you to speed up syncing when all you need is updates of a single (or several) big map. Until v1.0.0 a single big map handler could accept multiple lists of updates of different big maps.

config (before)

tezos_domains_big_map:
  kind: big_map
  datasource: <datasource>
  handlers:
    - callback: on_update
      pattern:
        - contract: <name_registry>
          path: store.records
        - contract: <name_registry>
          path: store.expiry_map

Diffs were grouped by pattern and passed to the handler once per block:

handler (before)

async def on_update(
    ctx: HandlerContext,
    store_records: List[BigMapContext[StoreRecordsKey, StoreRecordsValue]],
    store_expiry_map: List[BigMapContext[StoreExpiryMapKey, StoreExpiryMapValue]],
) -> None:
    ...

Now every big map has a separate handler:

config (after)

tezos_domains_big_map:
  kind: big_map
  datasource: <datasource>
  handlers:
    - callback: on_update_records
      contract: <name_registry>
      path: store.records
    - callback: on_update_expiry_map
      contract: <name_registry>
      path: store.expiry_map

handler 1 (after)

async def on_update_expiry_map(
    ctx: HandlerContext,
    store_expiry_map: BigMapDiff[StoreExpiryMapKey, StoreExpiryMapValue],
) -> None:
    ...

handler 2 (after)

async def on_update_records(
    ctx: HandlerContext,
    store_records: BigMapDiff[StoreRecordsKey, StoreRecordsValue],
) -> None:
    ...

Thus every big map index handler now always has exactly two arguments.

MIGRATION

In order to migrate an existing project:

update DipDup config
run dipdup init

Rename existing handlers in advance if you want to reuse their names.

Keep in mind that all indexes are still atomic by block. That means if an error occurs during the execution of a handler, all the related database changes will be reverted.

# New operation matching options

There are now three filters in the origination pattern:

originated_contract: Matches a specific contract origination
source: Matches all the contracts originated by a specified account (for example all Quipuswap DEX contracts are created using the launchExchange entrypoint of the factory contract)
similar_to: Matches originated contracts having the same parameter and storage types as the reference one; add strict: True to narrow the filtering down by the whole contract code.

Include the following lines in your index config to handle originations:

indexes:
  my_index:
  types:
    - transaction
    - origination

An operation pattern now can have an empty entrypoint to match regular transfers. For example tokenToTez entrypoint of the Quipuswap DEX contract emits an internal transaction having no parameter:

tx without params

- callback: on_fa2_token_to_tez
  pattern:
    - type: transaction
      destination: <dex_contract>
      entrypoint: tokenToTezPayment
    - type: transaction
      destination: <token_contract>
      entrypoint: transfer
    - type: transaction
      source: <dex_contract>

A tokenToTez call always generates a transfer to the operation initiator. But withdrawProfit entrypoint can have no internal transfers in case the initiator has zero baking rewards. The optional flag comes in handy in such cases:

optional item

- callback: on_fa2_withdraw_profit
  pattern:
    - type: transaction
      destination: <dex_contract>
      entrypoint: withdrawProfit
    - type: transaction
      source: <dex_contract>
      optional: True

# Dynamic configuration

What the point of having a powerful origination matching algorithm without the ability to spawn indexers at runtime? Here's how to achieve that with v1.0.0:

Prepare a template for the indexes you plan to spawn dynamically
Add a new operation indexer containing a desirable origination pattern to your configuration

stateless index

factory:
  kind: operation
  datasource: tzkt
  types:
    - origination
  handlers:
    - callback: on_factory_origination
      pattern:
        - type: origination
          similar_to: registry
  stateless: True

The stateless flag indicates that this index contains no database operations and acts as a factory for spawning other indexes in runtime.

Run dipdup init to generate handlers and typeclasses.
Call add_contract and add_index helpers from inside the generated handler to spawn new indexes.

factory handler

ctx.add_contract(
    name=originated_contract,
    address=originated_contract,
    typename='some_type',
)
ctx.add_index(
    name=index_name,
    template='some_template',
    values=dict(contract=originated_contract),
)

Dynamic indexes are handled in exactly the same way as ordinary ones: first being synced using REST requests (in case there are invocations right after the deployment), then switched to the websocket updates.

Another option to configure DipDup in runtime is the on_configure handler. DipDup executes this handler before the indexing starts and gives you full control over the configuration.

# Other improvements

# Better logging

We have improved the readability of DipDup logs and made filtering much easier. Logging is set up with Python logging.config configuration files in YAML format (built-in configs).

A preconfigured logger is now available in every handler at ctx.logger.

ctx.logger.info('Hello world!')

INFO     dipdup.index         Processing 3 operations of level 1518979
INFO     dipdup.index         oo6E3if16UB835y9m85HVbowcCX6LdUEvSqCu6MsmLWuJa89JjH: `on_fa2_token_to_tez` handler matched!
INFO     on_fa2_token_to_tez  oo6E3if16UB835y9m85HVbowcCX6LdUEvSqCu6MsmLWuJa89JjH: Hello world!

By default messages from ctx.logger are prefixed with the operation group hash. You can always change the format so that it better suite your needs:

ctx.logger.fmt = field_to_grep_by + ': {}'

If console logging is not enough there are many custom handlers available: Telegram, Logstash and others. Give them a try.

# Executing arbitrary SQL commands

When using PostgreSQL as a database backend you can run SQL scripts during the initialization. Create a directory named sql in your project root and place any number of files with .sql extension in it.

INFO     dipdup.dipdup        Initializing database
INFO     dipdup.dipdup        Applying raw SQL from `00-trade_summary_fn.sql`

# New CLI options

dipdup run now have several additional options:

--reindex: drop a database and start indexing from scratch
--oneshot: synchronize indexes via REST and exit without establishing realtime connection. Useful for debugging with first_block and last_block fields initialized in the configuration file

# What's next?

It's just a beginning. Here's what's in our roadmap for future releases:

Performance optimizations for multiple indexes in a single application.
Better rollback handling. For now, when DipDup receives a reorganization event this leads to a full reindexing. Soon you'll be able to implement "backward handlers" to process the rolled back block in reverse order.
Integration with mempool and metadata plugins written in Go.
Hasura 2.0 integration.

DipDup is a free open-source project driven by your, fellow Tezos developers, needs. Let us know what do you think about the recent changes and our further plans! Come join Baking Bad Telegram group, #baking-bad channel at tezos-dev Slack, and our Discord server.

DipDup Updates