event_reader.timestamp_cache

Documentation for eth_defi.event_reader.timestamp_cache Python module.

DuckDB-based cache for block number -> timestamp mapping.

By default, we manage a database file at ~/.tradingstrategy/block-timestamps.duckdb` where we have chain -> block -> timestamp mapping. Getting block numbers and timestamps is a common expensive operation when scanning historical events.

Functions

load_timestamp_cache(chain_id[, cache_folder])

Load the block->timestamp cache for a given chain ID.

Classes

BlockTimestampDatabase

Mapping of chain ID -> block number -> timestamp using DuckDB.

BlockTimestampSlicer

Read timestamps from DuckDB in slices iteratively.

class BlockTimestampDatabase

Bases: object

Mapping of chain ID -> block number -> timestamp using DuckDB.

  • Internal storage: DuckDB on-disk database (or in-memory).

  • Efficient selective loading and upserting

  • One second precision for disk space and speed savings

For usage see eth_defi.event_reader.multicall_timestamp.fetch_block_timestamps_multiprocess_auto_backend

Initialize the database connection.

Parameters

path – Path to the DuckDB file. Use ‘:memory:’ for transient storage.

__init__(chain_id, path)

Initialize the database connection.

Parameters
  • path (pathlib.Path) – Path to the DuckDB file. Use ‘:memory:’ for transient storage.

  • chain_id (int) –

close()

Release duckdb resources.

static create(chain_id, path)

Create an in-memory instance.

Parameters
Return type

eth_defi.event_reader.timestamp_cache.BlockTimestampDatabase

find_gaps()

Find all gaps in the block timestamp database.

Uses LEAD window function for efficient gap boundary detection without materialising the full expected block range.

Returns

List of (gap_start, gap_end, gap_size) tuples. gap_start is the last present block before the gap, gap_end is the first present block after the gap, gap_size is the number of missing blocks.

Return type

list[tuple[int, int, int]]

static get_database_file_chain(chain_id, path=PosixPath('/home/runner/.tradingstrategy/block-timestamp'))

Get the default database file path for a given chain ID.

Parameters

chain_id (int) –

Return type

pathlib.Path

get_first_and_last_block()

Get the first and last block numbers we have for a given chain ID.

Returns

0,0 if no data

Return type

tuple[int, int]

get_first_block()

Get the first block number we have for a given chain ID.

Returns

0 if no data

Return type

int

get_last_block()

Get the last block number we have for a given chain ID.

Returns

0 if no data

Return type

int

import_chain_data(chain_id, data)

Import data from raw dictionary format to the database.

  • Uses an upsert strategy (ON CONFLICT REPLACE) to ensure latest data is kept.

Parameters
  • chain_id (int) – Chain ID for the data being imported.

  • data (Union[dict[int, datetime.datetime], pandas.Series]) –

    Mapping of block number (int) to timestamp (datetime).

    Give block number -> unix timestamp pd.Series for max speed.

is_closed()

Check if the database connection is closed.

Return type

bool

static load(chain_id, path)

Load the database from disk.

Parameters
Return type

eth_defi.event_reader.timestamp_cache.BlockTimestampDatabase

query(start_block, end_block)

Get timestamps for a single chain in an inclusive block range.

Returns a Pandas Series to maintain compatibility with the original API.

Parameters
  • chain_id – EVM chain id

  • start_block (int) – Inclusive start block

  • end_block (int) – Inclusive end block

Returns

Pandas series block number (int) -> block timestamp (pd.Timestamp)

Return type

pandas.Series

save()

Force a checkpoint.

Note: DuckDB usually auto-commits. If moving from :memory: to disk, we need to copy.

to_series()

Get timestamps for a single chain.

Returns a Pandas Series to maintain compatibility with the original API.

Returns

Pandas series block number (int) -> block timestamp (pd.Timestamp)

Return type

Optional[pandas.Series]

transform_time_values(series)

Post-process our raw values from the database to actual time format.}

Parameters

series (pandas.Series) – Pandas Series with datetime values

Returns

Pandas Series with integer unix timestamps (seconds)

Return type

pandas.Series

class BlockTimestampSlicer

Bases: object

Read timestamps from DuckDB in slices iteratively.

  • Maintain a memory buffer of block numbers

  • Avoid reading all Arbitrum 20 GB of timestamp data to memory at once

__init__(timestamp_db, slice_size=1000000)
Parameters
close()

Release the associated cache db.

get(block_number)

Get timestamp for a given block number, or None if not found.

If the exact block is missing (gap in HyperSync data), returns the timestamp of the nearest available block in the current slice.

Parameters

block_number (int) –

Return type

Optional[datetime.datetime]

get_last_block()

Get the maximum block number in the database.

Return type

int

load_timestamp_cache(chain_id, cache_folder=PosixPath('/home/runner/.tradingstrategy/block-timestamp'))

Load the block->timestamp cache for a given chain ID.

Parameters
Return type

eth_defi.event_reader.timestamp_cache.BlockTimestampDatabase