hyperliquid.vault_data_export
Documentation for eth_defi.hyperliquid.vault_data_export Python module.
Export Hyperliquid vault data into the ERC-4626 pipeline format.
This module bridges the Hyperliquid-specific DuckDB data into the formats consumed by the existing ERC-4626 vault metrics pipeline:
Synthetic
VaultRowentries for theVaultDatabasepickleRaw price DataFrames matching the uncleaned Parquet schema, so that Hypercore data goes through the same cleaning pipeline as EVM vaults
Merge functions to append Hyperliquid data into existing files
Example:
from pathlib import Path
from eth_defi.hyperliquid.daily_metrics import HyperliquidDailyMetricsDatabase
from eth_defi.hyperliquid.vault_data_export import merge_into_vault_database, merge_into_uncleaned_parquet
db = HyperliquidDailyMetricsDatabase(Path("daily-metrics.duckdb"))
merge_into_vault_database(db, vault_db_path)
merge_into_uncleaned_parquet(db, uncleaned_parquet_path)
db.close()
Functions
Build a raw prices DataFrame from the Hyperliquid DuckDB. |
|
Build a raw prices DataFrame from the HF DuckDB. |
|
|
Create a synthetic VaultRow for a Hyperliquid native vault. |
|
Merge Hypercore price data from one or both DuckDB databases into the Parquet. |
|
Merge Hyperliquid daily prices into the uncleaned Parquet file. |
|
Merge Hyperliquid vault metadata into an existing VaultDatabase pickle. |
|
Open whichever Hyperliquid databases exist and merge into the parquet. |
- build_raw_prices_dataframe(db)
Build a raw prices DataFrame from the Hyperliquid DuckDB.
Produces rows matching the schema of the EVM vault scanner (
export()), so Hypercore data can go through the same cleaning pipeline (process_raw_vault_scan_data()) as ERC-4626 vaults.The output has
timestampas a column (not index), matching the raw uncleaned Parquet format.Includes per-row
deposit_closed_reason(str or None) anddeposits_open(str “true”/”false” or None) columns derived from forward-filledis_closed,allow_deposits, andleader_fractionstate columns in the DuckDB.Also exposes Hyperliquid’s raw cumulative account PnL as
account_pnlso downstream consumers can compare the website-style account PnL against the cleaned share-price based return series.follower_countandcumulative_volumeare exported as scalar historical fields when available.- Parameters
db (eth_defi.hyperliquid.daily_metrics.HyperliquidDailyMetricsDatabase) – The Hyperliquid daily metrics database.
- Returns
DataFrame with columns matching the uncleaned Parquet schema.
- Return type
- build_raw_prices_dataframe_hf(db)
Build a raw prices DataFrame from the HF DuckDB.
Exports raw API timestamps without resampling, so the spacing is irregular and reflects the
vaultDetailsAPI’s per-period resolution (~20 min for the last 24h, coarsening to ~3h / ~10.5h / ~weekly for older data — ~20 min is the finest the API ever serves). The downstream cleaning pipeline computesreturns_1hviapct_change()on consecutive rows — this already works for irregular timestamps (the daily pipeline has always produced ~24h returns labelledreturns_1hfor Hypercore). The downstreamforward_fill_vault()resamples to 1h when needed.- Parameters
db (eth_defi.hyperliquid.high_freq_metrics.HyperliquidHighFreqMetricsDatabase) – The HF metrics database.
- Returns
DataFrame matching the uncleaned Parquet schema with raw timestamps.
- Return type
- create_hyperliquid_vault_row(vault_address, name, description, tvl, create_time, follower_count=None, is_closed=False, allow_deposits=True, relationship_type='normal', leader_fraction=None, manual_review_status=None)
Create a synthetic VaultRow for a Hyperliquid native vault.
Builds a
VaultRowthat matches whatcalculate_vault_record()expects, using the Hypercore synthetic chain ID.User-created vaults (
relationship_type="normal") use the fixed platform performance feeHYPERLIQUID_VAULT_PERFORMANCE_FEE. Protocol vaults (HLP and its children withrelationship_type="parent"or"child") have zero fees.- Parameters
vault_address (eth_typing.evm.HexAddress) – Vault hex address (will be lowercased).
name (str) – Vault display name.
tvl (float) – Current TVL in USD.
create_time (Optional[datetime.datetime]) – Vault creation timestamp.
follower_count (Optional[int]) – Number of vault depositors.
is_closed (bool) – Whether the vault is closed for new deposits.
allow_deposits (bool) – Whether the vault allows deposits. A vault can have
is_closed=Falsebutallow_deposits=False.relationship_type (str) – Vault relationship type from the API:
"normal"for user-created vaults,"parent"for HLP,"child"for HLP sub-vaults.leader_fraction (Optional[float]) – Leader’s fraction of total vault capital (e.g. 0.10 = 10%). Used for
_get_deposit_closed_reason()to warn when close to the Hyperliquid 5% minimum.manual_review_status (Optional[eth_defi.hyperliquid.vault_review_sync.ReviewStatus]) – Manual review decision for this vault captured from the Hyperliquid review Google Sheet. Stored on the row so downstream exports (
calculate_vault_record→ JSON) can surface the decision without re-reading the sheet on every invocation.
- Returns
Tuple of (VaultSpec, VaultRow).
- Return type
tuple[eth_defi.vault.base.VaultSpec, eth_defi.vault.vaultdb.VaultRow]
- merge_hypercore_prices_to_parquet(parquet_path, daily_db=None, hf_db=None)
Merge Hypercore price data from one or both DuckDB databases into the Parquet.
Reads data from whichever databases are provided, combines them (deduplicating on
(address, timestamp)), removes old chain-9999 rows from the Parquet, and writes the combined result.This is safe for mode switches: if only the HF database is provided but the daily database also exists, pass both to preserve all historical data. When both databases contain a row for the same vault at the same timestamp, the HF row wins (more recent data).
Daily rows have midnight timestamps (from
pd.to_datetime(date)), HF rows have raw API timestamps — they rarely collide.- Parameters
parquet_path (pathlib.Path) – Path to the uncleaned Parquet file.
daily_db (Optional[eth_defi.hyperliquid.daily_metrics.HyperliquidDailyMetricsDatabase]) – Daily metrics database (optional).
hf_db (Optional[eth_defi.hyperliquid.high_freq_metrics.HyperliquidHighFreqMetricsDatabase]) – High-frequency metrics database (optional).
- Returns
The combined DataFrame (EVM + Hypercore rows).
- Return type
- merge_into_uncleaned_parquet(db, parquet_path)
Merge Hyperliquid daily prices into the uncleaned Parquet file.
Writes Hypercore raw data in the same format as the EVM vault scanner, so the standard cleaning pipeline (
process_raw_vault_scan_data()) can process all vaults together.Reads the existing Parquet, removes any prior Hypercore rows (chain == 9999), appends fresh Hyperliquid daily price rows, and writes back. Idempotent: running twice produces the same result.
If the Parquet file does not exist, creates a new one.
- Parameters
db (eth_defi.hyperliquid.daily_metrics.HyperliquidDailyMetricsDatabase) – The Hyperliquid daily metrics database.
parquet_path (pathlib.Path) – Path to the uncleaned Parquet file (typically
vault-prices-1h.parquet).
- Returns
The combined DataFrame.
- Return type
- merge_into_vault_database(db, vault_db_path, review_statuses=None)
Merge Hyperliquid vault metadata into an existing VaultDatabase pickle.
Reads the existing pickle, upserts Hyperliquid VaultRow entries (keyed by VaultSpec), and writes back. Idempotent: running twice produces the same result.
If the pickle file does not exist, creates a new VaultDatabase.
The
review_statusesargument is how the Hyperliquid review Google Sheet feeds human-enteredOK/Avoiddecisions into the pickle so downstream consumers (calculate_vault_record→ JSON export) can surface them without re-reading the sheet on every invocation.Behaviour:
review_statusesisNone(sheet unreachable, credentials missing, or the caller explicitly opted out): the existing_manual_review_statusvalue is carried forward from the previous pickle entry for each vault. This is the “persist if Google Sheets is down” contract — the last known manual review survives an outage.review_statusesis a mapping: the mapped value (including an explicitNonefor “no review”) is written for every address present in the mapping. Addresses absent from the mapping fall back to the carry-forward path above.
- Parameters
db (eth_defi.hyperliquid.daily_metrics.HyperliquidDailyMetricsDatabase) – The Hyperliquid daily metrics database.
vault_db_path (pathlib.Path) – Path to the VaultDatabase pickle file.
review_statuses (Optional[collections.abc.Mapping[eth_typing.evm.HexAddress, Optional[eth_defi.hyperliquid.vault_review_sync.ReviewStatus]]]) – Optional mapping from lowercased vault address to the latest manual review decision read from the Google Sheet.
- Returns
The updated VaultDatabase.
- Return type
- open_and_merge_hypercore_prices(parquet_path, daily_db_path=None, hf_db_path=None)
Open whichever Hyperliquid databases exist and merge into the parquet.
Convenience wrapper around
merge_hypercore_prices_to_parquet()that handles opening and closing both databases. Used by standalone scripts and post-processing to avoid duplicating the open/close pattern.- Parameters
parquet_path (pathlib.Path) – Path to the uncleaned Parquet file.
daily_db_path (Optional[pathlib.Path]) – Path to the daily DuckDB (
Noneuses default, skipped if not on disc).hf_db_path (Optional[pathlib.Path]) – Path to the HF DuckDB (
Noneuses default, skipped if not on disc).
- Returns
The combined DataFrame (EVM + Hypercore rows).
- Return type