verify_parquet_file
Documentation for eth_defi.vault.base.verify_parquet_file function.
- verify_parquet_file(path, expected_rows=None, expected_schema=None, required_columns=None)
Read back a parquet file after writing and verify its integrity.
Performs a metadata read-back (not a full table load) to check:
The file can be opened and its metadata read without errors
Row count matches
expected_rowsif providedAll columns in
expected_schemaare present with correct types (extra columns are permitted — e.g. native protocol columns)All
required_columnsare present
Uses
pq.read_metadata()andpq.read_schema()instead ofpq.read_table()to avoid loading the full dataset into memory.This function should be called on a temp file before the atomic replace so that the previous good file is preserved when verification fails.
- Parameters
path (Union[pathlib.Path, str]) – Path to the parquet file to verify.
expected_rows (Optional[int]) – If set, assert the file contains exactly this many rows.
expected_schema (pyarrow.Schema | None) – If set, verify that all columns in this schema are present with the correct types. Extra columns are permitted.
required_columns (Optional[list[str]]) – If set, verify these column names are present.
- Returns
Verification result with metadata about the file.
- Raises
ParquetVerificationError – If any verification check fails or the file cannot be read.
- Return type