Skip to content

Filesystem Warehouse

Bases: IcebergWarehouse, FrozenModel

Iceberg Filesystem Warehouse.

Added in 0.15.0

Note

This warehouse uses FileDFConnection classes to access data at the warehouse location. It relies on Spark's filesystem configuration and behavior.

Parameters:

  • connection (SparkFileDFConnection) –

    File connection for data storage

  • path (str) –

    Warehouse path

Examples:

from onetl.connection import Iceberg, SparkLocalFS

local_fs_connection = SparkLocalFS(spark=spark)

warehouse = Iceberg.FilesystemWarehouse(
    connection=local_fs_connection,
    path="/warehouse/path",
)
from onetl.connection import Iceberg, SparkHDFS

hdfs_connection = SparkHDFS(
    host="namenode",
    cluster="my-cluster",
    spark=spark,
)

warehouse = Iceberg.FilesystemWarehouse(
    connection=hdfs_connection,
    path="/warehouse/path",
)
from onetl.connection import Iceberg, SparkS3

s3_connection = SparkS3(
    host="s3.domain.com",
    protocol="http",
    bucket="my-bucket",
    access_key="access_key",
    secret_key="secret_key",
    path_style_access=True,
    region="us-east-1",
    spark=spark,
)

warehouse = Iceberg.FilesystemWarehouse(
    connection=s3_connection,
    path="/warehouse/path"
)

get_config()

Return flat dict with warehouse configuration.