Skip to content

Spark LocalFS

Bases: SparkFileDFConnection

Spark connection to local filesystem. support hooks

Based on Spark Generic File Data Source.

Warning

To use SparkHDFS connector you should have PySpark installed (or injected to sys.path) BEFORE creating the connector instance.

See Spark install installation instruction for more details.

Warning

Currently supports only Spark sessions created with option spark.master: local.

Note

Supports only reading files as Spark DataFrame and writing DataFrame to files.

Does NOT support file operations, like create, delete, rename, etc.

Added in 0.9.0

Parameters:

  • spark (SparkSession) –

    Spark session

Examples:

from onetl.connection import SparkLocalFS
from pyspark.sql import SparkSession

# create Spark session
spark = SparkSession.builder.master("local").appName("spark-app-name").getOrCreate()

# create connection
local_fs = SparkLocalFS(spark=spark).check()