S3 connection¶
Bases: FileConnection
Based on minio-py client.
Warning
Since onETL v0.7.0 to use S3 connector you should install package as follows:
pip install "onetl[s3]"
# or
pip install "onetl[files]"
Added in 0.5.1
Parameters:
-
host(str) –Host of S3 source. For example:
s3.domain.com -
port(int) –Port of S3 source
-
bucket(str) –Bucket name in the S3 file source
-
access_key(str) –Access key (aka user ID) of an account in the S3 service
-
secret_key(str) –Secret key (aka password) of an account in the S3 service
-
protocol(str, default:https) –Connection protocol. Allowed values:
httpsorhttpChanged in 0.6.0
Renamed
secure: booltoprotocol: Literal["https", "http"] -
region(str) –Region name of bucket in S3 service. Optional for some S3 implementations (MinIO, Ozone), but could be mandatory for others.
-
session_token(str) –Session token generated by S3 STS service, if used.
Examples:
Create and check S3 connection:
from onetl.connection import S3
s3 = S3(
host="s3.domain.com",
protocol="http",
bucket="my-bucket",
access_key="ACCESS_KEY",
secret_key="SECRET_KEY",
region="us-east-1",
).check()
path_exists(path)
¶
Check if specified path exists on remote filesystem. .
Added in 0.8.0
Parameters:
-
path(str | PathLike) –Path to check
Returns:
-
bool–Trueif path exists,Falseotherwise.
Examples:
>>> connection.path_exists("/path/to/file.csv")
True
>>> connection.path_exists("/path/to/dir")
True
>>> connection.path_exists("/path/to/missing")
False
resolve_dir(path)
¶
Returns directory at specific path, with stats.
Added in 0.8.0
Parameters:
-
path(str | PathLike) –Path to resolve
Returns:
-
Directory path with stats–
Raises:
-
DirectoryNotFoundError–Path does not exist
-
NotADirectoryError–Path is not a directory
Examples:
>>> dir_path = connection.resolve_dir("/path/to/dir")
>>> os.fspath(dir_path)
'/path/to/dir'
>>> dir_path.stat().st_uid # owner id
12345
resolve_file(path)
¶
Returns file at specific path, with stats.
Added in 0.8.0
Parameters:
-
path(str | PathLike) –Path to resolve
Returns:
-
File path with stats–
Raises:
-
FileNotFoundError–Path does not exist
-
NotAFileError–Path is not a file
Examples:
>>> file_path = connection.resolve_file("/path/to/dir/file.csv")
>>> os.fspath(file_path)
'/path/to/dir/file.csv'
>>> file_path.stat().st_uid # owner id
12345
create_dir(path)
¶
remove_dir(path, *, recursive=False)
¶
Remove directory or directory tree.
If directory does not exist, no exception is raised.
Added in 0.8.0
Parameters:
-
path(str | PathLike) –Directory path to remove
-
recursive(bool, default:False) –If
True, remove directory tree recursively (including files and subdirectories).If
False, remove only directory itself. Directory should be empty.
Returns:
-
bool–Trueif directory was removed,Falseif directory does not exist in the first place.
Raises:
-
NotADirectoryError–Path is not a directory
-
DirectoryNotEmptyError–Directory is not empty, and
recursiveisFalse
Examples:
>>> connection.remove_dir("/path/to/dir")
Traceback (most recent call last):
...
onetl.exception.DirectoryNotEmptyError: Cannot delete non-empty directory '/path/to/dir'
>>> connection.remove_dir("/path/to/dir", recirsive=True)
True
>>> connection.path_exists("/path/to/dir")
False
>>> connection.path_exists("/path/to/dir/file.csv")
False
>>> connection.remove_dir("/path/to/dir") # already deleted, no error
False