0.8.0 (2023-05-31)¶
Breaking Changes¶
-
Rename methods of
FileConnectionclasses: -
get_directory→resolve_dir get_file→resolve_filelistdir→list_dirmkdir→create_dirrmdir→remove_dir
New naming should be more consistent.
They were undocumented in previous versions, but someone could use these methods, so this is a breaking change. (#36)
-
Deprecate
onetl.core.FileFilterclass, replace it with new classes: -
onetl.file.filter.Glob onetl.file.filter.Regexponetl.file.filter.ExcludeDir
Old class will be removed in v1.0.0. (#43)
- Deprecate
onetl.core.FileLimitclass, replace it with new classonetl.file.limit.MaxFilesCount.
Old class will be removed in v1.0.0. (#44)
- Change behavior of
BaseFileLimit.resetmethod.
This method should now return self instead of None.
Return value could be the same limit object or a copy, this is an implementation detail. (#44)
- Replaced
FileDownloader.filterand.limitwith new options.filtersand.limits:
FileDownloader(
...,
filter=FileFilter(glob="*.txt", exclude_dir="/path"),
limit=FileLimit(count_limit=10),
)
FileDownloader(
...,
filters=[Glob("*.txt"), ExcludeDir("/path")],
limits=[MaxFilesCount(10)],
)
This allows to developers to implement their own filter and limit classes, and combine them with existing ones.
Old behavior still supported, but it will be removed in v1.0.0. (#45)
-
Removed default value for
FileDownloader.limits, user should pass limits list explicitly. (#45) -
Move classes from module
onetl.core:
from onetl.core import DBReader
from onetl.core import DBWriter
from onetl.core import FileDownloader
from onetl.core import FileUploader
with new modules onetl.db and onetl.file:
from onetl.db import DBReader
from onetl.db import DBWriter
from onetl.file import FileDownloader
from onetl.file import FileUploader
Imports from old module onetl.core still can be used, but marked as deprecated. Module will be removed in v1.0.0. (#46)
Features¶
- Add
rename_dirmethod.
Method was added to following connections:
FTPFTPSHDFSSFTPWebDAV
It allows to rename/move directory to new path with all its content.
S3 does not have directories, so there is no such method in that class. (#40)
- Add
onetl.file.FileMoverclass.
It allows to move files between directories of remote file system.
Signature is almost the same as in FileDownloader, but without HWM support. (#42)
Improvements¶
-
Document all public methods in
FileConnectionclasses: -
download_file resolve_dirresolve_fileget_statis_diris_filelist_dircreate_dirpath_existsremove_filerename_fileremove_dirupload_file-
walk(#39) -
Update documentation of
checkmethod of all connections - add usage example and document result type. (#39) -
Add new exception type
FileSizeMismatchError.
Methods connection.download_file and connection.upload_file now raise new exception type instead of RuntimeError,
if target file after download/upload has different size than source. (#39)
-
Add new exception type
DirectoryExistsError- it is raised if target directory already exists. (#40) -
Improved
FileDownloader/FileUploaderexception logging.
If DEBUG logging is enabled, print exception with stacktrace instead of
printing only exception message. (#42)
-
Updated documentation of
FileUploader. -
Class does not support read strategies, added note to documentation.
- Added examples of using
runmethod with explicit files list passing, both absolute and relative paths. -
Fix outdated imports and class names in examples. (#42)
-
Updated documentation of
DownloadResultclass - fix outdated imports and class names. (#42) -
Improved file filters documentation section.
Document interface class onetl.base.BaseFileFilter and function match_all_filters. (#43)
- Improved file limits documentation section.
Document interface class onetl.base.BaseFileLimit and functions limits_stop_at / limits_reached / reset_limits. (#44)
- Added changelog.
Changelog is generated from separated news files using towncrier. (#47)
Misc¶
-
Improved CI workflow for tests.
-
If developer haven't changed source core of a specific connector or its dependencies, run tests only against maximum supported versions of Spark, Python, Java and db/file server.
- If developed made some changes in a specific connector, or in core classes, or in dependencies, run tests for both minimal and maximum versions.
- Once a week run all aganst for minimal and latest versions to detect breaking changes in dependencies
- Minimal tested Spark version is 2.3.1 instead on 2.4.8. (#32)