Skip to content

File Filter (legacy)

Bases: BaseFileFilter, FrozenModel

Filter files or directories by their path.

Deprecated since 0.8.0

Use Glob, Regexp or ExcludeDir instead.

Parameters:

  • glob (str | None, default: None ) –

    Pattern (e.g. *.csv) for which any file (only file) path should match

    Warning

    Mutually exclusive with regexp

  • regexp (str | Pattern | None, default: None ) –

    Regular expression (e.g. \d+\.csv) for which any file (only file) path should match.

    If input is a string, regular expression will be compiles using re.IGNORECASE and re.DOTALL flags

    Warning

    Mutually exclusive with glob

  • exclude_dirs (list[PathLike | str], default: [] ) –

    List of directories which should not be a part of a file or directory path

Examples:

Create exclude_dir filter:

from onetl.core import FileFilter

file_filter = FileFilter(exclude_dirs=["/export/news_parse/exclude_dir"])
Create glob filter:

from onetl.core import FileFilter

file_filter = FileFilter(glob="*.csv")
Create regexp filter:

from onetl.core import FileFilter

file_filter = FileFilter(regexp=r"\d+\.csv")

# or

import re

file_filter = FileFilter(regexp=re.compile("\d+\.csv"))
Not allowed:

from onetl.core import FileFilter

FileFilter()  # will raise ValueError, at least one argument should be passed

match(path)

False means it does not match the template by which you want to receive files