Skip to content

Clickhouse connection

Bases: JDBCConnection

Clickhouse JDBC connection. support hooks

Based on Maven package com.clickhouse:clickhouse-jdbc:0.7.2 (official Clickhouse JDBC driver).

See also

Before using this connector please take into account Clickhouse prerequisites

Added in 0.1.0

Parameters:

Examples:

Create and check Clickhouse connection:

from onetl.connection import Clickhouse
from pyspark.sql import SparkSession

# Create Spark session with Clickhouse driver loaded
maven_packages = Clickhouse.get_packages()
spark = (
    SparkSession.builder.appName("spark-app-name")
    .config("spark.jars.packages", ",".join(maven_packages))
    .getOrCreate()
)

# Create connection
clickhouse = Clickhouse(
    host="database.host.or.ip",
    user="user",
    password="*****",
    extra={"continueBatchOnError": "false"},
    spark=spark,
).check()

get_packages(package_version=None, apache_http_client_version=None) classmethod

Get package names to be downloaded by Spark. support hooks

Allows specifying custom JDBC and Apache HTTP Client versions.

Added in 0.9.0

Parameters:

  • package_version (str, default: None ) –

    ClickHouse JDBC version client packages. Defaults to 0.7.2.

    Versions 0.8.0-0.9.2 are not supported, see issue #2625.

    Added in 0.11.0

  • apache_http_client_version (str, default: None ) –

    Apache HTTP Client version package. Defaults to 5.4.2.

    Used only if package_version is in range 0.5.0-0.7.0.

    Added in 0.11.0

Examples:

from onetl.connection import Clickhouse

Clickhouse.get_packages()
Clickhouse.get_packages(package_version="0.7.2")
Clickhouse.get_packages(package_version="0.6.0", apache_http_client_version="5.4.2")