Prerequisites¶

Version Compatibility¶

SQL Server versions:
- Officially declared: 2016 - 2025
- Actually tested: 2017, 2025
Spark versions: 3.2.x - 4.1.x
Java versions: 8 - 22

See official documentation and official compatibility matrix.

Installing PySpark¶

To use MSSQL connector you should have PySpark installed (or injected to sys.path) BEFORE creating the connector instance.

See installation instruction for more details.

Connecting to MSSQL¶

Connection port¶

Connection is usually performed to port 1433. Port may differ for different MSSQL instances. Please ask your MSSQL administrator to provide required information.

For named MSSQL instances (instanceName option), port number is optional, and could be omitted.

Connection host¶

It is possible to connect to MSSQL by using either DNS name of host or it's IP address.

If you're using MSSQL cluster, it is currently possible to connect only to one specific node. Connecting to multiple nodes to perform load balancing, as well as automatic failover to new master/replica are not supported.

Required grants¶

Ask your MSSQL cluster administrator to set following grants for a user, used for creating a connection:

Read + Write (schema is owned by user)Read + Write (schema is not owned by user)Read only

-- allow creating tables for user
GRANT CREATE TABLE TO username;

-- allow read & write access to specific table
GRANT SELECT, INSERT ON username.mytable TO username;

-- only if if_exists="replace_entire_table" is used:
-- allow dropping/truncating tables in any schema
GRANT ALTER ON username.mytable TO username;

-- allow creating tables for user
GRANT CREATE TABLE TO username;

-- allow managing tables in specific schema, and inserting data to tables
GRANT ALTER, SELECT, INSERT ON SCHEMA::someschema TO username;

-- allow read access to specific table
GRANT SELECT ON someschema.mytable TO username;

More details can be found in official documentation: