Starburst Salesforce connector#

The Starburst Salesforce connector allows Starburst Enterprise platform (SEP) to query Salesforce data available in your existing Salesforce account.

Requirements#

To connect to Salesforce, you need:

  • Network access from the coordinator and workers to the Salesforce API endpoints.

  • Security token for your Salesforce account, or configured range of IP addresses for logins.

  • A valid Starburst Enterprise license.

Configuration#

Create the mysalesforce catalog with a catalog properties file in etc/catalog named mysalesforce.properties:

connector.name=salesforce
salesforce.user=mysalesforceaccount@example.com
salesforce.password=secret
salesforce.security-token=<token-from-Salesforce>
salesforce.enable-sandbox=true

Specify the connector.name property as salesforce. Configure the connector using your Salesforce account username, password, and the security token for your account. See Security for more information on authenticating with Salesforce.

You can optionally enable the Salesforce Sandbox as shown in the last line of the preceding example. The default value is false.

The connector can only access Salesforce as the configured user in the catalog. If you need to access Salesforce with a different user name, configure a separate catalog.

General configuration properties#

The following table describes general catalog configuration properties for the connector:

Property name

Description

Default value

case-insensitive-name-matching

Support case insensitive schema and table names.

false

case-insensitive-name-matching.cache-ttl

1m

case-insensitive-name-matching.config-file

Path to a name mapping configuration file in JSON format that allows Trino to disambiguate between schemas and tables with similar names in different cases.

null

case-insensitive-name-matching.refresh-period

Frequency with which Trino checks the name matching configuration file for changes.

0 (refresh disabled)

metadata.cache-ttl

Duration for which metadata, including table and column statistics, is cached.

0 (caching disabled)

metadata.cache-missing

Cache the fact that metadata, including table and column statistics, is not available

false

metadata.cache-maximum-size

Maximum number of objects stored in the metadata cache

10000

write.batch-size

Maximum number of statements in a batched execution. Do not change this setting from the default. Non-default values may negatively impact performance.

1000

Warning

There is a known issue causing Salesforce catalogs to occasionally fail to list tables or views within their schemas. As a workaround, you may disable the catalog’s metadata cache by adding the following catalog configuration property:

salesforce.extra-jdbc-properties=Other="cachemetadatatable=false;cachemetadatatablecolumns=false;cachemetadataschema=false;cachametadataschemas=false;";

Type mapping#

Because SEP and Salesforce each support types that the other does not, the connector modifies some types when reading data.

Salesforce to SEP type mapping#

Salesforce type

SEP type

Auto Number

VARCHAR

Lookup Relationship

VARCHAR

Master-Detail Relationship

VARCHAR

External Lookup Relationship

VARCHAR

Checkbox

VARCHAR

Currency

DOUBLE

Date

DATE

Date/Time

TIMESTAMP

Email

VARCHAR

Geolocation

VARCHAR

Number

DOUBLE

Percent

DOUBLE

Phone per catalog

VARCHAR

Picklist

VARCHAR

Picklist (Multi-Select)

VARCHAR

Text

VARCHAR

Text Area

VARCHAR

Time

TIME

URL

VARCHAR

No other type is supported.

SQL support#

The connector supports globally available and read operation statements to access data and metadata in Salesforce.

Performance#

System information#

The Salesforce connector uses the Salesforce API to access data and metadata. Salesforce limits the number of API calls for organizations. If the limits are reached, query processing fails.

The connector provides API limit and current usage numbers in the limits table in the system schema. You can use this information for monitoring your usage to ensure a limit is not exceeded. The following query returns the data from your mysalesforce catalog:

SELECT * FROM mysalesforce.system.limits;

Dynamic filtering#

Dynamic filtering is enabled by default. It causes the connector to wait for dynamic filtering to complete before starting a JDBC query.

You can disable dynamic filtering by setting the dynamic-filtering.enabled property in your catalog configuration file to false.

Wait timeout#

By default, table scans on the connector are delayed up to 20 seconds until dynamic filters are collected from the build side of joins. Using a large timeout can potentially result in more detailed dynamic filters. However, it can also increase latency for some queries.

You can configure the dynamic-filtering.wait-timeout property in your catalog properties file:

dynamic-filtering.wait-timeout=1m

You can use the dynamic_filtering_wait_timeout catalog session property in a specific session:

SET SESSION catalogname.dynamic_filtering_wait_timeout = 1s;

Compaction#

The maximum size of dynamic filter predicate, that is pushed down to the connector during table scan for a column, is configured using the domain-compaction-threshold property in the catalog properties file:

domain-compaction-threshold=100

You can use the domain_compaction_threshold catalog session property:

SET SESSION domain_compaction_threshold = 10;

By default, domain-compaction-threshold is set to 32. When the dynamic predicate for a column exceeds this threshold, it is compacted into a single range predicate.

For example, if the dynamic filter collected for a date column dt on the fact table selects more than 32 days, the filtering condition is simplified from dt IN ('2020-01-10', '2020-01-12',..., '2020-05-30') to dt BETWEEN '2020-01-10' AND '2020-05-30'. Using a large threshold can result in increased table scan overhead due to a large IN list getting pushed down to the data source.

Metrics#

Metrics about dynamic filtering are reported in a JMX table for each catalog:

jmx.current."com.starburstdata.presto.plugin.jdbc.dynamicfiltering:name=catalogname,type=dynamicfilteringstats"

Metrics include information about the total number of dynamic filters, the number of completed dynamic filters, the number of available dynamic filters and the time spent waiting for dynamic filters.

Table scan redirection#

The connector supports table scan redirection to improve performance and reduce load on the data source.

This is particularly recommended for the Salesforce connector as Salesforce has API limits, which can cause queries to fail.

Security#

The connector includes a number of security-related features, detailed in the following sections.

Password authentication#

The connector supports authenticating via a user name. Use the following properties in the catalog properties file to configure it.

salesforce.authentication.type=PASSWORD
salesforce.user=mysalesforceaccount@example.com
salesforce.password=secret
salesforce.security-token=<token-from-Salesforce>
salesforce.enable-sandbox=true

The default value of salesforce.authentication.type is PASSWORD.

The security token is generated by Salesforce and emailed to you. To generate an access token, at the top navigation bar in Salesforce, go to <your name> > My Settings > Personal > Reset My Security Token. Note that whenever the user’s password changes, including regularly scheduled password expiration, the security token also changes and requires the connector to be reconfigured using the new password and token.

You can use a configured range of IP addresses approved for login as an alternative to the security token. In this case, do not specify salesforce.security-token in the catalog file. You must correctly configure the IP ranges for the coordinator and all worker nodes in Salesforce’s Security Controls > Network Access page for your organization.

OAuth JWT authentication#

The connector supports authenticating using OAuth 2.0. This method requires setup in Salesforce, but unlike password authentication it does not expire until access is explicitly revoked by Salesforce. Use the following properties in the catalog properties file to configure this method.

salesforce.authentication.type=OAUTH_JWT
salesforce.oauth.pkcs12-path=/path/to/salesforce-ca.p12
salesforce.oauth.pkcs12-password=pkcs12-password
salesforce.oauth.jwt-issuer=<salesforce connected app consumer key>
salesforce.oauth.jwt-subject=<salesforce user name or email address>

The trust is set up on the Salesforce side using a certificate. At a high level, you create a PKCS12 archive and certificate, then create an OAuth-enabled Connected App in Salesforce that uses that certificate. You then give permission to the user to use that Connected App and configure the Salesforce connector to use the PKCS12 archive and user name. This sets up a trust that never expires until it is explicitly revoked by a user in the Salesforce UI.

An example of setting up this configuration is as follows:

  1. Create a secure password and export it as the PKCS12_PASS environment variable. Keep this password, as it is necessary to configure the PKCS12 arvhive in the catalog configuration.

export PKCS12_PASS=averysecurepassword
  1. Create a PKCS12 archive named salesforce-ca.p12.

keytool -genkeypair -v \
  -alias salesforce-ca \
  -dname "CN=Starburst Salesforce Connector" \
  -storetype PKCS12 \
  -keystore salesforce-ca.p12 \
  -keypass:env PKCS12_PASS \
  -storepass:env PKCS12_PASS \
  -keyalg RSA \
  -keysize 4096 \
  -ext KeyUsage:critical="keyCertSign" \
  -ext BasicConstraints:critical="ca:true" \
  -validity 9999
  1. Export the certificate to salesforce-ca.crt.

keytool -export -v \
  -alias salesforce-ca \
  -file salesforce-ca.crt \
  -keypass:env PKCS12_PASS \
  -storepass:env PKCS12_PASS \
  -keystore salesforce-ca.p12 \
  -rfc
  1. Log in to Salesforce using a username and password that has authorization to create a Connected App. Select the gear icon in the upper right and select Setup, then Apps -> App Manager, then New Connected App in the upper right. Enter a name for the Connected App, such as SEP Salesforce Connector and provide a contact email address. A Callback URL is required but unused by this method; enter a valid URL, such as http://localhost.

  2. Select Enable OAuth Settings and Use digital signatures, uploading salesforce-ca.crt to the site. For Selected OAuth Scopes use Access content resources (content) at a minimum. Copy and keep the resulting Consumer Key, which is needed to configure the connector.

  3. Create a new Permission Set. Navigate to Users -> Permissions Sets and select New. Enter a label, such as SEP Salesforce Connector Permission Set and select Save. Select Assigned Connected Apps then Edit. Add the connected app name created in step 4.

  4. Add the Permission Set to the User. Select Users and locate the username you want to use to authenticate with Salesforce. When found, click to select that username. Select Permission Set Assignments then Edit Assignments. Add the permission set to the username and save.

This concludes the Salesforce part of the setup procedure.

  1. Configure the Salesforce catalog to enable OAuth 2.0 JWT. You need the PKCS12 archive’s password, the Consumer Key from Salesforce, and the user’s email address login.

salesforce.authentication.type=OAUTH_JWT
salesforce.oauth.pkcs12-path=/path/to/salesforce-ca.p12
salesforce.oauth.pkcs12-password=pkcs12-password
salesforce.oauth.jwt-issuer=<Salesforce connected app consumer key>
salesforce.oauth.jwt-subject=<Salesforce user name or email address>