Snowflake Integration

Snowflake Integration Feature Description

With the Snowflake integration feature, you can automatically upload the results of your Web Scraper tasks to a specified system, facilitating data backup, sharing, or further processing and analysis.

Integration Configuration:

  1. Integration Name Customize a name for the integration task for easy management and identification later. It is recommended to name it based on the purpose or scraping target, e.g., “Product Reviews Upload to Snowflake”.

  2. Event Type Settings You can choose one of the following two ways to trigger data sending according to your needs:

  • Specify Task IDs Suitable for sending results of known specific scraping tasks to S3. Ideal for handling multiple task IDs results from one scraper tool. Use English commas to separate multiple task IDs. Supports up to 10 task IDs.

  • Follow Task Automatically upload all subsequent results generated by the scraper tool to S3. Configured once and effective continuously unless manually disabled or deleted. More suitable for continuous scraping or periodic tasks for automated data archiving.

  1. Amazon S3 Parameter Configuration Fill in the following information to complete the data upload setup:

account_identifier , Account Identifier (Required)

Typically in the format <account_name>.<region_id> or <org_name>-<account_name>, used to uniquely identify a Snowflake instance.

database , Database (Required)

The name of the target database, serving as the logical container for data storage and queries.

role , Role (Required)

The name of the user's access role in Snowflake, used to define the scope of permissions.

user , User (Required)

The username used to log in to Snowflake.

pwd , Password (Required)

The user’s password, used for authentication.

schema , Schema (Required)

A structured namespace within the database used to organize tables, views, and other objects.

stage , Stage (Required)

The name of the internal stage, which Snowflake uses to temporarily store files.

warehouse , Warehouse (Required)

Virtual compute resources used to execute SQL queries and data loading tasks.

file_type , File Type (Required)

Amazon product results can be sent in either JSON or CSV format. YouTube product results can only be sent using the file format. Parameter values: JSON, CSV, Download Link

If you need more assistance, please contact us via email at [email protected].

Last updated

Was this helpful?