Amazon S3 Integration

Amazon S3 Integration Feature Description With the Amazon S3 integration feature, you can automatically upload the results of Web Scraper tasks to a specified S3 bucket, making it easier for data backup, sharing, or further analysis.

Integration Setup:

  1. Integration Name Customize a name for the integration task to help you manage and identify it later. It is recommended to name it based on its purpose or scraping target, such as “Product Reviews Upload to S3.”

  2. Event Type Settings You can choose one of the following methods to trigger data transmission:

  Specify Task ID   Suitable for sending results of known specific scraping tasks to S3.   Ideal for handling results from multiple task IDs within one scraper.   Separate multiple task IDs with commas.   Up to 10 task IDs are supported.

  Follow Task   Automatically uploads all subsequent results from the scraper to S3.   Configured once, it takes effect continuously unless manually disabled or deleted.   Better suited for continuous scraping or periodic tasks requiring automated data archiving.

  1. Amazon S3 Parameter Configuration Configure the following information to complete the data upload setup:

bucketName, Target Bucket Name (Required)

The name of the target Amazon S3 bucket.

targetPath, Target Path (Optional)

The destination location in Amazon S3.

fileName, File Name (Optional)

The name of the object in the bucket. Defaults to your task ID.

awsAccessKey, AWS Access Key (Required) The AWS access key ID used to authorize uploads. You can obtain it from the AWS Console -> IAM -> Users -> Create User/Select Existing User -> Security Credentials -> Access Keys. It functions like a username.

awsSecretKey, AWS Secret Key (Required) Your AWS secret access key used to authorize uploads. You can obtain this key from the AWS Console -> IAM -> Users -> Create User/Select Existing User -> Security Credentials -> Access Keys -> Create Access Key. After creating the access key, the secret access key is displayed only once. It functions like a password.

fileFormat, File Format (Required)

Results for Amazon products can be sent in either JSON or CSV format. Results for YouTube products can only be sent using file format. Parameter values: JSON CSV Download Link

Viewing Transferred Files: If your integration task shows “Success” status, you can view the results in your Amazon S3 account. Or you can directly access them via the link: https://s3.us-east-2.amazonaws.com/downloaddirectory/your-target-path/filename

For example: if your target path is path/to, and the file name is 123, and the file format is json, then the access link will be: https://s3.us-east-2.amazonaws.com/downloaddirectory/path/to/123.json

If you need further assistance, please contact us via email at [email protected].

Last updated

Was this helpful?