Amazon S3 Integration
Amazon S3 Integration Feature Description With the Amazon S3 integration feature, you can automatically upload the results of Web Scraper tasks to a specified S3 bucket, making it easier for data backup, sharing, or further analysis.
Integration Setup:
Integration Name Customize a name for the integration task to help you manage and identify it later. It is recommended to name it based on its purpose or scraping target, such as “Product Reviews Upload to S3.”
Event Type Settings You can choose one of the following methods to trigger data transmission:
Specify Task ID Suitable for sending results of known specific scraping tasks to S3. Ideal for handling results from multiple task IDs within one scraper. Separate multiple task IDs with commas. Up to 10 task IDs are supported.
Follow Task Automatically uploads all subsequent results from the scraper to S3. Configured once, it takes effect continuously unless manually disabled or deleted. Better suited for continuous scraping or periodic tasks requiring automated data archiving.
Amazon S3 Parameter Configuration Configure the following information to complete the data upload setup:
awsAccessKey
, AWS Access Key (Required)
The AWS access key ID used to authorize uploads. You can obtain it from the AWS Console -> IAM -> Users -> Create User/Select Existing User -> Security Credentials -> Access Keys. It functions like a username.
awsSecretKey
, AWS Secret Key (Required)
Your AWS secret access key used to authorize uploads. You can obtain this key from the AWS Console -> IAM -> Users -> Create User/Select Existing User -> Security Credentials -> Access Keys -> Create Access Key. After creating the access key, the secret access key is displayed only once. It functions like a password.
Viewing Transferred Files: If your integration task shows “Success” status, you can view the results in your Amazon S3 account. Or you can directly access them via the link: https://s3.us-east-2.amazonaws.com/downloaddirectory/your-target-path/filename
For example: if your target path is path/to
, and the file name is 123
, and the file format is json
,
then the access link will be:
https://s3.us-east-2.amazonaws.com/downloaddirectory/path/to/123.json
If you need further assistance, please contact us via email at [email protected].
Last updated
Was this helpful?