> For the complete documentation index, see [llms.txt](https://doc.thordata.com/doc/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://doc.thordata.com/doc/scraping/web-scraper-api/integrations/amazon-s3-integration.md).

# Amazon S3 Integration

Description of Amazon S3 Integration Function

Through the Amazon S3 integration function, you can automatically upload the results of Web Scraper crawling tasks to a specified S3 bucket, facilitating data backup, sharing, or subsequent processing and analysis.

**Integration Configuration:**

1. Integration Function Name\
   Customize a name for this integration task to facilitate subsequent management and identification. It is recommended to name it based on the purpose or crawling object, such as "Upload Product Review Results to S3".
2. Event Type Setting\
   You can choose one of the following two methods to trigger data sending according to your needs:

  <mark style="background-color:blue;">Specify Task ID</mark>\
  Suitable for sending results of known specific scraping tasks to S3.\
  Ideal for handling results from multiple task IDs within one scraper.\
  Separate multiple task IDs with commas.\
  Up to 10 task IDs are supported.

  <mark style="background-color:blue;">Follow Task</mark>\
  Automatically uploads all subsequent results from the scraper to S3.\
  Configured once, it takes effect continuously unless manually disabled or deleted.\
  Better suited for continuous scraping or periodic tasks requiring automated data archiving.

3. Amazon S3 Parameter Configuration\
   Configure the following information to complete the data upload setup:<br>

<details>

<summary><code>bucketName</code>, <strong>Target Bucket Name (Required)</strong></summary>

The name of the target Amazon S3 bucket.

</details>

<details>

<summary><code>targetPath</code>, <strong>Target Path (Optional)</strong></summary>

The destination location in Amazon S3.

</details>

<details>

<summary><code>fileName</code>, <strong>File Name (Optional)</strong></summary>

The name of the object in the bucket. Defaults to your task ID.

</details>

{% tabs %}
{% tab title="Access Key Credentials" %}
`awsAccessKey`, **AWS Access Key (Required)**\
The AWS access key ID used to authorize uploads. You can obtain it from the AWS Console -> IAM -> Users -> Create User/Select Existing User -> Security Credentials -> Access Keys. It functions like a username.

`awsSecretKey`, **AWS Secret Key (Required)**\
Your AWS secret access key used to authorize uploads. You can obtain this key from the AWS Console -> IAM -> Users -> Create User/Select Existing User -> Security Credentials -> Access Keys -> Create Access Key. After creating the access key, the secret access key is displayed only once. It functions like a password.
{% endtab %}

{% tab title="Role Token Credentials" %}
`roleArn`, **Role (Required)**\
RoleArn is an important parameter in AWS used for role authorization and identity switching.

`externalId`, **External ID (Required)**\
ExternalId is a parameter in AWS used to enhance cross-account access security, mainly for scenarios where third-party services access resources in your account. It is used together with RoleArn.
{% endtab %}
{% endtabs %}

<details>

<summary><code>fileFormat</code>, <strong>File Format (Required)</strong></summary>

Results for Amazon products can be sent in either JSON or CSV format. Results for YouTube products can only be sent using file format.\
Parameter values: `JSON` `CSV` `Download Link`

</details>

**Viewing Transferred Files:**\
If your integration task shows “Success” status, you can view the results in your Amazon S3 account.\
Or you can directly access them via the link:\
<https://s3.us-east-2.amazonaws.com/downloaddirectory/your-target-path/filename>

For example: if your target path is `path/to`, and the file name is `123`, and the file format is `json`,\
then the access link will be:\
<https://s3.us-east-2.amazonaws.com/downloaddirectory/path/to/123.json>

If you need further assistance, please contact us via email at <support@thordata.com>.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://doc.thordata.com/doc/scraping/web-scraper-api/integrations/amazon-s3-integration.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
