# Github 抓取参数

Web Scraper API Github 抓取参数

\
使用 Thordata 的 Web Scraper API 配置 Github 抓取参数，包括仓库URL、搜索URL、代码URL和其他参数。

**唯一标识:**

<details>

<summary><code>token</code>，<strong>访问令牌（必填）</strong></summary>

此参数用作 API 访问令牌，以确保抓取的合法性。

**请求示例：**

`Authorization: Bearer ********************`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer ********************" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://github.com/TheAlgorithms/Python/blob/master/divide_and_conquer/power.py\"},{\"url\": \"https://github.com/AkarshSatija/msSync/blob/master/index.js\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

**产品 - 抓取仓库信息**

1\. Github - 通过仓库URL抓取仓库信息

<details>

<summary><code>spider_id</code> ，<strong>抓取工具（必填）</strong></summary>

定义要使用的抓取工具。

**请求示例:**&#x20;

`spider_id=github_repository_by-repo-url`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-repo-url" ^
  -d "spider_parameters=[{\"repo_url\": \"https://github.com/TheAlgorithms/Python\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

<details>

<summary><code>repo_url</code>，<strong>仓库 URL（必填）</strong></summary>

此参数指定要抓取的仓库 URL。

**请求示例:**&#x20;

`"repo_url": "https://github.com/TheAlgorithms/Python"`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-repo-url" ^
  -d "spider_parameters=[{\"repo_url\": \"https://github.com/TheAlgorithms/Python\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
```

</details>

2.Github - 通过搜索 URL 抓取仓库信息

<details>

<summary><code>spider_id</code> ，<strong>抓取工具（必填）</strong></summary>

定义要使用的抓取工具。

**请求示例:**&#x20;

`spider_id=github_repository_by-search-url`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://github.com/search?q=ML%26type=repositories\",\"page_turning\": \"\",\"max_num\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

<details>

<summary><code>search_url</code> ，<strong>搜索 URL （必填）</strong></summary>

此参数指定要抓取的搜索网址。

**请求示例:**&#x20;

`"search_url": "https://github.com/search?q=ML%26type=repositories"`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://github.com/search?q=ML%26type=repositories\",\"page_turning\": \"\",\"max_num\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

<details>

<summary><code>page_turning</code> ，<strong>翻页（可选）</strong></summary>

此参数指定抓取结果页数的限制，请输入页数。

**请求示例:**&#x20;

`"page_turning": "1"`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://github.com/search?q=ML%26type=repositories\",\"page_turning\": \"1\",\"max_num\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

<details>

<summary><code>max_num</code> ，<strong>最大数量（可选）</strong></summary>

此参数指定要爬取的最大仓库数量。

**请求示例:**&#x20;

`"max_num": "1"`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://github.com/search?q=ML%26type=repositories\",\"page_turning\": \"1\",\"max_num\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

3.Github -通过 URL 抓取仓库信息

<details>

<summary><code>spider_id</code> ，<strong>抓取工具（必填）</strong></summary>

定义要使用的抓取工具。

**请求示例:**&#x20;

`spider_id=github_repository_by-url`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://github.com/TheAlgorithms/Python/blob/master/divide_and_conquer/power.py\"},{\"url\": \"https://github.com/AkarshSatija/msSync/blob/master/index.js\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

<details>

<summary><code>url</code>，<strong>代码 URL（必填）</strong></summary>

此参数指定要搜索的代码 URL。

**请求示例:**&#x20;

`"url": "https://github.com/TheAlgorithms/Python/blob/master/divide_and_conquer/power.py"`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://github.com/TheAlgorithms/Python/blob/master/divide_and_conquer/power.py\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

​

如果您需要进一步的帮助，请通过电子邮件联系 <support@thordata.com>。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://doc.thordata.com/doc/zh/web-scraper-api/zhua-qu-can-shu-shuo-ming/github-zhua-qu-can-shu.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
