# Github 抓取参数

Web Scraper API Github 抓取参数

\
使用 Thordata 的 Web Scraper API 配置 Github 抓取参数，包括仓库URL、搜索URL、代码URL和其他参数。

**唯一标识:**

<details>

<summary><code>token</code>，<strong>访问令牌（必填）</strong></summary>

此参数用作 API 访问令牌，以确保抓取的合法性。

**请求示例：**

`Authorization: Bearer ********************`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer ********************" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://github.com/TheAlgorithms/Python/blob/master/divide_and_conquer/power.py\"},{\"url\": \"https://github.com/AkarshSatija/msSync/blob/master/index.js\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

**产品 - 抓取仓库信息**

1\. Github - 通过仓库URL抓取仓库信息

<details>

<summary><code>spider_id</code> ，<strong>抓取工具（必填）</strong></summary>

定义要使用的抓取工具。

**请求示例:**&#x20;

`spider_id=github_repository_by-repo-url`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-repo-url" ^
  -d "spider_parameters=[{\"repo_url\": \"https://github.com/TheAlgorithms/Python\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

<details>

<summary><code>repo_url</code>，<strong>仓库 URL（必填）</strong></summary>

此参数指定要抓取的仓库 URL。

**请求示例:**&#x20;

`"repo_url": "https://github.com/TheAlgorithms/Python"`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-repo-url" ^
  -d "spider_parameters=[{\"repo_url\": \"https://github.com/TheAlgorithms/Python\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
```

</details>

2.Github - 通过搜索 URL 抓取仓库信息

<details>

<summary><code>spider_id</code> ，<strong>抓取工具（必填）</strong></summary>

定义要使用的抓取工具。

**请求示例:**&#x20;

`spider_id=github_repository_by-search-url`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://github.com/search?q=ML%26type=repositories\",\"page_turning\": \"\",\"max_num\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

<details>

<summary><code>search_url</code> ，<strong>搜索 URL （必填）</strong></summary>

此参数指定要抓取的搜索网址。

**请求示例:**&#x20;

`"search_url": "https://github.com/search?q=ML%26type=repositories"`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://github.com/search?q=ML%26type=repositories\",\"page_turning\": \"\",\"max_num\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

<details>

<summary><code>page_turning</code> ，<strong>翻页（可选）</strong></summary>

此参数指定抓取结果页数的限制，请输入页数。

**请求示例:**&#x20;

`"page_turning": "1"`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://github.com/search?q=ML%26type=repositories\",\"page_turning\": \"1\",\"max_num\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

<details>

<summary><code>max_num</code> ，<strong>最大数量（可选）</strong></summary>

此参数指定要爬取的最大仓库数量。

**请求示例:**&#x20;

`"max_num": "1"`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://github.com/search?q=ML%26type=repositories\",\"page_turning\": \"1\",\"max_num\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

3.Github -通过 URL 抓取仓库信息

<details>

<summary><code>spider_id</code> ，<strong>抓取工具（必填）</strong></summary>

定义要使用的抓取工具。

**请求示例:**&#x20;

`spider_id=github_repository_by-url`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://github.com/TheAlgorithms/Python/blob/master/divide_and_conquer/power.py\"},{\"url\": \"https://github.com/AkarshSatija/msSync/blob/master/index.js\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

<details>

<summary><code>url</code>，<strong>代码 URL（必填）</strong></summary>

此参数指定要搜索的代码 URL。

**请求示例:**&#x20;

`"url": "https://github.com/TheAlgorithms/Python/blob/master/divide_and_conquer/power.py"`

```sh
curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=github.com" ^
  -d "spider_id=github_repository_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://github.com/TheAlgorithms/Python/blob/master/divide_and_conquer/power.py\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

```

</details>

​

如果您需要进一步的帮助，请通过电子邮件联系 <support@thordata.com>。
