Reddit 抓取參數
Web Scraper API - Reddit 抓取參數
使用 Thordata 的 Web Scraper API 配置 Reddit 抓取參數,包含:URL、關鍵詞、日期、最大帖子數、排序方式、subreddit URL、時間排序發布天數限制、加載回覆、回覆數量限制
唯一標識:
token
,訪問令牌(必填)
此參數用作 API 訪問令牌,以確保抓取請求的合法性。
請求示例:
Authorization: Bearer ********************
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer ********************" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
一、產品 - 抓取 Reddit 帖子信息
1. Reddit - 通過 URL 抓取帖子信息
spider_id
,所屬抓取工具(必填)
定義要使用的抓取工具。
請求示例:
spider_id=reddit_posts_by-url
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
url
,URL(必填)
該參數用於指定抓取 Reddit 帖子的 URL。
請求示例:
"url": "https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
Reddit - 通過關鍵詞抓取帖子信息
spider_id
,所屬抓取工具(必填)
它定義了要使用的抓取工具。
請求示例:
spider_id=reddit_posts_by-keywords
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-keywords" ^
-d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
keyword
,關鍵詞(必填)
該參數用於指定抓取 Reddit 帖子的搜索關鍵詞。
請求示例:
"keyword": "datascience"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-keywords" ^
-d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
date
,日期(可選)
該參數用於指定抓取帖子的時間限制條件,參數值包括:All time
、Past year
、Past month
、Past week
、Today
、Past hour
。
請求示例:
"date": "All time"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-keywords" ^
-d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
num_of_posts
,最大帖子數(可選)
該參數用於指定抓取帖子的最大數量。
請求示例:
"num_of_posts": "10"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-keywords" ^
-d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
sort_by
,排序方式(可選)
該參數用於指定抓取帖子的排序方式,參數值包括:Relevance
、Hot
、Top
、New
、Comment count
。
請求示例:
"sort_by": "Hot"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-keywords" ^
-d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
Reddit - 通過 subreddit url 抓取帖子信息
spider_id
,所屬抓取工具(必填)
它定義了要使用的抓取工具。
請求示例:
spider_id=reddit_posts_by-subredditurl
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-subredditurl" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
url
,subreddit url(必填)
該參數用於指定抓取 Reddit 帖子的 subreddit URL。
請求示例:
"url": "https://www.reddit.com/r/battlefield2042"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-subredditurl" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
sort_by
,排序方式(可選)
該參數用於指定抓取帖子的排序方式,參數值包括:Hot
、Top
、New
、Rising
。
請求示例:
"sort_by": "Hot"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-subredditurl" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
num_of_posts
,最大帖子數(可選)
該參數用於指定抓取帖子的最大數量。
請求示例:
"num_of_posts": "10"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-subredditurl" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
sort_by_time
,時間排序(可選)
該參數用於指定抓取帖子的時間排序方式,參數值包括:Now
、Today
、This Week
、This Month
、This Year
、All Time
。
請求示例:
"sort_by_time": "All Time"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-subredditurl" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
二、產品-抓取 Reddit 帖子評論信息:
Reddit - 通過 URL 抓取帖子評論信息
spider_id
,所屬抓取工具(必填)
它定義了要使用的抓取工具。
請求示例:
reddit_comment_by-url
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_comment_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
url
,URL(必填)
該參數用於指定抓取 Reddit 評論或帖子的 URL。
請求示例:
"url": "https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_comment_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
days_back
,發布天數限制(可選)
該參數用於指定抓取您輸入的天數內發布的所有評論。
請求示例:
"days_back": "10"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_comment_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
load_all_replies
,加載回覆(可選)
該參數用於指定是否抓取評論的回覆內容,標記為 True 將獲取所有評論和所有回覆的記錄。
參數值:true
、false
請求示例:
"load_all_replies": "true"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_comment_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
comment_limit
,回覆數量限制(可選)
該參數用於指定限制返回的評論數量。
請求示例:
"comment_limit": "5"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_comment_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
如果您需要進一步的幫助,請通過電子郵件聯繫 [email protected]。
Last updated
Was this helpful?