Reddit Scraping Parameters
Web Scraper API Reddit Scraping Parameters
Use Thordata’s Web Scraper API to configure Reddit scraping parameters, including URL, keyword, date, maximum number of posts, sorting method, subreddit URL, time sorting, post age limit, load replies, and reply count limit.
Unique identifier:
token
,access token (required)
This parameter is used as the API access token to ensure the legitimacy of the scraping request.
Request examples:
Authorization: Bearer ********************
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer ********************" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
Product - Scraping Reddit Post Information:
Reddit - Scrape posts by URL
spider_id
,Scraper tool (required)
Defines which scraper tool to use.
Request examples:
spider_id=reddit_posts_by-url
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
url
,URL (required)
Specifies the Reddit post URL to scrape.
Request examples:
"url": "https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
Reddit - Scrape posts by keyword
spider_id
,Scraper tool (required)
Defines which scraper tool to use.
Request examples:
spider_id=reddit_posts_by-keywords
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-keywords" ^
-d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
keyword
,Keyword (required)
Specifies the search keyword for Reddit posts.
Request examples:
"keyword": "datascience"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-keywords" ^
-d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
date
,Date (optional)
Specifies the time range condition for scraping posts. Values include: All time
、Past year
、Past month
、Past week
、Today
、Past hour
。
Request examples:
"date": "All time"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-keywords" ^
-d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
num_of_posts
,Maximum number of posts (optional)
Specifies the maximum number of posts to scrape.
Request examples:
"num_of_posts": "10"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-keywords" ^
-d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
sort_by
,Sorting method (optional)
Specifies the sorting method for scraped posts. Values include: Relevance
、Hot
、Top
、New
、Comment count
。
Request examples:
"sort_by": "Hot"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-keywords" ^
-d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
Reddit - Scrape posts by subreddit URL
spider_id
,Scraper tool (required)
Defines which scraper tool to use.
Request examples:
spider_id=reddit_posts_by-subredditurl
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-subredditurl" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
url
,subreddit url(required)
Specifies the subreddit URL to scrape Reddit posts from.
Request examples:
"url": "https://www.reddit.com/r/battlefield2042"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-subredditurl" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
sort_by
,Sorting method (optional)
Specifies the sorting method for scraped posts. Values include: Hot
、Top
、New
、Rising
.
Request examples:
"sort_by": "Hot"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-subredditurl" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
num_of_posts
,Maximum number of posts (optional)
Specifies the maximum number of posts to scrape.
Request examples:
"num_of_posts": "10"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-subredditurl" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
sort_by_time
,Time sorting (optional)
Specifies the time-based sorting method. Values include: Now
、Today
、This Week
、This Month
、This Year
、All Time
。
Request examples:
"sort_by_time": "All Time"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-subredditurl" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
Product - Scraping Reddit Comment Information:
Reddit - Scrape comments by URL
spider_id
,Scraper tool (required)
Defines which scraper tool to use.
Request examples:
reddit_comment_by-url
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_comment_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
url
,URL (required)
Specifies the Reddit post or comment URL to scrape.
Request examples:
"url": "https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_comment_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
days_back
,Post age limit (optional)
Specifies comments posted within the given number of days.
Request examples:
"days_back": "10"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_comment_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
load_all_replies
,Load replies (optional)
Specifies whether to scrape reply content of comments. Setting to true will retrieve all comments and all reply records.
Values:true
、false
Request examples:
"load_all_replies": "true"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_comment_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
comment_limit
,Reply count limit (optional)
Specifies the limit on the number of comments returned.
Request examples:
"comment_limit": "5"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_comment_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"
If you need further assistance, please contact us via email at [email protected].
Last updated
Was this helpful?