Reddit Scraping Parameters
Web Scraper API Reddit Scraping Parameters
Use Thordata’s Web Scraper API to configure Reddit scraping parameters, including URL, keyword, date, maximum number of posts, sorting method, subreddit URL, time sorting, post age limit, load replies, and reply count limit.
Unique identifier:
token ,Access token (required)
This parameter is used as the API access token to ensure the legitimacy of the scraping request.
Request examples:
Authorization: Bearer ********************
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer ********************" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"Product - Scraping Reddit Post Information:
Reddit - Scrape posts by URL
spider_id ,Scraper tool (required)
Defines which scraper tool to use.
Request examples:
spider_id=reddit_posts_by-url
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"url ,URL (required)
Specifies the Reddit post URL to scrape.
Request examples:
"url": "https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/"
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-url" ^
-d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"Reddit - Scrape posts by keyword
spider_id ,Scraper tool (required)
Defines which scraper tool to use.
Request examples:
spider_id=reddit_posts_by-keywords
curl -X POST "https://scraperapi.thordata.com/builder" ^
-H "Authorization: Bearer Token-ID" ^
-H "Content-Type: application/x-www-form-urlencoded" ^
-d "spider_name=reddit.com" ^
-d "spider_id=reddit_posts_by-keywords" ^
-d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
-d "spider_errors=true" ^
-d "file_name={{TasksID}}"keyword ,Keyword (required)
Specifies the search keyword for Reddit posts.
Request examples:
"keyword": "datascience"
date ,Date (optional)
Specifies the time range condition for scraping posts. Values include: All time、Past year、Past month、Past week、Today、Past hour。
Request examples:
"date": "All time"
num_of_posts ,Maximum number of posts (optional)
Specifies the maximum number of posts to scrape.
Request examples:
"num_of_posts": "10"
sort_by ,Sorting method (optional)
Specifies the sorting method for scraped posts. Values include: Relevance、Hot、Top、New、Comment count。
Request examples:
"sort_by": "Hot"
Reddit - Scrape posts by subreddit URL
spider_id ,Scraper tool (required)
Defines which scraper tool to use.
Request examples:
spider_id=reddit_posts_by-subredditurl
url ,subreddit url(required)
Specifies the subreddit URL to scrape Reddit posts from.
Request examples:
"url": "https://www.reddit.com/r/battlefield2042"
sort_by ,Sorting method (optional)
Specifies the sorting method for scraped posts. Values include: Hot、Top、New、Rising .
Request examples:
"sort_by": "Hot"
num_of_posts ,Maximum number of posts (optional)
Specifies the maximum number of posts to scrape.
Request examples:
"num_of_posts": "10"
sort_by_time ,Time sorting (optional)
Specifies the time-based sorting method. Values include: Now、Today、This Week、This Month、This Year、All Time。
Request examples:
"sort_by_time": "All Time"
Product - Scraping Reddit Comment Information:
Reddit - Scrape comments by URL
spider_id ,Scraper tool (required)
Defines which scraper tool to use.
Request examples:
reddit_comment_by-url
url ,URL (required)
Specifies the Reddit post or comment URL to scrape.
Request examples:
"url": "https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button"
days_back ,Post age limit (optional)
Specifies comments posted within the given number of days.
Request examples:
"days_back": "10"
load_all_replies ,Load replies (optional)
Specifies whether to scrape reply content of comments. Setting to true will retrieve all comments and all reply records.
Values:true、false
Request examples:
"load_all_replies": "true"
comment_limit ,Reply count limit (optional)
Specifies the limit on the number of comments returned.
Request examples:
"comment_limit": "5"
If you need further assistance, please contact us via email at [email protected].
Last updated
Was this helpful?