Reddit Scraping Parameters

Web Scraper API Reddit Scraping Parameters

Use Thordata’s Web Scraper API to configure Reddit scraping parameters, including URL, keyword, date, maximum number of posts, sorting method, subreddit URL, time sorting, post age limit, load replies, and reply count limit.

Unique identifier:

token ,access token (required)

This parameter is used as the API access token to ensure the legitimacy of the scraping request.

Request examples:

Authorization: Bearer ********************

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer ********************" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

Product - Scraping Reddit Post Information:

  1. Reddit - Scrape posts by URL

spider_id ,Scraper tool (required)

Defines which scraper tool to use.

Request examples:

spider_id=reddit_posts_by-url

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
urlURL (required)

Specifies the Reddit post URL to scrape.

Request examples:

"url": "https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
  1. Reddit - Scrape posts by keyword

spider_id ,Scraper tool (required)

Defines which scraper tool to use.

Request examples:

spider_id=reddit_posts_by-keywords

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-keywords" ^
  -d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
keywordKeyword (required)

Specifies the search keyword for Reddit posts.

Request examples:

"keyword": "datascience"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-keywords" ^
  -d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
date ,Date (optional)

Specifies the time range condition for scraping posts. Values include: All timePast yearPast monthPast weekTodayPast hour

Request examples:

"date": "All time"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-keywords" ^
  -d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
num_of_posts ,Maximum number of posts (optional)

Specifies the maximum number of posts to scrape.

Request examples:

"num_of_posts": "10"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-keywords" ^
  -d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
sort_by ,Sorting method (optional)

Specifies the sorting method for scraped posts. Values include: RelevanceHotTopNewComment count

Request examples:

"sort_by": "Hot"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-keywords" ^
  -d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
  1. Reddit - Scrape posts by subreddit URL

spider_id ,Scraper tool (required)

Defines which scraper tool to use.

Request examples:

spider_id=reddit_posts_by-subredditurl

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-subredditurl" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
urlsubreddit url(required)

Specifies the subreddit URL to scrape Reddit posts from.

Request examples:

"url": "https://www.reddit.com/r/battlefield2042"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-subredditurl" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
sort_bySorting method (optional)

Specifies the sorting method for scraped posts. Values include: HotTopNewRising .

Request examples:

"sort_by": "Hot"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-subredditurl" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
num_of_postsMaximum number of posts (optional)

Specifies the maximum number of posts to scrape.

Request examples:

"num_of_posts": "10"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-subredditurl" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
sort_by_timeTime sorting (optional)

Specifies the time-based sorting method. Values include: NowTodayThis WeekThis MonthThis YearAll Time

Request examples:

"sort_by_time": "All Time"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-subredditurl" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042\",\"sort_by\": \"Hot\",\"num_of_posts\": \"10\",\"sort_by_time\": \"All Time\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

Product - Scraping Reddit Comment Information:

  1. Reddit - Scrape comments by URL

spider_idScraper tool (required)

Defines which scraper tool to use.

Request examples:

reddit_comment_by-url

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_comment_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
urlURL (required)

Specifies the Reddit post or comment URL to scrape.

Request examples:

"url": "https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_comment_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
days_backPost age limit (optional)

Specifies comments posted within the given number of days.

Request examples:

"days_back": "10"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_comment_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
load_all_repliesLoad replies (optional)

Specifies whether to scrape reply content of comments. Setting to true will retrieve all comments and all reply records. Values:truefalse

Request examples:

"load_all_replies": "true"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_comment_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
comment_limitReply count limit (optional)

Specifies the limit on the number of comments returned.

Request examples:

"comment_limit": "5"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_comment_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button\",\"days_back\": \"10\",\"load_all_replies\": \"true\",\"comment_limit\": \"5\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

If you need further assistance, please contact us via email at [email protected].

Last updated

Was this helpful?