Reddit Scraping Parameters

Web Scraper API Reddit Scraping Parameters

Use Thordata’s Web Scraper API to configure Reddit scraping parameters, including URL, keyword, date, maximum number of posts, sorting method, subreddit URL, time sorting, post age limit, load replies, and reply count limit.

Unique identifier:

token ,Access token (required)

This parameter is used as the API access token to ensure the legitimacy of the scraping request.

Request examples:

Authorization: Bearer ********************

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer ********************" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

Product - Scraping Reddit Post Information:

  1. Reddit - Scrape posts by URL

spider_id ,Scraper tool (required)

Defines which scraper tool to use.

Request examples:

spider_id=reddit_posts_by-url

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
urlURL (required)

Specifies the Reddit post URL to scrape.

Request examples:

"url": "https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-url" ^
  -d "spider_parameters=[{\"url\": \"https://www.reddit.com/r/battlefield2042/comments/1cmqs1d/official_update_on_the_next_battlefield_game/\"},{\"url\": \"https://reddit.com/r/datascience/comments/1cmnf0m/technical_interview_python_sql_problem_but_not/\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
  1. Reddit - Scrape posts by keyword

spider_id ,Scraper tool (required)

Defines which scraper tool to use.

Request examples:

spider_id=reddit_posts_by-keywords

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=reddit.com" ^
  -d "spider_id=reddit_posts_by-keywords" ^
  -d "spider_parameters=[{\"keyword\": \"datascience\",\"date\": \"All time\",\"num_of_posts\": \"10\",\"sort_by\": \"Hot\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
keywordKeyword (required)

Specifies the search keyword for Reddit posts.

Request examples:

"keyword": "datascience"

date ,Date (optional)

Specifies the time range condition for scraping posts. Values include: All timePast yearPast monthPast weekTodayPast hour

Request examples:

"date": "All time"

num_of_posts ,Maximum number of posts (optional)

Specifies the maximum number of posts to scrape.

Request examples:

"num_of_posts": "10"

sort_by ,Sorting method (optional)

Specifies the sorting method for scraped posts. Values include: RelevanceHotTopNewComment count

Request examples:

"sort_by": "Hot"

  1. Reddit - Scrape posts by subreddit URL

spider_id ,Scraper tool (required)

Defines which scraper tool to use.

Request examples:

spider_id=reddit_posts_by-subredditurl

urlsubreddit url(required)

Specifies the subreddit URL to scrape Reddit posts from.

Request examples:

"url": "https://www.reddit.com/r/battlefield2042"

sort_bySorting method (optional)

Specifies the sorting method for scraped posts. Values include: HotTopNewRising .

Request examples:

"sort_by": "Hot"

num_of_postsMaximum number of posts (optional)

Specifies the maximum number of posts to scrape.

Request examples:

"num_of_posts": "10"

sort_by_timeTime sorting (optional)

Specifies the time-based sorting method. Values include: NowTodayThis WeekThis MonthThis YearAll Time

Request examples:

"sort_by_time": "All Time"

Product - Scraping Reddit Comment Information:

  1. Reddit - Scrape comments by URL

spider_idScraper tool (required)

Defines which scraper tool to use.

Request examples:

reddit_comment_by-url

urlURL (required)

Specifies the Reddit post or comment URL to scrape.

Request examples:

"url": "https://www.reddit.com/r/datascience/comments/1cmnf0m/comment/l32204i/?utm_source=share%26utm_medium=web3x%26utm_name=web3xcss%26utm_term=1%26utm_content=share_button"

days_backPost age limit (optional)

Specifies comments posted within the given number of days.

Request examples:

"days_back": "10"

load_all_repliesLoad replies (optional)

Specifies whether to scrape reply content of comments. Setting to true will retrieve all comments and all reply records. Values:truefalse

Request examples:

"load_all_replies": "true"

comment_limitReply count limit (optional)

Specifies the limit on the number of comments returned.

Request examples:

"comment_limit": "5"

If you need further assistance, please contact us via email at [email protected].

Last updated