Yelp 抓取参数

Web Scraper API Yelp 抓取参数

使用 Thordata 的 Web Scraper API 配置 Yelp 抓取参数,包括搜索URL、商家URL、类别、位置和其他参数。

唯一标识:

token访问令牌(必填)

此参数用作 API 访问令牌,以确保抓取的合法性。

请求示例:

Authorization: Bearer ********************

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer ********************" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-search-filters" ^
  -d "spider_parameters=[{\"category\": \"cafe\",\"location\": \"new york\",\"business_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

产品 - 抓取商家信息

1. Yelp - 通过搜索过滤器抓取商家信息

spider_id抓取工具(必填)

定义要使用的抓取工具。

请求示例:

spider_id=yelp_business-overview_by-search-filters

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-search-filters" ^
  -d "spider_parameters=[{\"category\": \"cafe\",\"location\": \"new york\",\"business_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
category 类别(必填)

此参数指定要抓取的类别。

请求示例:

"category": "cafe"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-search-filters" ^
  -d "spider_parameters=[{\"category\": \"cafe\",\"location\": \"new york\",\"business_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
location 地点(必填)

此参数指定要抓取的位置。

请求示例:

"location": "new york"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-search-filters" ^
  -d "spider_parameters=[{\"category\": \"cafe\",\"location\": \"new york\",\"business_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
business_page_turning最多商家列表页数(可选)

此参数指定要收集的商家列表页数。

请求示例:

"business_page_turning": "1"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-search-filters" ^
  -d "spider_parameters=[{\"category\": \"cafe\",\"location\": \"new york\",\"business_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

2. Yelp - 通过搜索网址抓取商业信息

spider_id抓取工具(必填)

定义要使用的抓取工具。

请求示例:

spider_id=yelp_business-overview_by-search-url

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://www.yelp.com/search?find_desc=Cafes%26find_loc=Stowe%2C%2BVT\",\"business_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

search_url搜索网址(必填)

此参数指定要抓取的搜索网址。

请求示例:

"search_url": "https://www.yelp.com/search?find_desc=Cafes%26find_loc=Stowe%2C%2BVT"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://www.yelp.com/search?find_desc=Cafes%26find_loc=Stowe%2C%2BVT\",\"business_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

business_page_turning最多商家列表页数(可选)

此参数指定要收集的商家列表页数。

请求示例:

"business_page_turning": "1"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://www.yelp.com/search?find_desc=Cafes%26find_loc=Stowe%2C%2BVT\",\"business_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

3. Yelp - 通过商家URL抓取商业信息

spider_id抓取工具(必填)

定义要使用的抓取工具。

请求示例:

spider_id=yelp_business-overview_by-business-url

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-business-url" ^
  -d "spider_parameters=[{\"business_url\": \"https://www.yelp.com/biz/the-round-hearth-caf%C3%A9-and-marketplace-stowe?osq=Sandwiches\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"


business_url 商家URL (必填)

此参数指定要搜索的商家网址。

请求示例:

"business_url": "https://www.yelp.com/biz/the-round-hearth-caf%C3%A9-and-marketplace-stowe?osq=Sandwiches"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-business-url" ^
  -d "spider_parameters=[{\"business_url\": \"https://www.yelp.com/biz/the-round-hearth-caf%C3%A9-and-marketplace-stowe?osq=Sandwiches\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

产品 - 抓取商家评论信息:

1. Yelp - 通过搜索筛选器抓取商家评论信息

spider_id抓取工具(必填)

定义要使用的抓取工具。

请求示例:

spider_id=yelp_business-reviews_by-search-filters

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-reviews_by-search-filters" ^
  -d "spider_parameters=[{\"category\": \"cafe\",\"location\": \"Stowe, VT\",\"business_page_turning\": \"1\",\"review_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
category 类别(必填)

此参数指定要抓取的业务类别。

请求示例:

"category": "cafe"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-reviews_by-search-filters" ^
  -d "spider_parameters=[{\"category\": \"cafe\",\"location\": \"Stowe, VT\",\"business_page_turning\": \"1\",\"review_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

location地点(必填)

此参数指定要抓取特定位置的商家。

请求示例:

"location": "Stowe, VT"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-reviews_by-search-filters" ^
  -d "spider_parameters=[{\"category\": \"cafe\",\"location\": \"Stowe, VT\",\"business_page_turning\": \"1\",\"review_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

business_page_turning最多商家列表页数(可选)

此参数指定要收集的商家列表页数。

请求示例:

"business_page_turning": "1"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-reviews_by-search-filters" ^
  -d "spider_parameters=[{\"category\": \"cafe\",\"location\": \"Stowe, VT\",\"business_page_turning\": \"1\",\"review_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

review_page_turning最多评论页数(可选)

此参数指定要抓取的最大评论页数。

请求示例:

"max review pages": "1"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-reviews_by-search-filters" ^
  -d "spider_parameters=[{\"category\": \"cafe\",\"location\": \"Stowe, VT\",\"business_page_turning\": \"1\",\"review_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

2. Yelp - 通过搜索网址抓取商家评论信息

spider_id抓取工具(必填)

定义要使用的抓取工具。

请求示例:

spider_id=yelp_business-overview_by-search-url

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://www.yelp.com/search?find_desc=Cafes%26find_loc=Stowe%2C%2BVT\",\"business_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"


search_url搜索URL(必填)

此参数指定要抓取的搜索 URL。

请求示例:

"search_url": "https://www.yelp.com/search?find_desc=Cafes%26find_loc=Stowe%2C%2BVT"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-search-url" ^
  -d "spider_parameters=[{\"search_url\": \"https://www.yelp.com/search?find_desc=Cafes%26find_loc=Stowe%2C%2BVT\",\"business_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

3. Yelp - 通过商家网址抓取商家评论信息

spider_id抓取工具(必填)

定义要使用的抓取工具。

请求示例:

spider_id=yelp_business-overview_by-business-url

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-overview_by-business-url" ^
  -d "spider_parameters=[{\"business_url\": \"https://www.yelp.com/biz/the-round-hearth-caf%C3%A9-and-marketplace-stowe?osq=Sandwiches\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"


business_url商家URL(必填)

此参数指定要抓取的商家URL。

请求示例:

"business_url": "https://www.yelp.com/biz/the-round-hearth-caf%C3%A9-and-marketplace-stowe?osq=Sandwiches"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-reviews_by-business-url" ^
  -d "spider_parameters=[{\"business_url\": \"https://www.yelp.com/biz/the-round-hearth-caf%C3%A9-and-marketplace-stowe?osq=Sandwiches\",\"unrecommended_reviews\": \"yes\",\"sort_by\": \"DATE_DESC\",\"review_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
unrecommended_reviews不推荐评论(必填)

此参数用于指定是否抓取不推荐的评论。

请求示例:

"unrecommended_reviews": "yes"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-reviews_by-business-url" ^
  -d "spider_parameters=[{\"business_url\": \"https://www.yelp.com/biz/the-round-hearth-caf%C3%A9-and-marketplace-stowe?osq=Sandwiches\",\"unrecommended_reviews\": \"yes\",\"sort_by\": \"DATE_DESC\",\"review_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
sort_by排序方式(可选)

此参数用于指定抓取评论结果的排序方法。

请求示例:

"sort_by": "DATE_DESC"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-reviews_by-business-url" ^
  -d "spider_parameters=[{\"business_url\": \"https://www.yelp.com/biz/the-round-hearth-caf%C3%A9-and-marketplace-stowe?osq=Sandwiches\",\"unrecommended_reviews\": \"yes\",\"sort_by\": \"DATE_DESC\",\"review_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"
review_page_turning最多评论页数(可选)

此参数用于指定要收集的最大评论页数。

请求示例:

"review_page_turning": "1"

curl -X POST "https://scraperapi.thordata.com/builder" ^
  -H "Authorization: Bearer Token-ID" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -d "spider_name=yelp.com" ^
  -d "spider_id=yelp_business-reviews_by-business-url" ^
  -d "spider_parameters=[{\"business_url\": \"https://www.yelp.com/biz/the-round-hearth-caf%C3%A9-and-marketplace-stowe?osq=Sandwiches\",\"unrecommended_reviews\": \"yes\",\"sort_by\": \"DATE_DESC\",\"review_page_turning\": \"1\"}]" ^
  -d "spider_errors=true" ^
  -d "file_name={{TasksID}}"

如果您需要进一步的帮助,请通过电子邮件联系 [email protected]

Last updated

Was this helpful?