Query Parameters

Universal Scraping API Query Parameters

Learn about the crawling parameters of Thordata's General Crawling API.

Token: This parameter defines the API token used for crawling and plays a decisive role in the success of your crawling.

Parameter
Name
Function

token

Token

API token for verification during crawling

URL: This parameter defines the target website link for crawling, with a default value of google.com. You can also change it to links of other search engines.

Parameter
Name
Function

url

URL

Target website URL for scraping

Example Request:

Example with token:Token ,url:https://www.google.com parameter

curl -X POST https://universalapi.thordata.com/request -H "Authorization: Bearer Token" -H "Content-Type: application/x-www-form-urlencoded" -d "url=https://www.google.com" -d "type=html" -d "js_render=True"
js_render ,JS Rendering(Optional)

JS rendering can handle dynamically loaded content and Single - Page Applications (SPAs). It supports pages with more complex interactions and rendering requirements, so it is recommended that you enable it.

Example Request:

Example with js_render:True parameter

curl -X POST https://universalapi.thordata.com/request -H "Authorization: Bearer Token" -H "Content-Type: application/x-www-form-urlencoded" -d "url=https://www.google.com" -d "type=html" -d "js_render=True"
type ,Format(Optional)

This parameter defines the output format of the crawling results. The options include HTML and PNG formats, with the default being HTML.

Example Request:

Example with type:png parameter

curl -X POST https://universalapi.thordata.com/request -H "Authorization: Bearer Token" -H "Content-Type: application/x-www-form-urlencoded" -d "url=https://www.google.com" -d "type=png" -d "js_render=True"
header ,Header(Optional)

After enabling, the output results will include the information of the request headers.

Example Request:

Example with header:Ture parameter

curl -X POST https://universalapi.thordata.com/request \
 -H "Authorization: Bearer token" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -d "url=https://www.google.com" \
 -d "type=html" \
 -d "js_render=True" \
 -d "header=True"
block_resources,Block(Optional)

This parameter can restrict the crawling of unnecessary resources, thereby accelerating the crawling speed.

Example Request:

Example with block_resources:script parameter

curl -X POST https://universalapi.thordata.com/request -H "Authorization: Bearer Token" -H "Content-Type: application/x-www-form-urlencoded" -d "url=https://www.google.com" -d "type=html" -d "js_render=True" -d "block_resources=script"
country,Proxies(Optional)

This parameter defines the country/region of the proxy used for crawling, with the default value being no proxy.

Example Request:

Example with country:al parameter

curl -X POST https://universalapi.thordata.com/request -H "Authorization: Bearer Token" -H "Content-Type: application/x-www-form-urlencoded" -d "url=https://www.google.com" -d "type=html" -d "js_render=True" -d "country=al"
clean_content,Remove(Optional)

This parameter is used to remove JS and CSS content from the returned results.

Example Request:

Example of a parameter with clean_content: JS,CSS

curl -X POST https://universalapi.thordata.com/request \
 -H "Authorization: Bearer token" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -d "url=https://www.google.com" \
 -d "type=html" \
 -d "js_render=True" \
 -d "clean_content=js,css" \
 -d "header=False"
waitfor,Wait for (in ms)(Optional)

Wait for the page to load content within a specified time (in milliseconds). Maximum value: 100,000 milliseconds (100 seconds)

Example Request:

Example of a parameter with waitfor:1000

curl -X POST https://universalapi.thordata.com/request \
 -H "Authorization: Bearer token" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -d "url=https://www.google.com" \
 -d "type=html" \
 -d "js_render=True" \
 -d "header=False" \
 -d "waitfor=1000"
headers,Custom Headers(Optional)

Pass custom request headers to the target website. When there are multiple headers, please separate them with an English comma ','; HTTP headers are key-value pairs separated by a colon (:). The parameters sent should be in JSON format.

Example request:

If you want to send User-Agent and Content-Type, it should be as follows:

curl -X POST https://universalapi.thordata.com/request \
 -H "Authorization: Bearer token" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -d "url=https://www.google.com" \
 -d "type=html" \
 -d "js_render=True" \
 -d "header=False" \
 -d "headers="User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36", "Content-Type": "application/json""
cookies,Cookies(Optional)

Pass custom cookies to the target website. When there are multiple cookies, please separate them with a semicolon ';'.

Cookie is a small piece of data stored by a website on a user's device through a web browser. Cookies allow websites to retain user information, such as login status, preferences, or tracking details, thereby improving and personalizing the browsing experience.

When making a request, you can add cookies in two ways:

  1. In the headers parameter: Send them as part of the Cookie header in the following format: "Cookie": "cookie_name_1=cookie_value_1; cookie_name_2=cookie_value_2"

  2. Using the dedicated cookies parameter: Pass them directly in this format: cookie_name_1=cookie_value_1; cookie_name_2=cookie_value_2

Example Request: Using the dedicated cookies parameter: Pass them directly in this format: cookie_name_1=cookie_value_1; cookie_name_2=cookie_value_2

curl -X POST https://universalapi.thordata.com/request \
 -H "Authorization: Bearer token" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -d "url=https://www.google.com" \
 -d "type=html" \
 -d "js_render=True" \
 -d "header=False" \
 -d "cookies=cookie_name_1=cookie_value_1; cookie_name_2=cookie_value_2"

For assistance, contact [email protected].

Last updated

Was this helpful?