Query Parameters

Universal Scraping API Query Parameters

Learn about the crawling parameters of Thordata's General Crawling API.

Token: This parameter defines the API token used for crawling and plays a decisive role in the success of your crawling.

Parameter
Name
Function

token

Token

API token for verification during crawling

URL: This parameter defines the target website link for crawling, with a default value of google.com. You can also change it to links of other search engines.

Parameter
Name
Function

url

URL

Target website URL for scraping

Example Request:

Example with token:Token ,url:https://www.google.com parameter

curl -X POST https://universalapi.thordata.com/request -H "Authorization: Bearer Token" -H "Content-Type: application/x-www-form-urlencoded" -d "url=https://www.google.com" -d "type=html" -d "js_render=True"
js_render ,JS Rendering(Optional)

JS rendering can handle dynamically loaded content and Single - Page Applications (SPAs). It supports pages with more complex interactions and rendering requirements, so it is recommended that you enable it.

Example Request:

Example with js_render:True parameter

curl -X POST https://universalapi.thordata.com/request -H "Authorization: Bearer Token" -H "Content-Type: application/x-www-form-urlencoded" -d "url=https://www.google.com" -d "type=html" -d "js_render=True"
type ,Format(Optional)

This parameter defines the output format of the crawling results. The options include HTML and PNG formats, with the default being HTML.

Example Request:

Example with type:png parameter

curl -X POST https://universalapi.thordata.com/request -H "Authorization: Bearer Token" -H "Content-Type: application/x-www-form-urlencoded" -d "url=https://www.google.com" -d "type=png" -d "js_render=True"
block_resources,Block(Optional)

This parameter can restrict the crawling of unnecessary resources, thereby accelerating the crawling speed.

Example Request:

Example with block_resources:script parameter

curl -X POST https://universalapi.thordata.com/request -H "Authorization: Bearer Token" -H "Content-Type: application/x-www-form-urlencoded" -d "url=https://www.google.com" -d "type=html" -d "js_render=True" -d "block_resources=script"
country,Proxies(Optional)

This parameter defines the country/region of the proxy used for crawling, with the default value being no proxy.

Example Request:

Example with country:al parameter

curl -X POST https://universalapi.thordata.com/request -H "Authorization: Bearer Token" -H "Content-Type: application/x-www-form-urlencoded" -d "url=https://www.google.com" -d "type=html" -d "js_render=True" -d "country=al"
clean_content,Remove(Optional)

This parameter is used to remove JS and CSS content from the returned results.

Example Request:

Example of a parameter with clean_content: JS,CSS

curl -X POST https://universalapi.thordata.com/request \
 -H "Authorization: Bearer token" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -d "url=https://www.google.com" \
 -d "type=html" \
 -d "js_render=True" \
 -d "clean_content=js,css" \
 -d "header=False"
wait,Wait for (ms)(Optional)

Wait for the page to load content within the specified time (in milliseconds). Maximum value: 100,000 ms Example Request:

Example of a parameter with wait:1000

curl -X POST https://universalapi.thordata.com/request \
 -H "Authorization: Bearer token" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -d "url=https://www.google.com" \
 -d "type=html" \
 -d "js_render=True" \
 -d "wait=10000"
wait_for ,Wait for selector(Optional)

Wait for the CSS selector to load in the DOM. wait_for overrides wait (ignoring fixed time) if both are used. Maximum wait time is 30 seconds; content is returned automatically on timeout.

Example Request:

Example with wait_for:.content parameter

curl -X POST https://universalapi.thordata.com/request \
 -H "Authorization: Bearer token" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -d "url=https://www.google.com" \
 -d "type=html" \
 -d "js_render=True" \
 -d "wait=10000" \
 -d "wait_for=.content"
headers,Custom Headers(Optional)

Pass custom request headers to the target website. When there are multiple headers, please separate them with an English comma ','; HTTP headers are key-value pairs separated by a colon (:). The parameters sent should be in JSON format.

Example request:

If you want to send User-Agent and Content-Type, it should be as follows:

curl -X POST https://universalapi.thordata.com/request \
 -H "Authorization: Bearer token" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -d "url=https://www.google.com" \
 -d "type=html" \
 -d "js_render=token" \
 -d 'headers=[{"name":"name1","value":"value1"}]' \
 -d 'cookies=[{"name":"name2","value":"value2"}]'
cookies,Cookies(Optional)

Pass custom cookies to the target website. When there are multiple cookies, please separate them with a semicolon ';'.

Cookie is a small piece of data stored by a website on a user's device through a web browser. Cookies allow websites to retain user information, such as login status, preferences, or tracking details, thereby improving and personalizing the browsing experience.

When making a request, you can add cookies in two ways:

  1. In the headers parameter: Send them as part of the Cookie header in the following format: (e.g.:headers:{"name":"Cookie","value":"cookie_name_1=cookie_value_1"})

  2. Using the dedicated cookies parameter: Pass them directly in this format: (e.g.:{""Cookie":"cookie_name_1=cookie_value_1"})

Example Request: Using the dedicated cookies parameter: Pass them directly in this format: cookie_name_1=cookie_value_1; cookie_name_2=cookie_value_2

curl -X POST https://universalapi.thordata.com/request \
 -H "Authorization: Bearer token" \
 -H "Content-Type: application/x-www-form-urlencoded" \
 -d "url=https://www.google.com" \
 -d "type=html" \
 -d "js_render=token" \
 -d 'headers=[{"name":"name1","value":"value1"}]' \
 -d 'cookies=[{"name":"name2","value":"value2"}]'

For assistance, contact [email protected].

Last updated

Was this helpful?