Scrape API User Guide

Table of Contents

This document is intended for developers to facilitate efficient and convenient data integration and rapid API connection.

API Name

Amazon Page Scraping API

API Description

This API is used to scrape any page from the Amazon front end and supports scraping with a specified postal code to obtain page data consistent with what Amazon presents to consumers. The API returns data asynchronously, so developers need to deploy a simple HTTP service to receive the data. We will push the scraping results to you via an HTTP request. At the end of this document, you will find the code for a Java Spring Boot-based receiving service for reference.

Request URL

http://scrape.pangolinfo.com/api/task/receive/v1

Request Method

POST

Parameters

Query Parameters

Parameter NameParameter TypeDescription
tokenStringUser authentication information, please contact the administrator to obtain it

Request Body

				
					{
    "url": "https://www.amazon.com/s?k=baby", // Required, the Amazon page URL to scrape
    "callbackUrl": "http://xxx/xxx", // Required, the developer's service address for receiving data (the page data will be pushed to this address upon successful scraping)
    "proxySession": "0502f0d18e034e72bd14b026a3964f54", // 32-character UUID for specifying a particular IP for scraping, IP can be maintained for the day and expires after midnight
    "callbackHeaders": "k1:v1|k2:v2", // Optional, data to be included in the request headers during the callback, ensure value is correctly encoded
    "bizContext": { // Optional
        "zipcode": "90001" // Amazon postal code information (optional), the example is the postal code for Los Angeles, USA
    }
}

				
			

Note: The following postal codes are currently supported:

				
					United States:
"10041", "90001", "60601", "84104"

Germany:
"80331", "10115", "20095", "60306"

United Kingdom:
"W1S 3AS", "EH15 1LR", "M13 9PL", "M2 5BQ"

Japan:
"100-0004", "060-8588", "163-8001", "900-8570"

France:
"75000", "69001", "06000", "13000"

Italy:
"20019", "50121", "00042", "30100"

Spain:
"41001", "28001", "08001", "46001"

Canada:
"M4C 4Y4", "V6E 1N2", "H3G 2K8", "T2R 0G5"
				
			

Response Parameters:

				
					{
    "code": 0, // System status code
    "message": "ok",
    "data": {
        "data": "57b049c3fdf24e309043f28139b44d05", // Returns the spider task ID; this ID along with the page data will be pushed to the receiving service upon successful scraping
        "bizCode": 0, // Business status code
        "bizMsg": "ok" // Business status message
    }
}

				
			

Error Codes

1001

  • Meaning: Parameter is empty / Parameter is incorrect
  • Solution: Check if the request parameters are correct

1004

  • Meaning: Access denied / Token is incorrect / Exceeded trial limit
  • Solution: Please check the Token

Example Request

1. Curl Example

				
					# Request
curl --location 'http://scrape.pangolinfo.com/api/task/receive/v1?token=xxx' \
--header 'Content-Type: application/json' \
--data '{
  "url": "https://www.amazon.com/s?k=baby",
  "callbackUrl": "http://***.***.***.***/callback/data",
  "bizContext": {
    "zipcode": "90001"
  }
}'

# Response
{
    "code": 0, // System status code
    "message": "ok",
    "data": {
        "data": "57b049c3fdf24e309043f28139b44d05", // Returns the spider task ID; this ID along with the page data will be pushed to the receiving service upon successful scraping
        "bizCode": 0, // Business status code
        "bizMsg": "ok" // Business status message
    }
}

				
			

Java – OKHttp Example

				
					// Request
OkHttpClient client = new OkHttpClient.Builder()
  .build();
MediaType mediaType = MediaType.parse("application/json");
RequestBody body = RequestBody.create(mediaType, "{\"url\":\"https://www.amazon.com/s?k=baby\",\"callbackUrl\":\"http://***.***.***.***/callback/data\",\"bizContext\":{\"zipcode\":\"90001\"}}");
Request request = new Request.Builder()
  .url("http://scrape.pangolinfo.com/api/task/receive/v1?token=xxx")
  .method("POST", body)
  .addHeader("Content-Type", "application/json")
  .build();
Response response = client.newCall(request).execute();

// Response
{
    "code": 0, // System status code
    "message": "ok",
    "data": {
        "data": "57b049c3fdf24e309043f28139b44d05", // Returns the spider task ID; this ID along with the page data will be pushed to the receiving service upon successful scraping
        "bizCode": 0, // Business status code
        "bizMsg": "ok" // Business status message
    }
}

				
			

Python – Requests Example

				
					# Request
import requests
import json

url = "http://scrape.pangolinfo.com/api/task/receive/v1?token=xxx"

payload = json.dumps({
  "url": "https://www.amazon.com/s?k=baby",
  "callbackUrl": "http://***.***.***.***/callback/data",
  "bizContext": {
    "zipcode": "90001"
  }
})
headers = {
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

# Response
{
    "code": 0, // System status code
    "message": "ok",
    "data": {
        "data": "57b049c3fdf24e309043f28139b44d05", // Returns the spider task ID; this ID along with the page data will be pushed to the receiving service upon successful scraping
        "bizCode": 0, // Business status code
        "bizMsg": "ok" // Business status message
    }
}

				
			

Receiving Service Example

Java Springboot Project

Need help?

We are devoted to your success, don't hestitate to contact us for any kind of questions!

Our team of experts is committed to helping you troubleshoot and fix any issue that you might experience with our products.

If you want to file a bug report or need technical assistance, be sure to reach our support team by sending us an email. Or consult technical documentation. [Scrape API User Guide]

Scroll to Top
pangolinfo LOGO

Talk to our team

Pangolin provides a total solution from network resource, scrapper, to data collection service.
This website uses cookies to ensure you get the best experience.
pangolinfo LOGO

与我们的团队交谈

Pangolin提供从网络资源、爬虫工具到数据采集服务的完整解决方案。