Download and Sign Up
Get a $5 Coupon For Free
Getting Started Main Features

【Smart Mode】How to set up Paging | Web Scraping Tool | ScrapeStorm

2021-07-15 16:37:10
18259 views

Abstract:This article will show you how to set up paging. ScrapeStormFree Download

In Smart Mode, ScrapeStorm auto detects paging. The paging usually include the following:

(1) Next Page Button

(2) Scroll to Load

(3) Scroll to Load + Next Page Button

(4) None

 

However, the recognition results are sometimes incorrect. The reasons usually include the following:

(1) The page loading speed is too slow. During the automatic recognition process, the next page button has not appeared.

(2) There are multiple next page buttons in the page, and the software will only choose one of them.

(3) In the case where both Scroll to Load and Next Page Button exist, the next page button does not appear after the software automatically scrolls multiple times.

(4) The next page button in the current page is temporarily incompatible.

 

The settings menu for “Paging” is shown in the following figure.

(1) Next Page Button

ⅰ: Auto Detect

Click on the “Auto Detect” option.

The software will automatically detect the next page button. After the detection is successful, the page will scroll to the position of the next page button.

ⅱ: Select Button

If the software does not automatically detect the next page button, you will need to manually “Select button“.

Step 1: Click on the “Select button” option

Step 2: Click the next page button on the page

ⅲ: Edit XPath

If the above two cases cannot correctly detect the next page button, you need to edit the XPath.

 

(2) Sroll to Load

Suitable for web pages that do not have a next page button and need to be scrolled to load content.

 

(3) Scroll to Load + Next Page Button

Suitable for pages that do not have a next page button at the beginning, and that require multiple scrolling of the page before the next page button can be displayed.

Or the next page button has been displayed, but some content of the current webpage is not displayed, and the page needs to be scrolled multiple times to display the entire content.

This type of paging is more difficult to detect. Although the software will try to scroll automatically during automatic detection, the number of scrolling may not match the number of scrolling required by the current web page. Therefore, this type of paging usually requires some manual operations.

There are several situations:

ⅰ:Scroll to Load” was identified, but the next page button was not identified.

Step 1: Scroll the page manually until the next page button appears on the page

Step 2:  Select “Auto Detect“.

If the auto detect fails, select “Select Button” and click the next page button on the page.

ⅱ: The next page button was identified, but “Scroll to Load” was not identified.

In this case, just select “Scroll to Load“.

P.S. If the current web page is not needed, but the software Identifies “Scroll to Load”, it will not affect the scraping result, but canceling the “Scroll to Load” option can improve the scraping speed.

(4) None

If you do not need paging, select “None“.

P.S. Whether to set up paging or not has nothing to do with whether there is next page button in the current web page. It is only related to your scraping requirements. Not setting pages can narrow the scraping scope and improve the scraping speed.

Keyword extraction from web content python crawler php crawler Automatically organize data into excel Download images in batches Generate URLs in batches Download videos in batches python download file Data scraping with python Match emails with Regex
关闭