The Website Scraper action is designed to extract valuable data from websites, including emails and social network URLs like LinkedIn, Twitter, and more. This action is particularly useful for gathering contact information and social media links from a specified domain.

Inputs

website
string
required

The URL of the website to scrape. Ensure it is a valid URL.

Parameters

get_imgs
checkbox
default:
false

Intercepts all the image URLs when the website loads.

timeout
number
default:
30000

Number of milliseconds before timing out.

Output Fields

The action returns a comprehensive set of data fields extracted from the website:

Below is the JSON schema for the output fields:

Specificities

This action navigates only to the root URL (home) of the provided domain.

For optimal performance, it is recommended not to exceed 1000 rows in your input CSV file.

1

Batch Processing

This action processes inputs in batches of 500. If you provide 1000 inputs, it will process the first 500 and queue the remaining 500 for later processing.

2

Metadata Tracking

When enriching a list of domains with LinkedIn, consider adding an extra column (e.g., “index_id”) to your data as metadata. This helps track which domains have been successfully enriched.