Web Scraping
Scrape Content from Links
Extract content from a list of links recursively
Scrape Content from Links
The Scrape Content from Links node allows you to extract content from a list of links and follow them recursively if needed.
Overview
This node enables you to:
- Extract content from links
- Follow nested links
- Set crawl depth
- Filter content
- Handle pagination
Configuration
Parameter | Type | Description |
---|---|---|
Start URLs | Array[String] | Initial URLs to scrape |
Link Selector | String | CSS selector for finding links |
Content Selectors | Object | Selectors for content extraction |
Max Depth | Number | Maximum crawl depth |
Follow Rules | Object | Rules for following links |
Example Usage
Basic Link Scraping
Advanced Configuration
Link Following Rules
Pattern Matching
Content Extraction
Nested Content
Data Processing
Content Transformations
Error Handling
Common issues and solutions:
- Broken links
- Invalid content structure
- Rate limiting
- Depth limits
- Circular references