The Extract Cards node helps you extract repeated card-style elements from web pages.
Overview
This node enables you to:
- Extract repeated elements
- Parse card structures
- Handle dynamic loading
- Process grid layouts
- Extract media content
Configuration
Parameter | Type | Description |
---|
URL | String | Target webpage URL |
Card Selector | String | CSS selector for card elements |
Fields | Object | Mapping of card fields to selectors |
Pagination | Object | Pagination configuration |
Example Usage
{
"url": "https://example.com/products",
"card_selector": ".product-card",
"fields": {
"title": ".card-title",
"price": ".price",
"image": "img.product-image",
"description": ".description"
}
}
Advanced Configuration
{
"url": "https://example.com/products",
"card_selector": ".product-card",
"fields": {
"title": {
"selector": ".card-title",
"attribute": "text",
"transform": "trim"
},
"price": {
"selector": ".price",
"attribute": "text",
"transform": "number"
},
"image": {
"selector": "img.product-image",
"attribute": "src"
},
"rating": {
"selector": ".rating",
"attribute": "data-rating"
}
},
"pagination": {
"enabled": true,
"next_button": ".pagination .next",
"max_pages": 5
}
}
Field Types
Text Content
{
"field": {
"selector": ".text-content",
"attribute": "text",
"transform": ["trim", "lowercase"]
}
}
Images
{
"field": {
"selector": "img",
"attributes": ["src", "alt"],
"download": true
}
}
Links
{
"field": {
"selector": "a",
"attributes": ["href", "title"],
"follow": true
}
}
Data Processing
{
"transformations": {
"price": ["remove_currency", "to_number"],
"description": ["trim", "remove_html"]
}
}
Validation Rules
{
"validation": {
"required": ["title", "price"],
"types": {
"price": "number",
"rating": "float"
}
}
}
Error Handling
Common issues and solutions:
- Missing elements
- Invalid selectors
- Dynamic content
- Rate limiting
- Network errors