Replace Pioneer Home   All Examples   Free Download

 New request --free  RSS: Replace Pioneer Examples

1149.Batch download -- How to extract all links that contain imprint from a webpage?

User: Slim Johns -- 2013-11-23          << 1148  1150 >>
Hits: 1600
Type: Batch download   
Search all Batch download examples
Description:
Hi Guys,

I am looking for a way to retrieve files 1 OR TWO Levels away from a startin address that contain a certain string e.g. "Imprint"

Best Slim
Input Sample:
www.listing-service.com  (example)

contain links to: 
www.company_1.com/files/imprint.html
www.nolink.net/contact/no_content.html
www.company_x.net/service/contact_information/contact.php
Output Sample:
=> Files downloaded:
www.company_1.com/files/imprint.html
www.company_x.net/service/contact_information/contact.php

(Both contain the string "imprint")
Answer:
Hint: You need to Download and install "Replace Pioneer" on windows platform to finish following steps.
1. open "Tools->Batch Runner" menu
2. click "Extract Links" button
* set "File Type Filter" to:

* set "web page1" to your web page address:

3. click "Extract", you will get all links
4. if you want to download, click "copy/download" button.

Note: you can right click the address list before step 4, and select the "remove duplicate Input" contact menu, in case there may be some duplicate links.

Similar Examples:
How to extract all lines that contain words in a list? (76%)
How to extract all lines that contain words A and B and C? (76%)
How to extract all lines that contain specific words in a file? (74%)
How to extract all links from a web page?   (71%)
How to extract all lines that contain "abc" from multiple files? (70%)
How to extract all lines that contain chem, but not any of org and cn? (70%)
How to extract all lines that contain A, but not any of B C D? (70%)
How to extract all lines not containing 'ing' and 'and'? (69%)

Check Demo of Batch download
Keywords:
best  contact  level  file type filter  extract links  extract all links  extract link  remove duplicate  remove duplicat  remove dupl  copy string  download all files from webpage  download all files from the webpage  remove links  extract string  string extract  remove string  remove a string