Logo 
Search:

Unix / Linux / Ubuntu Answers

Ask Question   UnAnswered
Home » Forum » Unix / Linux / Ubuntu       RSS Feeds
  on Jan 21 In Unix / Linux / Ubuntu Category.

  
Question Answered By: Adah Miller   on Jan 21

A number of websites, particularly those based on wikis, are now
implementing 'burst triggers', which prevent you downloading too many
pages in a given time period, presumably for this reason (amongst others).

This means that such sites cannot be downloaded by conventional
mirroring programs, such as httrack. However, wget has two switches
which can help to lighten the load on the target servers, making a
successful download more likely. The first is --wait=[seconds], which
as the name implies, waits for the specified number of seconds between
requests to the server.

However, some servers are now using log analysis to identify this
approach and blocking the download, so wget also has the --random-wait
switch which gives a random delay between 0.5 and 1.5 times the value
specified in --wait (the two need to be used together).

So far as I am aware, wget is the only program to implement this feature
and so it's the one I would recommend.

Share: 

 

This Question has 18 more answer(s). View Complete Question Thread

 
Didn't find what you were looking for? Find more on command line or software to download hole site Or get search suggestion and latest updates.


Tagged: