Différences
Ci-dessous, les différences entre deux révisions de la page.
Les deux révisions précédentes Révision précédente Prochaine révision | Révision précédente | ||
httrack [Le 23/10/2016, 02:20] fouadessahlaoui [Utilisation] |
httrack [Le 19/01/2025, 20:39] (Version actuelle) Amiralgaby ancienne révision (Le 27/01/2024, 10:13) restaurée |
||
---|---|---|---|
Ligne 1: | Ligne 1: | ||
- | {{tag>Lucid Precise Quantal internet développement}} | + | {{tag>Bionic internet programmation BROUILLON}} |
---- | ---- | ||
Ligne 7: | Ligne 7: | ||
**Httrack** est un célèbre aspirateur de sites web. | **Httrack** est un célèbre aspirateur de sites web. | ||
- | === Avertissement === | + | <note warning> |
- | //Les sites volumineux (le forum et la documentation Ubuntu-fr compris), **ne doivent pas** être aspirés automatiquement, sous peine de blocage de votre adresse IP par le site. L'aspiration de sites doit respecter une certaine éthique et doit être utilisée uniquement lorsqu'il y a un besoin d'accéder à des contenus hors lignes. L'aspiration demande au site visé des ressources matérielles bien plus importante que le simple affichage d'une page web. Demandez l'autorisation au webmaster avant de procéder ! N'oublions pas non plus les problématiques liées à la propriété intellectuelle.// | + | Les sites volumineux (le forum et la documentation Ubuntu-fr compris), **ne doivent pas** être aspirés automatiquement, sous peine de blocage de votre adresse IP par le site. L'aspiration de sites doit respecter une certaine éthique et doit être utilisée uniquement lorsqu'il y a un besoin d'accéder à des contenus hors lignes. L'aspiration demande au site visé des ressources matérielles bien plus importante que le simple affichage d'une page web. Demandez l'autorisation au webmaster avant d'agir ! N'oublions pas non plus les problématiques liées à la propriété intellectuelle.</note> |
===== Installation ===== | ===== Installation ===== | ||
Il existe deux versions de httrack : | Il existe deux versions de httrack : | ||
- | * La version de base : [[:tutoriel:comment_installer_un_paquet|installez le paquet]] **[[apt://httrack|httrack]]** (dépôt Universe). | + | * La version de base : [[:tutoriel:comment_installer_un_paquet|installez le paquet]] **[[apt>httrack]]** |
- | * La version graphique, qui va utiliser votre navigateur préféré : [[:tutoriel:comment_installer_un_paquet|installez le paquet]] **[[apt://webhttrack|webhttrack]]** (dépôt Universe). | + | * La version graphique, qui va utiliser votre navigateur préféré : [[:tutoriel:comment_installer_un_paquet|installez le paquet]] **[[apt>webhttrack]]**. |
+ | =====Utilisation===== | ||
+ | httrack --mirror http://website.com | ||
httrack(1) General Commands Manual httrack(1) | httrack(1) General Commands Manual httrack(1) | ||
Ligne 189: | Ligne 190: | ||
-%s update hacks: various hacks to limit re-transfers when updating (identical size, bogus response..) (--updatehack) | -%s update hacks: various hacks to limit re-transfers when updating (identical size, bogus response..) (--updatehack) | ||
- | -%u url hacks: various hacks to limit duplicate URLs (strip //, www.foo.com==foo.com..) (--urlhack) | + | -%u url hacks: various hacks to limit duplicate URLs (strip , www.foo.com==foo.com..) (--urlhack) |
-%A assume that a type (cgi,asp..) is always linked with a mime type (-%A php3,cgi=text/html;dat,bin=application/x-zip) (--assume <param>) | -%A assume that a type (cgi,asp..) is always linked with a mime type (-%A php3,cgi=text/html;dat,bin=application/x-zip) (--assume <param>) | ||
Ligne 407: | Ligne 408: | ||
- | Details: Option K | ||
- | -K0 foo.cgi?q=45 -> foo4B54.html?q=45 (relative URI, default) | ||
- | |||
- | -K -> http://www.foobar.com/folder/foo.cgi?q=45 (absolute URL) (--keep-links[=N]) | ||
- | |||
- | -K3 -> /folder/foo.cgi?q=45 (absolute URI) | ||
- | |||
- | -K4 -> foo.cgi?q=45 (original URL) | ||
- | |||
- | -K5 -> http://www.foobar.com/folder/foo4B54.html?q=45 (transparent proxy URL) | ||
- | |||
- | |||
- | Shortcuts: | ||
- | --mirror | ||
- | <URLs> *make a mirror of site(s) (default) | ||
- | |||
- | --get | ||
- | <URLs> get the files indicated, do not seek other URLs (-qg) | ||
- | |||
- | --list | ||
- | <text file> add all URL located in this text file (-%L) | ||
- | |||
- | --mirrorlinks | ||
- | <URLs> mirror all links in 1st level pages (-Y) | ||
- | |||
- | --testlinks | ||
- | <URLs> test links in pages (-r1p0C0I0t) | ||
- | |||
- | --spider | ||
- | <URLs> spider site(s), to test links: reports Errors & Warnings (-p0C0I0t) | ||
- | |||
- | --testsite | ||
- | <URLs> identical to --spider | ||
- | |||
- | --skeleton | ||
- | <URLs> make a mirror, but gets only html files (-p1) | ||
- | |||
- | --update | ||
- | update a mirror, without confirmation (-iC2) | ||
- | |||
- | --continue | ||
- | continue a mirror, without confirmation (-iC1) | ||
- | |||
- | |||
- | --catchurl | ||
- | create a temporary proxy to capture an URL or a form post URL | ||
- | |||
- | --clean | ||
- | erase cache & log files | ||
- | |||
- | |||
- | --http10 | ||
- | force http/1.0 requests (-%h) | ||
- | |||
- | |||
- | Details: Option %W: External callbacks prototypes | ||
- | see htsdefines.h | ||
- | FILES | ||
- | /etc/httrack.conf | ||
- | The system wide configuration file. | ||
- | |||
- | ENVIRONMENT | ||
- | HOME Is being used if you defined in /etc/httrack.conf the line path ~/websites/# | ||
- | |||
- | DIAGNOSTICS | ||
- | Errors/Warnings are reported to hts-log.txt by default, or to stderr if the -v option was specified. | ||
- | |||
- | LIMITS | ||
- | These are the principals limits of HTTrack for that moment. Note that we did not heard about any other utility that would have solved them. | ||
- | |||
- | |||
- | - Several scripts generating complex filenames may not find them (ex: img.src='image'+a+Mobj.dst+'.gif') | ||
- | |||
- | - Some java classes may not find some files on them (class included) | ||
- | |||
- | - Cgi-bin links may not work properly in some cases (parameters needed). To avoid them: use filters like -*cgi-bin* | ||
- | |||
- | BUGS | ||
- | Please reports bugs to <bugs@httrack.com>. Include a complete, self-contained example that will allow the bug to be reproduced, and say which version of | ||
- | httrack you are using. Do not forget to detail options used, OS version, and any other information you deem necessary. | ||
- | |||
- | COPYRIGHT | ||
- | Copyright (C) 1998-2014 Xavier Roche and other contributors | ||
- | |||
- | This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Soft†| ||
- | ware Foundation, either version 3 of the License, or (at your option) any later version. | ||
- | |||
- | This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS | ||
- | FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. | ||
- | |||
- | You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
- | |||
- | |||
- | AVAILABILITY | ||
- | The most recent released version of httrack can be found at: http://www.httrack.com | ||
- | |||
- | AUTHOR | ||
- | Xavier Roche <roche@httrack.com> | ||
- | |||
- | SEE ALSO | ||
- | The HTML documentation (available online at http://www.httrack.com/html/ ) contains more detailed information. Please also refer to the httrack FAQ | ||
- | (available online at http://www.httrack.com/html/faq.html ) | ||
- | |||
- | |||
- | |||
- | httrack website copier 28 July 2014 httrack(1) | ||
===== Utilisation en ligne de commande ===== | ===== Utilisation en ligne de commande ===== |