22 November, 2014

How To Clone A Website

There are a number of reasons why you might want to clone a site - whether you want to keep all the information off an interesting site, learn more about how a website is organised / to find hidden data on the site like passwords/email addresses, or for more malicious purposes like phishing and social engineering.

We can use a program called HTTrack (http://www.httrack.com/) to save a local clone of an entire website. HTTrack is available on windows, linux and even android. It is free to download from here:
http://www.httrack.com/page/2/en/index.html
Download the installer or standalone version from the link. To be consistent with the other tutorials, here is an alternate method of installation for Kali linux:


Navigate to "System Tools" > "Add/Remove Software".
Search for "httrack".

OR use the command line:
Code:
kali > apt-get install httrack

Before we begin, I should tell you that the help documentation is called "httrack-doc.html" and can be opened in any web browser. Consult this for more advanced usage of HTTrack.

WARNING: Some sites can be enormous, and will take a lot of time and space to clone. E.G. Don't try to clone facebook or HF.

Using the GUI (on windows):
1) Run WinHTTrack. Click "Next", Give the project a name and browse to the location you want to store the project in:

[Image: zq_XDSfn.png]

2) Then click "Next" again, Make sure "Download website" is chosen and add the address to the address box:

[Image: Untitled.png]
If you are using a proxy server, you can set the proxy settings by clicking "settings".

3)Then click "Next", and then "Finish". The website will then be cloned, and you can then close HTTrack.

The cloned website will be in a subfolder named after the project, inside the root folder you selected. HTTrack will also have made an "index.html" file to give you an index to all the websites you have cloned so far with it. You can now use the files as you please!

Linux Command Line Usage:


The basic template command looks like this:
Code:
httrack <the URL of the site> -O <location to save the clone>

In my case, this could look something like this:
Code:
httrack http://example.com -O /tmp/exampleclone

Then, if we open the browser (IceWeasel), and give it the address "/tmp/exampleclone", it will display the same index page as above. We can then check out our sites files as we please!

Thanks for reading, please reply with your thanks if you enjoyed this and/or found it useful.

Watch out for future hacking tutorials from me - further wireless cracking, metasploit and other common hacking topics! I will soon do a more in depth tutorial on the usage of NMAP and WireShark

Labels:

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home