• OTW

Using FOCA to Gather Website Metadata


My preference for Linux as a hacking platform is well documented, and I have even created a series of tutorials to train new hackers. Without being proficient in Linux, you can't really call yourself a hacker.

Every once in while, though, a hacking tool comes out for Windows that makes me stand up and take notice. For instance, Cain and Abel is an excellent tool for password cracking and MitM attacks and is only available on the Windows platform. I could name a few others, but only Havij, the excellent SQL injection tool, comes immediately to mind.

FOCA was released in 2009 and is now in version three. Although a Linux version is included in Kali, it is outdated. Let's download the latest Windows version and use it here to perform some reconnaissance.

Good Reconnaissance Is Critical

FOCA is an excellent website reconnaissance tool with lots of interesting features and capabilities. Remember, before attacking any website or domain, it is critical to gather as much information as possible. From this information, you can determine the attack that is most likely to work against that site or network.

In this tutorial, we will looking at FOCA's ability to find, download, and retrieve files from websites with the file's metadata.

This metadata can give us insight into such information as the users (could be critical in cracking passwords), operating system (exploits are OS-specific), email addresses (possibly for social engineering), the software used (once again, exploits are OS-, and more and more often, application-specific)--and if we are really lucky-- passwords.

Step 1: Download FOCA for Windows

First, let's download FOCA Free 3.0 for Windows.

Step 2: Choose Where You Save Results

When you install FOCA, you will greeted with a screen like that below. The first task we need to do is to start a new project and then tell FOCA where we want to save our results.

I created a new directory at C:\foca and will save all my results there. Of course, you can save your results wherever is convenient for you, or use the default temp directory.

Step 3: Create a Project

In this tutorial, I will be starting with a project named after the information security training company, SANS, which is located at sans.org, and I will be saving my results to C:\foca.

Step 4: Getting the Metadata

Once I create my project, I can go to the object explorer to the far left and select Metadata. This enables us to pull the metadata from the files on the website that contain metadata. Files such as .pdf, .doc, .xls, etc. all contain metadata that could be useful in your hack of your target.

When you select metadata, you will pull up a screen like that below. In our case here, we will be searching sans.org for .doc files, so the syntax to be placed in the search window is:

site:sans.org filetype:doc

This will search the entire sans.org website, looking for .doc files. When I hit the Search button next to the window, it will begin to search and find all the .doc files at sans.org.

Of course, if you were searching for .pdf files or other filetypes, you would put in that filetype. You can also search for multiple filetypes by listing them after filetype, such as:

site:sans.org filetype:pdf,doc,xls

Step 5: Download the Files

Once FOCA is done retrieving a list of all the .doc files, we can then right-click on any file and download the file to our hard drive, download all the files, or analyze the metadata. I chose to download all the .doc files I found at sans.org.

Step 6: Collect & Analyze the Metadata

Now that we have downloaded all the .doc files, I chose to analyze all the metadata in them. Microsoft's Office files collect significant amounts of data as they are being created and edited that we can then extract.

When we expand the Metadata folder in the object explorer, you can see that we have 156 .doc files and 2 .docx files.

The Types of Metadata Collected

Just beneath the Metadata documents folder is another folder titled, Metadata Summary. We can click on it and it reveals the type of metadata is has collected from the files. This metadata is broken into eight (8) categories:

  1. Users

  2. Folders

  3. Printers

  4. Software

  5. Emails

  6. Operating Systems

  7. Passwords

  8. Servers

Let's take a look at Users first. When we click on users, we can see that FOCA has collected the names of every user that worked on those files.

When we click on Software, we can see the various editions of Microsoft Office that has been used, including five (5) users that created their documents with Office '97 (hmm...wonder if there are any Office '97 vulnerabilities still out there?).

We can also look for email addresses that are embedded in the documents as displayed below. Obviously, these folks are making themselves available to a social engineering attack.

We can also gather printer, folder, passwords, and servers from this metadata depending upon the documents we recover. All of this information can then be used to determine what is the best attack against this organization/website.


1,281 views