top of page
Search
  • Writer's pictureotw

OSINT: Using Spiderfoot for OSINT Data Gathering


Welcome back, my aspiring OSINT experts!


Open Source Intelligence --or OSINT as it has become known --is a leading-edge field in hacking/pentesting, forensics and data science. OSINT is changing the way private investigators, pentesters and data scientists do their job.


Open Source Intelligence uses the resources freely available on the Internet (no illegal activities necessary) to do hacking/pentest reconnaissance, solve mysteries and conduct investigations (for write up on how Hackers-Arise and OTW uncovered a global scam using OSINT techniques, check out this article). There are a multitude of tools and techniques to harvest this intelligence. Keep in mind that there are a plethora of resources on the Internet and each one often demands it's out tools and techniques.


For hackers and pentesters, OSINT can be invaluable in garnering information on a target for a phishing or spearphishing campaign. The more you know about your target, the higher the probability of success.


In the this tutorial we will look at how to use a general-purpose OSINT data gathering tool named spiderfoot. This tool is excellent for starting an investigation as it is capable of gathering information from a multiple of sources automatically with little or no manual intervention. Once this data has been gathered you will likely need to dig deeper with a specific tool for that particular resource.



Step #1: Download and Install Install spiderfoot


Spiderfoot is not installed by default in Kali, so you will need to download it from github.com.


kali > git clone https://github.com/smicallef/spiderfoot.git



Once you have completed the download, navigate to the new spiderfoot directory.


kali > cd spiderfoot


Next, you will need to download spiderfoot's requirements.


kali > pip3 install -r requirements


Spiderfoot does have a package on the Kali repository and you can download with apt or apt-get.


kali > apt-get install spiderfoot




Step #2: Start Spiderfoot and Open a Web Browser


You can run spiderfoot from the command line, but I prefer to use the browser as it enables easy navigation and graphical results that are easy to decipher. Spiderfoot will open a web server on Kali and serve the spiderfoot application on port 5001.


kali > python3 sf.py -l 127.0.0.1:5001




Once the web server is up and running, open a browser at 127.0.0.1 (or localhost) at port 5001 and you should be greeted with a screen like that below.



Step #3: Select the Type of Investigation


The next step is to choose what type of scan you want to conduct. Spiderfoot is capable of using a number of different data types as a "seed target". After you name your scan ("new scan") and click on the "Seed Target" window, you will be greeted by a pull down window with a number of options.


Spiderfoot can use a domain name, IP address (IPv4 or IPv6), Hostname, subnet, ASN, email address, phone numbers, human name and username as a "seed target". I have found spiderfoot to be particularly useful searching for email addresses, phone numbers, and both human and usernames. There are a number of other tools capable of finding the other Seed Targets.


In this case, I was looking a for a Human Name and entered it into the window in double quotation marks.



When spiderfoot has completed its scan, you can review its results by clicking on the "Browse" tab.



One of the nicer features of spiderfoot is its ability to display the data in graphical form. Each node represents a bit of data on the subject.



If you expand the graph, you can see the individual detail of every node.




Step #4: Using API Keys


Spiderfoot is also capable of using a number of different services that require API keys. These Application Programming Interfaces (API) are the standard way that users and developers interface and access the application and it's resources. The API's that spiderfoot is capable of using include;


Honeypot Checker - www.projecthoneypot.org This service checks to see whether a host is a likely honeypot.


Shodan - www.shodan.io Shodan enables you to scan the Internet by the banners presented by web servers that reveal the underlying technologies.


VirusTotal - www.virustotal.com enables you to search whether a bit of software is known malware


IBM X-Force Exchange - https://exchange.xforce.ibmcloud.com - this service from IBM provides data that integrates with firewalls, IDS and SIEM's on malicious threats.


Malware Patrol - http://www.malwarepatrol.net - this service tracks active threats presently being used on the Internet.


BotScout - http://www.botscout.com - this service identifies bots and prevent them from joining company forums and other services.


Censys.io - http://www.censys.io - this service scans nearly every IP address for vulnerabilities an entry points.


Hunter.io - http://www.hunter.io - this service is among the best at finding email addresses.


AlienVault OTX - https://otx.alienvault.com - this service shares the latest information about emerging threats, attack methods, and malicious actors, promoting greater security across the entire community.


Clearbit - https://dashboard.clearbit.com - this service provides real-time info on visitors to your web site.


BuiltWith - https://www.builtwith.com - BuiltWith scans the Internet for the technologies behind the web site. An excellent service for quickly identifying targets with a vulnerable technology.


FraudGuard -https://fraudguard.io - this service collects info on honeypots, open proxies servers, ToR exit nodes, geographic IP tracking, botnets, and spam IP's.


IPinfo.io - https://ipinfo.io. - this service is capable of tracking the geographic and other data on any IP


Security Trails SecurityTrails.com enables you to research DNS history, WhoIs data, Domain names, Website technologies, Hostname information and tags.


FullContact.com https://fullcontact.com - this service helps advertisers identify the user on every device to optimize advertsing campaigns. It can be great for identifying the user of a device.


RiskIQ - https://riskiq.com Risk IQ specializes in Attack Surface Management. Their database is designed for CISO's to manage their companies risk by collecting key information that could be useful to an attacker



Spiderfoot is a powerful tool without these API's, but given these API's it could become your go-to tool for automated OSINT reconnaissance.



Summary


OSINT is rapidly becoming a key science and skillset for investigators, pentesters and data scientists. spiderfoot can save these professionals innumerable hours working with individuals tools by providing an automated scan of a number of open source resources.


For more on OSINT tools and techniques, got to the OSINT page or attend the next OSINT training at Hackers-Arise!





12,205 views
bottom of page