Recon - Information Gathering

Collecting host metadata about services and users, Checking informations about a domain, IP address, phone number or an email address.

1. Ping

$ ping <IP>

$ whois <IP>

$ whatweb -v -a 3 <IP>

2. Finding subdomains

How ??

3. Mapping The App

4. Having Proxy Listener

MITM, Burp, ZAP (Zed Attack Proxy)

Inspecting HTTP headers, HTTP cookies, and URL query strings

Tracking URL and POST body parameters to see how the application interacts with the database

5. Finding The Architecture Information

Discovering what languages or content management systems (CMS) is running in the backend.

Wappalyzer

6. Scrapping & Investigation

==> Web Scraping ===> scrapy tool !!! OctoParse is strong too

Handling Anti-Scraping Mechanisms: There are websites on the Internet that have anti-scraping measures in place. If you are afraid you've hit a wall with this, these measures can be bypassed through simple modifications to the crawler. Pick a web crawler that comes in handy in overcoming these roadblocks with a robust mechanism of its own.

Note

take a glance on the page source.

For Every Page !!

get all the comment in html/css/js
get all href
heddin inputs

Listing all input vectors that potentially talk to the back end

Locating data entry points

Perhaps, the most common guidance is to "fully understand how the application behaves"

(HTML input fields such as forms fields, hidden fields, drop-down boxes, and radio button lists)

URL parameters Forms inputs

Performing client-side HTML and JavaScript functionality review

Identifying the encoding scheme(s) used

Note

Tips

Fuzz non-printable characters in any user inputCould result in regex bypass, account takeover…0x00, 0x2F, 0x3A, 0x40, 0x5B, 0x60, 0x7B, 0xFF%00, %2F, %3A, %40, %5B, %60, %7B, %FF

Skipfish ==> crawler

Scrapping from JS

# You can parse and scrape javascript content in a target website to look for hidden subdomains or interesting paths
# Often, endpoints are not public but users can still interact with them
# Tools like dirscraper automates this (https://github.com/Cillian-Collins/dirscraper)

# Classic
python discraper.py -u <url>

# Output mode
python discraper.py -u <url> -o <output>

# Silent mode (you won't see result in term)
python discraper.py -u <url> -s -o <output>

# Relative URL Extractor is another good tool to scrape from JS files (https://github.com/jobertabma/relative-url-extractor)
ruby extract.rb https://hackerone.com/some-file.js
  

7. Directories Enumeration

Wordlists

Tools

ffuf / fuzz / durb / gobuster

DVCS-Ripper

Distributed Version Control System,

Web applications are not developed over night. Developers use all sorts of version control systems (VCS) to keep a centralized source for their application's code base. However when a developer clones their repository into their new web server, hidden directories related to their VCS is created to keep track of updates, new commits and configuration settings. Since these hidden directories are often under the same directory where the application is running, these directories are publicly accessible over the web.

DVCS-Ripper is a suite of tools developed in perl that can discover and download web accessible version control systems, including GIT, SVN, Mercurial and more. DVCS-Ripper will crawl this structure and download all the files found. Repository names, usernames, and even source code can all be fetched from these hidden directories facilitating the total compromise of an application.

8. Understand how the app functions and the logic of the app ==> {NO Scanners}

Automation tools

Recon tool ==> Recon- NG (could also find the subdomains)
spidering (crawling) tool ==> Running Burp Spider
vulnerability scanners
brute forcing tool
site map ==> Site map in Burp Suite Or ZAP