Evine - Interactive CLI Web Crawler

(70 views)

Today we will have a chance to test the new tool from Saeed Dehghan, Evine - A Interactive Web Crawler. It will help you find sensitive and, most importantly hidden files or directories from web applications. This interactive web crawler and web scraper are written in Golang and may be useful for a wide range of purposes such as metadata and data extraction, data mining, reconnaissance, and testing. All in all, there is a lot you can do with it. 

Important links:

Follow the project on Twitter.

Github page: https://github.com/saeeddhqan/evine

Installation

There are a few ways you get this tool to start. Please note that golang 1.13.x is required.

We install the tool from the github page and go the directory of the tool. After doing this we build the application through Go language and move it permanently after which we can easily run this tool by entering the name from anywhere.

git clone https://github.com/saeeddhqan/evine.git

cd evine

go build.

mv evine /usr/local/bin

To learn more about the features that Evine has to offer you must execute the help command. 

evine --help

You can also install Evin from Binary:

Pre-build binary releases are also available(Suggested).

And from the source

go get github.com/saeeddhqan/evine

"$GOPATH/bin/evine" -h

Usage

Now that Evine is up and running, let’s take a look at the usage commands. Below you will find the list with small descriptions.

  • Enter – Run crawler (from URL view)
  • Tab – Go to Next view
  • Ctrl+Space – Run crawler
  • Ctrl+S – Save response
  • Ctrl+Z – Quit
  • Ctrl+R – Restore to default values (from Options and Headers views)
  • Ctrl+Q – Close response save view (from Save view)

By using this command line you will display help for the tool:

evine -h
flag Description Example
-url
URL to crawl for
evine -url toscrape.com
-url-exclude string
Exclude URLs maching with this regex (default ".*")
evine -url-exclude ?id=
-domain-exclude string
Exclude in-scope domains to crawl. Separate with comma. default=root domain
evine -domain-exclude host1.tld,host2.tld
-code-exclude string
Exclude HTTP status code with these codes. Separate whit '|' (default ".*")
evine -code-exclude 200,201
-delay int
Sleep between each request(Millisecond)
evine -delay 300
-depth
Scraper depth search level (default 1)
evine -depth 2
-thread int
The number of concurrent goroutines for resolving (default 5)
evine -thread 10
-header
HTTP Header for each request(It should to separated fields by \n).
evine -header KEY: VALUE\nKEY1: VALUE1
-proxy string
Proxy by scheme://ip:port
evine -proxy http://1.1.1.1:8080
-scheme string
Set the scheme for the requests (default "https")
evine -scheme http
-timeout int
Seconds to wait before timing out (default 10)
evine -timeout 15
-query string
JQuery expression(It could be a file extension(pdf), a key query(url,script,css,..) or a jquery selector($("a[class='hdr']).attr('hdr')")))
evine -query url,pdf,txt
-regex string
Search the Regular Expression on the page contents
evine -regex 'User.+'
-max-regex int
Max result of regex search for regex field (default 1000)
evine -max-regex -1
-robots
Scrape robots.txt for URLs and using them as seeds
evine -robots
-sitemap
Scrape sitemap.xml for URLs and using them as seeds
evine -sitemap
-wayback
Scrape WayBackURLs(web.archive.org) for URLs and using them as seeds
evine -sitemap

Crawling

Now here we will give the URL and click on the tab button and go to the options section and true all the options that we want to crawl.

To see the results ou must enter "all" in the key section. The results will appear below, in the response section.


Save Output

If we want to save our output or results then we need to input “Ctrl+S” through the keyboard on the response section, name the file, as we see fir and exit it, by using “Ctrl+q“.

Extract methods

From Keys

Keys are predefined keywords that can be used to specify data like in scope URLs, out scope URLs, emails, etc. List of all keys:

  • url, to extract IN SCOPE urls. the urls completly are sanitized.
  • email, to extract IN SCOPE and out scope emails.
  • query_urls, to extract IN SCOPE urls that contains the get query: ?foo=bar.
  • all_urls, to extract OUT SCOPE urls.
  • phone, to extract a[href]s that contains a phone number.
  • media, to extract files that are not web executable file. like .exe,.bat,.tar.xz,.zip, etc addresses.
  • css, to extract CSS files.
  • script, to extract JavaScript files.
  • cdn, to extract Content Delivery Networks(CDNs) addresses. like //api.foo.bar/jquery.min.js
  • comment, to extract html comments, <!-- .* !-->
  • dns, to extract subdomains that belongs to the website.
  • network, to extract social network IDs. like facebook, twitter, etc.
  • all, to extract all list of keys.(url,query_url,..) keys are case-sensitive. Also, it could be written to or three key with comma separation.

From Extensions

Maybe you wanna a file that is not defined in keys. What can you do? You can easily write the extension of the file on the Query view. like png,XML,txt,docx,xlsx,a,mp3, etc.

From JQuery selector

If you have basic JQuery skills, you can easily use this feature, but if not, it is not very difficult. To have a quick view about the selectors w3schools is a great source.

example(To find source[src]):

$("source").attr("src") // To find all of source[src] urls

$("h1").text() // To find h1 values

Template:

$("SELECTOR").METHOD_NAME("arg")

It does not support queries like below:

$('SELECTOR').METHOD("arg")

$('SELECTOR').METHOD('arg')

$("SELECTOR"  ).METHOD("arg" )

Methods are described below:

  • text(), to returns the content of the SELECTOR without html tag.
  • html(), to returns the content of the SELECTOR with html tag.
  • attr("ATTR"), to get the attribute of the SELECTOR. e.g $("a").attr("href")

Bugs or Suggestions

To report bugs or suggestions, create an issue.

Evine is heavily inspired by wuzz.

August 25, 2020
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
© HAKIN9 MEDIA SP. Z O.O. SP. K. 2023