General security

Measuring the Internet – Part I: Distributed nmap

April 16, 2012 by Veronica Valeros

Last month, I participated in a project that involved the scanning of a whole continent. The goal of the project was to report, within 20 days, how many hosts were running a specific service. This type of measuring is not an easy task. According to the Internet Systems Consortium, Inc., the number of Internet hosts by January 2012 ascended to 888,239,420 hosts¹. These hosts are distributed along a very large number of networks, making the scanning task yet more complex.

You may ask, “What is this for? What information could a scan like this give us?”

This type of scanning may provide valuable information for different areas:

  • In marketing, this information could be highly valuable to sell and promote some services or products.
  • In research, this information could help diagnose some problems and to know exactly the amount of services running.
  • In testing network tools, it can be used for performance improvements.
  • It could open the door for new markets and business.
  • For fun. It is really amusing to be able to know what is happening out there.

Unfortunately, the value of this information decreases over time. That is why the duration of the project is very important to be successful. Also, in order to quickly scan a huge number of networks in a reduced period of time it is highly necessary to develop a methodology to automate the scans.

This is a series of two articles that will cover:

  1. Measuring the Internet – Part I: Distributed nmap
    Most of the professionals working in the security field have heard about nmap. Nmap is the de facto standard tool to find active hosts and running services. But when you have to scan thousands of networks, the most important problem to overcome is the bandwidth needed. Most countries in the world have a small bandwidth, and therefore to distribute the scan is a good approximation to face this problem. This first article will cover a tool that was developed with the intent to avoid this limitation: dnmap, the distributed nmap. We will cover its history, details and usage.
  2. Measuring the Internet – Part II: the methodology
    Having a tool to distribute the scans is necessary, but it is not enough. Without a plan to organize the scan, it could be really messy to obtain good results. In the second part of the article, a methodology to carry these huge scans is presented. The four phases of this methodology, born from experience, will guide you on how to carry on a project for services measuring using dnmap.

Dnmap, the distributed nmap

How do we share our bandwidth with friends to speed up the network scans? dnmap!
When our project started, we realized that we needed to scan millions of networks. The first decision we made was to scan one country at a time, thus reducing the amount of networks to scan. However, the amount of networks was still huge. Furthermore, our bandwidth was small, allowing us to run only two nmap commands simultaneously. Consequently, the first estimated finish time to scan one small country alone was more than 4 years.
Because we wanted to deliver the results in 20 days, we clearly needed another approach to solve this problem. Our first attempt was to speed up the scans creating a file with nmap commands, and sharing it with our trustworthy friends. Each one of them had to read some commands, execute them, and mark them as ‘scanned’ to let the others know.

The main advantage of this approach was:

  • Multiple nmap commands were running in parallel. This increased the speed of the global scan.

The main disadvantages of this approach were:

  • The users had to be constantly selecting commands, executing commands, and marking them in the file. This is a time-consuming task, which would be too much for anyone.
  • The nmap results were distributed among the computers of all our friends, and they then became difficult to track and verify.

To solve these disadvantages, we opted to automate this process. We needed a tool to automatically distribute nmap commands along the connected users and store the results locally. That’s how dnmap was born.

About

Dnmap, is a free software framework that distributes nmap commands into several clients. It was written by the Argentinean Sebastián García, co-founder of the MatesLab² hackspace, in response to the need of increasing the speed and efficacy of huge network scans.

Based on twisted³, a python networking engine, dnmap implements a client-server architecture allowing clients to execute the commands provided by the server.

In the last version of the tool, some basic security measures were implemented in order to protect clients from executing malicious commands. However, clients should never connect to unknown servers.

Dnmap server contains the logic to automatically distribute a set of nmap commands into all the connected clients and store the results in the server. The following scheme illustrates the behavior of the tool:

Dnmap server

Dnmap server has three main functionalities:

  • Allows clients to remotely connect to it.
  • Distributes nmap commands to the connected clients.
  • Stores the client scanning results locally.

Figure 1 shows a preview of the configurable options that this tool offers:


Figure 1 – Help – dnmap_server.py

Some of the major features dnmap presents are:

  • Server configuration:
    • Allows you to specify the port number to use, by default it uses the port 46001.
    • Allows you to set a custom log file name.
    • It is possible to customize the client timeout. The client is considered offline if no responses are received within this time.
    • The data between client and server are encrypted using the TLS protocol.
    • It generates a very detailed log file.
  • Client management:
    • Shows information and statistics in real-time about the clients connected:
      (Figure 2 shows how dnmap presents this information)
      • MET (Mission Elapsed Time).
      • Amount of Online Clients.
      • Alias.
      • Amount of executed commands.
      • Last time seen.
      • Time since the last data sent from client.
      • Uptime.
      • Version of the client.
      • Shows if the client was executed with root privileges or not.
      • Running average⁴ of executed commands per minute.
      • Average of executed commands per minute.
      • Client status: Online, Offline, Executing or Storing.
  • Commands executed
    • The server saves the last command sent in a file with the extension ‘.dnmaptrace’. This feature prevents the server from losing information. In case of shutdown, the server would start scanning from the last command sent.
    • New commands can be added to the original command file without stopping the server, it reads them automatically. This allows any process to add new commands dynamically.
    • The server remembers the last command sent from each client, so if a client goes down, the command is re-scheduled for later.
  • Statistics
    • Calculates the running average of commands executed per minute.
    • Calculates the historic average of the amount of commands executed per minute.


Figure 2 – dnmap_server.py running.

Dnmap client


Figure 3 – Help – dnmap_client.py

The main functionality of the dnmap client is simple:

  • Connect to the server and wait for commands to execute.

In the latest version of the tool, the client is getting more complex. Figure 3 above shows the most recent options it presents. The client has implemented some logic to control the nmap scanning rate. The server cannot force the client to perform the scans at a minimum rate. If the command received has the –min-rate nmap option set, the client executes the command ignoring this option. In the other hand, the server can slow down the scan rate by setting a maximum rate with the –max-rate nmap option. The client has also an option to set up the maximum rate. In case of conflict where server and client set up different max rates, the client value has priority.

The prevention of command injection is one of the other improvements made. The client checks the existence of certain malicious characters in the received command and in case of finding any, the command is dropped. This logic seeks to protect clients for executing malicious commands that could harm their machines.

Also, the latest version of dnmap client verifies if the nmap is installed on the system before trying to execute commands. In older versions, the lack of this validation caused the commands executed by a client without nmap installed to be marked mistakenly as scanned, ruining the results.

These are the major features of the dnmap client:

  • If the server goes down, the client keeps running. When the server is available again, the client reconnects and sends the pending data.
  • The client verifies the commands sent by the server and strips dangerous characters in order to avoid command injection attacks.
  • The client only executes nmap commands. Server cannot force the clients to execute another program besides nmap.
  • It modifies the command sent by the server to use the known nmap binary in the system.
  • It is possible to select an alias for the user. Very useful for tracking when users run many clients in different machines.
  • Allows you to set up the port to which the client would connect.
  • Forces the command to always store the nmap output locally.

Figure 4 below shows a dnmap client running.


Figure 4 – dnmap_client.py running.

Usage

Basic Internet usage:

  1. First create an empty file and add some nmap commands. For example, the file could be named ‘commands.txt’.
  2. Be sure the computer that will run the server is reachable from Internet. Your public IP address could be, for example, x.x.x.x. Also make sure the port you are going to use is not being blocked (by default it is TCP/46001)
  3. Make sure you are standing on the dnmap directory when you execute the server. Otherwise you will need to give the full path of the ‘server.pem’ file.
  4. Start the server with this command: dnmap_server.py -f commands.txt
  5. Make sure your clients can reach the server port. Your clients can use the command: hping3 -S -p 46001 x.x.x.x
  6. Start any amount of clients, from any location on the world: dnmap_client.py -s x.x.x.x -a client1
  7. Check that the server shows the clients correctly. Also, monitor its behavior for a while to see if the commands get executed fine.

Friend usage:

Some more complex connection structures can be created with your friends. In figure 5, a more fun structure is shown. Two friends can connect to their own servers and to their friend’s server to duplicate the scan bandwidth. In this schema, we have two servers and four clients. Any nmap commands planned by both friends get executed in their clients simultaneously. In this way, they can share their connection, especially if they do not send commands at the same time.

Figure 5 – dnmap friend usage.

Dependencies

For running this framework the following software is needed:

  • Nmap – Network security scanner
    http://nmap.org/
  • Python 2.7.x
    http://www.python.org/
  • Twisted
    http://twistedmatrix.com

Conclusion

Dnmap facilitates the process to run thousands of nmap scans automatically. The speed in which you get results is given by the number of clients connected and the bandwidth of each of them. However, as we previously mentioned, this is not enough. In order to get good results, it is necessary to follow a methodology.

It is important to consider the legal aspects of a project like this. In some countries, it is illegal to send more than one TCP packet with the SYN bit activated. It is recommended to get permission before performing such scans.

In the second article of this series, we will describe a four phase methodology that is going to guide you from selecting the goal of the project to checking and verifying the obtained results. We will also show and explain how important it is to elaborate a good command list.

References

  1. Internet host count history, Internet Systems Consortium, Inc., http://www.isc.org/solutions/survey/history
  2. MatesLab hackspace, http://www.mateslab.com.ar
  3. Twisted framework, http://twistedmatrix.com
  4. Running average, Wikipedia, http://en.wikipedia.org/wiki/Running_average
  5. Dnmap, http://sourceforge.net/projects/dnmap/
Posted: April 16, 2012
Author
Veronica Valeros
View Profile

Verónica Valeros is in her final year of Computer Engineering at FASTA University, Argentina. She is currently working on her undergraduate thesis in web anomaly analysis for CITEDEF (Institute of Scientific and Technical Research for Defense, Argentina).She is a freelance penetration tester at Talsoft SRL and a researcher at InfoSec Institute. She is also a founding member of the MatesLab hackspace, the first hackspace in Mar del Plata, Argentina. Her areas of interest include web application security, lockpicking and physical security, python programming, wireless analysis, vulnerability research and others. She is co-author of the following projects: Web Crawler Security Tool & Domain Analyzer Security Tool. In her spare time, she likes to paint, ride her motorcycle, wardrive, cook, travel and fly dual-line kites.

5 responses to “Measuring the Internet – Part I: Distributed nmap”

  1. Sebastián Fink says:

    Great article Veronica!! Congratulations!

    It seems that it was a great project full of hard work and also very fun.

    I’m looking forward to read the next article from this series.

    Cheers! 😉

  2. None says:

    I would recommend adopting this approach with an established distributed computing client like BOINC. There is already a very large group (roughly 2 million users) in geographically diverse areas using this tool for other projects, and many of them would happily participate in mapping the Internet as well.

    For each of these projects, a central server hands out small bits of work to people in the project, which are run based on client limits (cpu use, ram, bandwidth). These complete work units are then returned to the originating server. Integrating with BOINC could easily net you a user based in the tens of thousands, almost immediately.

    http://boinc.berkeley.edu/

  3. seba garcia says:

    Well, boinc is a great project but I think it won’t be useful here. The problem is that most boinc projects are CPU-intensive, while dnmap is network-intensive. I really doubt that people kindly share its bandwith for this project, but I can be wrong (or hope to be!). The dnmap client lets you manage the bandwidth used, so perhaps we could give boinc a try. I should check whether they have network constrains to the projects.

  4. Art says:

    Have you though of splitting the scan into non contiguous blocks so that you don’t end up scanning in series lot’s of addresses in a certain IP range?

    How do you deal with the fact that you will end up triggering IDS’s that will block your scan?

    Have you received any abuse problems?

  5. We split up the IP ranges into several groups (/24 networks) to speed up the scans, but we didn’t sorted them into non contiguous blocks or in any way. Nmap optimizes the scans when scanning an entire network, maybe there is someway to tell nmap to scan a network in a non contiguous way… I don’t know if this option exists.

    We planned the scans very carefully to avoid scanning non active networks nor active hosts. Then we only scan a few ports on the active hosts, which is around 6 packets per hosts.
    We are not sure if we triggered a IDS, maybe we did. We knew from the beginning that we were not going to find all the active hosts and there were going to be errors; we work with millions of hosts, we tried not to get perfect results but to reduce the errors as much as possible.
    In the next article I’ll explain the methodology we use and the scanning stages, also some guides on how to plan and perform the scans. It may clarify many things!
    We didn’t receive any abuse problems because we avoid sending more that TCP SYN packets and because in our country it is not illegal to perform port scanning tasks. Also, the fact of having several clients scanning from several different networks helps mitigating the impact of probable IDS blocking.

Leave a Reply

Your email address will not be published.