Elevating Your Cybersecurity with Distributed Scanning Tools: An In-Depth Look at Archer

In today’s rapidly evolving digital landscape, cybersecurity has become paramount for organizations of all sizes. As attack surfaces expand and threats become more sophisticated, the need for efficient and scalable vulnerability scanning is greater than ever. This is where Distributed Scanning Tools come into play, offering a powerful solution to manage and mitigate risks effectively. Among these tools, Archer stands out as a robust and versatile option, designed for speed, scalability, and comprehensive network analysis.

Understanding the Power of Distributed Scanning Tools

Traditional network scanning methods, often performed from a single machine, can be time-consuming and resource-intensive, especially when dealing with large networks or internet-wide scans. Distributed scanning tools revolutionize this process by distributing the workload across multiple nodes or workers. This parallel approach significantly accelerates scanning times and enhances scalability, making it feasible to assess vast IP ranges and complex infrastructures efficiently.

Key benefits of using distributed scanning tools include:

  • Enhanced Speed: By distributing tasks, scans are completed much faster compared to single-machine setups.
  • Improved Scalability: Easily handle large networks and massive target lists by adding more worker nodes.
  • Increased Efficiency: Optimized resource utilization and reduced bottlenecks through parallel processing.
  • Greater Flexibility: Adaptable to various scanning needs, from targeted assessments to broad internet-wide sweeps.

Archer: A Leading Distributed Scanning Tool for Comprehensive Security Assessments

Archer is a cutting-edge distributed network and vulnerability scanner engineered with Golang, prioritizing speed and scalability for demanding cybersecurity tasks. Originally conceived for internet-wide scans, Archer’s architecture is built to handle scans of any scale, making it an ideal solution for organizations needing to assess extensive digital footprints.

At the heart of Archer’s effectiveness are its Scan Workflows. These workflows orchestrate a sequence of industry-standard security tools, leveraging the output of each stage to refine subsequent scans. This intelligent chaining of tools ensures optimal results and minimizes redundant scanning efforts. Archer also employs Elasticsearch scripting to maintain a unified record for each IP address per scan, simplifying result organization and analysis.

Archer provides both a Command-Line Interface (CLI) and an Application Programming Interface (API), offering users flexible interaction and integration options for seamless incorporation into existing security workflows.

Use Cases: Unleashing the Potential of Archer’s Distributed Scanning

Archer’s capabilities as a distributed scanning tool are applicable across a wide spectrum of cybersecurity use cases:

  • Internet-Wide Scans: Discover and analyze publicly exposed assets and vulnerabilities across the internet, crucial for understanding global threat landscapes and identifying potential risks to your organization’s external presence.
  • Attack Surface Management: Proactively identify and monitor your organization’s external attack surface, encompassing all internet-facing assets. Archer helps pinpoint potential entry points for attackers, enabling timely remediation and risk reduction.
  • Bug Bounties & Penetration Testing: Accelerate vulnerability discovery in bug bounty programs and penetration testing engagements. Archer’s speed and efficiency allow security professionals to cover wider scopes and identify vulnerabilities faster.
  • Distributed Fast Scanning of Large Ranges: Quickly assess the security posture of vast IP address ranges, whether for internal network segmentation checks, cloud infrastructure security audits, or large-scale vulnerability assessments.

Empowering Security Assessments with Integrated Tools and Workflows

Archer distinguishes itself by leveraging established and trusted security tools within its modular framework. Rather than reinventing the wheel with custom scanning techniques, Archer integrates battle-tested industry standards, ensuring reliability and accuracy.

Currently supported tools within Archer include:

  • Masscan: A renowned ultra-fast port scanner, ideal for rapidly identifying open ports across vast networks.
  • HttpX: A versatile HTTP probing tool designed for fast and customizable web server enumeration and analysis.
  • Nuclei: A powerful vulnerability scanning engine utilizing customizable templates to detect a wide range of security issues.

This modular design allows for easy expansion. New tools can be integrated into Archer by developing new modules, ensuring the platform remains adaptable to evolving security landscapes and user needs.

Archer’s Workflow system is a key differentiator. It enables the sequential execution of multiple tools, feeding results from one stage into the next for optimized scanning. For instance, a typical workflow in Archer, utilizing all available tools, proceeds as follows:

graph LR
a[Masscan] --> b[Httpx]
a --> c["Nuclei (Network Templates)"]
b --> d["Nuclei (HTTP Templates)"]

This workflow exemplifies how Archer intelligently chains tools: Masscan identifies open ports, Httpx probes HTTP services on those ports, and Nuclei then conducts vulnerability scans based on both network and HTTP service information. This layered approach maximizes the effectiveness of each tool and provides a more comprehensive security assessment.

This diagram illustrates Archer’s workflow, showcasing how Masscan, Httpx, and Nuclei are integrated for sequential and efficient distributed scanning.

Installation and Setup: Getting Archer Up and Running

Setting up Archer is straightforward, offering flexibility with both Docker and manual installation options:

1. Choose your Installation Method:

  • Docker: The quickest method. Use docker compose up -d to launch Archer and its dependencies.
  • Manual: For more control and customization. Requires manual installation of Redis, PostgreSQL, and Elasticsearch.

2. Create Elasticsearch Index:

  • Utilize the provided elasticsearch.json configuration file (located in the configs folder in the Archer repository) to create a new Elasticsearch index named “archer.” For Docker setups, use the curl command provided in the documentation.

3. Build Components:

  • Execute make all to compile Archer’s components.

4. Configure Components:

  • Modify configuration files within the configs folder to tailor each component (Coordinator, Scheduler, Worker) to your environment.

5. Start Services:

  • Launch the Coordinator with database migrations: ./dist/archer-coordinator -migrate
  • Start the Scheduler: ./dist/archer-scheduler
  • Start the Worker: ./dist/archer-worker

6. Initiate Scans:

  • Use the Archer CLI to create new scans.

Docker Setup: Simplified Deployment

For users prioritizing ease of deployment, Docker provides a streamlined setup process. The docker-compose.yml file included in the Archer repository simplifies the orchestration of Archer and its necessary dependencies (Redis, PostgreSQL, Elasticsearch).

Creating the Elasticsearch Index via Docker

After starting Archer with Docker Compose, execute the following curl command within the Docker environment to create the “archer” Elasticsearch index:

curl -X PUT -k  -u "elastic:elastic"  -H "Content-Type: application/json"  -d @"configs/elasticsearch.json"  https://127.0.0.1:9200/archer

This command ensures that Elasticsearch is properly configured to store and index Archer’s scan results.

Accessing and Visualizing Scan Results

Archer leverages Kibana, a powerful data visualization dashboard, to provide insights into scan results stored in Elasticsearch.

To view results:

  1. Navigate to http://127.0.0.1:5601 in your web browser.
  2. Log in with the default credentials: elastic:elastic.
  3. Access “Stack Management” under “Management” in the sidebar menu.
  4. Go to “Data Views” and create a new data view.
  5. Name the data view “archer.”
  6. Select “timestamp” as the timestamp field.
  7. Click “Create data view.”
  8. Open “Discover” under “Analytics” in the sidebar.
  9. Adjust the time range in the top right corner to view scan data.

This Kibana interface allows you to explore scan results, filter data, and create visualizations to gain a deeper understanding of identified vulnerabilities and network insights.

Monitoring Task Progress

Archer provides a dedicated Asynq Web UI for monitoring task queues and worker activity. Access the Asynq Web UI at http://localhost:8080 to observe the real-time status of scanning tasks and system performance.

CLI Usage: Interacting with Archer via Command Line

Archer’s Command-Line Interface (CLI) offers a convenient way to initiate new scans. While current CLI functionalities are focused on scan creation, future updates may expand its capabilities. For detailed scan progress monitoring, users can directly query the PostgreSQL database.

Example CLI Commands:

  • Internet Wide Scan (Specific ports):

    ./dist/archer-cli -c configs/cli.yaml new -m all -t 0.0.0.0/0 -p 80 -p 443

    This command initiates an internet-wide scan using all modules, targeting all IP addresses (0.0.0.0/0) and scanning ports 80 and 443.

  • Specific Module Scan (Single target, multiple ports):

    ./dist/archer-cli -c configs/cli.yaml new -m masscan -t 1.1.1.1 -p 80 -p 443

    This command runs a Masscan module scan specifically against IP address 1.1.1.1 on ports 80 and 443.

  • Scan with Target List:

    ./dist/archer-cli -c configs/cli.yaml new -m all -l targets.txt

    This command utilizes all modules to scan targets listed in the targets.txt file.

Example Result: Understanding Scan Output

Archer outputs scan results in JSON format, indexed within Elasticsearch. Below is an example of a scan result entry:

{
  "_index" : "archer",
  "_id" : "67215d5bebdc30c792d17d6bef54c0d9695cd902",
  "_version" : 21,
  "_seq_no" : 30,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "ip" : "1.1.1.1",
    "scan" : "666kf2xdpu1k",
    "ports" : [
      {
        "metadata" : {
          "task" : "tjhgdrsphvdd",
          "module" : "masscan",
          "timestamp" : "2023-02-17T02:39:02.012817236Z"
        },
        "port" : 80
      },
      {
        "metadata" : {
          "task" : "tjhgdrsphvdd",
          "module" : "masscan",
          "timestamp" : "2023-02-17T02:39:02.11115321Z"
        },
        "port" : 443
      }
    ],
    "timestamp" : "2023-02-17T02:45:58.23592402Z",
    "http" : [
      {
        "metadata" : {
          "task" : "tmau646n49mg",
          "module" : "httpx",
          "timestamp" : "2023-02-17T02:39:19.932586378Z"
        },
        "scheme" : "http",
        "port" : 443,
        "title" : "400 The plain HTTP request was sent to HTTPS port"
      }
    ],
    "detections" : [
      {
        "severity" : "info",
        "metadata" : {
          "task" : "r339zsmiiib0",
          "module" : "nuclei",
          "timestamp" : "2023-02-17T02:45:56.345803529Z"
        },
        "matched_at" : "http://1.1.1.1:443",
        "port" : 443,
        "matcher_name" : "cloudflare",
        "name" : "Wappalyzer Technology Detection",
        "description" : "",
        "template_id" : "tech-detect",
        "type" : "http",
        "tags" : [
          "tech"
        ]
      },
      {
        "severity" : "info",
        "metadata" : {
          "task" : "r339zsmiiib0",
          "module" : "nuclei",
          "timestamp" : "2023-02-17T02:45:58.235956801Z"
        },
        "matched_at" : "http://1.1.1.1:443",
        "port" : 443,
        "matcher_name" : "permissions-policy",
        "name" : "HTTP Missing Security Headers",
        "description" : "This template searches for missing HTTP security headers. The impact of these missing headers can vary.",
        "template_id" : "http-missing-security-headers",
        "type" : "http",
        "tags" : [
          "misconfig",
          "headers",
          "generic"
        ]
      }
    ]
  }
}

This example showcases the structured output, including IP address, scan metadata, port information, HTTP service details (if applicable), and any detected vulnerabilities or security findings from Nuclei scans.

Optimizing Archer for Peak Performance

To maximize the efficiency of your distributed scanning tool infrastructure with Archer, consider these optimization strategies:

  • Masscan Optimization:

    • Dedicated Masscan Workers: Deploy dedicated worker nodes with high network bandwidth specifically for Masscan tasks.
    • Packet Rate Tuning: Experiment to determine the optimal Masscan packet rate for your server environment to achieve maximum scanning speed without packet loss.
    • Concurrency Control: Limit Masscan tasks to a concurrency of 1 per server for optimal resource utilization and to avoid overloading network interfaces.
  • Faster Task Execution:

    • Scale Worker Nodes: Increase the number of worker nodes to parallelize task execution and reduce overall scan time.
    • Increase Concurrency: Adjust worker concurrency settings to allow each worker to handle more tasks simultaneously, based on available resources.
  • Faster Task Scheduling:

    • Scale Schedulers: Deploy multiple scheduler instances to handle increased task scheduling demands.
    • Increase Scheduler Concurrency: Optimize scheduler concurrency to improve task distribution speed.
    • Elasticsearch Monitoring: Regularly monitor Elasticsearch performance to ensure it doesn’t become a bottleneck in the scanning pipeline. Optimize Elasticsearch indexing and query performance as needed.

Archer’s Core Components: Distributed Architecture Explained

Archer’s architecture is built upon a distributed component model, enabling scalability and resilience. The key components are:

Coordinator

The Coordinator acts as the central management hub for all scans and tasks within Archer. It maintains the overall state of scans, tracks task assignments to workers, and orchestrates the scan workflow. The Coordinator is responsible for:

  • Managing scan definitions and configurations.
  • Tracking task status and progress.
  • Interacting with the Heartbeat service to monitor available worker capacity.
  • Updating scan stages and triggering subsequent tasks.

Scheduler

The Scheduler is responsible for dynamically scheduling new scanning tasks based on scan definitions and available worker resources. Its primary function is to query Elasticsearch for results from previous scan stages and determine the next tasks to be executed. Importantly, the Scheduler shares the same codebase as the Worker but is configured to process scheduling tasks from a dedicated Redis queue. Running multiple Schedulers allows for horizontal scaling of task scheduling capabilities.

Worker

The Worker component is the workhorse of Archer, responsible for executing the actual scanning tasks assigned by the Scheduler. Workers are configurable to specialize in specific modules (e.g., Masscan, Nuclei, HttpX) or handle a range of modules. Workers:

  • Receive tasks from the Scheduler via Redis queues.
  • Parse task data to identify targets and modules.
  • Signal the Coordinator upon task start and completion.
  • Spawn child processes to execute the chosen scanning tools (Masscan, HttpX, Nuclei).
  • Parse tool output and bulk-index results into Elasticsearch.
  • Report task completion status and result counts to the Coordinator.

Understanding the Scan Lifecycle in Archer

The following outlines the step-by-step lifecycle of a scan within Archer, illustrating the interaction between its distributed components:

  1. Scan Creation: A new scan is initiated via the CLI or API.
  2. API Request Processing:
    • The API receives the scan request, parsing targets and defined scan stages.
    • A new scan entry is created in the PostgreSQL database.
    • A parent task is created for the first scan stage.
    • The Coordinator queries the Heartbeat service to determine the number of available workers.
    • Child tasks are created in PostgreSQL, corresponding to the available worker capacity.
    • New tasks are submitted to the Redis queue for worker assignment.
  3. Worker Task Execution:
    • Workers retrieve tasks from the Redis queue.
    • A Worker signals the Coordinator about task commencement.
    • A child process is spawned to execute the designated scanning tool.
    • The Worker parses the tool’s output and bulk-indexes results into Elasticsearch.
    • Upon completion, the Worker signals the Coordinator with task completion status and result counts.
  4. Coordinator Task Completion Handling:
    • The Coordinator receives task completion signals.
    • It locks the parent task row in PostgreSQL for data consistency.
    • Child task status and results are updated in the database.
    • The Coordinator checks if all child tasks associated with the parent task are finished.
    • The parent task row in PostgreSQL is unlocked.
  5. Stage Progression:
    • The Coordinator determines if all child tasks for the current stage are complete.
    • The scan’s current stage in PostgreSQL is updated.
    • The Coordinator checks for subsequent stages defined in the scan workflow.
    • Total results from the completed stage are retrieved from Elasticsearch.
    • Steps 3-5 are repeated, utilizing Elasticsearch results as input for the next stage, until all stages in the workflow are completed.
  6. Scan Completion:
    • The Coordinator determines that all defined scan stages are finished.
    • The scan’s current stage in PostgreSQL is set to NULL.
    • The scan status is updated to “complete” in PostgreSQL, signifying the end of the scan lifecycle.

Conclusion: Embrace Distributed Scanning for Enhanced Cybersecurity

Distributed scanning tools like Archer represent a significant advancement in cybersecurity assessment capabilities. By distributing scanning workloads, Archer delivers unparalleled speed, scalability, and efficiency, empowering organizations to conduct comprehensive security assessments across vast networks and internet-scale environments. Its modular architecture, workflow system, and integration of industry-leading tools make Archer a powerful asset for proactive vulnerability management, attack surface reduction, and enhanced overall cybersecurity posture. Explore the potential of Archer and other cutting-edge security tools at vcdstool.com to elevate your organization’s defenses in the face of ever-evolving cyber threats.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *