Module System
Modules are the building blocks of Pentora scans. Each module performs a specific function within the scan pipeline and can be composed via the DAG engine to create custom workflows.
What is a Module?
A module is a self-contained unit that:
- Reads inputs from DataContext
- Performs a specific operation (discovery, scanning, parsing, reporting)
- Writes outputs to DataContext
- Declares dependencies (what it needs to run)
- Provides metadata (name, description, configuration schema)
Module Types
Discovery Modules
Identify live hosts on the network.
Examples:
- icmp_discovery: ICMP echo (ping) probe
- arp_discovery: ARP requests for local network
- tcp_probe_discovery: TCP SYN to common ports
Inputs: targets (parsed target list)
Outputs: discovered_hosts (list of responsive hosts)
Scanner Modules
Probe hosts for open ports and capture banners.
Examples:
- syn_scanner: TCP SYN port scan
- connect_scanner: Full TCP connect scan
- udp_scanner: UDP port scan
- banner_grabber: Connect and read service banners
Inputs: discovered_hosts or targets
Outputs: open_ports, banners
Parser Modules
Extract structured data from raw scan output.
Examples:
- http_parser: Parse HTTP responses (headers, body, status)
- ssh_parser: Parse SSH banners (version, algorithms)
- smtp_parser: Parse SMTP banners and capabilities
Inputs: banners
Outputs: parsed_services
Fingerprint Modules
Identify services, applications, and operating systems.
Examples:
- fingerprint_coordinator: Orchestrate layered detection
- banner_fingerprinter: Match banners against signatures
- http_fingerprinter: Detect web servers and frameworks
- tls_fingerprinter: Identify TLS implementations
Inputs: banners, parsed_services
Outputs: service_fingerprints
Profiler Modules
Build comprehensive asset profiles.
Examples:
- asset_profiler: Fuse signals into device/OS/app profile
- os_classifier: Determine operating system
- device_classifier: Identify device type (server, IoT, etc.)
Inputs: service_fingerprints, open_ports
Outputs: asset_profiles
Evaluation Modules
Assess vulnerabilities and compliance.
Examples:
- cve_matcher: Match versions against CVE database
- misconfig_checker: Detect common misconfigurations
- weak_cipher_checker: Identify weak cryptography
- compliance_evaluator: Check CIS/PCI/NIST rules (Enterprise)
Inputs: asset_profiles, service_fingerprints
Outputs: vulnerabilities, compliance_violations
Reporter Modules
Generate output in various formats.
Examples:
- json_reporter: JSON/JSONL output
- csv_reporter: CSV export
- pdf_reporter: Executive PDF report (Enterprise)
- html_reporter: Interactive HTML dashboard
Inputs: All DataContext keys Outputs: Files written to workspace or stdout
Module Interface
Go Module Interface
package module
import "context"
// Module interface that all modules must implement
type Module interface {
    // Metadata
    Name() string
    Description() string
    Version() string
    // Configuration
    ConfigSchema() Schema
    Configure(config Config) error
    // Dependencies
    Requires() []string  // DataContext keys needed
    Provides() []string  // DataContext keys produced
    // Execution
    Execute(ctx context.Context, data DataContext) error
    // Lifecycle
    Initialize() error
    Cleanup() error
}
Example Module Implementation
package discovery
import (
    "context"
    "github.com/pentora/pentora/pkg/module"
)
type ICMPModule struct {
    timeout  time.Duration
    retry    int
    icmpConn net.PacketConn
}
func (m *ICMPModule) Name() string {
    return "icmp_discovery"
}
func (m *ICMPModule) Description() string {
    return "Discover live hosts using ICMP echo requests"
}
func (m *ICMPModule) Version() string {
    return "1.0.0"
}
func (m *ICMPModule) ConfigSchema() module.Schema {
    return module.Schema{
        "timeout": {Type: "duration", Default: "2s"},
        "retry":   {Type: "int", Default: 2},
    }
}
func (m *ICMPModule) Configure(config module.Config) error {
    m.timeout = config.GetDuration("timeout")
    m.retry = config.GetInt("retry")
    return nil
}
func (m *ICMPModule) Requires() []string {
    return []string{"targets"}
}
func (m *ICMPModule) Provides() []string {
    return []string{"discovered_hosts"}
}
func (m *ICMPModule) Initialize() error {
    // Open raw ICMP socket
    conn, err := icmp.ListenPacket("ip4:icmp", "0.0.0.0")
    if err != nil {
        return fmt.Errorf("failed to open ICMP socket: %w", err)
    }
    m.icmpConn = conn
    return nil
}
func (m *ICMPModule) Execute(ctx context.Context, data module.DataContext) error {
    // Read targets from context
    targets, err := data.GetTargets("targets")
    if err != nil {
        return err
    }
    var discovered []Host
    for _, target := range targets {
        // Send ICMP echo
        if m.ping(target) {
            discovered = append(discovered, Host{IP: target.IP})
        }
    }
    // Write results to context
    return data.Set("discovered_hosts", discovered)
}
func (m *ICMPModule) Cleanup() error {
    if m.icmpConn != nil {
        return m.icmpConn.Close()
    }
    return nil
}
func (m *ICMPModule) ping(target Target) bool {
    // ICMP ping implementation
    // ...
    return true
}
Module Registration
Modules register themselves during package initialization:
package discovery
import "github.com/pentora/pentora/pkg/module"
func init() {
    module.Register("icmp_discovery", &ICMPModule{})
    module.Register("arp_discovery", &ARPModule{})
    module.Register("tcp_probe_discovery", &TCPProbeModule{})
}
Embedded vs External Modules
Embedded Modules (Builtin)
Compiled into Pentora binary:
Advantages:
- Fast (no IPC overhead)
- No external dependencies
- Simpler deployment
- Always available
Disadvantages:
- Requires recompilation to update
- All modules loaded into memory
- Language limited to Go
Usage:
import _ "github.com/pentora/pentora/pkg/modules/discovery"
import _ "github.com/pentora/pentora/pkg/modules/scanner"
import _ "github.com/pentora/pentora/pkg/modules/fingerprint"
All builtin modules auto-register via init().
External Modules (Plugins)
Isolated processes or libraries:
Advantages:
- Hot-reloadable without Pentora restart
- Isolated failures (crash doesn't kill Pentora)
- Any language (via gRPC)
- Memory efficient (loaded on demand)
- Third-party distribution
Disadvantages:
- IPC overhead (~10-50ms per call)
- More complex deployment
- Requires plugin management
Types:
1. Go Plugins (.so shared objects)
// plugin-vuln/main.go
package main
import "github.com/pentora/pentora/pkg/module"
type CustomVulnChecker struct{}
func (m *CustomVulnChecker) Name() string {
    return "custom_vuln_checker"
}
// ... implement Module interface ...
var Plugin = &CustomVulnChecker{}
Build:
go build -buildmode=plugin -o vuln-checker.so plugin-vuln/main.go
Load:
pentora scan --plugin vuln-checker.so --targets 192.168.1.100
2. gRPC Plugins (any language)
// module.proto
service ModuleService {
    rpc Execute(ExecuteRequest) returns (ExecuteResponse);
    rpc GetMetadata(Empty) returns (Metadata);
}
message ExecuteRequest {
    map<string, bytes> context = 1;
    bytes config = 2;
}
message ExecuteResponse {
    map<string, bytes> context = 1;
    string error = 2;
}
Python example:
# custom_module.py
import grpc
from pentora_pb2 import ExecuteRequest, ExecuteResponse
from pentora_pb2_grpc import ModuleServiceServicer
class CustomModule(ModuleServiceServicer):
    def Execute(self, request, context):
        # Read inputs
        targets = request.context.get('targets')
        # Custom logic
        results = self.scan(targets)
        # Write outputs
        return ExecuteResponse(
            context={'custom_results': results}
        )
Register:
plugins:
  - name: custom_module
    type: grpc
    endpoint: localhost:50051
    timeout: 30s
3. WASM Plugins (experimental)
WebAssembly modules for sandboxed execution:
// custom_scanner.rs
use pentora_sdk::*;
#[no_mangle]
pub extern "C" fn execute(context: *const Context) -> i32 {
    let targets = context.get("targets");
    let results = scan(targets);
    context.set("results", results);
    0
}
Compile to WASM:
cargo build --target wasm32-wasi --release
Load:
pentora scan --plugin custom_scanner.wasm --targets 192.168.1.100
Module Lifecycle
┌──────────────┐
│ Registration │  (init() or plugin load)
└──────┬───────┘
       │
┌──────▼───────┐
│ Initialize   │  (one-time setup: open sockets, load data)
└──────┬───────┘
       │
┌──────▼───────┐
│ Configure    │  (apply runtime config)
└──────┬───────┘
       │
       ├─────────────┐
       │             │
┌──────▼───────┐    │
│ Execute      │◄───┘ (called per scan, potentially many times)
└──────┬───────┘
       │
┌──────▼───────┐
│ Cleanup      │  (release resources)
└──────────────┘
Registration Phase
Module announces itself to registry:
module.Register("my_module", &MyModule{})
Initialize Phase
One-time setup before any scans:
func (m *MyModule) Initialize() error {
    // Open persistent connections
    m.db = openDatabase()
    // Load data files
    m.signatures = loadSignatures()
    return nil
}
Called once when Pentora starts or plugin loads.
Configure Phase
Apply scan-specific configuration:
func (m *MyModule) Configure(config module.Config) error {
    m.timeout = config.GetDuration("timeout")
    m.concurrency = config.GetInt("concurrency")
    return nil
}
Called before each scan with DAG node config.
Execute Phase
Perform module operation:
func (m *MyModule) Execute(ctx context.Context, data module.DataContext) error {
    // Read inputs
    targets, _ := data.Get("targets")
    // Perform work
    results := m.scan(targets)
    // Write outputs
    data.Set("results", results)
    return nil
}
Called once per scan (or multiple times for parallel instances).
Cleanup Phase
Release resources:
func (m *MyModule) Cleanup() error {
    if m.db != nil {
        m.db.Close()
    }
    return nil
}
Called when Pentora exits or plugin unloads.
Module Configuration
Schema Definition
Modules declare configuration schema:
func (m *ScannerModule) ConfigSchema() module.Schema {
    return module.Schema{
        "ports": {
            Type:        "string",
            Description: "Port list (e.g., '80,443' or '1-1000')",
            Default:     "1-1000",
            Required:    false,
        },
        "rate": {
            Type:        "int",
            Description: "Packets per second",
            Default:     1000,
            Min:         1,
            Max:         100000,
        },
        "timeout": {
            Type:        "duration",
            Description: "Connection timeout",
            Default:     "3s",
        },
        "protocol": {
            Type:        "string",
            Description: "Protocol to scan",
            Enum:        []string{"tcp", "udp"},
            Default:     "tcp",
        },
    }
}
Runtime Configuration
Provided in DAG node definition:
nodes:
  - instance_id: port_scan
    module_type: syn_scanner
    config:
      ports: "1-10000"
      rate: 5000
      timeout: 5s
      protocol: tcp
Configuration Validation
Pentora validates config against schema before execution:
pentora dag validate my-scan.yaml
Checks:
- Required fields present
- Types correct (int, string, duration, bool)
- Values within allowed ranges
- Enum values valid
Module Communication
DataContext Keys
Modules communicate via shared keys:
// Producer
data.Set("open_ports", []Port{
    {Host: "192.168.1.100", Port: 22},
    {Host: "192.168.1.100", Port: 80},
})
// Consumer
ports, err := data.Get("open_ports")
if err != nil {
    return fmt.Errorf("missing required input: %w", err)
}
for _, port := range ports.([]Port) {
    // Process port
}
Type Safety
Use typed getters for safety:
// module/context.go
type DataContext interface {
    Get(key string) (interface{}, error)
    // Typed accessors
    GetTargets(key string) ([]Target, error)
    GetHosts(key string) ([]Host, error)
    GetPorts(key string) ([]Port, error)
    GetBanners(key string) ([]Banner, error)
    GetFingerprints(key string) ([]Fingerprint, error)
}
Namespace Conventions
Avoid key collisions:
<module_type>.<instance_id>.<output>
Example:
icmp_discovery.main.discovered_hosts
syn_scanner.port_scan_1.open_ports
banner_grabber.banner_1.banners
For standard keys, use simple names:
targets
discovered_hosts
open_ports
banners
service_fingerprints
Error Handling
Return Errors
Modules should return descriptive errors:
func (m *Module) Execute(ctx context.Context, data DataContext) error {
    targets, err := data.GetTargets("targets")
    if err != nil {
        return fmt.Errorf("failed to read targets: %w", err)
    }
    results, err := m.scan(targets)
    if err != nil {
        return fmt.Errorf("scan failed: %w", err)
    }
    if err := data.Set("results", results); err != nil {
        return fmt.Errorf("failed to write results: %w", err)
    }
    return nil
}
Partial Results
Write partial results before returning error:
func (m *Module) Execute(ctx context.Context, data DataContext) error {
    var results []Result
    for _, target := range targets {
        result, err := m.scan(target)
        if err != nil {
            // Log error but continue
            log.Warn().Err(err).Str("target", target).Msg("scan failed")
            continue
        }
        results = append(results, result)
    }
    // Write partial results
    data.Set("results", results)
    if len(results) == 0 {
        return errors.New("all scans failed")
    }
    return nil
}
Context Cancellation
Respect context cancellation:
func (m *Module) Execute(ctx context.Context, data DataContext) error {
    for _, target := range targets {
        select {
        case <-ctx.Done():
            return ctx.Err()  // Cancelled or timeout
        default:
            result := m.scan(target)
            results = append(results, result)
        }
    }
    return nil
}
Module Distribution
Builtin Modules
Shipped with Pentora:
- Discovery: ICMP, ARP, TCP probe
- Scanner: SYN, Connect, UDP, banner grab
- Parser: HTTP, SSH, SMTP, FTP, TLS
- Fingerprint: Banner matching, HTTP detection
- Profiler: Asset classification
- Reporter: JSON, CSV, JSONL
Always available, no installation required.
Official Plugin Repository
Pentora-maintained plugins:
# List available plugins
pentora plugin list
# Install plugin
pentora plugin install vuln/nmap-nse-wrapper
# Update plugin
pentora plugin update vuln/nmap-nse-wrapper
# Remove plugin
pentora plugin remove vuln/nmap-nse-wrapper
Plugins installed to ~/.local/share/pentora/plugins/.
Third-Party Plugins
Community-developed modules:
# Install from URL
pentora plugin install https://github.com/user/custom-scanner/releases/latest/plugin.so
# Install from file
pentora plugin install /path/to/plugin.so
Security: Signature verification required (Enterprise):
plugins:
  require_signature: true
  trusted_publishers:
    - fingerprint: A1B2C3D4E5F6...
      name: TrustedVendor
Enterprise Plugin Marketplace
Browse and install plugins via UI (Enterprise):
- Navigate to Plugins → Marketplace
- Search/filter by category
- Click Install
- Configure plugin settings
- Enable in scan profiles
Licensing enforced per plugin.
Best Practices
1. Minimize State
Keep modules stateless where possible:
// Bad: Shared state
type Module struct {
    results []Result  // Shared across scans
}
// Good: State in DataContext
func (m *Module) Execute(ctx context.Context, data DataContext) error {
    var results []Result
    // ...
    data.Set("results", results)
    return nil
}
2. Validate Inputs
Check DataContext inputs:
func (m *Module) Execute(ctx context.Context, data DataContext) error {
    targets, err := data.GetTargets("targets")
    if err != nil {
        return fmt.Errorf("missing targets: %w", err)
    }
    if len(targets) == 0 {
        return errors.New("no targets provided")
    }
    // ... proceed with scan
}
3. Structured Logging
Use Zerolog with context:
import "github.com/rs/zerolog/log"
func (m *Module) Execute(ctx context.Context, data DataContext) error {
    logger := log.With().
        Str("module", m.Name()).
        Str("instance", m.instanceID).
        Logger()
    logger.Info().Msg("execution started")
    // ... perform work
    logger.Info().
        Int("results", len(results)).
        Dur("duration", elapsed).
        Msg("execution completed")
    return nil
}
4. Respect Timeouts
Honor context deadlines:
func (m *Module) scan(ctx context.Context, target Target) (Result, error) {
    deadline, ok := ctx.Deadline()
    if ok {
        timeout := time.Until(deadline)
        conn.SetDeadline(time.Now().Add(timeout))
    }
    // ... perform scan
}
5. Handle Concurrency
If module spawns goroutines:
func (m *Module) Execute(ctx context.Context, data DataContext) error {
    var wg sync.WaitGroup
    resultsChan := make(chan Result)
    for _, target := range targets {
        wg.Add(1)
        go func(t Target) {
            defer wg.Done()
            result := m.scan(ctx, t)
            resultsChan <- result
        }(target)
    }
    go func() {
        wg.Wait()
        close(resultsChan)
    }()
    var results []Result
    for result := range resultsChan {
        results = append(results, result)
    }
    data.Set("results", results)
    return nil
}
Testing Modules
Unit Tests
Test module in isolation:
func TestICMPModule_Execute(t *testing.T) {
    // Setup
    module := &ICMPModule{}
    module.Configure(module.Config{
        "timeout": "1s",
        "retry": 1,
    })
    module.Initialize()
    defer module.Cleanup()
    // Create test context
    ctx := context.Background()
    data := module.NewTestDataContext()
    data.Set("targets", []Target{
        {IP: "127.0.0.1"},
    })
    // Execute
    err := module.Execute(ctx, data)
    require.NoError(t, err)
    // Verify
    hosts, err := data.GetHosts("discovered_hosts")
    require.NoError(t, err)
    assert.Len(t, hosts, 1)
    assert.Equal(t, "127.0.0.1", hosts[0].IP)
}
Integration Tests
Test module in DAG:
func TestScanPipeline(t *testing.T) {
    dag := `
nodes:
  - instance_id: targets
    module_type: target_parser
  - instance_id: discover
    module_type: icmp_discovery
    depends_on: [targets]
  - instance_id: scan
    module_type: syn_scanner
    depends_on: [discover]
`
    orchestrator := engine.NewOrchestrator()
    orchestrator.LoadDAG([]byte(dag))
    result, err := orchestrator.Execute(context.Background())
    require.NoError(t, err)
    assert.Equal(t, "completed", result.Status)
}
Next Steps
- DAG Engine - How modules are orchestrated
- Custom Modules - Writing your own modules
- External Plugins - gRPC and WASM plugins
- Module API Reference - Full API documentation