Workspace

The workspace is Pentora's shared storage location for scan data, configuration, cache, and operational state. It provides persistent storage for scan results and enables coordination between CLI and server modes.

What is a Workspace?

A workspace is a directory structure that stores:

Scan results: Complete scan output with metadata
Queue: Scheduled scans waiting for execution (server mode)
Cache: Fingerprint databases, DNS lookups, temporary data
Reports: Generated reports and exports
Audit logs: User actions and system events (Enterprise)
Configuration: Workspace-specific settings

Default Locations

Pentora respects OS-specific conventions for application data:

Linux

Follows XDG Base Directory Specification:

$XDG_DATA_HOME/pentora      (typically ~/.local/share/pentora)

macOS

Uses Application Support directory:

~/Library/Application Support/Pentora

Windows

Uses AppData directory:

%AppData%\Pentora           (e.g., C:\Users\username\AppData\Roaming\Pentora)

Directory Structure

pentora/                      # Workspace root
├── scans/                    # Scan results (organized by ID)
│   ├── 20231006-143022-a1b2c3/
│   │   ├── request.json      # Scan parameters
│   │   ├── status.json       # Execution state
│   │   ├── results.jsonl     # Main results
│   │   └── artifacts/        # Additional data
│   │       ├── banners/      # Banner captures
│   │       ├── screenshots/  # Web screenshots
│   │       └── pcaps/        # Packet captures
│   └── 20231006-150000-d4e5f6/
│       └── ...
├── queue/                    # Scheduled scans (server mode)
│   ├── pending/              # Queued jobs
│   ├── running/              # Active jobs
│   └── failed/               # Failed job metadata
├── cache/                    # Cached data
│   ├── fingerprints/         # Fingerprint catalogs
│   │   ├── builtin.yaml
│   │   ├── remote-sync.yaml
│   │   └── custom.yaml
│   ├── dns/                  # DNS resolution cache
│   └── geo/                  # GeoIP database
├── reports/                  # Generated reports
│   ├── executive-2023-10.pdf
│   ├── vuln-summary.csv
│   └── compliance-pci.json
├── audit/                    # Audit logs (Enterprise)
│   ├── 2023-10-01.jsonl
│   └── 2023-10-02.jsonl
└── config/                   # Workspace-specific config
    ├── pentora.yaml          # Workspace configuration
    ├── license.key           # License JWT (Enterprise)
    └── profiles/             # Custom scan profiles
        ├── webapp.yaml
        └── infrastructure.yaml

Scan Directory Structure

Each scan creates a unique directory identified by timestamp and ID:

scans/20231006-143022-a1b2c3/
├── request.json              # Original scan request
├── status.json               # Execution metadata
├── results.jsonl             # Line-delimited JSON results
├── artifacts/
│   ├── banners/              # Raw protocol banners
│   │   ├── 192.168.1.100-22.txt
│   │   └── 192.168.1.100-80.txt
│   ├── screenshots/          # Web screenshots (if enabled)
│   │   └── 192.168.1.100-80.png
│   └── pcaps/                # Network captures (if enabled)
│       └── 192.168.1.100.pcap
└── reports/
    ├── summary.json          # Scan summary
    ├── vulnerabilities.csv   # CVE list
    └── executive.pdf         # Executive report (Enterprise)

request.json

Original scan parameters:

{
  "scan_id": "20231006-143022-a1b2c3",
  "timestamp": "2023-10-06T14:30:22Z",
  "user": "operator",
  "targets": ["192.168.1.0/24"],
  "profile": "standard",
  "options": {
    "discover": true,
    "vuln": true,
    "rate": 1000,
    "ports": "1-1000"
  },
  "notifications": ["slack://security-alerts"]
}

status.json

Execution state and timing:

{
  "scan_id": "20231006-143022-a1b2c3",
  "state": "completed",
  "started_at": "2023-10-06T14:30:22Z",
  "completed_at": "2023-10-06T14:35:45Z",
  "duration_seconds": 323,
  "stages": {
    "target_ingestion": {"status": "completed", "duration": 0.5},
    "discovery": {"status": "completed", "duration": 12.3, "hosts_found": 15},
    "port_scan": {"status": "completed", "duration": 45.2, "ports_found": 73},
    "fingerprint": {"status": "completed", "duration": 89.5},
    "profile": {"status": "completed", "duration": 2.1},
    "vulnerability": {"status": "completed", "duration": 156.3, "vulns_found": 12},
    "reporting": {"status": "completed", "duration": 17.1}
  },
  "errors": [],
  "warnings": [
    "Rate limit exceeded, throttling to 500 pps"
  ]
}

results.jsonl

Main scan results in line-delimited JSON format (one object per line):

{"timestamp":"2023-10-06T14:31:00Z","host":"192.168.1.100","state":"up","latency":1.2}
{"timestamp":"2023-10-06T14:31:15Z","host":"192.168.1.100","port":22,"protocol":"tcp","state":"open","service":{"name":"ssh","product":"OpenSSH","version":"8.2p1"}}
{"timestamp":"2023-10-06T14:31:16Z","host":"192.168.1.100","port":80,"protocol":"tcp","state":"open","service":{"name":"http","product":"nginx","version":"1.18.0"}}
{"timestamp":"2023-10-06T14:32:30Z","host":"192.168.1.100","port":80,"vulnerability":{"id":"CVE-2021-23017","severity":"critical","cvss":9.8}}

Why JSONL?

Streamable: Process results line-by-line
Append-friendly: Add results as scan progresses
Grep-friendly: Search with standard tools
Resilient: Corruption affects single line, not entire file

Configuration

Global Configuration

Control workspace behavior in ~/.config/pentora/config.yaml:

workspace:
  # Workspace root directory
  dir: ~/.local/share/pentora

  # Enable workspace persistence
  enabled: true

  # Automatically create directory if missing
  auto_create: true

  # Retention policies
  retention:
    enabled: true
    max_age: 90d              # Delete scans older than 90 days
    max_scans: 1000           # Keep at most 1000 scans
    min_free_space: 10GB      # Delete oldest when space low

  # Scan storage options
  scans:
    compress: false           # Gzip compress results
    keep_artifacts: true      # Store banners/screenshots
    keep_pcaps: false         # Store packet captures (large)

  # Cache settings
  cache:
    fingerprints:
      ttl: 7d                 # Refresh fingerprint catalog after 7 days
    dns:
      ttl: 24h
      max_entries: 10000

Workspace-Specific Configuration

Override global settings per workspace in <workspace>/config/pentora.yaml:

# This workspace uses custom settings
workspace:
  retention:
    max_age: 30d              # Shorter retention for this workspace

scanner:
  # Default profile for this workspace
  default_profile: webapp

notifications:
  # Workspace-specific notification channels
  default_channels:
    - slack://workspace-alerts
    - email://security-team@company.com

CLI Control

Specify Workspace Directory

Override the default workspace location:

# Use custom workspace
pentora scan --targets 192.168.1.100 --workspace-dir /data/pentora-workspace

# Use environment variable
export PENTORA_WORKSPACE=/data/pentora-workspace
pentora scan --targets 192.168.1.100

Disable Workspace (Stateless Mode)

Run scans without persisting to workspace:

pentora scan --targets 192.168.1.100 --no-workspace

Use cases for stateless mode:

Quick ad-hoc scans
CI/CD pipelines (ephemeral environments)
Minimal disk usage
Privacy (no scan history)

Results still output to stdout/specified file but not saved to workspace.

Workspace Management Commands

# Show workspace information
pentora workspace info

# List all scans
pentora workspace list

# Show specific scan
pentora workspace show <scan-id>

# Garbage collection (cleanup old scans)
pentora workspace gc

# Remove scans older than 30 days
pentora workspace gc --older-than 30d

# Keep only last 100 scans
pentora workspace gc --keep-last 100

# Clean cache
pentora workspace clean-cache

# Export scan
pentora workspace export <scan-id> --format json --output scan.json

# Delete specific scan
pentora workspace delete <scan-id>

# Initialize new workspace
pentora workspace init /path/to/new/workspace

Retention Policies

Automatic cleanup based on configured policies:

Age-Based Retention

Delete scans older than specified duration:

workspace:
  retention:
    max_age: 90d              # 90 days

Supported units: d (days), w (weeks), m (months), y (years)

Count-Based Retention

Keep only the N most recent scans:

workspace:
  retention:
    max_scans: 1000           # Keep last 1000 scans

Oldest scans deleted first.

Space-Based Retention

Delete oldest scans when disk space is low:

workspace:
  retention:
    min_free_space: 10GB      # Maintain at least 10GB free

Monitors workspace filesystem and triggers cleanup when threshold reached.

Manual Garbage Collection

Run cleanup on demand:

# Apply configured retention policies
pentora workspace gc

# Override with custom rules
pentora workspace gc --older-than 30d --keep-last 50

# Dry run (show what would be deleted)
pentora workspace gc --dry-run

# Force deletion without confirmation
pentora workspace gc --force

Permissions

File Permissions

Workspace directories and files are created with user-only access:

Directories: 0700 (rwx------)
Files: 0600 (rw-------)

This prevents other users on the system from reading scan results.

Multi-User Scenarios

For shared workspaces:

Use a dedicated user:

sudo useradd -r -s /bin/false pentora
sudo mkdir /var/lib/pentora
sudo chown pentora:pentora /var/lib/pentora

Run server as dedicated user:

sudo -u pentora pentora server start --workspace-dir /var/lib/pentora

Grant specific users access:

sudo usermod -a -G pentora alice
sudo usermod -a -G pentora bob

Enterprise Multi-Tenant

In Enterprise multi-tenant mode, workspace structure changes:

pentora/
├── tenants/
│   ├── customer-a/
│   │   ├── scans/
│   │   ├── reports/
│   │   └── config/
│   ├── customer-b/
│   │   └── ...

Tenant-level ACLs enforce isolation. See Multi-Tenant Deployment.

Workspace Lifecycle

Initialization

Workspace is automatically created on first use:

pentora scan --targets 192.168.1.100
# Creates ~/.local/share/pentora/ if missing

Explicit initialization:

pentora workspace init /path/to/workspace

Migration

When upgrading Pentora, workspace format may change:

# Check if migration needed
pentora workspace check

# Perform migration
pentora workspace migrate

# Backup before migration
cp -r ~/.local/share/pentora ~/.local/share/pentora.backup

Backup

Backup entire workspace:

# Filesystem backup
tar -czf pentora-backup-$(date +%Y%m%d).tar.gz ~/.local/share/pentora

# Incremental backup with rsync
rsync -av ~/.local/share/pentora /mnt/backups/pentora/

Export individual scans:

pentora workspace export <scan-id> --output scan-backup.json

Restoration

Restore from backup:

# Stop server if running
pentora server stop

# Restore workspace
tar -xzf pentora-backup-20231006.tar.gz -C ~/

# Verify integrity
pentora workspace check

# Restart server
pentora server start

Performance Considerations

Disk Space

Monitor workspace disk usage:

# Show workspace size
du -sh ~/.local/share/pentora

# Show per-directory breakdown
du -h --max-depth=1 ~/.local/share/pentora

Space usage estimates:

Small scan (10 hosts, 100 ports): ~1-5 MB
Medium scan (100 hosts, 1000 ports): ~10-50 MB
Large scan (1000 hosts, 1000 ports): ~100-500 MB
With packet captures: 10-100x larger

Compression

Enable compression to save space:

workspace:
  scans:
    compress: true            # Gzip compress results.jsonl

Reduces storage by 70-90% for text-based results. Transparent decompression when reading.

Database Backend (Enterprise)

For very large deployments, use database storage instead of filesystem:

workspace:
  backend: postgresql
  database:
    host: localhost
    port: 5432
    database: pentora
    user: pentora
    password: ${PENTORA_DB_PASSWORD}

Benefits:

Better performance with millions of scans
Transactional integrity
Easier replication and backup
Advanced querying

Server Mode Integration

In server mode, workspace serves additional roles:

Job Queue

Scheduled scans stored in queue/:

queue/
├── pending/
│   ├── job-001.json
│   └── job-002.json
├── running/
│   └── job-001.json
└── failed/
    └── job-003.json

Worker processes poll queue/pending/, execute scans, and write results to scans/.

Distributed Scanning (Enterprise)

Workers on multiple hosts share a centralized workspace (via NFS, object storage, or database):

[CLI] → [Queue] → [Worker Pool] → [Shared Workspace]
                      ↓
                  [Worker 1]
                  [Worker 2]
                  [Worker 3]

See Distributed Scanning for architecture.

Troubleshooting

Workspace Permission Errors

Error: Failed to create workspace directory: permission denied

Solution: Ensure you have write permissions:

mkdir -p ~/.local/share/pentora
chmod 700 ~/.local/share/pentora

Disk Space Exhausted

Error: Failed to write results: no space left on device

Solutions:

Run garbage collection: pentora workspace gc --older-than 7d
Enable compression: Set workspace.scans.compress: true
Reduce artifact retention: Set workspace.scans.keep_artifacts: false
Use different workspace location with more space

Corrupted Scan Data

If a scan directory is incomplete or corrupted:

# Check scan status
pentora workspace show <scan-id>

# Delete corrupted scan
pentora workspace delete <scan-id>

# Re-run scan
pentora scan --targets <original-targets>

Workspace Migration Failed

If migration fails:

# Restore from backup
mv ~/.local/share/pentora ~/.local/share/pentora.failed
tar -xzf pentora-backup.tar.gz -C ~/

# Report issue with logs
pentora workspace check --verbose > migration-error.log

Next Steps

Scan Pipeline - How scan data is generated
Server Mode Deployment - Running Pentora as a service
Workspace Configuration - Detailed configuration
Distributed Scanning - Multi-host workspace sharing

What is a Workspace?​

Default Locations​

Linux​

macOS​

Windows​

Directory Structure​

Scan Directory Structure​

request.json​

status.json​

results.jsonl​

Configuration​

Global Configuration​

Workspace-Specific Configuration​

CLI Control​

Specify Workspace Directory​

Disable Workspace (Stateless Mode)​

Workspace Management Commands​

Retention Policies​

Age-Based Retention​

Count-Based Retention​

Space-Based Retention​

Manual Garbage Collection​

Permissions​

File Permissions​

Multi-User Scenarios​

Enterprise Multi-Tenant​

Workspace Lifecycle​

Initialization​

Migration​

Backup​

Restoration​

Performance Considerations​

Disk Space​

Compression​

Database Backend (Enterprise)​

Server Mode Integration​

Job Queue​

Distributed Scanning (Enterprise)​

Troubleshooting​

Workspace Permission Errors​

Disk Space Exhausted​

Corrupted Scan Data​

Workspace Migration Failed​

Next Steps​