My AI and Kali Adventures: LLM Testing and Proxmox Debugging

Session Overview

This session combined two fascinating homelab challenges: testing local LLM (Large Language Model) deployment on limited hardware, and debugging a stubborn Kali Linux VM that refused to boot on Proxmox. Both problems taught valuable lessons about resource management and virtualization troubleshooting.

Part 1: Local LLM Testing on Consumer Hardware

The Goal

Run open-source LLMs locally on a ThinkPad E14 Gen 2 for: - Privacy-preserving AI assistance - Offline access to AI capabilities - Learning about LLM quantization and optimization - Code completion and development support

Hardware Constraints

ThinkPad E14 Gen 2 Specifications: - CPU: Intel i5-1135G7 (4 cores, 8 threads) - RAM: 16 GB DDR4 - Storage: 256 GB SSD - GPU: Intel Iris Xe (integrated, Vulkan support) - Limitations: No dedicated GPU, limited RAM

LLM Model Evaluation

Attempt 1: GPT-OSS-20B

Model Specifications: - Size: 21 billion parameters - Quantization: 4-bit (Q4) - File Size: ~12 GB - License: Apache 2.0 (commercial use allowed)

Compatibility Assessment:

Model Requirements:
├── VRAM: ~8 GB (GPU)
├── System RAM: ~16 GB minimum
├── Storage: 12 GB for model
└── Performance: Dedicated GPU recommended

Available Resources:
├── VRAM: ~2 GB (shared with system)
├── System RAM: 16 GB (but OS needs ~4 GB)
├── Storage: 256 GB (adequate)
└── GPU: Integrated Iris Xe

Result: ❌ Incompatible

Reasons: 1. Insufficient RAM: Model would consume ~12 GB, leaving only 4 GB for OS 2. No Dedicated GPU: Iris Xe can’t handle 20B parameter model efficiently 3. Performance: Would experience extreme slowdowns (10+ seconds per token) 4. Risk of System Crash: Out-of-memory errors likely

Attempt 2: Mistral 7B (Q4)

Model Options Discovered: 1. mistral-7b-ielts-evaluator-q4 (specialized for IELTS evaluation) 2. Mistral-7B-Instruct-v0.1-Q4_K_M (general purpose) ✓ Recommended

Model Specifications: - Size: 7 billion parameters - Quantization: Q4_K_M (4-bit with mixed precision) - File Size: ~4.3 GB - Performance: Optimized for CPU inference

Compatibility Assessment:

Model Requirements:
├── RAM: ~6 GB (model + overhead)
├── Storage: 4.3 GB
├── CPU: Multi-core (4+ cores recommended)
└── Acceleration: AVX2, Vulkan optional

Available Resources:
├── RAM: 16 GB ✓
├── Storage: 256 GB ✓
├── CPU: i5-1135G7 (4 cores) ✓
└── Acceleration: AVX2 ✓, Vulkan ✓

Result: ✅ Compatible

Expected Performance: - Tokens per second: 2-5 (CPU) - Response time: 10-30 seconds for typical query - Memory usage: ~6 GB during inference

LLM Deployment with LM Studio

Installation

# Download LM Studio
wget https://lmstudio.ai/download/linux/latest -O lmstudio.AppImage

# Make executable
chmod +x lmstudio.AppImage

# Run
./lmstudio.AppImage

Configuration for Low-Resource Systems

Settings → Inference: - Context Length: 2048 (reduce from 4096 for faster inference) - GPU Offload: 0 layers (CPU-only on integrated GPU) - Threads: 6 (leave 2 cores for system) - Batch Size: 8 (lower = slower but more stable) - Use mlock: ✓ (keep model in RAM, prevent swapping)

Settings → Model: - Temperature: 0.7 - Top P: 0.9 - Repeat Penalty: 1.1

Alternative LLM Options for Constrained Hardware

TinyLlama 1.1B (Ultra-Lightweight)

Specifications: - Size: 1.1 billion parameters - File Size: ~700 MB (Q4) - RAM Usage: ~2 GB - Performance: 10-20 tokens/sec on CPU

Use Cases: - Code completion - Simple chat - Quick prototyping

Installation:

# In LM Studio, search for:
TinyLlama-1.1B-Chat-v1.0-Q4_K_M

Phi-2 (2.7B) - Microsoft

Specifications: - Size: 2.7 billion parameters - File Size: ~1.6 GB (Q4) - RAM Usage: ~3 GB - Performance: 5-10 tokens/sec

Advantages: - Trained on high-quality code and reasoning data - Excellent for programming tasks - Efficient architecture

Installation:

# In LM Studio:
microsoft/phi-2-Q4_K_M

Performance Optimization Techniques

1. Use Quantization

# Quantization levels (from most to least compressed):
Q2_K   # 2-bit: Fastest, lowest quality
Q4_K_M # 4-bit: Balanced (recommended)
Q5_K_M # 5-bit: Better quality, slower
Q8_0   # 8-bit: Best quality, slowest

2. Optimize CPU Inference

# Enable AVX2 acceleration
export OMP_NUM_THREADS=6

# Use llama.cpp with optimizations
./main -m model.gguf \
    -t 6 \
    --mlock \
    -ngl 0 \
    --ctx-size 2048

3. Monitor Resource Usage

# Install monitoring tools
sudo apt install htop nvtop

# Monitor during inference
htop  # CPU and RAM
watch -n 1 free -h  # Memory usage

Part 2: Proxmox VM Debugging - Kali Linux Boot Failure

The Error

Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

Problem Analysis

This cryptic error indicates the kernel can’t find or mount the root filesystem. Common causes in virtualized environments:

  1. GRUB Misconfiguration: Wrong root= parameter
  2. Missing EFI Partition: UEFI boot without EFI disk
  3. Incompatible Disk Controller: Wrong SCSI controller type
  4. Corrupted Initramfs: Missing drivers in initial ramdisk

Debugging Process

Step 1: Verify BIOS Type

Initial Configuration: - BIOS: SeaBIOS (legacy BIOS) - Machine Type: i440fx - Disk: SCSI with iothread enabled

Warning Encountered:

WARN: iothread is only valid with virtio disk or virtio-scsi-single controller, ignoring

Issue: SCSI controller type doesn’t support iothread, but warning was ignored during VM creation.

Step 2: Try OVMF (UEFI) Boot

Changes Made: 1. BIOS: SeaBIOS → OVMF (UEFI) 2. Added EFI Disk via Hardware → Add → EFI Disk 3. Machine Type: i440fx → q35 (modern chipset)

Result: ❌ Still failed to boot

Analysis: OVMF requires specific partition layout and EFI boot loader installed during OS installation. Switching BIOS type post-installation doesn’t work.

Step 3: Fix SCSI Controller

Root Cause Identified:

The warning about iothread was critical: - iothread requires VirtIO SCSI Single controller - VM was using standard SCSI controller - Kernel couldn’t communicate with disk properly

Resolution:

  1. Shutdown VM

  2. Remove existing disk

  3. Add disk with correct configuration:

    • Bus/Device: SCSI
    • Storage: local-lvm
    • SCSI Controller: VirtIO SCSI Single
    • iothread: ✓
  4. Update VM Configuration:

    # Edit VM config
    nano /etc/pve/qemu-server/101.conf
    
    # Ensure controller is set correctly:
    scsihw: virtio-scsi-single
    
    # Verify disk uses correct controller:
    scsi0: local-lvm:vm-101-disk-0,iothread=1
  5. Reboot VM

Result: ✅ Boot Successful!

Final Working Configuration

# /etc/pve/qemu-server/101.conf

agent: 1
bios: seabios
boot: order=scsi0;ide2
cores: 4
cpu: host
ide2: local:iso/kali-linux-2024.4-installer-amd64.iso,media=cdrom
machine: i440fx
memory: 8192
name: kali-pentest
net0: virtio=XX:XX:XX:XX:XX:XX,bridge=vmbr1,firewall=0
numa: 0
ostype: l26
scsi0: local-lvm:vm-101-disk-0,iothread=1,size=80G
scsihw: virtio-scsi-single
smbios1: uuid=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
sockets: 1
vmgenid: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX

Key Lessons from Debugging

  1. Read Warnings Carefully: The iothread warning was the critical clue
  2. BIOS Type Matters: Can’t switch between SeaBIOS and OVMF post-install
  3. Controller Compatibility: VirtIO SCSI Single required for advanced features
  4. Machine Type: q35 for modern hardware, i440fx for compatibility

Recommendations for Similar Setups

For LLM Testing on Limited Hardware:

Recommended Models (by use case):

Light Tasks (2-4 GB RAM available):
├── TinyLlama 1.1B (~700 MB, fast)
└── Phi-2 2.7B (~1.6 GB, good for code)

Medium Tasks (6-8 GB RAM available):
├── Mistral 7B Q4 (~4.3 GB, balanced)
├── Llama 2 7B Q4 (~3.8 GB, general purpose)
└── CodeLlama 7B Q4 (~4.3 GB, coding focused)

Heavy Tasks (12+ GB RAM available):
└── Upgrade hardware or use cloud services

Optimization Checklist: - [ ] Use Q4 quantization for best size/quality balance - [ ] Enable AVX2 CPU acceleration - [ ] Set context length to 2048 for faster inference - [ ] Use 75% of available CPU threads - [ ] Monitor RAM usage with htop - [ ] Consider external SSD if storage limited

For Proxmox VM Configuration:

Best Practices:

SeaBIOS (Legacy Boot):
├── Machine: i440fx
├── SCSI Controller: VirtIO SCSI Single
├── Disk: SCSI with iothread=1
└── Use for: Linux, older OSes

OVMF (UEFI Boot):
├── Machine: q35
├── Add EFI Disk during VM creation
├── SCSI Controller: VirtIO SCSI Single
└── Use for: Modern OSes, Windows, secure boot

Troubleshooting Workflow:

  1. Check Proxmox Warnings: Don’t ignore yellow text in logs
  2. Verify Controller Compatibility: Match controller to features
  3. Test Boot Order: Ensure disk is first in boot sequence
  4. Check GRUB Config: Verify root= parameter in VM console
  5. Use Snapshots: Before making major configuration changes

Tools and Resources

LLM Tools:

  • LM Studio: User-friendly GUI for model management
  • llama.cpp: Command-line inference engine
  • Ollama: Docker-based LLM deployment
  • Jan AI: Lightweight alternative to LM Studio

Proxmox Debugging:

  • System Logs: /var/log/pve/tasks/
  • VM Logs: /var/log/qemu-server/
  • Configuration: /etc/pve/qemu-server/

Conclusion

This session demonstrated two critical homelab skills:

  1. Resource Assessment: Understanding hardware limitations and choosing appropriate solutions (7B models vs 20B models)
  2. Systematic Debugging: Following error messages and warnings to identify root causes (iothread + controller compatibility)

Both challenges reinforced the importance of: - Reading documentation thoroughly - Monitoring resource constraints - Testing incrementally - Documenting configurations

The working Kali VM and functional LLM setup provide a solid foundation for future security research and AI-assisted development work.

Next steps: Integrate Mistral 7B with coding workflows and deploy vulnerable VMs for Kali pentesting practice.