My AI and Kali Adventures: LLM Testing and Proxmox Debugging
Session Overview
This session combined two fascinating homelab challenges: testing local LLM (Large Language Model) deployment on limited hardware, and debugging a stubborn Kali Linux VM that refused to boot on Proxmox. Both problems taught valuable lessons about resource management and virtualization troubleshooting.
Part 1: Local LLM Testing on Consumer Hardware
The Goal
Run open-source LLMs locally on a ThinkPad E14 Gen 2 for: - Privacy-preserving AI assistance - Offline access to AI capabilities - Learning about LLM quantization and optimization - Code completion and development support
Hardware Constraints
ThinkPad E14 Gen 2 Specifications: - CPU: Intel i5-1135G7 (4 cores, 8 threads) - RAM: 16 GB DDR4 - Storage: 256 GB SSD - GPU: Intel Iris Xe (integrated, Vulkan support) - Limitations: No dedicated GPU, limited RAM
LLM Model Evaluation
Attempt 1: GPT-OSS-20B
Model Specifications: - Size: 21 billion parameters - Quantization: 4-bit (Q4) - File Size: ~12 GB - License: Apache 2.0 (commercial use allowed)
Compatibility Assessment:
Model Requirements:
├── VRAM: ~8 GB (GPU)
├── System RAM: ~16 GB minimum
├── Storage: 12 GB for model
└── Performance: Dedicated GPU recommended
Available Resources:
├── VRAM: ~2 GB (shared with system)
├── System RAM: 16 GB (but OS needs ~4 GB)
├── Storage: 256 GB (adequate)
└── GPU: Integrated Iris Xe
Result: ❌ Incompatible
Reasons: 1. Insufficient RAM: Model would consume ~12 GB, leaving only 4 GB for OS 2. No Dedicated GPU: Iris Xe can’t handle 20B parameter model efficiently 3. Performance: Would experience extreme slowdowns (10+ seconds per token) 4. Risk of System Crash: Out-of-memory errors likely
Attempt 2: Mistral 7B (Q4)
Model Options Discovered: 1. mistral-7b-ielts-evaluator-q4 (specialized for IELTS evaluation) 2. Mistral-7B-Instruct-v0.1-Q4_K_M (general purpose) ✓ Recommended
Model Specifications: - Size: 7 billion parameters - Quantization: Q4_K_M (4-bit with mixed precision) - File Size: ~4.3 GB - Performance: Optimized for CPU inference
Compatibility Assessment:
Model Requirements:
├── RAM: ~6 GB (model + overhead)
├── Storage: 4.3 GB
├── CPU: Multi-core (4+ cores recommended)
└── Acceleration: AVX2, Vulkan optional
Available Resources:
├── RAM: 16 GB ✓
├── Storage: 256 GB ✓
├── CPU: i5-1135G7 (4 cores) ✓
└── Acceleration: AVX2 ✓, Vulkan ✓
Result: ✅ Compatible
Expected Performance: - Tokens per second: 2-5 (CPU) - Response time: 10-30 seconds for typical query - Memory usage: ~6 GB during inference
LLM Deployment with LM Studio
Installation
# Download LM Studio
wget https://lmstudio.ai/download/linux/latest -O lmstudio.AppImage
# Make executable
chmod +x lmstudio.AppImage
# Run
./lmstudio.AppImageConfiguration for Low-Resource Systems
Settings → Inference: - Context Length: 2048 (reduce from 4096 for faster inference) - GPU Offload: 0 layers (CPU-only on integrated GPU) - Threads: 6 (leave 2 cores for system) - Batch Size: 8 (lower = slower but more stable) - Use mlock: ✓ (keep model in RAM, prevent swapping)
Settings → Model: - Temperature: 0.7 - Top P: 0.9 - Repeat Penalty: 1.1
Alternative LLM Options for Constrained Hardware
TinyLlama 1.1B (Ultra-Lightweight)
Specifications: - Size: 1.1 billion parameters - File Size: ~700 MB (Q4) - RAM Usage: ~2 GB - Performance: 10-20 tokens/sec on CPU
Use Cases: - Code completion - Simple chat - Quick prototyping
Installation:
# In LM Studio, search for:
TinyLlama-1.1B-Chat-v1.0-Q4_K_MPhi-2 (2.7B) - Microsoft
Specifications: - Size: 2.7 billion parameters - File Size: ~1.6 GB (Q4) - RAM Usage: ~3 GB - Performance: 5-10 tokens/sec
Advantages: - Trained on high-quality code and reasoning data - Excellent for programming tasks - Efficient architecture
Installation:
# In LM Studio:
microsoft/phi-2-Q4_K_MPerformance Optimization Techniques
1. Use Quantization
# Quantization levels (from most to least compressed):
Q2_K # 2-bit: Fastest, lowest quality
Q4_K_M # 4-bit: Balanced (recommended)
Q5_K_M # 5-bit: Better quality, slower
Q8_0 # 8-bit: Best quality, slowest2. Optimize CPU Inference
# Enable AVX2 acceleration
export OMP_NUM_THREADS=6
# Use llama.cpp with optimizations
./main -m model.gguf \
-t 6 \
--mlock \
-ngl 0 \
--ctx-size 20483. Monitor Resource Usage
# Install monitoring tools
sudo apt install htop nvtop
# Monitor during inference
htop # CPU and RAM
watch -n 1 free -h # Memory usagePart 2: Proxmox VM Debugging - Kali Linux Boot Failure
The Error
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
Problem Analysis
This cryptic error indicates the kernel can’t find or mount the root filesystem. Common causes in virtualized environments:
- GRUB Misconfiguration: Wrong
root=parameter - Missing EFI Partition: UEFI boot without EFI disk
- Incompatible Disk Controller: Wrong SCSI controller type
- Corrupted Initramfs: Missing drivers in initial ramdisk
Debugging Process
Step 1: Verify BIOS Type
Initial Configuration: - BIOS: SeaBIOS (legacy BIOS) - Machine Type: i440fx - Disk: SCSI with iothread enabled
Warning Encountered:
WARN: iothread is only valid with virtio disk or virtio-scsi-single controller, ignoring
Issue: SCSI controller type doesn’t support iothread, but warning was ignored during VM creation.
Step 2: Try OVMF (UEFI) Boot
Changes Made: 1. BIOS: SeaBIOS → OVMF (UEFI) 2. Added EFI Disk via Hardware → Add → EFI Disk 3. Machine Type: i440fx → q35 (modern chipset)
Result: ❌ Still failed to boot
Analysis: OVMF requires specific partition layout and EFI boot loader installed during OS installation. Switching BIOS type post-installation doesn’t work.
Step 3: Fix SCSI Controller
Root Cause Identified:
The warning about iothread was critical: - iothread requires VirtIO SCSI Single controller - VM was using standard SCSI controller - Kernel couldn’t communicate with disk properly
Resolution:
Shutdown VM
Remove existing disk
Add disk with correct configuration:
- Bus/Device: SCSI
- Storage: local-lvm
- SCSI Controller: VirtIO SCSI Single
- iothread: ✓
Update VM Configuration:
# Edit VM config nano /etc/pve/qemu-server/101.conf # Ensure controller is set correctly: scsihw: virtio-scsi-single # Verify disk uses correct controller: scsi0: local-lvm:vm-101-disk-0,iothread=1Reboot VM
Result: ✅ Boot Successful!
Final Working Configuration
# /etc/pve/qemu-server/101.conf
agent: 1
bios: seabios
boot: order=scsi0;ide2
cores: 4
cpu: host
ide2: local:iso/kali-linux-2024.4-installer-amd64.iso,media=cdrom
machine: i440fx
memory: 8192
name: kali-pentest
net0: virtio=XX:XX:XX:XX:XX:XX,bridge=vmbr1,firewall=0
numa: 0
ostype: l26
scsi0: local-lvm:vm-101-disk-0,iothread=1,size=80G
scsihw: virtio-scsi-single
smbios1: uuid=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
sockets: 1
vmgenid: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXXKey Lessons from Debugging
- Read Warnings Carefully: The iothread warning was the critical clue
- BIOS Type Matters: Can’t switch between SeaBIOS and OVMF post-install
- Controller Compatibility: VirtIO SCSI Single required for advanced features
- Machine Type: q35 for modern hardware, i440fx for compatibility
Recommendations for Similar Setups
For LLM Testing on Limited Hardware:
Recommended Models (by use case):
Light Tasks (2-4 GB RAM available):
├── TinyLlama 1.1B (~700 MB, fast)
└── Phi-2 2.7B (~1.6 GB, good for code)
Medium Tasks (6-8 GB RAM available):
├── Mistral 7B Q4 (~4.3 GB, balanced)
├── Llama 2 7B Q4 (~3.8 GB, general purpose)
└── CodeLlama 7B Q4 (~4.3 GB, coding focused)
Heavy Tasks (12+ GB RAM available):
└── Upgrade hardware or use cloud services
Optimization Checklist: - [ ] Use Q4 quantization
for best size/quality balance - [ ] Enable AVX2 CPU acceleration - [ ]
Set context length to 2048 for faster inference - [ ] Use 75% of
available CPU threads - [ ] Monitor RAM usage with htop - [
] Consider external SSD if storage limited
For Proxmox VM Configuration:
Best Practices:
SeaBIOS (Legacy Boot):
├── Machine: i440fx
├── SCSI Controller: VirtIO SCSI Single
├── Disk: SCSI with iothread=1
└── Use for: Linux, older OSes
OVMF (UEFI Boot):
├── Machine: q35
├── Add EFI Disk during VM creation
├── SCSI Controller: VirtIO SCSI Single
└── Use for: Modern OSes, Windows, secure boot
Troubleshooting Workflow:
- Check Proxmox Warnings: Don’t ignore yellow text in logs
- Verify Controller Compatibility: Match controller to features
- Test Boot Order: Ensure disk is first in boot sequence
- Check GRUB Config: Verify root= parameter in VM console
- Use Snapshots: Before making major configuration changes
Tools and Resources
LLM Tools:
- LM Studio: User-friendly GUI for model management
- llama.cpp: Command-line inference engine
- Ollama: Docker-based LLM deployment
- Jan AI: Lightweight alternative to LM Studio
Proxmox Debugging:
- System Logs:
/var/log/pve/tasks/ - VM Logs:
/var/log/qemu-server/ - Configuration:
/etc/pve/qemu-server/
Conclusion
This session demonstrated two critical homelab skills:
- Resource Assessment: Understanding hardware limitations and choosing appropriate solutions (7B models vs 20B models)
- Systematic Debugging: Following error messages and warnings to identify root causes (iothread + controller compatibility)
Both challenges reinforced the importance of: - Reading documentation thoroughly - Monitoring resource constraints - Testing incrementally - Documenting configurations
The working Kali VM and functional LLM setup provide a solid foundation for future security research and AI-assisted development work.
Next steps: Integrate Mistral 7B with coding workflows and deploy vulnerable VMs for Kali pentesting practice.