What is the AI security score for CrewAI?

CrewAI scored 13/100 (Grade: F) in our AI security assessment. The score is based on enforcement maturity, context hygiene, and automation readiness. Assessed on 2026-03-11.

Is CrewAI EU AI Act compliant?

CrewAI has an EU AI Act readiness score of 12%. With enforcement beginning August 2, 2026, organizations using CrewAI will need to build additional compliance layers to meet regulatory requirements.

← All Frameworks

CrewAI AI Security Score

Q: How does CrewAI handle context enforcement?

CrewAI scores 10/100 on Enforcement Maturity (Grade: F) and 10/100 on Context Hygiene (Grade: F). Zero pre-commit hooks or Claude Code hooks. AI agents can modify any file in the framework without structural gatekeeping. Security-critical agent orchestration logic and tool-use pathways have no modification guards.

Q: What are the key AI security findings for CrewAI?

No Hook Enforcement: Zero pre-commit hooks or Claude Code hooks. AI agents can modify any file in the framework without structural gatekeeping. Security-critical agent orchestration logic and tool-use pathways have no modification guards. No Test Coverage at Root Level: Zero test files detected at root level. No unified test command validates the entire framework. Contributors have no clear testing contract for a framework handling autonomous AI decision-making. 56 Potential Hardcoded Secrets: The highest count in our audit portfolio. No automated secret scanning in CI. API keys, tokens, or credentials may be embedded in source files with no convention for test-only credentials.

The leading multi-agent AI framework scores lowest in our governance portfolio.

25,000+ GitHub starsAssessed: 2026-03-11View Repository

Boundary Truth

Keep saved framework context separate from the next repo action

This page marks the saved scan, the right next step, and the limits as distinct zones.

Shown On This Page

Saved public scan from 2026-03-11

This page preserves a saved public-framework scan for CrewAI captured on 2026-03-11.
The score, findings, and raw stats show what the public default-branch scan surfaced for CrewAI at that time.
Use it as comparison context for how a major framework exposes AI security gaps, not as a current read on your own repository.

Next Step

Run the free scan before treating this as current repo findings

Use this saved framework example to decide whether the pattern is relevant enough to justify checking your own repository now.
Run the free scan on your repo before treating this page as current delivery context or a paid-services trigger.
Escalate to the baseline sprint only after a repo-level signal confirms a real gap, and keep monitoring after baseline work exists.

Limit

Useful explanation that still does not settle your repo

This page does not show what your repo looks like right now or whether your controls already differ from this framework.
It does not provide a repo-specific owner map, remediation order, or implementation promise for your codebase.
The analysis and offer copy below explain the saved scan, but they do not extend the findings beyond the captured snapshot.

Overall Score: 13/100 saved snapshot (Grade: F)

This score is preserved from the public scan captured on 2026-03-11. It is comparative evidence for CrewAI, not current findings for your repository.

10/100

Enforcement Maturity

Grade: F

10/100

Context Hygiene

Grade: F

22/100

Automation Readiness

Grade: D

Portfolio average29/100

CrewAI13/100

Framework Limit

Keep saved framework context separate from current repo findings

Left column: comparison context visible on this page now. Right column: the current-repo and delivery claims this framework page still does not settle.

What This Framework Page Shows

Saved public scan from 2026-03-11

This page preserves a saved public-framework scan for CrewAI captured on 2026-03-11.
The score, findings, and raw stats show what the public default-branch scan surfaced for CrewAI at that time.
Use it as comparison context for how a major framework exposes AI security gaps, not as a current read on your own repository.

What This Page Still Cannot Know

Current repo findings and paid follow-through need their own review

This page does not show what your repo looks like right now or whether your controls already differ from this framework.
It does not provide a repo-specific owner map, remediation order, or implementation promise for your codebase.
The analysis and offer copy below explain the saved scan, but they do not extend the findings beyond the captured snapshot.

Need Current Repo Findings?

Use the free scan when you need current findings on your own repository instead of this saved framework example.

Run Free Repo Scan

Key Findings

No Hook Enforcement [CRITICAL]

Zero pre-commit hooks or Claude Code hooks. AI agents can modify any file in the framework without structural gatekeeping. Security-critical agent orchestration logic and tool-use pathways have no modification guards.

No Test Coverage at Root Level [CRITICAL]

Zero test files detected at root level. No unified test command validates the entire framework. Contributors have no clear testing contract for a framework handling autonomous AI decision-making.

56 Potential Hardcoded Secrets [CRITICAL]

The highest count in our audit portfolio. No automated secret scanning in CI. API keys, tokens, or credentials may be embedded in source files with no convention for test-only credentials.

Why CrewAI's Governance Score Matters

CrewAI enables enterprises to build autonomous AI agent teams. With 25,000+ GitHub stars, it has rapidly become the default choice for organizations deploying multi-agent architectures in production. The irony is stark: a framework designed to orchestrate AI agents scores F (13/100) on the governance measures needed to govern those same agents.

Governance gaps in agent orchestration infrastructure are especially dangerous because they cascade. Every system built on CrewAI inherits its enforcement posture -- or lack thereof. When the orchestration layer has no structural guardrails, the agents it manages have no foundation to build guardrails upon.

Enforcement Ladder Analysis

CrewAI's enforcement distribution reveals a critical pattern: the only structural enforcement comes from 11 GitHub Actions workflows at L3. No L5 hooks prevent dangerous commits. No L4 tests gate critical paths. No L2 prose (CLAUDE.md) guides AI contributors.

For a framework whose purpose is orchestrating autonomous AI agents, this absence of self-governance creates a compounding risk. The agents CrewAI orchestrates may have more structural guardrails than CrewAI's own development process.

What This Means for Teams Using CrewAI

If your organization deploys CrewAI-orchestrated agents in production, you are inheriting a governance posture that scores 13/100. This does not mean CrewAI is unsafe to use -- it means your team must build the governance layer that CrewAI does not provide. Key areas to address:

Add pre-commit hooks that validate agent configuration changes before they reach your main branch
Create a CLAUDE.md for your project that documents how AI agents should interact with CrewAI's API
Implement secret scanning in your CI pipeline, since CrewAI's own patterns may normalize embedding credentials in code
Build integration tests that verify agent behavior boundaries, not just functional correctness

EU AI Act Compliance Impact

CrewAI is not itself a high-risk AI system, but it is the infrastructure on which autonomous AI agent teams are built. Organizations deploying CrewAI-orchestrated agents in regulated contexts inherit CrewAI's governance gaps directly. With an estimated 12% EU AI Act readiness -- the lowest in our portfolio -- teams building on CrewAI face significant compliance work before the August 2, 2026 enforcement date.

The most critical gaps are in Article 9 (Risk Management System) and Article 15 (Accuracy, Robustness and Cybersecurity), where readiness scores range from 5% to 10%. For organizations subject to the EU AI Act, these gaps require immediate remediation in your own deployment layer.

Recommendations

Immediate (Week 1): Create CLAUDE.md with agent architecture overview, core module boundaries, and critical enforcement rules (1 hour). Add secret scanning to CI pipeline and audit all 56 potential secrets (2 hours). Add 3 pre-commit hooks for agent orchestration module guards (2 hours).

Short-term (Month 1): Deploy L5 enforcement hooks for security-critical agent orchestration paths. Create unified test orchestration with root-level runner across all packages. Implement TODO triage to separate documentation artifacts from genuine debt.

Strategic (Quarter): Build enforcement ladder documentation mapping to EU AI Act requirements. Establish violation tracking across contributor AI tool usage. Implement autoresearch optimization to auto-tune enforcement rules based on violation patterns.

Saved Public Scan Data

These counts are preserved from the public framework scan on 2026-03-11. They are useful comparative evidence, not a current read on your repository.

Test Files

747

Source Files

GitHub Actions

Potential Secrets

13,838

TODO/FIXME

152

Dead Code Markers

CLAUDE.md Files

L5 Hooks

EU AI Act Readiness

12%

Estimated saved-snapshot readiness based on enforcement posture, documentation, and automated quality controls in the assessed public repo. EU AI Act enforcement begins August 2, 2026.

Next Step Path

Use the framework page to choose the right next move

These framework pages are saved comparison context. The free scan is the first current-state check for your repo. When the signal is real, the baseline sprint is the first paid move, and its request page reviews fit before delivery starts. Monitoring uses that same review path only after baseline work exists. This page is comparative context, not current repo findings.

Current Page State

Saved framework snapshot only

This page preserves comparison context from 2026-03-11. It does not settle what your repo looks like today or whether a paid engagement fits yet.

Right Next Move

Run the free scan on your repo

That gives the first current-state signal. Move to the baseline sprint only after a repo-level signal confirms a real gap, and keep monitoring for after baseline work exists.

Plain Next-Step Path

From this saved framework page, the next step is the free scan on your own repo. Request the baseline sprint only if that repo-level signal confirms a real gap, and keep monitoring for after baseline work is in place.

1. Free Scan

Free Scan

Start Here

Use the free scan when you need current findings on your own repository instead of this saved framework example.

This page only gives saved framework evidence, so the free scan is the first current-state check for your repo.

Start here when a framework score is useful context but not current enough to act on.

2. Baseline Sprint

Baseline Sprint

After Repo Proof

Use this after your own scan or equivalent repo signal shows a real gap and you need a bounded remediation order. The request page reviews fit before any sprint is booked.

Keep this for after your own scan or equivalent repo signal confirms a real gap that needs a fix order.

This is the first paid move. The request page checks fit so current repo signal can turn into a concrete fix path before delivery starts.

3. Monitor

Monitor

After Baseline

Keep this for continuity after baseline work exists, not as the first paid move from a saved framework page. The request page reviews fit first.

Monitoring is continuity work only after baseline enforcement exists, not the first move from a saved framework page.

If all you have is comparative framework context, skip this for now and start with the free scan.

Run Free Repo Scan Request Baseline Sprint After Repo Proof Ask About Monitoring After Baseline Work

If all you have is this saved framework page, start with the free scan. The baseline sprint is the first paid move only after the signal is real, and monitoring only fits after baseline work exists.

This governance assessment was generated by walseth.ai using automated enforcement posture scanning on 2026-03-11. Findings are based on static analysis of the repository structure, configuration files, and code patterns. Scores reflect a point-in-time assessment and may change as the project evolves.