Skip to content

Linter

The cxas_scrapi.utils.linter module is a rule-based lint engine for validating CXAS agent repositories against best practices and structural requirements. It's inspired by tools like Ruff and pylint: rules are first-class objects with IDs and configurable severity, auto-registered via a @rule decorator, and run against your app's local file tree.

The lint engine is also what powers the cxas lint CLI command — so everything here is available to you programmatically if you want to build custom tooling or integrate linting into a CI step.

Key Components

Component What it does
Severity Enum: ERROR, WARNING, INFO, OFF. Controls whether a rule failure blocks CI.
LintResult Dataclass: holds the file path, rule ID, severity, message, line number, and optional fix suggestion.
LintContext Dataclass: shared context passed to every rule — agent names, tool names, directories, and config options.
Rule Abstract base class for all lint rules. Subclass it and implement check().
@rule(category) Decorator that auto-registers a Rule subclass into the global registry.
run_rules() The main runner: discovers files, dispatches rules by category, and collects results into a LintReport.
LintConfig Loaded from cxaslint.yaml — controls which rules are active and at what severity.

Quick Example

from pathlib import Path
from cxas_scrapi.utils.linter import (
    build_registry,
    build_context,
    run_rules,
    LintConfig,
    LintReport,
    Discovery,
)

project_root = Path(".")
config = LintConfig.load(project_root)
discovery = Discovery(
    app_dir=project_root / config.app_dir,
    evals_dir=project_root / config.evals_dir,
)
registry = build_registry()
context = build_context(project_root, config, discovery)

report = LintReport()
run_rules(registry, config, context, discovery, report)
report.print_summary(show_fixes=True)

# Exit with code 1 if there are errors (great for CI)
report.print_and_exit()

Writing a custom rule:

from cxas_scrapi.utils.linter import Rule, LintResult, LintContext, rule
from pathlib import Path

@rule("instructions")
class NoPlaceholderInstructions(Rule):
    id = "C001"
    name = "No placeholder instructions"
    description = "Instruction files should not contain TODO placeholders."
    default_severity = Severity.WARNING

    def check(self, file_path: Path, content: str, context: LintContext) -> list[LintResult]:
        results = []
        for i, line in enumerate(content.splitlines(), start=1):
            if "TODO" in line:
                results.append(self.make_result(str(file_path), "Found TODO placeholder", line=i))
        return results

Reference

Severity

Bases: Enum

LintResult dataclass

LintResult(file, rule_id, severity, message, line=None, fix_suggestion='')

LintContext dataclass

LintContext(project_root, app_dir, evals_dir, all_agent_names=set(), all_agent_display_names=set(), all_tool_names=set(), all_tool_dirs=dict(), platform_tools=(lambda: {'end_session', 'customize_response'})(), options=dict())

Shared context passed to rules for cross-referencing.

Rule

Bases: ABC

Base class for all lint rules.

Each rule has: - id: unique identifier (e.g., "I001") - name: human-readable name - description: what the rule checks - default_severity: severity when not overridden by config - category: set by the @rule decorator - target: which file type this rule operates on (used by the structure category to dispatch rules to the right files). Values: "app_config", "instruction", "agent_config". Rules that don't set target receive the default files for their category.

check abstractmethod

check(file_path, content, context)

Run this rule against a file. Returns list of LintResults.

Source code in src/cxas_scrapi/utils/linter.py
@abstractmethod
def check(
    self, file_path: Path, content: str, context: "LintContext"
) -> list[LintResult]:
    """Run this rule against a file. Returns list of LintResults."""
    ...

LintReport dataclass

LintReport(results=list())

print_and_exit

print_and_exit(json_output=False, show_fixes=False)

Print results and exit with code 1 if errors, 0 otherwise.

Source code in src/cxas_scrapi/utils/linter.py
def print_and_exit(
    self, json_output: bool = False, show_fixes: bool = False
) -> None:
    """Print results and exit with code 1 if errors, 0 otherwise."""
    import sys  # noqa: PLC0415

    if json_output:
        print(self.to_json())
    else:
        print("\n" + "=" * 60)
        print("LINT RESULTS")
        print("=" * 60)
        self.print_summary(show_fixes=show_fixes)

        if self.errors:
            print(f"\nLint FAILED with {len(self.errors)} error(s).")
        else:
            print("\nLint PASSED (no errors).")

    sys.exit(1 if self.errors else 0)

LintConfig dataclass

LintConfig(app_dir='.', evals_dir='evals/', rules=dict(), options=dict(), ignore=list(), per_file=dict())

Linter configuration loaded from cxaslint.yaml.

get_severity

get_severity(rule_obj, file_path='')

Get the effective severity for a rule.

Considers per-file overrides.

Source code in src/cxas_scrapi/utils/linter.py
def get_severity(self, rule_obj: Rule, file_path: str = "") -> Severity:
    """Get the effective severity for a rule.

    Considers per-file overrides.
    """
    for pattern, overrides in self.per_file.items():
        if fnmatch.fnmatch(file_path, pattern):
            if rule_obj.id in overrides:
                return Severity.from_str(overrides[rule_obj.id])

    if rule_obj.id in self.rules:
        return self.rules[rule_obj.id]

    return rule_obj.default_severity

is_ignored

is_ignored(file_path)

Check if a file matches any ignore pattern.

Source code in src/cxas_scrapi/utils/linter.py
def is_ignored(self, file_path: str) -> bool:
    """Check if a file matches any ignore pattern."""
    return any(fnmatch.fnmatch(file_path, p) for p in self.ignore)

get_options

get_options(rule_id)

Get rule-specific options.

Source code in src/cxas_scrapi/utils/linter.py
def get_options(self, rule_id: str) -> dict:
    """Get rule-specific options."""
    return self.options.get(rule_id, {})

Discovery

Discovery(app_dir, evals_dir)

Discovers agents, tools, callbacks, evals, and configs.

Scans an app directory for all resources.

Source code in src/cxas_scrapi/utils/linter.py
def __init__(self, app_dir: Path, evals_dir: Path):
    self.app_dir = app_dir
    self.evals_dir = evals_dir
    self.app_root = self._find_app_root()

discover_global_instruction

discover_global_instruction()

Return path to global_instruction.txt if it exists.

Source code in src/cxas_scrapi/utils/linter.py
def discover_global_instruction(self) -> Optional[Path]:
    """Return path to ``global_instruction.txt`` if it exists."""
    if not self.app_root:
        return None
    p = self.app_root / "global_instruction.txt"
    return p if p.exists() else None

discover_agents

discover_agents()

Return {dir_name: instruction_or_config_path} for all agents.

Source code in src/cxas_scrapi/utils/linter.py
def discover_agents(self) -> dict[str, Path]:
    """Return ``{dir_name: instruction_or_config_path}`` for all agents."""
    if not self.app_root:
        return {}
    agents_dir = self.app_root / "agents"
    if not agents_dir.exists():
        return {}
    result = {}
    for d in sorted(agents_dir.iterdir()):
        if d.is_dir():
            inst = d / "instruction.txt"
            if inst.exists():
                result[d.name] = inst
            else:
                json_file = d / f"{d.name}.json"
                if json_file.exists():
                    result[d.name] = json_file
    return result

discover_tools

discover_tools()

Return {tool_name: code_path} for all tools.

Source code in src/cxas_scrapi/utils/linter.py
def discover_tools(self) -> dict[str, Path]:
    """Return ``{tool_name: code_path}`` for all tools."""
    if not self.app_root:
        return {}
    tools_dir = self.app_root / "tools"
    if not tools_dir.exists():
        return {}
    result = {}
    for d in sorted(tools_dir.iterdir()):
        if d.is_dir():
            code = d / "python_function" / "python_code.py"
            if code.exists():
                result[d.name] = code
            else:
                json_files = list(d.glob("*.json"))
                if json_files:
                    result[d.name] = json_files[0]
    return result

discover_callbacks

discover_callbacks()

Return [(agent_name, cb_type, cb_name, code_path), ...].

Source code in src/cxas_scrapi/utils/linter.py
def discover_callbacks(self) -> list[tuple[str, str, str, Path]]:
    """Return ``[(agent_name, cb_type, cb_name, code_path), ...]``."""
    if not self.app_root:
        return []
    agents_dir = self.app_root / "agents"
    if not agents_dir.exists():
        return []
    result = []
    cb_types = [
        "before_model_callbacks",
        "after_model_callbacks",
        "before_agent_callbacks",
        "after_agent_callbacks",
        "before_tool_callbacks",
        "after_tool_callbacks",
    ]
    for agent_dir in sorted(agents_dir.iterdir()):
        if not agent_dir.is_dir():
            continue
        for cb_type in cb_types:
            cb_dir = agent_dir / cb_type
            if not cb_dir.exists():
                continue
            for cb in sorted(cb_dir.iterdir()):
                code = cb / "python_code.py"
                if code.exists():
                    result.append((agent_dir.name, cb_type, cb.name, code))
    return result

discover_evals

discover_evals()

Return {filename: path} for all eval YAMLs.

Source code in src/cxas_scrapi/utils/linter.py
def discover_evals(self) -> dict[str, Path]:
    """Return ``{filename: path}`` for all eval YAMLs."""
    result = {}
    if not self.evals_dir.exists():
        return result
    for yaml_path in sorted(self.evals_dir.rglob("*.yaml")):
        rel = str(yaml_path.relative_to(self.evals_dir))
        result[rel] = yaml_path
    return result

discover_app_config

discover_app_config()

Return path to app.json or app.yaml.

Source code in src/cxas_scrapi/utils/linter.py
def discover_app_config(self) -> Optional[Path]:
    """Return path to ``app.json`` or ``app.yaml``."""
    if not self.app_root:
        return None
    for name in ("app.json", "app.yaml"):
        p = self.app_root / name
        if p.exists():
            return p
    return None

discover_agent_configs

discover_agent_configs()

Return {agent_name: json_path} for all agent configs.

Source code in src/cxas_scrapi/utils/linter.py
def discover_agent_configs(self) -> dict[str, Path]:
    """Return ``{agent_name: json_path}`` for all agent configs."""
    if not self.app_root:
        return {}
    agents_dir = self.app_root / "agents"
    if not agents_dir.exists():
        return {}
    result = {}
    for d in sorted(agents_dir.iterdir()):
        if d.is_dir():
            json_file = d / f"{d.name}.json"
            if json_file.exists():
                result[d.name] = json_file
    return result

dir_name_to_display

dir_name_to_display(dir_name)

Convert directory name to display name.

Source code in src/cxas_scrapi/utils/linter.py
def dir_name_to_display(self, dir_name: str) -> str:
    """Convert directory name to display name."""
    return dir_name.replace("_", " ")

build_registry

build_registry()

Build the rule registry by importing all rule modules.

Importing cxas_scrapi.utils.lint_rules triggers the @rule decorator on every rule class, populating _RULE_REGISTRY. We then copy those into a RuleRegistry for config-aware lookups.

Source code in src/cxas_scrapi/utils/linter.py
def build_registry() -> RuleRegistry:
    """Build the rule registry by importing all rule modules.

    Importing ``cxas_scrapi.utils.lint_rules`` triggers the ``@rule``
    decorator on every rule class, populating ``_RULE_REGISTRY``.  We
    then copy those into a ``RuleRegistry`` for config-aware lookups.
    """
    import cxas_scrapi.utils.lint_rules  # noqa: F401,PLC0415

    registry = RuleRegistry()
    for _category, rules in get_registered_rules().items():
        registry.register_all(rules)
    return registry

build_context

build_context(project_root, config, discovery)

Build the shared lint context from discovered app resources.

Source code in src/cxas_scrapi/utils/linter.py
def build_context(
    project_root: Path,
    config: LintConfig,
    discovery: Discovery,
) -> LintContext:
    """Build the shared lint context from discovered app resources."""
    agents = discovery.discover_agents()
    tools = discovery.discover_tools()

    return LintContext(
        project_root=project_root,
        app_dir=discovery.app_dir,
        evals_dir=project_root / config.evals_dir,
        all_agent_names=set(agents.keys()),
        all_agent_display_names={
            discovery.dir_name_to_display(name) for name in agents
        },
        all_tool_names=set(tools.keys()),
        all_tool_dirs={name: path.parent for name, path in tools.items()},
        options=config.options,
    )

run_rules

run_rules(registry, config, context, discovery, report, categories=None, specific_rules=None)

Run lint rules against discovered files.

Source code in src/cxas_scrapi/utils/linter.py
def run_rules(  # noqa: C901
    registry: RuleRegistry,
    config: LintConfig,
    context: LintContext,
    discovery: Discovery,
    report: LintReport,
    categories: Optional[list[str]] = None,
    specific_rules: Optional[set[str]] = None,
):
    """Run lint rules against discovered files."""

    def should_run(rule_obj):
        if specific_rules and rule_obj.id not in specific_rules:
            return False
        if categories and rule_obj.category not in categories:
            return False
        return True

    def _get_severity(rule_obj, file_rel):
        sev = config.get_severity(rule_obj, file_rel)
        return sev if sev != Severity.OFF else None

    def _lint_files(rules: list[Rule], files: dict[str, Path]):
        """Apply rules to a set of discovered files or directories."""
        for _name, file_path in files.items():
            rel = str(file_path.relative_to(context.project_root))
            if config.is_ignored(rel):
                continue
            content = file_path.read_text() if file_path.is_file() else ""
            for rule_obj in rules:
                sev = _get_severity(rule_obj, rel)
                if sev is None:
                    continue
                for result in rule_obj.check(file_path, content, context):
                    result.severity = sev
                    report.add(result)

    def _get_rules(category: str) -> list[Rule]:
        if categories and category not in categories:
            return []
        return [
            r for r in registry.rules_for_category(category) if should_run(r)
        ]

    # Instructions — instruction.txt files + global_instruction.txt
    instruction_files = {
        k: v
        for k, v in discovery.discover_agents().items()
        if v.name == "instruction.txt"
    }
    global_inst = discovery.discover_global_instruction()
    if global_inst:
        instruction_files["global_instruction"] = global_inst
    _lint_files(_get_rules("instructions"), instruction_files)

    # Callbacks
    cb_files = {
        f"{agent}_{cb_type}_{cb_name}": code_path
        for agent, cb_type, cb_name, code_path in discovery.discover_callbacks()
    }
    _lint_files(_get_rules("callbacks"), cb_files)

    # Tools
    _lint_files(_get_rules("tools"), discovery.discover_tools())

    # Evals
    _lint_files(_get_rules("evals"), discovery.discover_evals())

    # Config — app config + agent configs
    config_rules = _get_rules("config")
    if config_rules:
        app_cfg = discovery.discover_app_config()
        config_files = {}
        if app_cfg:
            config_files["app"] = app_cfg
        config_files.update(discovery.discover_agent_configs())
        _lint_files(config_rules, config_files)

    # Target-dispatched rules (structure + schema categories)
    # Rules with a ``target`` property get matched to discovered files.
    target_dispatched = _get_rules("structure") + _get_rules("schema")
    if target_dispatched:
        target_files = {
            "app_config": {},
            "instruction": instruction_files,
            "agent_config": discovery.discover_agent_configs(),
            "tool_config": discovery._discover_resource_dirs("tools"),
            "toolset_config": discovery.discover_toolsets(),
            "guardrail_config": discovery.discover_guardrails(),
            "evaluation_config": discovery.discover_evaluations(),
            "eval_expectation_config": (
                discovery.discover_evaluation_expectations()
            ),
        }
        app_cfg = discovery.discover_app_config()
        if app_cfg:
            target_files["app_config"] = {"app": app_cfg}

        by_target: dict[str, list[Rule]] = {}
        for r in target_dispatched:
            by_target.setdefault(r.target, []).append(r)

        for target_name, rules in by_target.items():
            files = target_files.get(target_name, {})
            if files:
                _lint_files(rules, files)

rule

rule(category)

Class decorator that auto-registers a Rule into its category.

Duplicate rule IDs are silently ignored so that repeated imports (or test-time @rule usage) never produce duplicates.

Usage::

@rule("config")
class InvalidJson(Rule):
    id = "A001"
    ...
Source code in src/cxas_scrapi/utils/linter.py
def rule(category: str):
    """Class decorator that auto-registers a Rule into its category.

    Duplicate rule IDs are silently ignored so that repeated imports
    (or test-time ``@rule`` usage) never produce duplicates.

    Usage::

        @rule("config")
        class InvalidJson(Rule):
            id = "A001"
            ...
    """

    def decorator(cls):
        cls.category = category
        instance = cls()
        if instance.id not in _REGISTERED_IDS:
            _REGISTERED_IDS.add(instance.id)
            _RULE_REGISTRY[category].append(instance)
        return cls

    return decorator