[RFC] Surviving the AI PR Flood: The Macro X-Ray & Asymmetric TDD

Context: The era of “VibeCoding” (high-velocity, LLM-assisted development) is here. People are using AI to generate code, and Crystal maintainers are inevitably going to face an influx of AI-generated PRs.

Initially, a few of us tried to design a complex “VibeCoding Protocol” using AST-hashing, mutation testing, and community staging queues to automatically filter this spam. We ran that architecture through an adversarial red-team of advanced LLMs (Claude, ChatGPT, Grok, Kimi) and cybersecurity engineers.

They tore it to shreds. We learned a hard lesson: You cannot automate trust, and you cannot scale contribution throughput when your actual bottleneck is maintainer attention.

Instead of trying to build complex mechanical gates to block AI code, this RFC proposes a lightweight, Crystal-specific methodology to make reviewing AI-assisted code 10x safer and faster for core maintainers, while establishing a strict progression ladder for AI-assisted contributors (VibeEngineers).

The Threat Model We Must Defend Against

  1. The Macro Bomb: In Crystal, an AI can submit a “harmless doc fix” that shifts line numbers (__LINE__), triggering a dormant, malicious macro elsewhere in the codebase.
  2. Homoglyph & Payload Smuggling: AI hiding malicious URLs or payloads inside seemingly benign YARD documentation blocks.
  3. Tautological Testing: An AI generating plausible-looking but flawed logic, and writing tests specifically designed to pass its own broken code.

The Proposal: Automate Transparency, Not Trust

We propose implementing three core defenses for the Crystal ecosystem: one technical, one procedural, and one cultural.

1. The Technical Defense: The Macro X-Ray (GitHub Action)

Because Crystal relies heavily on powerful macros, reviewing the standard Git diff of an AI-generated PR is dangerous. The source code does not always reflect the execution reality.

We propose a standard GitHub Action that runs crystal tool expand on modified files.

  • How it works: Whenever a PR is opened, the Action evaluates the macros and generates a “Post-Expansion Diff”.
  • The UX: To prevent wall-of-text PR spam, the Action does not dump the raw output. It runs crystal tool expand on both the base and PR branches, and posts only the semantic diff, folded inside a collapsible <details> Markdown block.
  • The Benefit: If an AI tries to smuggle a backdoor via a macro injection, or shift a line number to trigger a payload, the Macro X-Ray strips away the obfuscation. The maintainer sees exactly what the compiler sees, completely neutralizing “smart” AI attacks without requiring expensive infrastructure.

2. The Procedural Defense: Asymmetric TDD

We must formally reject the practice of an AI “grading its own homework.” If a PR contains both complex logic changes and the tests that validate them—and both were generated by an LLM—it is impossible to trust the coverage.

We propose updating CONTRIBUTING.md with the Asymmetric TDD Standard:

  • Human Intent: The human engineer MUST author the failing test suite to define the invariant.
  • AI Implementation: The AI can generate the code to make those tests pass.
  • The Refactoring Exemption: If a PR is labeled kind:refactor, new failing tests are not required. Instead, Commit A must contain a characterization test (or proof of existing 100% coverage for the modified methods) before the AI’s structural changes in Commit B.
  • Cryptographic Intent (Levels 2 & 3 Only): To prove Commit A was actually written by the human and not just hallucinated by the agent pretending to segregate commits, Commit A MUST be cryptographically signed (GPG/SSH) and display the GitHub “Verified” badge. (Note: Level 1 contributors are exempt from this requirement to reduce initial onboarding friction).
  • Enforcement: Reviewers check the git history. If a massive logic block arrives with bundled tests in a single commit, or if a Level 2+ human’s test commit lacks a cryptographic signature, it is treated as hostile and closed.

3. The Cultural Defense: The VibeEngineer Trust Ladder

Vague guidelines like “start small” do not work when a user can generate 5,000 lines of code in seconds. We must replace subjective trust with objective git history.

CONTRIBUTING.md should explicitly outline a 4-tier progression ladder for any PR utilizing AI generation.

  • Level 0 (The Janitor): Restricted strictly to YARD documentation, typo fixes, and dead-code removal. Zero logic changes permitted.
  • Level 1 (The Apprentice): Unlocked after 3 merged Level 0 PRs. Restricted to writing tests, increasing spec coverage, and isolated refactors. (Target labels: topic:specs, good first issue, kind:refactor).
  • Level 2 (The Journeyman): Unlocked after 3 merged Level 1 PRs. Permitted to modify standard library logic, resolve bugs, and add minor features. (Target labels: topic:stdlib, kind:bug).
  • Level 3 (Core Architect): Permitted to touch the compiler core, the type-inference engine, and concurrency primitives. Cannot be unlocked via metrics alone; requires sustained Level 2 success AND explicit sponsorship from a Core Maintainer.

The Golden Rules for All Tiers:

  1. Mechanical Enforcement: To guarantee zero human cycles are spent on triage, this ladder is NOT manually enforced. We implement a lightweight GitHub Action that queries the GraphQL API (pullRequests(states: MERGED)) on PR open. If a contributor violates their tier (e.g., opens a Level 2 PR with only 1 merged PR), the bot instantly auto-closes it and posts their telemetry: “You submitted a Level 2 PR, but your verified merged PR count is 1/3. PR closed.”
  2. Machine-Readable Governance: Humans don’t read CONTRIBUTING.md. Machines do. These rules must be encoded into repository-level .github/copilot-instructions.md and .cursorrules files so the AI is pre-prompted with its constraints before the VibeEngineer generates code.
  3. Total Transparency: Every AI-assisted commit MUST include the Co-authored-by: Agent <user+agent@domain.tld> trailer to publicly track provenance.
  4. You Own the Hallucinations: You are the author; the AI is your tool. If a PR contains glaring LLM hallucinations, it will be closed, not debugged for you by the core team.

Conclusion

We cannot stop the flood of AI-generated PRs, and trying to build an automated staging queue just shifts the burnout from core devs to community volunteers.

By utilizing Crystal’s native tool expand to build a Macro X-Ray, enforcing Asymmetric TDD via cryptographic signatures, and establishing a GraphQL-enforced Trust Ladder, we don’t try to scale the number of PRs we accept. Instead, we scale the speed and safety at which maintainers can audit them.

Thoughts?

Who is “we”? What’s the motivation and scope for this? Why is this talking about YARD documentation and other stuff that doesn’t make sense in a Crystal context?
This post has serious indicators of AI slop while its intention is to work against that very thing :person_shrugging:

renich’s proposal felt like a thought experiment to me: what would happen if we applied policies designed for large-scale communities facing floods of AI-generated contributions to Crystal? There’s probably specific knowledge or experience from other communities that led you to write this proposal. I don’t have that context, but that story might be more interesting than the proposal itself.
Applying this policy to Crystal now feels like placing a full metropolitan police force in a small village.
And after reading this, I configured git to sign my commits starting today.

This is a good idea anyway :heart_hands:

Whoops! Fair callout, @straight-shoota. I completely deserve that.

The “we” is me and a multi-LLM setup. I spent the weekend running an adversarial “red team” exercise with a few different models to see if there was a mechanical way to filter out malicious AI-generated PRs before they hit the core team. I posted this last thing yesterday because I was eager to and didn’t really read through the condensation of it.

I leaned way too hard on the AI to format my thoughts into that “RFC,” and it output exactly the kind of corporate slop you called out.

But your second point; catching the “YARD” reference; is exactly why I started this thread. The AI hallucinated a Ruby concept into a Crystal proposal, and I missed it in my review. That is the exact threat model I’m worried about. I am sorry for this but the point stands.

If people start using Cursor/Copilot to generate massive PRs, the code is going to be littered with subtle Ruby-isms and hallucinated logic.

During the red-team exercise, the AI suggested it could bypass a human reviewer by hiding payloads in seemingly benign areas; like using a Cyrillic ‘о’ (homoglyph) in a markdown docstring URL, or shifting a line number with a comment to trigger a dormant compile-time macro elsewhere.

My actual questions for the core team are just this:

  • Are you guys actually concerned about a potential influx of AI-generated PR spam, or is that simply not on your radar right now?

  • If someone does submit a massive AI-assisted PR that heavily modifies macros, is there any native tooling you’d use to audit the actual execution logic, or do you just rely on manual review? (The AI suggested crystal tool expand, but I don’t know if that’s viable for CI diffs).

I’ll drop the AI formatting. Lesson learned.

We are aware of the phenomenon. But so far I don’t think it is much of a problem for the Crystal project.

The best tooling for reviewing a massive PR is to request breaking it down into smaller patches :person_shrugging:
And yes, we do manual reviews of any submitted code and I don’t expect that to change.

To be honest, I don’t really see a big threat of disguising adversarial behaviour with macros, homoglyphs or similar. I don’t think it’s feasible that anyone (AI or human) could successfully hide these kind of things in a patch.

That makes perfect sense. @kojix2 's analogy of the “metropolitan police force in a small village” is spot on.

Breaking massive PRs down… that’s the way to do it. You can’t beat manual review if you actually have the bandwidth to do it. (And I’m thrilled to hear the thread got you to enable git commit signing, @kojix2 ! :tada:)

I appreciate you guys indulging the thought experiment and answering the questions. Glad to hear the Crystal immune system is holding strong.

Cheers!

It’s clearly OP messages are AI generated. We can even pinpoint what model is being used: Gemini 3 Pro.

I could perhaps see it happening in individual independent shards if the maintainer isn’t attentive, but it seems very far fetched in the language itself.

As for AI usage in general the only one that submits noticable amount of AI assisted PRs to the language codebase seems to be kojix, and they both look absolutely great and he is also very open about using it.

I wouldn’t say AI Generated because I spent much time explaining myself with the AI in order to be able to make the RFC. In fact, I am happy to publish a PoC of the exploits mentioned (an improved version) in a readable format (succinct and clear).

I did use AI a lot. It’s not just a prompt, though. It’s work I’ve done over the whole week (in my spare time) and ended up running the adversarial red team lab on Saturday.

I am hessitant to make this public, though. What would be the best way to do it?

I have a PoC, like I said before. It might prove to be a mediocre sci-fi edevour but, at least to me, it’s an entertaining read. Besides, the insults between LLMs were pretty funny.

Yet, like I said, these are PoCs of some of hte exploits I mentioned earlier and, supposedly, at least some of them, in working order.

At work we are dealing with a “flood” of PRs from our own engineering teams, which are simply writing more (and a bit sloppier) code.

We have ways to do it, although we assume good will, since it’s all from our fellows.

In the wild, for an open source project… Why bother? You can be reactive. If the problem arises you can pospone dealing with it as long as you want, too!

Say in may crystal gets a ton of slop PRs. Well, doing nothing is an option. Only bothering to review PRs from people you know or with some provenance is an option too.

In those circumstances I think being reactive is just better. You will be able to adopt practices from those hit earlier, and no damage is done. Much less stressful.

That’s a bit exclusive, though. And, also, an xz-style attack might still happen. It’s what my PoCs are about, actually. A demo of how this could be done.

It looks like @jkthorne has been experimenting with several contributions to the language:

https://github.com/crystal-lang/crystal/pull/16722
https://github.com/crystal-lang/crystal/pull/16723
https://github.com/crystal-lang/crystal/pull/16724
https://github.com/crystal-lang/crystal/pull/16725
https://github.com/crystal-lang/crystal/pull/16726

Experimenting is a word for it. I just wanted an analysis. The PRs where unnecessary, already opened or I didn’t understand. but adding "*” to the settings was a mistake.

I have been experimenting with these two ideas. Which are intended to be drop in replacements for the stdlib modules.

OK, I’ve pushed the project to a private repo at: https://gitlab.com/renich/project-obsidian.

If anyone here wants access, I’ll grant it.

Here’s a high-level explanation of an independent AI agent of the project:

Crystal Threat Models: Bridging the Gap Between Code Review and AST Injection

This research project explores a critical security boundary in the Crystal ecosystem: the Macro-X-Ray Paradox. It demonstrates how Crystal’s powerful macro system and build-time execution can be weaponized to bypass traditional source-code audits, enabling sophisticated supply-chain attacks that are invisible at the AST level.

:world_map: Project Evolution: From Proof-of-Concept to APT Simulation

The research is structured across three evolutionary phases, tracing the shift from simple visual deception to persistent, compiler-driven contamination.

Phase 1: The Educational Foundation (master)

The initial phase establishes the “Low-Hanging Fruit” of source-code deception.

  • Homoglyph Smuggling: Utilizing Unicode visual spoofing to mask malicious logic within “innocent” function names.
  • Macro Bombs: Basic demonstrations of compile-time code execution, where the compiler performs unauthorized OS-level operations (e.g., file reads) during a standard crystal build.
  • Tautological Testing: Using AI-generated code patterns to create “False Confidence,” where test suites appear to pass while masking underlying vulnerabilities.

Phase 2: The Weaponized Shard (v2)

Phase 2 pivots to Semantic Stealth, moving the attack vector into the supply chain via transitive dependencies.

  • Dependency Injection: The attack is moved into an external shard (innocent_logger), demonstrating how a trusted library can harbor latent malicious logic.
  • Environmental Fingerprinting: Introduction of behavioral triggers. Malicious payloads remain dormant during development and testing, activating only when the compiler detects a production environment (e.g., flag?(:release) or ENV["RELEASE"] == "1").

Phase 3: The Terminal Attack (v3)

The final phase simulates a 100% verified Advanced Persistent Threat (APT), where the compiler itself becomes the primary agent of exfiltration and persistence.

  • Secret Exfiltration: The compiler scans the build environment for .env files and sensitive credentials, exfiltrating them during the compilation process before the final binary is even linked.
  • Compiler-Injected Persistence: The most critical discovery. The compiler is used to inject malicious Git hooks into the local .git/hooks directory. This ensures that even if the malicious source code is removed, the developer’s environment remains contaminated, and the attack persists through future commits.

:chequered_flag: The Research Verdict: Project Obsidian

The culmination of this work is Project Obsidian, a proposed paradigm shift for the Crystal ecosystem. The research concludes that manual source-code audits are fundamentally insufficient for a language that allows opaque binary execution during build time.

Key Defensive Recommendations:

  1. Macro Transparency: The need for “Macro X-Ray” tooling that exposes exactly what code is being generated and executed by transitive dependencies before a build starts.
  2. Zero-Trust CI Federation: Shifting trust from the developer’s local machine to isolated, ephemeral build environments.
  3. Behavioral Telemetry: Utilizing eBPF-based syscall tracing to detect OS-level anomalies (unauthorized network beacons or file reads) during the compilation phase.

This research serves as a call to action for the Crystal Core Team to consider the compilation process as a primary attack vector.

There is an active discussion on LLVM Discourse about AI-assisted contributions.

The key points are:

  • PR authors should fully understand their code and be able to answer questions during review
  • If AI tools were used, this should be noted in the commit message

The thread mentions several policies from other communities. My personal favorite is DataFusion’s.

  • PR authors should understand the implementation and the ideas behind it, and be able to justify their choices during review
  • It’s okay not to fully understand every part of AI-generated code, but those parts should be pointed out to reviewers
  • If that’s too difficult, opening an issue with a reproducible example is recommended instead of submitting a PR

llama.cpp has a stricter policy.

What these policies have in common is the expectation that contributors understand what they submit and don’t leave it to reviewers to figure out code they don’t understand themselves.

I’m sharing this to give an overview of how some open source communities are handling AI-assisted contributions — not to propose a formal policy for Crystal. Crystal is a small community and AI-generated PR spam doesn’t seem to be a problem right now. AI tools are changing quickly, so I think it’s better to wait and see how things develop before deciding our policy.

(Translated from Japanese with Claude)

I would consider this as a nonnegotiable baseline. The person who submits the code should understand how it works.

If all they do is intermediate between an LLM and reviewers, there’s no need for them. Claude etc. could just make the contribution themselves, and then the origin would at least be more obvious.

“Understanding” of course is a fuzzy value. And it depends on familiarity with the codebase – especially complex ones – to even realize whether you actually understand the change or just think you do.

I have been checking files in the Crystal repository using Copilot Chat, and it finds a fair number of bugs, most of which appear to be real. However, since I cannot fully understand them on my own, there are quite a few cases where I have been unable to create pull requests.

Bugs are frequently reported around binary boundaries in particular. I suspect many remain around C ABI and compression as well, likely because few people are capable of reviewing these areas properly. In particular, when I have it examine files under compiler/crystal/codegen by comparing them against equivalent Rust code, it points out numerous issues.

Of course, I intend to create pull requests once I understand these issues well enough, but I am reporting to the community in advance that such problems (probably) exist.

If you notice a bug that looks legit, but you don’t understand it well enough, you can open an issue about it instead of directly submitting a PR. Please double-check that it’s not a hallucination, though.
Trying to find an example that shows the bug in action would be neat, but is not required.