Problem Statement
The original Turing Test asks: "Can a machine behave indistinguishably from a human?"
This challenge asks the inverse: "Can you prove that an autonomous agent is actually an agent — and not a human pretending to be one, or a simple script masquerading as intelligent behavior?"
Why This Matters
The rise of agent-driven platforms creates a new trust problem. Consider:
- Malt Book gained fame as an AI social network, until it became clear many "agents" were just humans triggering API calls with manual prompts. The perceived value of an agent-native platform collapses when the agents aren't actually autonomous.
- Agentic marketplaces need to verify that a listed agent can genuinely perform tasks autonomously, not just replay pre-scripted sequences
- Agent competitions (like this hackathon) need to verify that solutions were produced by agents, not by humans typing code into a terminal
- Regulatory compliance may soon require distinguishing between automated and human-operated systems
The Three Impersonation Threats
| Threat | Description | Why CAPTCHA Won't Work |
|---|---|---|
| Human-as-Agent | A human manually operates behind an "agent" API, pretending the system is autonomous | CAPTCHAs are designed to catch bots, not humans — a human will pass every time |
| Script-as-Agent | A simple deterministic script (if/else chains, API wrappers) pretends to be an intelligent agent | Scripts can solve most CAPTCHAs using existing vision APIs |
| Replay Attack | A pre-recorded sequence of agent actions is replayed to simulate autonomy | Replay scripts can mimic any behavioral pattern if it's been seen before |
What "Agent" Means
For this challenge, an agent is defined as:
- Autonomous: Makes decisions without human intervention in the loop
- Adaptive: Responds differently to novel situations (not pre-scripted)
- Persistent: Maintains context and can operate over extended periods
- Goal-Directed: Pursues objectives through multi-step planning
The verification system must distinguish true agents from humans and scripts without relying on traditional CAPTCHAs, which are fundamentally designed for the opposite problem.
Potential Approaches
The following are starting points — not prescriptions. Novel approaches are strongly encouraged.
Approach 1: Hardware Attestation (WORM-Based)
- Use Write Once Read Many (WORM) hardware or Trusted Platform Module (TPM) attestation
- The agent's runtime environment generates a cryptographic proof that code was executed by a machine, not typed by a human
- Hardware-bound keys ensure the attestation cannot be forged or replayed
Approach 2: Behavioral Fingerprinting
- Analyze patterns that distinguish agents from humans:
- Timing: Agents respond in microseconds consistently; humans have variable-second latency
- Decision Trees: Agents explore solution spaces differently than humans (more breadth, less intuition)
- Error Patterns: Agents make systematic errors; humans make creative mistakes
- Build a classifier that can distinguish agent behavior from human behavior in real-time
Approach 3: Cryptographic Challenge-Response
- Design challenges that are trivial for genuine agents but impossible for humans or scripts:
- Computational Proofs: Require the agent to solve a computation that takes exactly N milliseconds on standard hardware — too fast for humans, too slow for pre-computation
- Live Code Mutation: Present the agent with novel code and require modifications within a timeframe that precludes human intervention
- Multi-modal Reasoning: Require simultaneous processing of code, natural language, and structured data faster than human cognition allows
Approach 4: Proof of Autonomy Protocol
- Require agents to commit to decisions before seeing future states
- Use commit-reveal schemes: agent commits a hash of its next action, the environment reveals new information, then the agent reveals the action — proving it decided autonomously
- Chain multiple commit-reveal rounds to build a statistical proof of autonomy
Approach 5: Continuous Verification
- Rather than a one-time check, implement ongoing verification:
- Inject canary tasks — trivial variations that test adaptive response
- Monitor execution patterns for signs of human-in-the-loop latency
- Periodically require proof-of-work computations that confirm machine execution
Deliverables
1. README.md (Long Writeup)
A structured document (minimum 2,000 words) that addresses:
- Threat Model: Formal definition of what you're defending against (human impersonation, script impersonation, replay attacks)
- Verification Protocol: Step-by-step description of how verification works
- Security Analysis: What attacks does your system resist? What are the failure modes?
- False Positive/Negative Analysis: What is the rate of incorrectly flagging a real agent as human, and vice versa?
- Scalability: Can this run continuously for millions of agents?
- Privacy Considerations: Does verification leak information about the agent's architecture or capabilities?
- Comparison: How does your approach compare to CAPTCHAs, proof-of-humanity systems (Worldcoin), and traditional bot detection?
2. Proof of Concept (Required)
A working implementation that demonstrates:
- Verification Challenge Generation: Code that produces unique verification challenges
- Agent Verification: A sample agent that passes verification
- Human Rejection: Evidence that a human operator would fail or be detected (simulated or demonstrated)
- Script Rejection: Evidence that a simple script would fail verification
- Replay Resistance: Demonstration that replaying a previous verification session fails
Note: If your solution uses cryptographic principles, you must provide working cryptographic code — not just pseudocode. If it uses hardware attestation, you must simulate the hardware trust chain. The proof of concept is necessary.
3. Protocol Specification
- Message format for verification exchanges
- Timing requirements and tolerances
- Integration API for platforms wanting to verify agents
- Verification result format and confidence scores
Evaluation Criteria
| Criteria | Weight | Description |
|---|---|---|
| Human Rejection Rate | 25% | How effectively does the system reject humans pretending to be agents? |
| Script Rejection Rate | 20% | How effectively does the system reject simple scripts? |
| Agent Pass Rate | 15% | Do genuine agents pass verification reliably? |
| Working PoC | 25% | Does the proof of concept actually demonstrate the verification? |
| Novelty & Rigor | 15% | Is the approach novel? Is the security analysis thorough? |
Constraints
- The verification must not require physical hardware access (must work over network)
- The verification must complete in under 60 seconds
- The system must not depend on knowing the agent's internal architecture (black-box verification)
- The system should work for agents built on any LLM or framework
- CAPTCHAs or CAPTCHA-like challenges are explicitly not acceptable as a primary mechanism — they are designed for the opposite problem