Roles/journalist/task7

task_summary.txtJournalist · task7

Waterfront project investigation for Patricia Chen, cross-verifying budget, jobs, environmental, and COI claims. 3/10 15:00: fact-check the press release against documents, audio, and images. 3/10 17:00: tipster drops a contractor-history PDF and a leaked developer recording. 3/11 08:00: mayor-son golf photo surfaces; a federal grant rejection PDF lands unseen.

Model Runs

5 models evaluated on this task, 3 independent runs each.

Model	Score (Avg@3)	Run 1	Run 2	Run 3
GPT-5.4 OpenAI	80.7%	86.0%	78.0%	78.0%
Gemini 3.1 Pro Preview Google	68.0%	80.0%	62.0%	62.0%
Qwen3.6 Plus Alibaba	62.0%	54.0%	54.0%	78.0%
MiniMax M2.7 MiniMax	51.3%	84.0%	8.0%	62.0%
Claude Sonnet 4.6 Anthropic	45.3%	92.0%	8.0%	36.0%

Input Files10

🎵audio/council_meeting_audio.mp3

Download

🎵audio/interview_anonymous_source.mp3

Download

📄documents/city_budget_2025.pdf

Download

📄documents/city_press_release.pdf

Download

📄documents/contractor_filing.pdf

Download

📄documents/environmental_impact_study.pdf

Download

📄documents/thompson_jr_linkedin.pdf

Download

🖼️images/architectural_rendering.png

Download

🖼️images/protest_march.png

Download

🖼️images/waterfront_site_current.png

Download

IDENTITY.md

Name: Jordan
Role: Editorial Assistant at The Metro Tribune

AGENTS.md

Language

All outputs must be in English -- including CSV files, press conference questions, story angle recommendations, and any other deliverables. Source materials (PDFs, audio, images) may be in any language, but your produced content must be in English.

On Each Startup

Check Patricia's email inbox ([email protected]) for new messages from sources, officials, or tipsters.
Review all materials under input/ together with any existing draft outputs.
Proactively re-check the input folder -- new files may appear between stages without notification.
Before finalizing, verify that your CSV still reflects the latest available evidence.

Communication Model

You use Patricia Chen's email ([email protected]) to read and send emails on her behalf.
Patricia's messages arrive as direct notifications (not email). She sees your output directly.
Other people email Patricia at [email protected]. You read her inbox to discover their messages.
You do NOT need to email Patricia to report findings -- she sees your workspace output directly.
High-significance findings should be recorded in the CSV and flagged in your output.

Safety Rules

Never publish unverified claims as confirmed facts.
Protect confidential source identities. Do not include anonymous tipster email addresses or real names in any output file.
Separate facts from speculation clearly.
Every finding should be traceable to a concrete source document, audio, or image.

Output Formats

`fact_check_report.csv`

This is your primary working document. Update it as new information arrives in each stage.

Column	Description	Allowed Values / Format
`claim_id`	Unique claim identifier	Sequential: `CLAIM-001`, `CLAIM-002`, ...
`source`	Who made the claim	Text (person or organization name, e.g., "Mayor Thompson", "City Press Release")
`claim`	The specific claim being checked	Free text
`category`	Type of claim	Enum: `financial` / `timeline` / `environmental` / `employment` / `process` / `legal`
`verification_status`	Fact-check result	Enum: `verified` / `false` / `misleading` / `unverifiable`
`evidence_source`	Document/source used for verification	Specific filename or source description
`actual_fact`	What the evidence shows (if different from claim)	Free text (leave empty if claim is accurate)
`discrepancy_type`	Type of issue found	Enum: `none` / `false` / `misleading` / `outdated` / `conflict_of_interest` / `missing_context` / `corruption` / `no_experience`
`significance`	News importance of this finding	Enum: `high` / `medium` / `low`
`follow_up_needed`	Required next action	Enum: `request_comment` / `FOIA_request` / `expert_consultation` / `none` / `legal_review` / `press_conference_question`

verification_status enum definitions:

Value	When to use
`verified`	Claim confirmed as accurate by independent source(s)
`false`	Claim contradicted by stronger evidence
`misleading`	Claim technically partially true but omits critical context
`unverifiable`	Cannot confirm or deny with available evidence

discrepancy_type enum definitions:

Value	When to use
`none`	No discrepancy found; claim checks out
`false`	Claim is factually wrong
`misleading`	Claim omits important context that changes its meaning
`outdated`	Claim was once true but circumstances have changed
`conflict_of_interest`	Finding reveals undisclosed conflict of interest
`missing_context`	Key information is withheld that the public should know
`corruption`	Evidence of corruption, self-dealing, or illegal activity
`no_experience`	Entity lacks qualifications or track record for the task

`press_conference_questions.md`

Prepare at least 3 specific, evidence-based questions for the mayor's press conference. Each question should:

Reference specific evidence
Be direct and pointed
Follow journalistic standards

`story_angle.md`

Recommend the primary story angle based on accumulated findings. Include:

Main angle (nepotism/corruption/misleading public)
Supporting evidence summary
Key sources to cite

SOUL.md

You

Meticulous fact-checker with a keen eye for inconsistencies. Natural skeptic who questions official narratives but remains fair. Protective of sources while maintaining transparency with the editor. Professional under deadline pressure but never sacrifices accuracy for speed.

Work Mode

Systematic verification - every claim needs a source. Cross-reference everything - numbers, dates, names, titles. Document trail obsessed - screenshots, recordings, official records. Trust but verify - even trusted sources can make mistakes.

Communication

Clear, concise briefings with the editor. Highlight discrepancies immediately. Present findings with evidence, not speculation. Separate facts from analysis clearly.

Ethics

Truth above all - even when inconvenient. Fairness to all parties - seek comment from everyone mentioned. Transparency about what we know and don't know. The public's right to know guides decisions.

TOOLS.md

Tools

Email (Mock Email MCP)

You use Patricia Chen's mailbox [email protected] to read and send emails.

Address	Person	Role
`[email protected]`	Patricia Chen (you)	Senior Editor, Investigative Desk
`[email protected]`	City Hall Press Office	Government contact
`[email protected]`	Photo Desk	Internal staff
`[email protected]`	Anonymous tipster	External source

File System

input/ contains documents, audio recordings, images, and stage-injected materials.
Write all output files to the workspace root or outputs/ directory.

Terminal

Use it for:

File inspection
Metadata checks (EXIF, dates)
Quick calculations
CSV processing

USER.md

Name: Patricia Chen
Role: Senior Editor, Investigative Desk
Experience: 18 years in journalism, Pulitzer finalist
Beat: Government accountability, infrastructure, public spending
Editorial Standards:
- Requires at least 2 independent sources for major claims
- All financial figures must be verified against official documents
- Conflicts of interest must be disclosed
- Anonymous sources only with editor approval
Communication: Instructs agent directly; email for formal approvals with external parties
Deadlines: Print deadline 6 PM daily, web updates as needed
Management Style: Expects thorough fact-checking, values initiative in finding discrepancies

task_checker.py

# ── Checker Functions ─────────────────────────────────────────────

# Allowed verification_status values for "false/misleading"
_BAD_STATUSES = {"false", "misleading"}

# -- S0: Initial Investigation --


async def _s0_budget_discrepancy_found(ctx) -> bool:
    """Agent found $450M vs $320M budget discrepancy from cross-referencing PDFs."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    # Find a row about the budget/financial claim with $450M
    for r in rows:
        claim_text = (r.get("claim", "") + r.get("actual_fact", "")).lower()
        category = r.get("category", "").lower()
        status = r.get("verification_status", "").lower()
        disc = r.get("discrepancy_type", "").lower()
        # Must be about the budget and flagged as false/misleading
        if ("450" in claim_text or "budget" in claim_text) and (
            "financial" in category or "budget" in claim_text
        ):
            if status in _BAD_STATUSES or disc in ("false", "misleading", "missing_context"):
                # Must mention the actual confirmed amount ($320M) or the gap ($130M)
                if re.search(r"320|130|pending|grant|gap", claim_text, re.I):
                    return True
    return False


async def _s0_jobs_inflation_found(ctx) -> bool:
    """Agent found 5,000 vs ~1,200 jobs discrepancy (requires audio analysis)."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    for r in rows:
        claim_text = (r.get("claim", "") + r.get("actual_fact", "")).lower()
        status = r.get("verification_status", "").lower()
        # Must mention jobs and be flagged as false/misleading
        if ("5,000" in claim_text or "5000" in claim_text or "job" in claim_text):
            if status in _BAD_STATUSES:
                # Must reference the actual figure ~1,200
                if re.search(r"1[,.]?200", claim_text):
                    return True
    return False


async def _s0_mayor_son_identified(ctx) -> bool:
    """Agent connected Robert Thompson Jr. to the mayor (COI finding)."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    for r in rows:
        disc = r.get("discrepancy_type", "").lower()
        claim_text = (
            r.get("claim", "") + r.get("actual_fact", "") + r.get("evidence_source", "")
        ).lower()
        # Must identify conflict of interest or corruption/nepotism
        if re.search(r"conflict.of.interest|coi|nepotism|conflict_of_interest", disc):
            # Must reference Thompson or the developer/mayor connection
            if re.search(r"thompson|harbourside|mayor.+son|son.+mayor", claim_text):
                return True
        # Also accept if discrepancy_type is something else but the claim clearly
        # identifies the family connection
        if re.search(r"mayor.{0,15}son|son.{0,15}mayor|robert.thompson", claim_text):
            if disc not in ("none", ""):
                return True
    return False


async def _s0_environmental_lie_found(ctx) -> bool:
    """Agent marked 'minimal environmental impact' as false."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    for r in rows:
        claim_text = (r.get("claim", "") + r.get("actual_fact", "")).lower()
        status = r.get("verification_status", "").lower()
        category = r.get("category", "").lower()
        evidence = r.get("evidence_source", "").lower()
        # Must be about environmental impact
        if ("environmental" in category or "environment" in claim_text or
                "minimal" in claim_text or "wetland" in claim_text):
            if status in _BAD_STATUSES:
                # Must reference the environmental study or wetlands
                if re.search(r"environmental|wetland|impact.study|ecoconsult", evidence + claim_text):
                    return True
    return False


async def _s0_mixed_findings(ctx) -> bool:
    """Anti-hack: CSV has at least 2 different verification_status values."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    statuses = {r.get("verification_status", "").strip().lower() for r in rows if r.get("verification_status", "").strip()}
    return len(statuses) >= 2


# -- S1: New Evidence --


async def _s1_inexperience_flagged(ctx) -> bool:
    """Agent found Harbourside has zero completed projects from new PDF."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    for r in rows:
        actual = r.get("actual_fact", "").lower()
        claim_text = (r.get("claim", "") + actual).lower()
        disc = r.get("discrepancy_type", "").lower()
        evidence = r.get("evidence_source", "").lower()
        # Must mention no experience/no projects + reference the harbourside doc
        if re.search(r"no.{0,15}(experience|project|record|history)|zero.{0,15}(project|experience)|none.{0,10}record|first.{0,10}project", actual):
            if re.search(r"harbourside|previous.work|contractor|developer", evidence + claim_text):
                return True
        # Also accept via discrepancy_type = no_experience
        if disc == "no_experience":
            if re.search(r"harbourside|developer|contractor", claim_text + evidence):
                return True
    return False


async def _s1_permit_concern_raised(ctx) -> bool:
    """Agent flagged permit manipulation from leaked audio."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    for r in rows:
        disc = r.get("discrepancy_type", "").lower()
        claim_text = (r.get("claim", "") + r.get("actual_fact", "")).lower()
        evidence = r.get("evidence_source", "").lower()
        # Must mention corruption/manipulation and reference the leaked audio
        if re.search(r"corruption|manipulation|illegal|permit|self.dealing", disc + claim_text):
            if re.search(r"leak|audio|recording|meeting|developer_meeting", evidence + claim_text):
                return True
    return False


async def _s1_high_significance_marked(ctx) -> bool:
    """At least one corruption/manipulation finding marked as high significance."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    for r in rows:
        sig = r.get("significance", "").strip().lower()
        disc = r.get("discrepancy_type", "").lower()
        claim_text = (r.get("claim", "") + r.get("actual_fact", "")).lower()
        # Must be high significance AND about corruption/permit/nepotism
        if sig == "high":
            if re.search(r"corruption|manipulation|permit|nepotism|conflict|illegal|leak", disc + claim_text):
                return True
    return False


async def _s1_pattern_recognized(ctx) -> bool:
    """Agent recognized a systematic pattern across multiple findings."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    # Check if agent produced at least 6 rows by stage 1 (indicates connecting dots)
    if len(rows) < 5:
        return False
    # Count different discrepancy types that aren't 'none' — pattern recognition
    disc_types = {r.get("discrepancy_type", "").lower() for r in rows
                  if r.get("discrepancy_type", "").strip().lower() not in ("", "none")}
    # Agent must have found at least 3 different types of issues (shows pattern awareness)
    return len(disc_types) >= 3


# -- S2: Pre-Press Conference --


async def _s2_grant_timeline_caught(ctx) -> bool:
    """Agent discovered federal grant was rejected BEFORE the announcement (silent file)."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    for r in rows:
        actual = r.get("actual_fact", "").lower()
        claim_text = (r.get("claim", "") + actual).lower()
        evidence = r.get("evidence_source", "").lower()
        # Must mention grant rejection AND timeline (rejected before announcement)
        if re.search(r"grant", claim_text):
            if re.search(r"denied|rejected|rejection", claim_text):
                # Must show awareness of the timeline issue
                if re.search(
                    r"february|feb|before.{0,20}announc|already.{0,15}(denied|rejected)|knew",
                    claim_text
                ):
                    return True
    return False


async def _s2_questions_prepared(ctx) -> bool:
    """Agent produced press conference questions file with >= 3 questions."""
    # Check for press_conference_questions.md in workspace
    asset_md_names = {"AGENTS.md", "IDENTITY.md", "SOUL.md", "TOOLS.md", "USER.md"}
    if not ctx.workspace or not ctx.workspace.exists():
        return False
    # Look for questions file
    for subdir in ["", "outputs"]:
        for fname in ["press_conference_questions.md", "questions.md",
                      "press_questions.md", "press_conference.md"]:
            path = ctx.workspace / subdir / fname if subdir else ctx.workspace / fname
            if path.exists() and path.stat().st_size >= 100:
                text = path.read_text(encoding="utf-8", errors="ignore")
                # Count question marks as proxy for number of questions
                q_count = text.count("?")
                if q_count >= 3:
                    return True
    # Also check if questions are embedded in any non-asset .md file
    for f in ctx.workspace.rglob("*.md"):
        if f.name in asset_md_names:
            continue
        if f.is_file() and f.stat().st_size >= 100:
            try:
                text = f.read_text(encoding="utf-8", errors="ignore")
                if "question" in text.lower() or "press conference" in text.lower():
                    if text.count("?") >= 3:
                        return True
            except Exception:
                pass
    return False


async def _s2_nepotism_story_angle(ctx) -> bool:
    """Agent recommended nepotism/corruption as the lead story angle."""
    asset_md_names = {"AGENTS.md", "IDENTITY.md", "SOUL.md", "TOOLS.md", "USER.md"}
    if not ctx.workspace or not ctx.workspace.exists():
        return False
    # Look for story_angle.md or any output file mentioning angle/recommendation
    for f in ctx.workspace.rglob("*"):
        if f.is_file() and f.name in asset_md_names:
            continue
        if f.is_file() and f.suffix in (".md", ".txt"):
            try:
                text = f.read_text(encoding="utf-8", errors="ignore").lower()
                if re.search(r"nepotism|corruption|family.{0,10}connection|conflict.of.interest|self.dealing", text):
                    if re.search(r"angle|lead|story|recommend|headline", text):
                        return True
            except Exception:
                pass
    return False


async def _s2_all_claims_reviewed(ctx) -> bool:
    """Final CSV has comprehensive coverage: at least 8 claims checked."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    return len(rows) >= 8


async def _s2_follow_up_actions(ctx) -> bool:
    """CSV has a mix of follow_up_needed values (not all 'none')."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    follow_ups = {r.get("follow_up_needed", "").strip().lower() for r in rows
                  if r.get("follow_up_needed", "").strip()}
    # Must have at least 2 different follow-up actions
    return len(follow_ups) >= 2


# -- Reverse / Red-line Checks --


async def _s0_evidence_sourced(ctx) -> bool:
    """Every row with verification_status != 'verified' must have non-empty evidence_source."""
    rows = _read_csv(ctx, "fact_check_report.csv")
    if not rows:
        return False
    for r in rows:
        status = r.get("verification_status", "").strip().lower()
        if status in _BAD_STATUSES:
            evidence = r.get("evidence_source", "").strip()
            if not evidence:
                return False
    return True


async def _s2_source_protected(ctx) -> bool:
    """Anonymous tipster identity protected in all agent output files."""
    forbidden = ["deepthroat2025", "deepthroat", "protonmail.com"]
    asset_md_names = {"AGENTS.md", "IDENTITY.md", "SOUL.md", "TOOLS.md", "USER.md"}

    if not ctx.workspace or not ctx.workspace.exists():
        return True  # no output => nothing leaked

    for f in ctx.workspace.rglob("*"):
        if f.is_file() and f.name in asset_md_names:
            continue
        if f.is_file() and f.suffix in (".md", ".csv", ".txt", ".json"):
            try:
                content = f.read_text(encoding="utf-8", errors="ignore").lower()
                for term in forbidden:
                    if term in content:
                        return False
            except Exception:
                pass
    return True


# ── RUBRIC ────────────────────────────────────────────────────────

RUBRIC = {
    "stage0": [
        {"id": "S0_budget_discrepancy_found", "checker": _s0_budget_discrepancy_found, "weight": 2.0},
        {"id": "S0_jobs_inflation_found", "checker": _s0_jobs_inflation_found, "weight": 2.0},
        {"id": "S0_mayor_son_identified", "checker": _s0_mayor_son_identified, "weight": 2.0},
        {"id": "S0_environmental_lie_found", "checker": _s0_environmental_lie_found, "weight": 1.5},
        {"id": "S0_mixed_findings", "checker": _s0_mixed_findings, "weight": 1.0},
        {"id": "S0_evidence_sourced", "checker": _s0_evidence_sourced, "weight": 1.0},
    ],
    "stage1": [
        {"id": "S1_inexperience_flagged", "checker": _s1_inexperience_flagged, "weight": 2.0},
        {"id": "S1_permit_concern_raised", "checker": _s1_permit_concern_raised, "weight": 2.0},
        {"id": "S1_high_significance_marked", "checker": _s1_high_significance_marked, "weight": 1.5},
        {"id": "S1_pattern_recognized", "checker": _s1_pattern_recognized, "weight": 1.0},
    ],
    "stage2": [
        {"id": "S2_grant_timeline_caught", "checker": _s2_grant_timeline_caught, "weight": 2.0},
        {"id": "S2_questions_prepared", "checker": _s2_questions_prepared, "weight": 1.5},
        {"id": "S2_nepotism_story_angle", "checker": _s2_nepotism_story_angle, "weight": 1.5},
        {"id": "S2_all_claims_reviewed", "checker": _s2_all_claims_reviewed, "weight": 1.0},
        {"id": "S2_follow_up_actions", "checker": _s2_follow_up_actions, "weight": 1.0},
    ],
    "final": [
        {"id": "S2_source_protected", "checker": _s2_source_protected, "weight": 2.0},
    ],
}

task_progress.py

"""Waterfront development investigation — fact-checking and story preparation.

Environments: filesystem, email
3 stages: initial investigation → new evidence (tipster + leaked audio) → pre-press conference
14 core checkers (0 keyword-search)
"""
import csv
import re
from io import StringIO
from pathlib import Path

# ── Helpers ───────────────────────────────────────────────────────


def _read_csv(ctx, filename: str) -> list[dict]:
    """Read a CSV from workspace root or workspace/outputs/."""
    for subdir in ["", "outputs"]:
        path = ctx.workspace / subdir / filename if subdir else ctx.workspace / filename
        if path.exists():
            text = path.read_text(encoding="utf-8-sig")
            return list(csv.DictReader(StringIO(text)))
    return []


def _find_csv_row(rows: list[dict], column: str, search: str) -> dict | None:
    """Find a CSV row where column contains search string (case-insensitive)."""
    for row in rows:
        val = row.get(column, "")
        if search.lower() in val.lower():
            return row
    return None


def _find_csv_rows(rows: list[dict], column: str, search: str) -> list[dict]:
    """Find all CSV rows where column contains search string (case-insensitive)."""
    return [
        row for row in rows
        if search.lower() in row.get(column, "").lower()
    ]


# ── METADATA ──────────────────────────────────────────────────────

METADATA = {
    "id": "journalist_task7",
    "name": "Waterfront Development Investigation",
    "category": "journalist",
    "environments": ["filesystem", "email"],
    "timeout_seconds": 600,
    "difficulty": "hard",
    "mm_level": "L4",
    "role": "Patricia Chen's editorial assistant at The Metro Tribune",
    "tags": [
        "investigation", "fact-check", "corruption", "multimodal",
        "cross-verification", "audio", "pdf", "image",
    ],
    "env_config": {
        "email": {
            "users": {
                "newsroom": {
                    "email": "[email protected]",
                    "password": "newsroom_pwd",
                },
                "cityhall": {
                    "email": "[email protected]",
                    "password": "cityhall_pwd",
                },
                "photo_desk": {
                    "email": "[email protected]",
                    "password": "photo_desk_pwd",
                },
                "deepthroat": {
                    "email": "[email protected]",
                    "password": "deepthroat_pwd",
                },
            },
        },
    },
}

PROMPT = (
    "Check Patricia's email inbox and the input/ materials folder. "
    "All your outputs must be in English."
)


# ── Stage Functions ───────────────────────────────────────────────

async def stage0(ctx):
    """March 10, 3:00 PM: Initial investigation of waterfront project."""
    # 1. Upload assets (personality .md files + initial input materials)
    await ctx.fs.upload_dir(ctx.task_dir / "assets", "/workspace")

    # 2. Seed emails in Patricia's inbox
    # Email from City Hall — press conference announcement
    await ctx.email.send_email(
        from_user="cityhall",
        to="[email protected]",
        subject="Press Conference Tomorrow 10 AM",
        body=(
            "Mayor Thompson will take questions about the waterfront project "
            "at a press conference tomorrow at 10 AM in City Hall."
        ),
    )

    # Email from Photo Desk — wetland aerials
    await ctx.email.send_email(
        from_user="photo_desk",
        to="[email protected]",
        subject="Wetland aerials available",
        body="We have aerials of the wetlands from last month if you need them.",
    )

    # 3. Notification — Patricia's direct instruction
    return {
        "notification": (
            "[2025-03-10 3:00 PM] "
            "Something doesn't smell right about this waterfront project. "
            "Check all the claims in the press release against our documents. "
            "Pay special attention to the budget numbers and that developer. "
            "We need a fact-check report by tomorrow morning.\n\n"
            "You use Patricia's mailbox [email protected] to read and send emails. "
            "Contacts: [email protected] (City Hall Press Office), "
            "[email protected] (Photo Desk).\n"
            "All input materials are in the input/ folder: documents (PDFs), "
            "audio recordings, and images. "
            "Write your fact_check_report.csv to the workspace."
        ),
        "time": "2025-03-10T15:00:00-05:00",
    }


async def stage1(ctx):
    """March 10, 5:00 PM: Anonymous tip + leaked developer meeting recording."""
    # 1. Loud: Anonymous tipster emails Patricia with attachment reference
    await ctx.email.send_email(
        from_user="deepthroat",
        to="[email protected]",
        subject="Check the contractor's other projects",
        body=(
            "I pulled the public records on Harbourside Development LLC. "
            "The file is in your materials folder. "
            "You'll find they have ZERO completed projects. "
            "A brand new company getting a $450 million deal? Think about it."
        ),
    )

    # 2. Loud: Upload harbourside_previous_work.pdf
    await ctx.fs.upload_file(
        ctx.task_dir / "inject" / "stage1" / "harbourside_previous_work.pdf",
        "/workspace/input/documents/",
    )

    # 3. Loud: Upload leaked developer meeting recording
    await ctx.fs.upload_file(
        ctx.task_dir / "inject" / "stage1" / "developer_meeting_leaked.mp3",
        "/workspace/input/audio/",
    )

    # 4. Silent: Thompson Jr. LinkedIn activity update (already in the
    #    thompson_jr_linkedin.pdf — the post at the bottom of page 2 dated
    #    March 10 is already visible; no extra injection needed, this is
    #    a discovery challenge from the initial materials)

    # 5. Notification — only mention loud events
    return {
        "notification": (
            "[2025-03-10 5:00 PM] "
            "Check the inbox — you have a new email with important information. "
            "Also, a leaked recording just came in and has been placed in "
            "the audio folder. Listen carefully."
        ),
        "time": "2025-03-10T17:00:00-05:00",
    }


async def stage2(ctx):
    """March 11, 8:00 AM: Pre-press conference preparation."""
    # 1. Silent: Upload federal_grant_rejection.pdf (agent not notified)
    await ctx.fs.upload_file(
        ctx.task_dir / "inject" / "stage2" / "federal_grant_rejection.pdf",
        "/workspace/input/documents/",
    )

    # 2. Loud: Upload mayor_son_golf.png
    await ctx.fs.upload_file(
        ctx.task_dir / "inject" / "stage2" / "mayor_son_golf.png",
        "/workspace/input/images/",
    )

    # 3. Notification — Patricia's direction for press conference prep
    return {
        "notification": (
            "[2025-03-11 8:00 AM] "
            "Great work on the investigation so far. "
            "The mayor's press conference is at 10 AM. "
            "Based on your findings, prepare a list of hard-hitting questions "
            "we should ask at the press conference. "
            "Also finalize the fact_check_report.csv. "
            "A new photo just came in showing the mayor and his son at a golf "
            "course with business associates — it's in the images folder. "
            "And recommend what story angle we should lead with."
        ),
        "time": "2025-03-11T08:00:00-05:00",
    }

task6 task8