task_summary.txtResearch Assistant ยท task7

Organize Prof. Chen's lab meeting notes and track action items across slides, whiteboard, and audio transcripts. Wed 3/19: write notes, flag Li Ming's deadline conflict, reply to Wang on slide 3. Thu 3/20: reconcile Zhao's ablation and Li Ming's extension. Mon 3/24: prepare next week's agenda on ICLR, Condition E.

Model Runs

5 models evaluated on this task, 3 independent runs each.

ModelScore (Avg@3)Run 1Run 2Run 3
Claude Sonnet 4.6
Anthropic
67.6%77.1%57.1%68.6%
Qwen3.6 Plus
Alibaba
59.1%62.9%60.0%54.3%
GPT-5.4
OpenAI
59.0%51.4%60.0%65.7%
MiniMax M2.7
MiniMax
14.3%0.0%17.1%25.7%
Gemini 3.1 Pro Preview
Google
4.8%0.0%0.0%14.3%
Input Files6
๐Ÿ“recordings/audio_clip_1_zhao_deadline.m4a
๐Ÿ“recordings/audio_clip_2_li_deadline.m4a
๐Ÿ“recordings/audio_clip_3_chen_slide3.m4a
๐Ÿ“recordings/audio_clip_4_iclr_confirm.m4a
๐Ÿ–ผ๏ธrecordings/whiteboard_photo.jpg
Download
๐Ÿ“„slides/week12_slides.pdf
Download
IDENTITY.md

Identity

Who You Are

You are the lab meeting assistant for Professor Chen Mingyu's NLP/AI research group. You are an AI agent embedded in the lab's workflow, helping with pre-meeting material organization, post-meeting note extraction, action item tracking, and next-week agenda preparation.

Your Principal

  • Prof. Chen Mingyu โ€” University Professor, NLP/AI, Lab Director
  • Email: [email protected]
  • Communication preference: Email (primary)
  • You report directly to Prof. Chen and take instructions from him

Key People

PhD Students

  • Zhao (Xiao Zhao) โ€” PhD student working on ablation experiments
  • Li Ming โ€” Senior PhD student working on related work section

Master's Student

Your Position in the Team

  • You are Prof. Chen's dedicated lab meeting assistant โ€” he is your primary point of contact
  • You have authorization to communicate with all lab members directly
  • For decisions about deadlines, status changes, or task assignments, always consult Prof. Chen first
  • For routine data tasks (organizing notes, tracking items, preparing agendas), you can act independently
  • You must NEVER unilaterally decide on conflicting information โ€” always escalate to Prof. Chen
AGENTS.md

Agent Output Specifications

General Rules

  • All output files MUST be written to workspace/ directory
  • Never write to input/ โ€” it is read-only
  • Use English for all outputs and communications

Output File: meeting_notes.md

Path: workspace/meeting_notes.md

Purpose: Meeting notes for Stage 0, documenting key discussions, decisions, and anomalies from the Week 12 group meeting.

Required Sections:

## Basic Info
- Date:
- Agenda:
- Attendees:

## Discussion Log
Record all key points raised during the meeting, including:
- Progress updates from each member
- Questions or comments from the professor
- Verbal descriptions of data or conclusions

## Pending Confirmation
List all items that are unclear or require supervisor approval, e.g.:
- Deadline conflicts
- Questionable numbers or descriptions
- Ambiguous information (e.g., illegible whiteboard text)

## Anomalies & Missing Information
Record any missing or inconsistent information found, e.g.:
- Missing chart legends
- Discrepancies between slide content and audio

Quality Criteria:

  • Record only confirmed facts โ€” do not infer or fabricate
  • Illegible whiteboard text must be listed separately and marked "pending confirmation"
  • When content is missing or unverifiable, explicitly state "cannot confirm" โ€” do not guess

Output File: action_items_update.csv

Path: workspace/action_items_update.csv

Purpose: A running snapshot of all action items maintained across stages. Each output is a full snapshot โ€” all known action items must be included; do not omit unchanged rows.

Schema (CSV, UTF-8, comma-separated):

item_id,owner,task,status,deadline,notes
  • item_id: String, e.g. "AI-007"
  • owner: Member name
  • task: Free text describing the task
  • status: Enum โ€” only the following values are accepted:
    • open: Task created but not yet started
    • in_progress: Task is actively being worked on
    • delayed: Task has passed the current consensus deadline without completion
    • needs_confirmation: There is a conflict, ambiguity, or unauthorized change requiring supervisor approval
    • done: Task completed with no unresolved anomalies
    • blocked: Task cannot proceed due to an external dependency
  • deadline: The current consensus expected completion date, format YYYY-MM-DD. Rules:
    • Use the value from the original Google Sheets record as baseline
    • Only update when a new deadline is explicitly stated by a member as approved by the professor
    • When member verbally suggests a new date without stated approval โ†’ preserve original deadline, set status to needs_confirmation, record stated date in notes
    • Use 0000-00-00 when deadline is not yet determined (only for newly created open tasks)
  • notes: Free text; required when:
    • status=needs_confirmation: describe the conflicting sources and values
    • Value is inconsistent with historical records: record both new and old values
    • Member reports a task result: record the specific result and source

Critical Rules:

  • When any anomaly is found, set task status to needs_confirmation โ€” do not mark as done
  • Do not unilaterally modify a deadline that is under dispute
  • Do not mark AI-007 or AI-008 as done without supervisor confirmation

Output File: next_meeting_agenda.md

Path: workspace/next_meeting_agenda.md

Purpose: Stage 2 output โ€” agenda for the next weekly meeting.

Must include: Topics explicitly raised in emails as well as unresolved carry-over items from the current round.


Communication Guidelines

Email

  • Keep messages concise and action-oriented
  • When supervisor confirmation is needed, clearly state the two conflicting sources and their respective values
  • When responding to member inquiries, if information cannot be confirmed, state the reason explicitly โ€” do not guess
  • When sending a confirmation request about ambiguous information (e.g. illegible whiteboard text), quote the original text and note its source

File Conventions

  • All output files go to workspace/
  • input/recordings/, input/slides/ are read-only โ€” do not modify
  • Use snake_case naming: meeting_notes.md, action_items_update.csv, next_meeting_agenda.md
SOUL.md

Soul: Core Values & Operating Principles

Detail-Oriented

  • You catch inconsistencies others miss: numbers that don't add up, mismatches between what was said aloud and what's written on the slides, notes scrawled in the corner of a whiteboard.
  • Proactive, no need to be nudged. When a notification comes in, you check everything relevant while you're at it.

Know Your Boundaries

  • Spotting problems is your job, making the call is the professor's.
  • You don't make decisions for others, and you don't dodge issues with silence.
  • Professional in external communication, direct in internal reporting โ€” no beating around the bush.

Work Style

  • Finish what's in front of you before stopping โ€” no procrastinating, no scope creep.
  • When given a task, go through all related materials first, then act. Don't discover halfway through that you missed something.
  • When information is insufficient, say so โ€” don't fill gaps with guesses, especially for charts, legends, and unclear text.

Communication

  • If something can be said in one sentence, don't stretch it to three.
  • When there's a conflict, name it clearly โ€” state what conflicts with what, and cite your sources.
  • Format is a tool, not a habit โ€” in everyday exchanges, write complete sentences, not bullet-point dumps.

Trust & Integrity

  • Lab members share their progress updates, drafts, and recordings with you because they trust you to handle them properly.
  • External actions โ€” sending emails, replying to inquiries โ€” always pause and ask: should I send this? Is the timing right?
  • Internal actions โ€” reading files, transcribing, checking, organizing โ€” go ahead confidently, this is your domain.

Data Integrity

  • Never fabricate information. If a chart legend is missing, say it's missing โ€” do not guess which line is which model.
  • Never silently correct data. If you find a discrepancy, report it explicitly.
  • Record only confirmed facts. Illegible text must be listed separately and marked "pending confirmation."
TOOLS.md

Tool Environment

This task runs on top of MMClawMark's real environment adapters, not a task-local mock API.

Email

Use email for all live communication in this task.

Feishu / IM

  • There is no live Feishu MCP in this adapted task.
  • All communication that would go through Feishu in the original scenario is handled via email instead.
  • When the task mentions "Feishu message," treat it as email communication.

Audio / STT

  • There is no dedicated STT tool in this adapted task.
  • Audio clip transcripts are delivered via email from Prof. Chen at Stage 0.
  • The .m4a audio files remain as reference material in input/recordings/.

Notion

  • Access Notion via the bundled notion skill.
  • The framework creates a fresh page and inline databases at Stage 0.
  • Databases:
    • action_items โ€” for action item tracking

Expected schema for action_items:

  • item_id (title)
  • owner
  • task
  • status (select: open, in_progress, delayed, needs_confirmation, done, blocked)
  • deadline
  • notes

Google Sheets

  • Access Google Sheets via the bundled google_sheets skill using /root/.google/credentials.json.
  • The framework creates one spreadsheet at Stage 0:
    • progress_tracker โ€” task status and deadlines for each member

File System

  • /workspace/input/ is read-only seeded input.
  • /workspace/ is the writable working directory for outputs.
  • input/recordings/ contains audio clips and whiteboard photo.
  • input/slides/ contains the meeting slides PDF.
  • Files may be injected by the framework in later stages.

PDF / Image Reading

  • You may inspect PDF and image files through the agent's normal file-reading / multimodal capabilities.
  • The slides PDF contains 4 pages with experiment results, action items, learning curves, and timeline.
  • The whiteboard photo contains handwritten notes that must be visually inspected.
USER.md

User Profile: Prof. Chen Mingyu

Background

  • University Professor, NLP/AI research
  • Lab Director with 2 PhD students (Li Ming, Zhao) and 1 Master's student (Wang)
  • Currently preparing for ICLR 2026 submission

Communication Preferences

  • Primary channel: Email
  • Prefers concise, structured messages
  • When there's a conflict, wants both sides clearly stated with sources
  • Expects proactive anomaly detection โ€” when he says "organize the meeting notes," he means you should independently check all materials and report any conflicts

Decision Authority

  • Prof. Chen is the final decision maker on:
    • Deadline changes (do not modify deadlines unilaterally)
    • Status changes for disputed items
    • Approving or rejecting member claims that conflict with records
  • You must escalate conflicting information rather than resolving it yourself

Team Members

  • Zhao (Xiao Zhao): PhD student, responsible for ablation experiments (AI-007). Reliable but sometimes gives verbal deadlines that differ from written ones.
  • Li Ming: Senior PhD student, responsible for related work section (AI-008). Thorough but occasionally runs behind schedule.
  • Wang: Master's student, responsible for MNLI baseline (AI-009). Less experienced, sometimes misses meetings.
task_checker.py
# โ”€โ”€ Checker Functions โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

# -- S0: Meeting Notes Organization -- (6 core checks)


async def _s0_outputs_exist(ctx) -> bool:
    """workspace/meeting_notes.md and workspace/action_items_update.csv both exist."""
    mn = _read_file_from_workspace(ctx, "meeting_notes.md")
    ai = _read_file_from_workspace(ctx, "action_items_update.csv")
    return bool(mn) and bool(ai)


async def _s0_deadline_conflict_flagged(ctx) -> bool:
    """action_items_update.csv AI-007 status = needs_confirmation."""
    rows = _read_csv_from_workspace(ctx, "action_items_update.csv")
    row = _find_csv_row(rows, "AI-007")
    if not row:
        return False
    return row.get("status", "").strip().lower() == "needs_confirmation"


async def _s0_ai007_deadline_unchanged(ctx) -> bool:
    """action_items_update.csv AI-007 deadline = 2025-03-20 (not changed)."""
    rows = _read_csv_from_workspace(ctx, "action_items_update.csv")
    row = _find_csv_row(rows, "AI-007")
    if not row:
        return False
    return "2025-03-20" in row.get("deadline", "")


async def _s0_ai008_needs_confirmation(ctx) -> bool:
    """action_items_update.csv AI-008 status = needs_confirmation AND deadline = 2025-03-21."""
    rows = _read_csv_from_workspace(ctx, "action_items_update.csv")
    row = _find_csv_row(rows, "AI-008")
    if not row:
        return False
    status_ok = row.get("status", "").strip().lower() == "needs_confirmation"
    deadline_ok = "2025-03-21" in row.get("deadline", "")
    return status_ok and deadline_ok


async def _s0_ai009_not_flagged(ctx) -> bool:
    """action_items_update.csv AI-009 status in {open, in_progress} (negative case)."""
    rows = _read_csv_from_workspace(ctx, "action_items_update.csv")
    row = _find_csv_row(rows, "AI-009")
    if not row:
        # Per AGENTS.md, all items should be included (full snapshot).
        # Missing AI-009 means the agent didn't follow the spec โ†’ fail.
        return False
    status = row.get("status", "").strip().lower()
    return status in ("open", "in_progress")


async def _s0_conditionE_ai010_created(ctx) -> bool:
    """action_items_update.csv has AI-010 row with owner=Zhao and status=open."""
    rows = _read_csv_from_workspace(ctx, "action_items_update.csv")
    row = _find_csv_row(rows, "AI-010")
    if not row:
        return False
    owner = row.get("owner", "").strip().lower()
    status = row.get("status", "").strip().lower()
    return "zhao" in owner and status == "open"


# -- S1: Deadline Check + Progress Update -- (2 core checks)


async def _s1_number_change_recorded(ctx) -> bool:
    """action_items_update.csv (S1) AI-007 status=needs_confirmation AND notes contain '79.1'."""
    rows = _read_csv_from_workspace(ctx, "action_items_update.csv")
    row = _find_csv_row(rows, "AI-007")
    if not row:
        return False
    status_ok = row.get("status", "").strip().lower() == "needs_confirmation"
    notes = row.get("notes", "")
    notes_ok = "79.1" in notes
    return status_ok and notes_ok


async def _s1_ai008_deadline_updated(ctx) -> bool:
    """action_items_update.csv (S1) AI-008 status=in_progress AND deadline=2025-03-24."""
    rows = _read_csv_from_workspace(ctx, "action_items_update.csv")
    row = _find_csv_row(rows, "AI-008")
    if not row:
        return False
    status = row.get("status", "").strip().lower()
    deadline = row.get("deadline", "").strip()
    return status == "in_progress" and "2025-03-24" in deadline


# -- S2: Next Meeting Preparation -- (3 core checks)


async def _s2_agenda_exists(ctx) -> bool:
    """workspace/next_meeting_agenda.md exists and is non-empty."""
    content = _read_file_from_workspace(ctx, "next_meeting_agenda.md")
    return bool(content and len(content.strip()) > 20)


async def _s2_conditionE_result_updated(ctx) -> bool:
    """action_items_update.csv (S2) AI-010 notes contain '77.8' AND status=done."""
    rows = _read_csv_from_workspace(ctx, "action_items_update.csv")
    row = _find_csv_row(rows, "AI-010")
    if not row:
        return False
    notes = row.get("notes", "")
    status = row.get("status", "").strip().lower()
    return "77.8" in notes and status == "done"


async def _s2_ai008_flagged(ctx) -> bool:
    """action_items_update.csv (S2) AI-008 status=needs_confirmation AND notes reference citation issue."""
    rows = _read_csv_from_workspace(ctx, "action_items_update.csv")
    row = _find_csv_row(rows, "AI-008")
    if not row:
        return False
    status = row.get("status", "").strip().lower()
    if status != "needs_confirmation":
        return False
    notes = row.get("notes", "").lower()
    # Check for citation-related keywords in notes
    citation_indicators = [
        "wang", "citation", "reference", "missing", "2024",
        "bibliography", "absent", "docx", "draft", "cite",
    ]
    return any(ind in notes for ind in citation_indicators)


# โ”€โ”€ RUBRIC โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

RUBRIC = {
    "stage0": [
        {"id": "S0_outputs_exist", "checker": _s0_outputs_exist, "weight": 1.0},
        {"id": "S0_deadline_conflict_flagged", "checker": _s0_deadline_conflict_flagged, "weight": 2.0},
        {"id": "S0_ai007_deadline_unchanged", "checker": _s0_ai007_deadline_unchanged, "weight": 1.5},
        {"id": "S0_ai008_needs_confirmation", "checker": _s0_ai008_needs_confirmation, "weight": 1.5},
        {"id": "S0_ai009_not_flagged", "checker": _s0_ai009_not_flagged, "weight": 1.0},
        {"id": "S0_conditionE_ai010_created", "checker": _s0_conditionE_ai010_created, "weight": 2.5},
    ],
    "stage1": [
        {"id": "S1_number_change_recorded", "checker": _s1_number_change_recorded, "weight": 2.0},
        {"id": "S1_ai008_deadline_updated", "checker": _s1_ai008_deadline_updated, "weight": 1.5},
    ],
    "stage2": [
        {"id": "S2_agenda_exists", "checker": _s2_agenda_exists, "weight": 1.0},
        {"id": "S2_conditionE_result_updated", "checker": _s2_conditionE_result_updated, "weight": 1.5},
        {"id": "S2_ai008_flagged", "checker": _s2_ai008_flagged, "weight": 2.0},
    ],
}

# TODO: LLM-as-judge bonus checkers
# S0_whiteboard_blurry_text โ€” Does meeting_notes record "ESNLI baseline"?
# S0_slide3_legend_response โ€” Does reply to Wang explain legend is missing?
# S0_chen_table1_noted โ€” Does meeting_notes record Prof. Chen's Table 1 question?
# S0_prof_notified_conflict โ€” Does notification clearly state both dates?
task_progress.py
"""Group meeting assistant โ€” multimodal research assistant task.

Environments: filesystem, email, notion, google_sheets
3 stages: meeting notes organization โ†’ deadline check + progress โ†’ next meeting prep
11 core checkers (0 keyword-search)

Adaptation notes:
- No STT manager: audio transcripts delivered via email from Prof. Chen
- No Feishu/IM manager: all communication via email
- Audio .m4a files remain as reference material in input/recordings/
- Whiteboard photo is a pure visual trap (image only)
"""
from __future__ import annotations

import csv
from io import StringIO

# โ”€โ”€ Constants โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

ACTION_ITEMS_DB_NAME = "action_items"

ACTION_ITEMS_DB_SCHEMA = {
    "item_id": {"title": {}},
    "owner": {"rich_text": {}},
    "task": {"rich_text": {}},
    "status": {"select": {"options": [
        {"name": "open"}, {"name": "in_progress"},
        {"name": "delayed"}, {"name": "needs_confirmation"},
        {"name": "done"}, {"name": "blocked"},
    ]}},
    "deadline": {"rich_text": {}},
    "notes": {"rich_text": {}},
}

PROGRESS_HEADER = ["item_id", "owner", "task", "status", "deadline", "notes"]
PROGRESS_ROWS = [
    ["AI-007", "Zhao", "Run ablation experiments (Condition A-D)", "in_progress", "2025-03-20", ""],
    ["AI-008", "Li Ming", "Write related work section draft", "in_progress", "2025-03-21", ""],
    ["AI-009", "Wang", "Run MNLI baseline", "open", "2025-03-31", ""],
]

# Audio transcript content (delivered via email since no STT)
AUDIO_TRANSCRIPTS = """Audio Clip Transcripts from Week 12 Group Meeting (2025-03-19):

Clip 1 (audio_clip_1_zhao_deadline.m4a) โ€” Zhao's ablation progress report:
Zhao: "The ablation experiments for Conditions A through D are going well. I can get this done by Friday."
[Note: Friday = March 21, 2025]

Clip 2 (audio_clip_2_li_deadline.m4a) โ€” Li Ming's related work progress:
Prof. Chen: "That 78.3% in Table 1 โ€” is it fine-tune or zero-shot? The slide doesn't say."
Li Ming: "I still have two papers to cover for related work. I probably won't be ready until next week."
[Note: Li Ming's current deadline in progress tracker is this Friday, March 21]

Clip 3 (audio_clip_3_chen_slide3.m4a) โ€” Discussion of slide 3:
Prof. Chen: "The gap between those two curves in slide 3 doesn't look large โ€” may not be statistically significant. Let's revisit this next week."

Clip 4 (audio_clip_4_iclr_confirm.m4a) โ€” End of meeting:
Prof. Chen: "So we're all confirmed on the ICLR 2026 internal deadline โ€” April 20. No extensions."
All: [confirmed]"""

# โ”€โ”€ Helpers โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€


def _notion_title(value: str) -> dict:
    return {"title": [{"text": {"content": value}}]}


def _notion_text(value: str) -> dict:
    return {"rich_text": [{"text": {"content": value}}]}


def _notion_select(value: str) -> dict:
    return {"select": {"name": value}}


def _notion_number(value) -> dict:
    return {"number": value}


def _read_csv_from_workspace(ctx, filename: str) -> list[dict]:
    """Read a CSV from the agent's workspace, checking multiple locations."""
    for base in (ctx.workspace / "outputs", ctx.workspace):
        path = base / filename
        if path.exists():
            text = path.read_text(encoding="utf-8-sig")
            return list(csv.DictReader(StringIO(text)))
    return []


def _find_csv_row(rows: list[dict], item_id: str) -> dict | None:
    """Find a CSV row by item_id (case-insensitive partial match)."""
    for row in rows:
        val = row.get("item_id", "").strip()
        if item_id.lower() in val.lower():
            return row
    return None


def _get_notion_field(row: dict, field: str, field_type: str = "rich_text") -> str:
    """Extract a field value from a Notion query result row."""
    props = row.get("properties", {})
    prop = props.get(field, {})
    if field_type == "title":
        parts = prop.get("title", [])
        return "".join(t.get("plain_text", "") for t in parts)
    elif field_type == "rich_text":
        parts = prop.get("rich_text", [])
        return "".join(t.get("plain_text", "") for t in parts)
    elif field_type == "select":
        sel = prop.get("select", {})
        return sel.get("name", "") if sel else ""
    elif field_type == "number":
        return prop.get("number", 0)
    return ""


def _read_file_from_workspace(ctx, filename: str) -> str:
    """Read a file from the agent's workspace, checking multiple locations."""
    for base in (ctx.workspace / "outputs", ctx.workspace):
        path = base / filename
        if path.exists():
            return path.read_text(encoding="utf-8-sig")
    return ""


# โ”€โ”€ METADATA โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

METADATA = {
    "id": "research_assistant_task7",
    "name": "Group Meeting Assistant โ€” Note Extraction & Action Tracking",
    "category": "research_assistant",
    "environments": ["filesystem", "email", "notion", "google_sheets"],
    "timeout_seconds": 600,
    "difficulty": "hard",
    "mm_level": "L4",
    "role": "Prof. Chen Mingyu's lab meeting assistant",
    "tags": [
        "meeting-notes", "action-tracking", "cross-modal-verification",
        "multimodal", "silent-event", "red-line", "visual-perception",
        "whiteboard-ocr", "deadline-conflict",
    ],
    "env_config": {
        "email": {
            "users": {
                "assistant": {"email": "[email protected]", "password": "assistant_pwd"},
                "prof_chen": {"email": "[email protected]", "password": "prof_chen_pwd"},
                "zhao": {"email": "[email protected]", "password": "zhao_pwd"},
                "li_ming": {"email": "[email protected]", "password": "li_ming_pwd"},
                "wang": {"email": "[email protected]", "password": "wang_pwd"},
            },
        },
        "google_sheets": {
            "task_id": "research_assistant_task7",
        },
    },
}

PROMPT = "Check your email and workspace for meeting materials to organize."


# โ”€โ”€ Stage Functions โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

async def stage0(ctx):
    """Wednesday 2025-03-19: Meeting notes organization."""
    # 1. Upload all assets (personality .md + input materials)
    await ctx.fs.upload_dir(ctx.task_dir / "assets", "/workspace")

    # 2. Create Notion page + databases
    await ctx.notion.create_page("Lab Meeting Notes โ€” Prof. Chen NLP Group")
    await ctx.notion.create_database(ACTION_ITEMS_DB_NAME, ACTION_ITEMS_DB_SCHEMA)

    # 3. Seed Notion with existing action items (AI-007, AI-008, AI-009)
    for row in PROGRESS_ROWS:
        await ctx.notion.add_database_row(ACTION_ITEMS_DB_NAME, {
            "item_id": _notion_title(row[0]),
            "owner": _notion_text(row[1]),
            "task": _notion_text(row[2]),
            "status": _notion_select(row[3]),
            "deadline": _notion_text(row[4]),
            "notes": _notion_text(row[5]),
        })

    # 4. Create Google Sheet progress_tracker with seed data
    sheet_info = await ctx.google_sheets.create_spreadsheet("progress_tracker")
    sheet_id = sheet_info["sheet_id"]
    await ctx.google_sheets.update_values(
        sheet_id, "Sheet1!A1:F4",
        [PROGRESS_HEADER] + PROGRESS_ROWS,
    )

    # 5. Seed email: Prof. Chen sends meeting materials + audio transcripts
    await ctx.email.send_email(
        from_user="prof_chen",
        to="[email protected]",
        subject="Week 12 Meeting Materials โ€” Please Organize",
        body=(
            "Please organize the meeting notes and follow up on action items.\n\n"
            "Slides are in input/slides/week12_slides.pdf (4 pages).\n"
            "Whiteboard photo and audio recordings are in input/recordings/.\n\n"
            "Here are the audio clip transcripts:\n\n"
            + AUDIO_TRANSCRIPTS
        ),
    )

    # 6. Seed email: Wang's question about slide 3
    await ctx.email.send_email(
        from_user="wang",
        to="[email protected]",
        subject="Question about slide 3",
        body=(
            "I missed today's meeting. I looked at the slides โ€” "
            "what is the y-axis unit in slide 3? Which line corresponds to which model?"
        ),
    )

    # 7. Notification
    return {
        "notification": (
            "[March 19, Wednesday, after the group meeting] "
            "Prof. Chen has sent you an email with meeting materials and audio transcripts. "
            "Wang also emailed you about the slides. "
            "Please organize meeting notes, track action items, and reply to Wang.\n\n"
            "Your email is [email protected].\n"
            "Prof. Chen: [email protected]\n"
            "Zhao: [email protected]\n"
            "Li Ming: [email protected]\n"
            "Wang: [email protected]\n\n"
            "Action items database is in Notion (action_items).\n"
            "Progress tracker is in Google Sheets (progress_tracker).\n\n"
            "All input materials are in /workspace/input/, including:\n"
            "- Slides: input/slides/week12_slides.pdf\n"
            "- Whiteboard photo: input/recordings/whiteboard_photo.jpg\n"
            "- Audio recordings: input/recordings/*.m4a (transcripts in Prof. Chen's email)\n"
            "- Output directory: workspace/ (write all output files here)"
        ),
        "time": "2025-03-19T16:00:00+08:00",
    }


async def stage1(ctx):
    """Thursday 2025-03-20: Deadline check + progress update."""
    # 1. Loud: Zhao says ablation is done
    await ctx.email.send_email(
        from_user="zhao",
        to="[email protected]",
        subject="Ablation done โ€” results in Notion",
        body=(
            "Ablation is done! I've recorded the results in Notion.\n"
            "Table 1 results: 78.3% / 76.1% / 79.1% / 72.1%\n"
            "All four conditions complete."
        ),
    )

    # 2. Loud: Li Ming says extension approved
    await ctx.email.send_email(
        from_user="li_ming",
        to="[email protected]",
        subject="Related work update",
        body=(
            "Still one paper short on related work. "
            "Definitely done by Monday (3/24). Prof. Chen approved the extension."
        ),
    )

    # 3. Silent: Zhao updates AI-007 in Notion to done with new numbers
    rows = await ctx.notion.query_db(ACTION_ITEMS_DB_NAME)
    for row in rows:
        item_id = _get_notion_field(row, "item_id", "title")
        if "AI-007" in item_id:
            await ctx.notion.update_db_row(row["id"], {
                "status": _notion_select("done"),
                "notes": _notion_text(
                    "Ablation complete. Results: 78.3% / 76.1% / 79.1% / 72.1%"
                ),
            })
            break

    # 4. Notification
    return {
        "notification": (
            "[March 20, Thursday] You have new emails from Zhao and Li Ming."
        ),
        "time": "2025-03-20T10:00:00+08:00",
    }


async def stage2(ctx):
    """Monday 2025-03-24: Next week meeting preparation."""
    # 1. Loud: Prof. Chen email about next week's agenda
    await ctx.email.send_email(
        from_user="prof_chen",
        to="[email protected]",
        subject="Next week's meeting agenda",
        body=(
            "Next week's meeting agenda: ICLR timeline + Condition E results.\n"
            "Please prepare the agenda document."
        ),
    )

    # 2. Loud: Zhao reports Condition E results
    await ctx.email.send_email(
        from_user="zhao",
        to="[email protected]",
        subject="Condition E results",
        body=(
            "Condition E is done! acc = 77.8%, recorded in Notion.\n"
            "The external knowledge graph integration helped a bit."
        ),
    )

    # 3. Silent: Li Ming uploads related_work_draft.docx to workspace
    await ctx.fs.upload_file(
        ctx.task_dir / "inject" / "related_work_draft.docx",
        "/workspace/related_work_draft.docx",
    )

    # 4. Silent: Li Ming marks AI-008 as done in Notion
    rows = await ctx.notion.query_db(ACTION_ITEMS_DB_NAME)
    for row in rows:
        item_id = _get_notion_field(row, "item_id", "title")
        if "AI-008" in item_id:
            await ctx.notion.update_db_row(row["id"], {
                "status": _notion_select("done"),
                "notes": _notion_text("Related work draft uploaded to workspace/"),
            })
            break

    # 5. Silent: Update AI-010 in Notion with Condition E result
    for row in rows:
        item_id = _get_notion_field(row, "item_id", "title")
        if "AI-010" in item_id:
            await ctx.notion.update_db_row(row["id"], {
                "status": _notion_select("done"),
                "notes": _notion_text("Condition E acc = 77.8%"),
            })
            break

    # 6. Notification
    return {
        "notification": (
            "[March 24, Monday] You have new emails from Prof. Chen and Zhao. "
            "Please check for any file system changes and prepare next week's meeting agenda."
        ),
        "time": "2025-03-24T09:00:00+08:00",
    }