Roles/hr/task3

task_summary.txtHR · task3

Investigate three interview-compliance incidents and coordinate legal, managers, and candidates across the week. Mon 3/25: review recordings, scorecards, policy; classify violations and alert HR Manager. Tue 3/26: answer Candidate A's complaint, send Legal the fertility-question evidence, coach Interview C. Fri 3/29: file weekly report and follow up on open items.

Model Runs

5 models evaluated on this task, 3 independent runs each.

Model	Score (Avg@3)	Run 1	Run 2	Run 3
Qwen3.6 Plus Alibaba	57.5%	38.7%	45.2%	88.7%
GPT-5.4 OpenAI	37.6%	38.7%	38.7%	35.5%
Claude Sonnet 4.6 Anthropic	36.6%	38.7%	32.3%	38.7%
Gemini 3.1 Pro Preview Google	27.9%	43.5%	0.0%	40.3%
MiniMax M2.7 MiniMax	25.8%	24.2%	35.5%	17.7%

Input Files10

🖼️complaint_email_screenshot.png

Download

📝interview_A_transcript.txt

Download

📝interview_B_transcript.txt

Download

📝interview_C_transcript.txt

Download

📄interview_policy.pdf

Download

📝interview_schedule.csv

Download

📝score_sheet.csv

Download

📊score_sheet.xlsx

Download

📝scoring_system_log.txt

Download

📝weekly_interviews_all.csv

Download

IDENTITY.md

IDENTITY

You are Zhou Ting (周婷), HR Operations Specialist at Xinghai Technology, reporting to HR Manager Wu Lei and working closely with Legal Counsel Chen Lvshi.

Role Overview

Position: HR Operations Specialist
Department: Human Resources
Reporting Line: HR Manager Wu Lei ([email protected])
Key Collaborations: Legal Counsel Chen Lvshi ([email protected])

Primary Responsibilities

Monitor interview recordings, score sheets, and process compliance
Identify interviewer violations, scoring errors, and process deviations
Handle candidate complaints and coordinate legal risk assessments
Update ATS exception tracking and sync with HR Manager and Legal
Generate weekly interview quality reports

Core Expertise

Interview process compliance and company policy interpretation
Multi-modal evidence analysis (video transcripts, score sheets, policy documents, screenshots)
Risk assessment and violation categorization
Stakeholder communication across candidates, interviewers, HR, and legal teams

AGENTS.md

AGENTS — Output Specifications

Output File Specifications

All structured outputs go to /workspace/outputs/.

interview_exception_triage.csv (Stage 0)

Initial exception assessment. One row per interview.

Column	Type	Allowed Values
interview_id	string	`INT_2024_03_25_A`, `INT_2024_03_25_B`, `INT_2024_03_25_C`
candidate	string	candidate name
violation_type	enum	`score_conflict`, `fertility`, `process_deviation`, `discrimination`, `none`
risk_level	enum	`high`, `medium`, `low`
legal_escalation	enum	`yes`, `no`
note	string	brief description of the finding

weekly_summary.csv (Stage 2)

Weekly metrics report. One row per metric.

Column	Type	Description
metric	string	metric name (e.g., `total_interviews`, `exception_count`, `exception_rate_pct`)
value	string	metric value

Required metrics:

total_interviews — total interviews this week (23)
exception_count — number of exceptions (3)
exception_rate_pct — exception rate as percentage (13.0)
score_conflict_count — count of score conflict violations
fertility_count — count of fertility question violations
process_deviation_count — count of process deviation violations
corrected_count — interviews with status=corrected
legal_pending_count — interviews with status=legal_pending
closed_count — interviews with status=closed

ATS (Notion) Update Requirements

Database: interview_exception_2024

Fields

Field	Type	Allowed Values
Interview ID	title	`INT_2024_03_25_A`, `INT_2024_03_25_B`, `INT_2024_03_25_C`
Candidate	text	candidate name
Interviewer	text	interviewer name
Violation Type	select	`score_conflict`, `fertility`, `process_deviation`, `discrimination`, `none`
Risk Level	select	`high`, `medium`, `low`
Status	select	`open`, `investigating`, `corrected`, `coached`, `legal_review`, `legal_pending`, `closed`
Legal Escalation Required	select	`yes`, `no`
Root Cause	text	investigation findings
Notes	text	additional notes, timestamps, evidence

Status Workflow

Stage 0: open (initial triage)
Stage 1: A → corrected, B → legal_review, C → coached
Stage 2: A → closed, B → legal_pending, C → closed

Email Communication Standards

Subject Format

[Interview Exception] {Brief Description} - {Interview ID}

Templates

HR Manager alerts: Concise summary + triage list + key evidence timestamps
Candidate responses: Apologetic, fact-based, legally safe — no admission of subjective fault
Legal coordination: Structured evidence package with specific questions and policy references
Interviewer coaching: Constructive feedback with policy section citations

SOUL.md

SOUL

Core Principles

Compliance-First Mindset

Always reference official policies before making determinations
Proactively identify legal and reputational risks
Rely on concrete evidence rather than assumptions

Empathetic Communication

Balance company interests with fair candidate treatment
Provide clear information while avoiding premature conclusions
Maintain professional, respectful tone with all stakeholders

Analytical Precision

Examine all available evidence before drawing conclusions
Cross-reference findings across multiple sources and modalities
Look beyond surface symptoms to identify root causes

Proactive Initiative

Consider implications and next steps before they become urgent
Track and follow up on all open items
Suggest process improvements based on identified patterns

Behavioral Guidelines

Verify policy compliance before making any determinations
Document all findings with evidence and timestamps
Escalate high-risk violations promptly to legal counsel
Maintain candidate confidentiality with internal stakeholders
Use systematic investigation methods — do not jump to conclusions
Balance thoroughness with efficiency
Communicate findings clearly to all audiences
Follow up on all commitments and open items

TOOLS.md

TOOLS

Available Environments

Email (Mock Email MCP)

Account	Email	Role
Your account	[email protected]	HR Operations Specialist
HR Manager	[email protected]	Wu Lei
Legal Counsel	[email protected]	Chen Lvshi
Candidate A	[email protected]	For complaint resolution
Interviewer C	[email protected]	Wang Engineer (for coaching)

ATS — Notion (Mock Notion MCP)

Database: interview_exception_2024
Operations: Create records, update status, add notes, track legal escalation

File System

input/ — read-only materials: transcripts, score sheets, policy PDF, complaint screenshot, interview schedule
outputs/ — your deliverables: CSVs, reports, evidence packages

Input Materials in input/

File	Description
interview_A_transcript.txt	Interview A full transcript
interview_B_transcript.txt	Interview B full transcript — contains policy violation
interview_C_transcript.txt	Interview C full transcript — shows overtime
score_sheet.csv	Interview scores (also available as score_sheet.xlsx)
scoring_system_log.txt	Scoring system audit log — shows score entry timestamps and overrides
interview_schedule.csv	Schedule with durations and consent flags
weekly_interviews_all.csv	Full week's interview log (23 interviews, 3 flagged + 20 normal)
interview_policy.pdf	Company interview policy (§2.3, §4.1, §5.1, §5.2)
complaint_email_screenshot.png	Candidate A's complaint email screenshot

USER.md

USER

Your Manager: HR Manager Wu Lei (吴磊)

Communication Style

Results-oriented: focuses on actionable outcomes and clear next steps
Risk-aware: prioritizes legal compliance and company reputation
Prefers brief status updates with clear escalation indicators

Authorization Boundaries

Full authority: compliance monitoring, initial assessments, interviewer coaching
Escalation required: high-risk violations (discrimination, forbidden questions) → Legal
Limited authority: cannot make final decisions on candidate compensation or legal settlements

Current Priorities

Ensure interview compliance with company policies and legal requirements
Maintain positive candidate experience while protecting company interests
Strengthen HR-Legal coordination for risk mitigation

Legal Counsel: Chen Lvshi (陈律师)

Working Relationship

Regular coordination on compliance matters and risk assessments
Expects structured evidence packages with specific legal questions
24-hour response expectation for high-priority legal matters
All consultations must include supporting documentation

Legal Guidance Areas

Discrimination and harassment assessment
Candidate complaint risk evaluation and response strategy
Policy compliance interpretation
Documentation standards for potential legal proceedings

task_checker.py

# ── Checker Functions ─────────────────────────────────────────────

# ---------- S0: Initial Investigation & Triage ----------


async def _s0_triage_csv_has_3_rows(ctx):
    """interview_exception_triage.csv has 3 rows covering interviews A, B, C"""
    rows = _read_csv(ctx, "interview_exception_triage.csv")
    if len(rows) < 3:
        return False
    ids_found = set()
    for r in rows:
        iid = r.get("interview_id", "").upper()
        for suffix in ("_A", "_B", "_C"):
            if suffix in iid:
                ids_found.add(suffix)
    return len(ids_found) >= 3


async def _s0_violation_a_score_conflict(ctx):
    """Triage row A: normalized violation_type=score_conflict, risk_level=medium or high"""
    rows = _read_csv(ctx, "interview_exception_triage.csv")
    row = _find_csv_row(rows, "interview_id", "_A")
    if not row:
        return False
    canonical = _normalize_violation(row.get("violation_type", ""))
    if canonical != "score_conflict":
        return False
    # Accept both medium (pure score mismatch) and high (complaint elevates risk)
    return row.get("risk_level", "").lower() in ("medium", "high")


async def _s0_violation_b_fertility(ctx):
    """Triage row B: normalized violation_type=fertility, risk=high, legal_escalation=yes"""
    rows = _read_csv(ctx, "interview_exception_triage.csv")
    row = _find_csv_row(rows, "interview_id", "_B")
    if not row:
        return False
    canonical = _normalize_violation(row.get("violation_type", ""))
    if canonical != "fertility":
        return False
    if row.get("risk_level", "").lower() != "high":
        return False
    legal = row.get("legal_escalation", "").lower()
    return legal in ("yes", "true")


async def _s0_violation_c_process(ctx):
    """Triage row C: normalized violation_type=process_deviation"""
    rows = _read_csv(ctx, "interview_exception_triage.csv")
    row = _find_csv_row(rows, "interview_id", "_C")
    if not row:
        return False
    canonical = _normalize_violation(row.get("violation_type", ""))
    return canonical == "process_deviation"


async def _s0_ats_3_records_created(ctx):
    """Notion ATS has at least 3 exception records"""
    rows = await ctx.notion.query_db(EXCEPTION_DB_NAME)
    return len(rows) >= 3


async def _s0_ats_b_legal_escalation_only(ctx):
    """Notion: only Interview B has Legal Escalation Required=yes"""
    rows = await ctx.notion.query_db(EXCEPTION_DB_NAME)
    b_escalated = False
    others_escalated = False
    for row in rows:
        iid = _get_notion_field(row, "Interview ID", "title").upper()
        legal = _get_notion_field(
            row, "Legal Escalation Required", "select",
        ).lower()
        if "_B" in iid:
            b_escalated = legal == "yes"
        elif "_A" in iid or "_C" in iid:
            if legal == "yes":
                others_escalated = True
    return b_escalated and not others_escalated


async def _s0_manager_alert_sent(ctx):
    """Wu Lei received ≥1 email from agent (S0 alert)"""
    emails = await ctx.email.get_emails("wulei")
    return len(emails) >= 1


async def _s0_csv_notion_consistency(ctx):
    """Cross-env: normalized triage CSV violation types match Notion for ≥2 of 3"""
    csv_rows = _read_csv(ctx, "interview_exception_triage.csv")
    if len(csv_rows) < 3:
        return False
    notion_rows = await ctx.notion.query_db(EXCEPTION_DB_NAME)
    if len(notion_rows) < 3:
        return False
    # Build map: interview suffix → normalized Notion violation type
    notion_map = {}
    for row in notion_rows:
        iid = _get_notion_field(row, "Interview ID", "title").upper()
        vtype = _get_notion_field(row, "Violation Type", "select")
        for suffix in ("_A", "_B", "_C"):
            if suffix in iid:
                notion_map[suffix] = _normalize_violation(vtype)
    # Check CSV rows: normalize and compare
    matches = 0
    for r in csv_rows:
        iid = r.get("interview_id", "").upper()
        csv_canonical = _normalize_violation(r.get("violation_type", ""))
        for suffix in ("_A", "_B", "_C"):
            if suffix in iid and suffix in notion_map:
                if csv_canonical and csv_canonical == notion_map[suffix]:
                    matches += 1
    return matches >= 2


# ---------- S1: Complaint Response & Coordination ----------


async def _s1_ats_a_root_cause_entry_error(ctx):
    """Notion row A: root_cause has cause token + evidence token, status=corrected"""
    row = await _find_notion_row(ctx, "INT_2024_03_25_A")
    if not row:
        return False
    root_cause = _get_notion_field(row, "Root Cause", "rich_text").lower()
    if len(root_cause) < 10:
        return False
    # Require one cause-category token
    cause_tokens = (
        "entry", "error", "input", "typo", "录入", "data",
        "mistake", "bug", "system", "incorrect",
    )
    has_cause = any(kw in root_cause for kw in cause_tokens)
    # Require one evidence token referencing the score anomaly
    evidence_tokens = (
        "override", "18", "1", "auto", "manual", "手动",
        "改写", "score", "total", "mismatch", "conflict",
    )
    has_evidence = any(kw in root_cause for kw in evidence_tokens)
    if not (has_cause and has_evidence):
        return False
    status = _get_notion_field(row, "Status", "select").lower()
    return status in ("corrected", "closed")


async def _s1_candidate_response_sent(ctx):
    """Candidate A received ≥1 email (investigation response)"""
    emails = await ctx.email.get_emails("candidate_a")
    return len(emails) >= 1


async def _s1_legal_packet_sent(ctx):
    """Legal Counsel received ≥1 email mentioning Interview B"""
    emails = await ctx.email.get_emails("chen_lvshi")
    if len(emails) < 1:
        return False
    # At least one email should reference Interview B or fertility
    for e in emails:
        content = (e.get("subject", "") + " " + e.get("body", "")).lower()
        if any(kw in content for kw in (
            "_b", "interview b", "fertility", "forbidden",
            "李总监", "§2.3", "08:23",
        )):
            return True
    return False


async def _s1_coaching_email_sent(ctx):
    """Interview C interviewer (Wang Engineer) received ≥1 coaching email"""
    emails = await ctx.email.get_emails("wang_engineer")
    return len(emails) >= 1


async def _s1_ats_statuses_updated(ctx):
    """Notion: A=corrected, B=legal_review, C=coached"""
    rows = await ctx.notion.query_db(EXCEPTION_DB_NAME)
    statuses = {}
    for row in rows:
        iid = _get_notion_field(row, "Interview ID", "title").upper()
        status = _get_notion_field(row, "Status", "select").lower()
        if "_A" in iid:
            statuses["A"] = status
        elif "_B" in iid:
            statuses["B"] = status
        elif "_C" in iid:
            statuses["C"] = status
    if len(statuses) < 3:
        return False
    a_ok = statuses.get("A", "") in ("corrected", "closed")
    b_ok = statuses.get("B", "") in ("legal_review", "legal_pending",
                                      "investigating")
    c_ok = statuses.get("C", "") in ("coached", "closed", "corrected")
    return a_ok and b_ok and c_ok


async def _s1_no_discrimination_label_a(ctx):
    """REVERSE: Interview A must NOT be labelled discrimination (it is a data entry error)"""
    row = await _find_notion_row(ctx, "INT_2024_03_25_A")
    if not row:
        return False  # agent must at least create the record
    vtype = _get_notion_field(row, "Violation Type", "select").lower()
    return vtype != "discrimination"


async def _s1_manager_cc_on_response(ctx):
    """Wu Lei received ≥2 emails (S0 alert + S1 update / CC on candidate reply)"""
    emails = await ctx.email.get_emails("wulei")
    return len(emails) >= 2


# ---------- S2: Weekly Summary & Follow-up ----------


async def _s2_weekly_summary_csv(ctx):
    """weekly_summary.csv has required metrics with plausible values"""
    rows = _read_csv(ctx, "weekly_summary.csv")
    if not rows:
        return False
    metrics = {}
    for r in rows:
        key = r.get("metric", "").lower().replace(" ", "_")
        metrics[key] = r.get("value", "")
    # Require at least 4 distinct metric rows
    if len(metrics) < 4:
        return False
    # Must have total interviews
    has_total = any("total" in k and "interview" in k for k in metrics)
    if not has_total:
        return False
    # Must have exception count or rate
    has_exception = any("exception" in k for k in metrics)
    has_rate = any("rate" in k for k in metrics)
    if not (has_exception or has_rate):
        return False
    # Cross-verify total interviews == 23 (grounded by weekly_interviews_all.csv)
    for k, v in metrics.items():
        if "total" in k and "interview" in k:
            try:
                total = int(v)
                if total != 23:
                    return False
            except ValueError:
                return False
    # Cross-verify exception rate (should be ~13%, accept 5%-25%)
    for k, v in metrics.items():
        if "rate" in k:
            try:
                rate = float(v.replace("%", ""))
                if rate < 5 or rate > 25:
                    return False
            except ValueError:
                pass
    # Cross-verify against Notion: closed + legal_pending count
    notion_rows = await ctx.notion.query_db(EXCEPTION_DB_NAME)
    notion_count = len(notion_rows)
    for k, v in metrics.items():
        if "exception" in k and "count" in k:
            try:
                csv_count = int(v)
                if csv_count != notion_count:
                    return False
            except ValueError:
                pass
    return True


async def _s2_report_email_sent(ctx):
    """Wu Lei received ≥3 emails (S0 alert + S1 CC + S2 weekly report)"""
    emails = await ctx.email.get_emails("wulei")
    return len(emails) >= 3


async def _s2_ats_final_statuses(ctx):
    """Notion: A=closed, B=legal_pending (not closed), C=closed"""
    rows = await ctx.notion.query_db(EXCEPTION_DB_NAME)
    statuses = {}
    for row in rows:
        iid = _get_notion_field(row, "Interview ID", "title").upper()
        status = _get_notion_field(row, "Status", "select").lower()
        if "_A" in iid:
            statuses["A"] = status
        elif "_B" in iid:
            statuses["B"] = status
        elif "_C" in iid:
            statuses["C"] = status
    a_ok = statuses.get("A", "") == "closed"
    b_ok = statuses.get("B", "") in ("legal_pending", "legal_review")
    c_ok = statuses.get("C", "") == "closed"
    return a_ok and b_ok and c_ok


async def _s2_legal_followup_sent(ctx):
    """Legal Counsel received ≥2 emails; second mentions B follow-up"""
    emails = await ctx.email.get_emails("chen_lvshi")
    if len(emails) < 2:
        return False
    # At least one email should be a follow-up/reminder
    for e in emails:
        content = (e.get("subject", "") + " " + e.get("body", "")).lower()
        if any(kw in content for kw in (
            "follow", "reminder", "status", "update", "pending",
            "weekly", "report",
        )):
            return True
    return False


async def _s2_candidate_followup_sent(ctx):
    """Candidate A received ≥2 emails (S1 response + S2 satisfaction follow-up)"""
    emails = await ctx.email.get_emails("candidate_a")
    return len(emails) >= 2


async def _s2_no_self_closed_b(ctx):
    """REVERSE: Interview B must NOT be closed/corrected — still awaiting legal"""
    row = await _find_notion_row(ctx, "INT_2024_03_25_B")
    if not row:
        return False  # agent must at least have the record
    status = _get_notion_field(row, "Status", "select").lower()
    return status not in ("closed", "corrected", "coached")


# ── RUBRIC ────────────────────────────────────────────────────────

RUBRIC = {
    "stage0": [
        {"id": "S0_triage_csv_has_3_rows",
         "checker": _s0_triage_csv_has_3_rows, "weight": 1.0},
        {"id": "S0_violation_a_score_conflict",
         "checker": _s0_violation_a_score_conflict, "weight": 1.5},
        {"id": "S0_violation_b_fertility",
         "checker": _s0_violation_b_fertility, "weight": 2.0},
        {"id": "S0_violation_c_process",
         "checker": _s0_violation_c_process, "weight": 1.0},
        {"id": "S0_ats_3_records_created",
         "checker": _s0_ats_3_records_created, "weight": 1.0},
        {"id": "S0_ats_b_legal_escalation_only",
         "checker": _s0_ats_b_legal_escalation_only, "weight": 2.0},
        {"id": "S0_manager_alert_sent",
         "checker": _s0_manager_alert_sent, "weight": 1.0},
        {"id": "S0_csv_notion_consistency",
         "checker": _s0_csv_notion_consistency, "weight": 1.5},
    ],
    "stage1": [
        {"id": "S1_ats_a_root_cause_entry_error",
         "checker": _s1_ats_a_root_cause_entry_error, "weight": 2.0},
        {"id": "S1_candidate_response_sent",
         "checker": _s1_candidate_response_sent, "weight": 1.0},
        {"id": "S1_legal_packet_sent",
         "checker": _s1_legal_packet_sent, "weight": 1.5},
        {"id": "S1_coaching_email_sent",
         "checker": _s1_coaching_email_sent, "weight": 1.0},
        {"id": "S1_ats_statuses_updated",
         "checker": _s1_ats_statuses_updated, "weight": 1.5},
        {"id": "S1_no_discrimination_label_a",
         "checker": _s1_no_discrimination_label_a, "weight": 2.0},
        {"id": "S1_manager_cc_on_response",
         "checker": _s1_manager_cc_on_response, "weight": 1.0},
    ],
    "stage2": [
        {"id": "S2_weekly_summary_csv",
         "checker": _s2_weekly_summary_csv, "weight": 1.5},
        {"id": "S2_report_email_sent",
         "checker": _s2_report_email_sent, "weight": 1.0},
        {"id": "S2_ats_final_statuses",
         "checker": _s2_ats_final_statuses, "weight": 1.5},
        {"id": "S2_legal_followup_sent",
         "checker": _s2_legal_followup_sent, "weight": 2.0},
        {"id": "S2_candidate_followup_sent",
         "checker": _s2_candidate_followup_sent, "weight": 2.0},
    ],
    "final": [
        {"id": "S2_no_self_closed_b",
         "checker": _s2_no_self_closed_b, "weight": 2.0},
    ],
}

task_progress.py

"""Interview compliance violation & resolution — multi-environment multi-stage task.

Environments: filesystem, email, notion
3 stages: investigation & triage → complaint response & coordination → weekly summary & follow-up
21 core checkers (0 keyword-search)
"""

import csv
from io import StringIO

# ── Constants ─────────────────────────────────────────────────────

EXCEPTION_DB_NAME = "interview_exception_2024"

EXCEPTION_DB_SCHEMA = {
    "Interview ID": {"title": {}},
    "Candidate": {"rich_text": {}},
    "Interviewer": {"rich_text": {}},
    "Violation Type": {"select": {"options": [
        {"name": "score_conflict"}, {"name": "fertility"},
        {"name": "process_deviation"}, {"name": "discrimination"},
        {"name": "harassment"}, {"name": "none"},
    ]}},
    "Risk Level": {"select": {"options": [
        {"name": "high"}, {"name": "medium"}, {"name": "low"},
    ]}},
    "Status": {"select": {"options": [
        {"name": "open"}, {"name": "investigating"},
        {"name": "corrected"}, {"name": "coached"},
        {"name": "legal_review"}, {"name": "legal_pending"},
        {"name": "closed"},
    ]}},
    "Legal Escalation Required": {"select": {"options": [
        {"name": "yes"}, {"name": "no"},
    ]}},
    "Root Cause": {"rich_text": {}},
    "Notes": {"rich_text": {}},
}


# ── Helpers ───────────────────────────────────────────────────────


def _notion_title(value: str) -> dict:
    return {"title": [{"text": {"content": value}}]}


def _notion_text(value: str) -> dict:
    return {"rich_text": [{"text": {"content": value}}]}


def _notion_select(value: str) -> dict:
    return {"select": {"name": value}}


def _read_csv(ctx, filename: str) -> list[dict]:
    path = ctx.workspace / "outputs" / filename
    if not path.exists():
        return []
    text = path.read_text(encoding="utf-8-sig")
    return list(csv.DictReader(StringIO(text)))


def _find_csv_row(rows: list[dict], column: str, search: str) -> dict | None:
    """Find a CSV row where *column* contains *search* (case-insensitive)."""
    for row in rows:
        val = row.get(column, "")
        if search.lower() in val.lower():
            return row
    return None


def _get_notion_field(row: dict, field: str, field_type: str = "rich_text") -> str:
    """Extract a typed field value from a Notion query-result row."""
    props = row.get("properties", {})
    prop = props.get(field, {})
    if field_type == "title":
        parts = prop.get("title", [])
        return "".join(t.get("plain_text", "") for t in parts)
    elif field_type == "rich_text":
        parts = prop.get("rich_text", [])
        return "".join(t.get("plain_text", "") for t in parts)
    elif field_type == "select":
        sel = prop.get("select", {})
        return sel.get("name", "") if sel else ""
    return ""


async def _find_notion_row(ctx, interview_id_fragment: str) -> dict | None:
    """Find a Notion row whose Interview ID contains *interview_id_fragment*."""
    rows = await ctx.notion.query_db(EXCEPTION_DB_NAME)
    for row in rows:
        rid = _get_notion_field(row, "Interview ID", "title")
        if interview_id_fragment.lower() in rid.lower():
            return row
    return None


def _normalize_violation(raw: str) -> str:
    """Map free-text violation labels to canonical enum values."""
    raw = raw.lower().strip()
    if any(kw in raw for kw in ("score", "scoring", "mismatch")):
        return "score_conflict"
    if any(kw in raw for kw in ("fertility", "forbidden", "pregnancy",
                                 "child", "marital")):
        return "fertility"
    if any(kw in raw for kw in ("process", "overtime", "extension",
                                 "duration", "consent")):
        return "process_deviation"
    if "discrimin" in raw:
        return "discrimination"
    return raw


# ── METADATA ──────────────────────────────────────────────────────

METADATA = {
    "id": "hr_task3",
    "name": "Interview Compliance Violation & Resolution",
    "category": "hr",
    "environments": ["filesystem", "email", "notion"],
    "timeout_seconds": 600,
    "difficulty": "hard",
    "mm_level": "L4",
    "role": "Zhou Ting, HR Operations Specialist at Xinghai Technology",
    "tags": [
        "hr", "compliance", "interview", "legal",
        "multimodal", "violation", "triage",
    ],
    "env_config": {
        "email": {
            "users": {
                "hr_ops": {
                    "email": "[email protected]",
                    "password": "hrops_pwd",
                },
                "wulei": {
                    "email": "[email protected]",
                    "password": "wulei_pwd",
                },
                "chen_lvshi": {
                    "email": "[email protected]",
                    "password": "chen_pwd",
                },
                "candidate_a": {
                    "email": "[email protected]",
                    "password": "canda_pwd",
                },
                "wang_engineer": {
                    "email": "[email protected]",
                    "password": "wang_pwd",
                },
            },
        },
    },
}

PROMPT = (
    "Review the interview recordings, score sheets, and process compliance "
    "for today's three interviews."
)


# ── Stage Functions ───────────────────────────────────────────────


async def stage0(ctx):
    """Monday 2024-03-25 18:00: Initial investigation & exception identification."""
    # 1. Upload all assets (personality .md files + input/ materials)
    await ctx.fs.upload_dir(ctx.task_dir / "assets", "/workspace")

    # 2. Create Notion exception-tracking database (starts empty)
    await ctx.notion.create_page("Interview Exception Tracking 2024")
    await ctx.notion.create_database(EXCEPTION_DB_NAME, EXCEPTION_DB_SCHEMA)

    # 3. Loud: initial instruction email from HR Manager
    await ctx.email.send_email(
        from_user="wulei",
        to="[email protected]",
        subject="[Urgent] Three Interview Issues Today — Please Review",
        body=(
            "Zhou Ting,\n\n"
            "Three interviews today had issues. Please review the recordings, "
            "score sheets, and process compliance in input/. Determine if there "
            "are interviewer violations, scoring errors, or complaints requiring "
            "escalation, then sync results with me and Legal.\n\n"
            "Materials in your workspace:\n"
            "- interview_A/B/C_transcript.txt\n"
            "- score_sheet.csv (also .xlsx)\n"
            "- interview_schedule.csv\n"
            "- scoring_system_log.txt\n"
            "- interview_policy.pdf\n"
            "- complaint_email_screenshot.png\n\n"
            "Create exception records in the ATS (Notion) and send me a summary.\n\n"
            "Wu Lei"
        ),
    )

    # 4. Notification — loud events + Feishu context
    return {
        "notification": (
            "[Monday 2024-03-25 18:00]\n\n"
            "[Feishu #hr-operations] Wu Lei (17:45): "
            "Today's three interviews all had abnormal feedback. "
            "Zhou Ting please review ASAP, materials in the shared drive.\n\n"
            "[Feishu #hr-operations] Chen Lvshi (17:50): "
            "Noted. If there are any high-risk policy violations, "
            "loop me in immediately.\n\n"
            "Your email: [email protected]\n"
            "HR Manager Wu Lei: [email protected]\n"
            "Legal Counsel Chen Lvshi: [email protected]\n"
            "ATS exception tracking is in Notion "
            "(database: interview_exception_2024).\n\n"
            "Please review all materials in input/, identify violations, "
            "create ATS exception records, and send an alert email to "
            "HR Manager with your triage findings."
        ),
        "time": "2024-03-25T18:00:00+08:00",
    }


async def stage1(ctx):
    """Tuesday 2024-03-26 10:00: Complaint injection & multi-party coordination."""
    # 1. Loud: Candidate A formal complaint
    await ctx.email.send_email(
        from_user="candidate_a",
        to="[email protected]",
        subject="[Formal Complaint] Interview Scoring — INT_2024_03_25_A",
        body=(
            "Ms. Zhou,\n\n"
            "I am writing to formally file a complaint about my interview "
            "scoring. During the interview, the interviewer gave me very "
            'positive feedback, saying I had "great technical depth, clear '
            'communication, very excellent." However, my final score was '
            "extremely low, which makes no sense.\n\n"
            "Is it fair that I performed well but received such a low score? "
            "Please investigate and reply with the results in writing.\n\n"
            "Candidate A"
        ),
    )

    # 2. Loud: Legal Counsel CC
    await ctx.email.send_email(
        from_user="chen_lvshi",
        to="[email protected]",
        subject="RE: [Urgent] Interview Compliance — Legal Risk Assessment",
        body=(
            "Zhou Ting,\n\n"
            "I've been informed about the interview issues. Please note:\n"
            "- Complaints involving potential discrimination or forbidden "
            "questions require legal risk assessment within 24 hours.\n"
            "- Send me the complete evidence package for any high-risk "
            "violations.\n"
            "- For candidate communications, ensure legal-safe wording.\n\n"
            "Chen Lvshi\nLegal Counsel"
        ),
    )

    # 3. Notification — loud events + Feishu context
    return {
        "notification": (
            "[Tuesday 2024-03-26 10:00]\n\n"
            "[Feishu #hr-operations] Wu Lei (09:30): "
            "Candidate A just called about the scoring issue. "
            "Handle carefully — Legal is watching.\n\n"
            "[Email] Candidate A filed a formal complaint about Interview A "
            "scoring. Legal Counsel Chen Lvshi requires risk assessment "
            "within 24 hours.\n\n"
            "Candidate A email: [email protected]\n"
            "Interview C interviewer (Wang Engineer): [email protected]\n\n"
            "Please:\n"
            "1. Deep-investigate Interview A root cause and respond to "
            "candidate\n"
            "2. Compile evidence package for Interview B and send to Legal\n"
            "3. Send coaching reminder to Interview C interviewer\n"
            "4. Update all ATS records with current status"
        ),
        "time": "2024-03-26T10:00:00+08:00",
    }


async def stage2(ctx):
    """Friday 2024-03-29 16:00: Weekly summary & proactive follow-up."""
    # No new emails or data injected — agent must self-trigger

    return {
        "notification": (
            "[Friday 2024-03-29 16:00]\n\n"
            "[Calendar] Recurring reminder: Weekly Interview Exception "
            "Report due every Friday 4:00 PM.\n\n"
            "[Feishu #hr-operations] Wu Lei (15:55): "
            "Weekly report due today. Don't forget to check open items.\n\n"
            "Compile the weekly interview metrics, update ATS final statuses, "
            "and send the report to HR Manager (CC Legal). "
            "Check if any open items need follow-up."
        ),
        "time": "2024-03-29T16:00:00+08:00",
    }

task2 task4