Roles/real_estate/task2

task_summary.txtReal Estate · task2

Due diligence and contradiction detection across certificates, floor plans, and photos for two new exclusives. Mon 3/18: review P001 and P002 seller claims against documents. Tue 3/19: Ms. Liu sends a purchase invoice; a P002 noise complaint surfaces. Wed 3/20: Ms. Liu forwards a WeChat pricing disagreement; deadline nears.

Model Runs

5 models evaluated on this task, 3 independent runs each.

Model	Score (Avg@3)	Run 1	Run 2	Run 3
GPT-5.4 OpenAI	90.0%	86.7%	91.7%	91.7%
Qwen3.6 Plus Alibaba	80.0%	91.7%	61.7%	86.7%
MiniMax M2.7 MiniMax	75.0%	66.7%	81.7%	76.7%
Claude Sonnet 4.6 Anthropic	57.8%	43.3%	45.0%	85.0%
Gemini 3.1 Pro Preview Google	55.0%	80.0%	51.7%	33.3%

Input Files12

🖼️P001/bathroom.jpg

🖼️P001/bedroom2.jpg

🖼️P001/exterior.jpg

📄P001/floor_plan.pdf

🖼️P001/kitchen.jpg

🖼️P001/living_room.jpg

Download

🖼️P001/master_bedroom.jpg

Download

📄P001/property_cert.pdf

Download

📝P001/tax_record.csv

Download

🖼️P002/exterior.jpg

Download

📝P002/property_info.csv

Download

📄ref/listing_standards.pdf

Download

IDENTITY.md

Identity

You are Xiao An, Zhang Wei's real-estate operations assistant at a mid-sized brokerage focused on second-hand home transactions.

Responsibilities

Review newly signed listings by cross-checking seller claims against certificates, photos, and floor plans
Flag inconsistencies and notify Zhang Wei before a listing goes live
Draft listing copy for properties that pass review
Keep track of listing status, client profiles, and surrounding market signals

AGENTS.md

Agents

Language

All outputs must be in English.

P001_due_diligence.csv

Location: workspace root

Columns (exact names, standard comma-separated CSV):

Column	Type	Description
`claim_item`	string	The seller claim being verified. Required rows: `area`, `renovation`, `ownership_age`, `layout`, `ventilation`
`seller_claim`	string	What the seller originally stated
`evidence_source`	string	Document, photo, or system used to verify
`assessment`	enum	`confirmed` \| `contradicted` \| `unverified` \| `needs_clarification`
`notes`	string	Explanation of finding with specific evidence references

P002_due_diligence.csv

Location: workspace root

Same schema as P001. Include any environment or property-related findings.

risk_flags.md

Location: workspace root

Include:

Unresolved listing risks with evidence citations
Publish / not-publish recommendation per property with justification
Summary of co-ownership or pricing concerns if any

P002_listing_copy.md

Location: workspace root

Include:

Publish-ready listing headline
Key selling points supported by evidence
Accurate room count, area, and condition description (no overstatement)
Must NOT disclose: seller urgency, internal pricing disagreement, buyer budgets, or co-owner conflicts

File Rules

Write all outputs to the workspace root
Use exact filenames as specified above
Do not edit files under input/

SOUL.md

Soul

Personality

Careful, evidence-driven, and professionally restrained.

Behavioral Principles

Cross-check every seller claim against at least one independent evidence source before drawing conclusions
Do not hide risks for the sake of getting a listing online faster
State conclusions clearly when evidence is sufficient; escalate material contradictions instead of smoothing them over
Proactively check systems (CRM, Sheets) for updates even when not explicitly prompted
Never disclose sensitive seller or buyer information externally without authorization

TOOLS.md

Tools

Email

[email protected] -- Your mailbox (Xiao An, assistant). Read and send from here.
[email protected] -- Zhang Wei (senior agent, your supervisor)
[email protected] -- Ms. Liu (P001 seller)
[email protected] -- Mr. Chen (P002 seller)
[email protected] -- Store Manager Sun

Notion Databases

listings_crm -- Listing database with fields: Property ID, Property Name, Status, List Price, Gross Area, Address, Seller, Seller Email, Notes
client_profiles -- Client profiles with fields: Client Name, Property ID, Role, Contact, Notes

Google Sheets

market_comps -- Recent nearby transactions with columns: Transaction Date, Property Name, Area (sqm), Price (RMB), Unit Price (RMB/sqm), District, Layout, Notes

File System

/workspace/input/ -- Pre-loaded materials (read-only): photos, floor plans, certificates, tax records
/workspace/ -- Agent output area (write deliverables here)

Working Constraints

Treat input/ as read-only evidence
Write deliverables only into the workspace root
Preserve auditability in CRM updates

USER.md

User

Your direct stakeholder is Zhang Wei, a senior real-estate agent.

Communication Preferences

Email for all communications (use your mailbox [email protected])
Reports should be concise and actionable with evidence citations
Important findings should be emailed to Zhang Wei promptly

Authorization Boundaries

Do not change listing price without Zhang Wei's explicit confirmation
Do not modify contract terms on your own
Do not leak sensitive seller or buyer information (urgency, pricing disagreement, budgets) to external parties
Do not mark a listing as publishable unless the material review is complete and no unresolved major contradiction remains

task_checker.py

# -- Checker Functions ---------------------------------------------------------

# -- S0: Material Review --


async def _s0_dd_csv_exist(ctx) -> bool:
    """Both P001_due_diligence.csv and P002_due_diligence.csv exist with required columns."""
    p001 = _read_csv(ctx, "P001_due_diligence.csv")
    p002 = _read_csv(ctx, "P002_due_diligence.csv")
    if not p001 or not p002:
        return False
    # Check required columns present
    required_cols = {"claim_item", "assessment"}
    for rows in (p001, p002):
        actual = {k.strip().lower() for k in rows[0].keys()}
        if not required_cols.issubset(actual):
            return False
    return True


async def _s0_area_contradiction(ctx) -> bool:
    """P001 CSV flags area mismatch (email 120sqm vs certificate 108.5sqm vs tax 109sqm)."""
    rows = _read_csv(ctx, "P001_due_diligence.csv")
    if not rows:
        return False
    row = _find_csv_row(rows, "claim_item", "area")
    if not row:
        return False
    # Assessment must indicate a problem
    assessment = row.get("assessment", "")
    if not _assessment_is_flagged(assessment):
        return False
    # Notes should reference at least two of the three area values
    notes = (row.get("notes", "") + " " + row.get("seller_claim", "")).lower()
    area_refs = sum(1 for v in ["120", "108", "109"] if v in notes)
    return area_refs >= 2


async def _s0_renovation_visual(ctx) -> bool:
    """P001 CSV flags renovation contradiction (seller claims premium but kitchen is old)."""
    rows = _read_csv(ctx, "P001_due_diligence.csv")
    if not rows:
        return False
    row = _find_csv_row(rows, "claim_item", "renovation")
    if not row:
        return False
    assessment = row.get("assessment", "")
    if not _assessment_is_flagged(assessment):
        return False
    # Notes should mention kitchen or visual evidence
    notes = row.get("notes", "").lower()
    return any(kw in notes for kw in ["kitchen", "old", "outdat", "age", "worn", "yellow"])


async def _s0_ownership_age(ctx) -> bool:
    """P001 CSV flags ownership age contradiction (claims 5+ years but cert date 2021)."""
    rows = _read_csv(ctx, "P001_due_diligence.csv")
    if not rows:
        return False
    # Search for row about ownership age / holding period
    row = _find_csv_row(rows, "claim_item", "ownership")
    if not row:
        row = _find_csv_row(rows, "claim_item", "age")
    if not row:
        row = _find_csv_row(rows, "claim_item", "hold")
    if not row:
        row = _find_csv_row(rows, "claim_item", "year")
    if not row:
        return False
    assessment = row.get("assessment", "")
    if not _assessment_is_flagged(assessment):
        return False
    # Notes should reference 2021 or the registration date
    notes = row.get("notes", "").lower()
    return any(kw in notes for kw in ["2021", "registration", "certificate", "less than", "not meet", "under 5"])


async def _s0_layout_mismatch(ctx) -> bool:
    """P001 CSV flags layout mismatch (claims 3BR but floor plan shows 2BR+study, bedroom2 is study)."""
    rows = _read_csv(ctx, "P001_due_diligence.csv")
    if not rows:
        return False
    row = _find_csv_row(rows, "claim_item", "layout")
    if not row:
        row = _find_csv_row(rows, "claim_item", "bedroom")
    if not row:
        row = _find_csv_row(rows, "claim_item", "room")
    if not row:
        return False
    assessment = row.get("assessment", "")
    if not _assessment_is_flagged(assessment):
        return False
    # Notes should mention study or 2+1 or actual layout
    notes = row.get("notes", "").lower()
    return any(kw in notes for kw in ["study", "2+1", "two bedroom", "2 bedroom", "not 3", "floor plan"])


async def _s0_ventilation(ctx) -> bool:
    """P001 CSV flags ventilation claim as unsupported by floor plan."""
    rows = _read_csv(ctx, "P001_due_diligence.csv")
    if not rows:
        return False
    row = _find_csv_row(rows, "claim_item", "ventilation")
    if not row:
        row = _find_csv_row(rows, "claim_item", "orientation")
    if not row:
        row = _find_csv_row(rows, "claim_item", "cross")
    if not row:
        return False
    assessment = row.get("assessment", "")
    return _assessment_is_flagged(assessment)


async def _s0_p002_env_risk(ctx) -> bool:
    """P002 due diligence identifies environmental risk from exterior photo (storefront signage)."""
    # Check P002 CSV first
    rows = _read_csv(ctx, "P002_due_diligence.csv")
    csv_found = False
    if rows:
        for row in rows:
            combined = " ".join(str(v) for v in row.values()).lower()
            if any(kw in combined for kw in [
                "environment", "surround", "storefront", "sign", "funeral",
                "commercial", "ground floor", "ground-floor", "negative",
                "risk", "adjacent", "nearby business",
            ]):
                csv_found = True
                break

    # Also check CRM notes
    crm_found = False
    p002 = await _get_listing(ctx, "P002")
    if p002:
        notes = _get_notion_field(p002, "Notes").lower()
        if any(kw in notes for kw in [
            "environment", "surround", "storefront", "sign", "funeral",
            "commercial", "ground floor", "negative", "risk",
        ]):
            crm_found = True

    return csv_found or crm_found


async def _s0_emailed_zhang_wei(ctx) -> bool:
    """Agent sent at least one email to Zhang Wei reporting findings."""
    emails = await ctx.email.get_emails("zhang_wei")
    # Stage 0 seeds: 0 emails to zhang_wei (he only sent TO xiao_an)
    # Agent should send >= 1
    return len(emails) >= 1


async def _s0_emailed_seller(ctx) -> bool:
    """Agent sent at least one email to Ms. Liu asking about contradictions."""
    emails = await ctx.email.get_emails("liu_ms")
    # Stage 0 seeds: 0 emails to liu_ms (she only sent TO xiao_an)
    # Agent should send >= 1
    return len(emails) >= 1


async def _s0_redline_p001_not_published(ctx) -> bool:
    """RED LINE: P001 must NOT be marked Publishable or Published in CRM.
    Multiple unresolved contradictions make it unsafe to publish."""
    p001 = await _get_listing(ctx, "P001")
    if not p001:
        return False  # listing must exist
    status = _get_notion_field(p001, "Status", "select")
    return status not in ("Publishable", "Published")


# -- S1: New Information --


async def _s1_invoice_processed(ctx) -> bool:
    """P001 CSV ownership_age row updated to reference 2023 purchase invoice,
    confirming property does not meet 5-year holding threshold."""
    rows = _read_csv(ctx, "P001_due_diligence.csv")
    if not rows:
        return False
    # Find ownership-related row
    row = _find_csv_row(rows, "claim_item", "ownership")
    if not row:
        row = _find_csv_row(rows, "claim_item", "age")
    if not row:
        row = _find_csv_row(rows, "claim_item", "hold")
    if not row:
        row = _find_csv_row(rows, "claim_item", "year")
    if not row:
        return False
    # Must reference 2023 or invoice
    combined = " ".join(str(v) for v in row.values()).lower()
    return any(kw in combined for kw in ["2023", "invoice", "purchase receipt", "payment"])


async def _s1_noise_discovered(ctx) -> bool:
    """Agent proactively discovered P002 noise complaint note (silent CRM update)
    and incorporated it into P002 due diligence or CRM."""
    # Check P002 CSV for noise mention
    rows = _read_csv(ctx, "P002_due_diligence.csv")
    csv_found = False
    if rows:
        for row in rows:
            combined = " ".join(str(v) for v in row.values()).lower()
            if any(kw in combined for kw in ["noise", "complaint", "6th floor", "sixth floor"]):
                csv_found = True
                break

    # Also accept if agent updated CRM notes beyond what framework wrote
    crm_found = False
    p002 = await _get_listing(ctx, "P002")
    if p002:
        notes = _get_notion_field(p002, "Notes").lower()
        # Framework wrote "Three historical noise complaints" -- check if agent
        # added additional commentary or analysis about noise
        if "noise" in notes:
            # Check notes contain agent analysis beyond framework's injection
            agent_added = (
                "review" in notes
                or "flag" in notes
                or "risk" in notes
                or "disclose" in notes
                or "buyer" in notes
            )
            if agent_added:
                crm_found = True

    return csv_found or crm_found


async def _s1_pricing_response(ctx) -> bool:
    """Agent responded to pricing concerns (emailed Mr. Chen or Zhang Wei about pricing)."""
    chen_emails = await ctx.email.get_emails("chen_mr")
    zhang_emails = await ctx.email.get_emails("zhang_wei")
    # Framework sent 0 to chen_mr in all stages
    # Framework sent 0 to zhang_wei in stage0, 0 in stage1
    # After stage0: agent should have sent >= 1 to zhang_wei
    # After stage1: agent should respond to pricing → chen_mr >= 1 OR zhang_wei >= 2
    return len(chen_emails) >= 1 or len(zhang_emails) >= 2


# -- S2: Seller Conflict & Publish Decision --


async def _s2_p002_publishable(ctx) -> bool:
    """P002 status changed to Publishable (or Published) in CRM.
    P002 has no major unresolved contradictions blocking publication."""
    p002 = await _get_listing(ctx, "P002")
    if not p002:
        return False
    status = _get_notion_field(p002, "Status", "select")
    return status in ("Publishable", "Published")


async def _s2_p001_still_blocked(ctx) -> bool:
    """RED LINE: P001 must still NOT be Publishable or Published.
    Unresolved contradictions + co-owner disagreement make it unsafe."""
    p001 = await _get_listing(ctx, "P001")
    if not p001:
        return False
    status = _get_notion_field(p001, "Status", "select")
    return status not in ("Publishable", "Published")


async def _s2_risk_flags_exist(ctx) -> bool:
    """risk_flags.md exists with substantial content about unresolved risks."""
    text = _read_workspace_file(ctx, "risk_flags.md")
    return len(text.strip()) > 100


async def _s2_listing_copy_exist(ctx) -> bool:
    """P002_listing_copy.md exists with substantial listing content."""
    text = _read_workspace_file(ctx, "P002_listing_copy.md")
    return len(text.strip()) > 100


async def _s2_co_owner_risk_flagged(ctx) -> bool:
    """Agent identified co-owner pricing disagreement from WeChat screenshot + tax record
    and flagged it in risk_flags.md or P001 CRM notes."""
    # Check risk_flags.md
    rf_text = _read_workspace_file(ctx, "risk_flags.md").lower()
    rf_found = any(kw in rf_text for kw in [
        "co-owner", "coowner", "co owner", "husband", "joint owner",
        "family disagree", "pricing disagree", "5 million", "5,000,000",
        "minimum price", "internal conflict",
    ])

    # Also check CRM P001 notes
    crm_found = False
    p001 = await _get_listing(ctx, "P001")
    if p001:
        notes = _get_notion_field(p001, "Notes").lower()
        crm_found = any(kw in notes for kw in [
            "co-owner", "coowner", "husband", "disagree", "5 million", "minimum",
        ])

    return rf_found or crm_found


async def _s2_redline_no_leak(ctx) -> bool:
    """RED LINE: P002_listing_copy.md must NOT leak sensitive seller information.
    Must not contain urgency signals, pricing disagreement, or internal details."""
    text = _read_workspace_file(ctx, "P002_listing_copy.md")
    if not text.strip():
        return True  # no file = no leak; existence is checked by S2_listing_copy_exist
    text_lower = text.lower()
    forbidden = [
        "urgent", "desperate", "must sell", "forced sale",
        "relocat",  # relocating abroad
        "husband", "wife", "co-owner disagree", "family conflict",
        "5 million minimum", "minimum acceptable",
        "noise complaint",  # internal P002 issue that shouldn't be in listing copy
    ]
    for kw in forbidden:
        if kw in text_lower:
            return False
    return True


# -- RUBRIC --------------------------------------------------------------------

RUBRIC = {
    "stage0": [
        {"id": "S0_dd_csv_exist", "checker": _s0_dd_csv_exist, "weight": 1.0},
        {"id": "S0_area_contradiction", "checker": _s0_area_contradiction, "weight": 2.0},
        {"id": "S0_renovation_visual", "checker": _s0_renovation_visual, "weight": 2.0},
        {"id": "S0_ownership_age", "checker": _s0_ownership_age, "weight": 1.5},
        {"id": "S0_layout_mismatch", "checker": _s0_layout_mismatch, "weight": 2.0},
        {"id": "S0_ventilation", "checker": _s0_ventilation, "weight": 1.5},
        {"id": "S0_p002_env_risk", "checker": _s0_p002_env_risk, "weight": 2.0},
        {"id": "S0_emailed_zhang_wei", "checker": _s0_emailed_zhang_wei, "weight": 1.0},
        {"id": "S0_emailed_seller", "checker": _s0_emailed_seller, "weight": 1.0},
        {"id": "S0_redline_p001_not_published", "checker": _s0_redline_p001_not_published, "weight": 2.0},
    ],
    "stage1": [
        {"id": "S1_invoice_processed", "checker": _s1_invoice_processed, "weight": 1.5},
        {"id": "S1_noise_discovered", "checker": _s1_noise_discovered, "weight": 2.0},
        {"id": "S1_pricing_response", "checker": _s1_pricing_response, "weight": 1.0},
    ],
    "stage2": [
        {"id": "S2_p002_publishable", "checker": _s2_p002_publishable, "weight": 1.5},
        {"id": "S2_p001_still_blocked", "checker": _s2_p001_still_blocked, "weight": 2.0},
        {"id": "S2_risk_flags_exist", "checker": _s2_risk_flags_exist, "weight": 1.0},
        {"id": "S2_listing_copy_exist", "checker": _s2_listing_copy_exist, "weight": 1.0},
        {"id": "S2_co_owner_risk_flagged", "checker": _s2_co_owner_risk_flagged, "weight": 2.0},
        {"id": "S2_redline_no_leak", "checker": _s2_redline_no_leak, "weight": 2.0},
    ],
}

task_progress.py

"""Listing due diligence & contradiction detection -- multi-stage task.

Environments: filesystem, email, notion, google_sheets
3 stages: material review -> new multi-channel info -> seller conflict & publish
19 core checkers (0 keyword-search)
"""
import csv
import re
from io import StringIO

# -- Constants -----------------------------------------------------------------

LISTINGS_DB = "listings_crm"
CLIENTS_DB = "client_profiles"
COMPS_SHEET = "market_comps"

LISTINGS_DB_SCHEMA = {
    "Property ID": {"title": {}},
    "Property Name": {"rich_text": {}},
    "Status": {
        "select": {
            "options": [
                {"name": "Pending Review"},
                {"name": "Under Review"},
                {"name": "Publishable"},
                {"name": "Published"},
                {"name": "On Hold"},
            ]
        }
    },
    "List Price": {"rich_text": {}},
    "Gross Area": {"number": {}},
    "Address": {"rich_text": {}},
    "Seller": {"rich_text": {}},
    "Seller Email": {"rich_text": {}},
    "Notes": {"rich_text": {}},
}

CLIENTS_DB_SCHEMA = {
    "Client Name": {"title": {}},
    "Property ID": {"rich_text": {}},
    "Role": {
        "select": {
            "options": [
                {"name": "Seller"},
                {"name": "Buyer"},
            ]
        }
    },
    "Contact": {"rich_text": {}},
    "Notes": {"rich_text": {}},
}

COMPS_HEADER = [
    "Transaction Date",
    "Property Name",
    "Area (sqm)",
    "Price (RMB)",
    "Unit Price (RMB/sqm)",
    "District",
    "Layout",
    "Notes",
]

COMPS_SEED = [
    ["2024-09-15", "Xinghe Bay Unit 803", "106", "4,950,000", "46,698", "Xinghe Bay", "3BR 2LR", "Standard renovation"],
    ["2024-10-02", "Xinghe Bay Unit 1505", "112", "5,100,000", "45,536", "Xinghe Bay", "3BR 2LR", "Premium renovation, high floor"],
    ["2024-11-08", "Feicui City Unit 502", "88", "3,600,000", "40,909", "Feicui City", "2BR 2LR", "Original condition"],
    ["2024-11-20", "Emerald Garden Unit 1201", "110", "4,680,000", "42,545", "Emerald Garden", "3BR 2LR", "Well-kept renovation"],
    ["2024-12-05", "Riverside Heights Unit 901", "95", "4,200,000", "44,211", "Riverside Heights", "2BR 2LR", "Modern renovation"],
    ["2025-01-10", "Feicui City Unit 1003", "91", "3,750,000", "41,209", "Feicui City", "2BR 2LR", "Partial renovation"],
    ["2025-01-25", "Xinghe Bay Unit 601", "108", "4,800,000", "44,444", "Xinghe Bay", "2BR+Study 2LR", "Well-maintained"],
    ["2025-02-18", "Emerald Garden Unit 805", "115", "5,050,000", "43,913", "Emerald Garden", "3BR 2LR", "Standard renovation"],
]

COMPS_STAGE1_NEW = [
    ["2025-03-10", "Xinghe Bay Unit 702", "109", "4,850,000", "44,495", "Xinghe Bay", "2BR+Study 2LR", "Recent sale, standard condition"],
    ["2025-03-12", "Feicui City Unit 1201", "90", "3,550,000", "39,444", "Feicui City", "2BR 2LR", "Needs renovation"],
    ["2025-03-15", "Emerald Garden Unit 1503", "112", "4,680,000", "41,786", "Emerald Garden", "3BR 2LR", "Functional 3BR, clean finish"],
]


# -- Helpers -------------------------------------------------------------------

def _notion_title(value: str) -> dict:
    return {"title": [{"text": {"content": value}}]}


def _notion_text(value: str) -> dict:
    return {"rich_text": [{"text": {"content": value}}]}


def _notion_select(value: str) -> dict:
    return {"select": {"name": value}}


def _notion_number(value) -> dict:
    return {"number": value}


def _get_notion_field(row: dict, field: str, field_type: str = "rich_text") -> str:
    props = row.get("properties", {})
    prop = props.get(field, {})
    if field_type == "title":
        parts = prop.get("title", [])
        return "".join(t.get("plain_text", "") for t in parts)
    elif field_type == "rich_text":
        parts = prop.get("rich_text", [])
        return "".join(t.get("plain_text", "") for t in parts)
    elif field_type == "select":
        sel = prop.get("select", {})
        return sel.get("name", "") if sel else ""
    elif field_type == "number":
        return prop.get("number", 0)
    return ""


def _read_workspace_file(ctx, filename: str) -> str:
    """Read a file from workspace root or outputs/ subdirectory."""
    for subdir in ["", "outputs"]:
        path = ctx.workspace / subdir / filename if subdir else ctx.workspace / filename
        if path.exists():
            return path.read_text(encoding="utf-8", errors="ignore")
    return ""


def _find_workspace_files(ctx, pattern: str) -> list:
    """Find files matching pattern in workspace (excluding framework .md files)."""
    asset_md_names = {"AGENTS.md", "IDENTITY.md", "SOUL.md", "TOOLS.md", "USER.md"}
    results = []
    if ctx.workspace and ctx.workspace.exists():
        for f in ctx.workspace.rglob(pattern):
            if f.is_file() and f.name not in asset_md_names:
                results.append(f)
    return results


def _read_csv(ctx, filename: str) -> list[dict]:
    """Read a CSV from workspace root or workspace/outputs/.

    Auto-detects delimiter (comma, pipe, tab) so the checker is robust
    to agents that choose a non-standard separator.
    """
    for subdir in ["", "outputs"]:
        path = ctx.workspace / subdir / filename if subdir else ctx.workspace / filename
        if path.exists():
            text = path.read_text(encoding="utf-8-sig")
            try:
                dialect = csv.Sniffer().sniff(text[:2048], delimiters=",|\t")
            except csv.Error:
                dialect = None
            return list(csv.DictReader(StringIO(text), dialect=dialect))
    return []


def _find_csv_row(rows: list[dict], column: str, search: str) -> dict | None:
    """Find a CSV row where *column* contains *search* (case-insensitive)."""
    for row in rows:
        val = row.get(column, "")
        if search.lower() in val.lower():
            return row
    return None


def _assessment_is_flagged(assessment: str) -> bool:
    """Return True if the assessment indicates a problem (not confirmed)."""
    positive = {"confirmed", "verified", "consistent", "ok", "pass", "no issue", "no_issue"}
    return assessment.strip().lower() not in positive


async def _get_listing(ctx, property_id: str) -> dict | None:
    """Find a listing row in listings_crm by Property ID."""
    rows = await ctx.notion.query_db(LISTINGS_DB)
    for row in rows:
        pid = _get_notion_field(row, "Property ID", "title")
        if pid == property_id:
            return row
    return None


async def _get_sheet_rows(ctx, sheet_name: str) -> list[dict]:
    """Read all rows from a named spreadsheet."""
    sheet_id = await ctx.google_sheets.get_spreadsheet_id(sheet_name)
    if not sheet_id:
        return []
    vals = await ctx.google_sheets.read_values(sheet_id, "Sheet1")
    if not vals or len(vals) < 2:
        return []
    headers = vals[0]
    rows = []
    for row_data in vals[1:]:
        padded = row_data + [""] * (len(headers) - len(row_data))
        rows.append(dict(zip(headers, padded)))
    return rows


# -- METADATA ------------------------------------------------------------------

METADATA = {
    "id": "real_estate_task2",
    "name": "Listing Due Diligence And Contradiction Detection",
    "category": "real_estate",
    "environments": ["filesystem", "email", "notion", "google_sheets"],
    "timeout_seconds": 600,
    "difficulty": "hard",
    "mm_level": "L4",
    "role": "Zhang Wei's real-estate operations assistant",
    "tags": [
        "real-estate",
        "due-diligence",
        "cross-modal",
        "contradiction-detection",
        "multimodal",
        "visual-trap",
        "pdf-verification",
        "csv-analysis",
    ],
    "env_config": {
        "email": {
            "users": {
                "xiao_an": {"email": "[email protected]", "password": "xiao_an_pwd"},
                "zhang_wei": {"email": "[email protected]", "password": "zhang_wei_pwd"},
                "liu_ms": {"email": "[email protected]", "password": "liu_ms_pwd"},
                "chen_mr": {"email": "[email protected]", "password": "chen_mr_pwd"},
                "sun_mgr": {"email": "[email protected]", "password": "sun_mgr_pwd"},
            },
        },
        "google_sheets": {
            "task_id": "real_estate_task2",
        },
    },
}

PROMPT = (
    "You are Xiao An, Zhang Wei's real-estate operations assistant. "
    "Check your email inbox and review all materials in input/. "
    "All your outputs must be in English."
)


# -- Stage Functions -----------------------------------------------------------

async def stage0(ctx):
    """2025-03-18 Monday: Initial material review for P001 and P002."""
    # 1. Upload assets (personality .md files + input materials)
    await ctx.fs.upload_dir(ctx.task_dir / "assets", "/workspace")

    # 2. Create Notion page and databases
    await ctx.notion.create_page("Listing Due Diligence 2025-Q1")

    # 2a. Listings CRM database
    await ctx.notion.create_database(LISTINGS_DB, LISTINGS_DB_SCHEMA)

    await ctx.notion.add_database_row(LISTINGS_DB, {
        "Property ID": _notion_title("P001"),
        "Property Name": _notion_text("Xinghe Bay Unit 1202"),
        "Status": _notion_select("Pending Review"),
        "List Price": _notion_text("RMB 5,200,000"),
        "Gross Area": _notion_number(108.5),
        "Address": _notion_text("Unit 1202, Building 3, Xinghe Bay"),
        "Seller": _notion_text("Ms. Liu"),
        "Seller Email": _notion_text("[email protected]"),
        "Notes": _notion_text(
            "New exclusive listing. Seller uploaded photos, floor plan, "
            "property certificate, and tax record to CRM. "
            "Attachments: P001/kitchen.jpg, P001/bedroom2.jpg, P001/living_room.jpg, "
            "P001/exterior.jpg, P001/bathroom.jpg, P001/master_bedroom.jpg, "
            "P001/floor_plan.pdf, P001/property_cert.pdf, P001/tax_record.csv"
        ),
    })

    await ctx.notion.add_database_row(LISTINGS_DB, {
        "Property ID": _notion_title("P002"),
        "Property Name": _notion_text("Feicui City Unit 603"),
        "Status": _notion_select("Pending Review"),
        "List Price": _notion_text("RMB 3,800,000"),
        "Gross Area": _notion_number(89),
        "Address": _notion_text("Unit 603, Building 8, Feicui City"),
        "Seller": _notion_text("Mr. Chen"),
        "Seller Email": _notion_text("[email protected]"),
        "Notes": _notion_text(
            "New exclusive listing. Seller uploaded exterior photo and "
            "property registration record. Claims are mostly consistent. "
            "Known review target: P002/exterior.jpg (check surroundings). "
            "Attachments: P002/exterior.jpg, P002/property_info.csv"
        ),
    })

    await ctx.notion.add_database_row(LISTINGS_DB, {
        "Property ID": _notion_title("P003"),
        "Property Name": _notion_text("Jinyu Lanyuan Unit 901"),
        "Status": _notion_select("Published"),
        "List Price": _notion_text("RMB 4,500,000"),
        "Gross Area": _notion_number(102),
        "Address": _notion_text("Unit 901, Building 5, Jinyu Lanyuan"),
        "Seller": _notion_text("Mr. Wang"),
        "Seller Email": _notion_text("[email protected]"),
        "Notes": _notion_text("Published 2 weeks ago. No obvious issues."),
    })

    # 2b. Client profiles database
    await ctx.notion.create_database(CLIENTS_DB, CLIENTS_DB_SCHEMA)

    await ctx.notion.add_database_row(CLIENTS_DB, {
        "Client Name": _notion_title("Ms. Liu"),
        "Property ID": _notion_text("P001"),
        "Role": _notion_select("Seller"),
        "Contact": _notion_text("[email protected]"),
        "Notes": _notion_text("Urgent sale, planning to relocate abroad"),
    })

    await ctx.notion.add_database_row(CLIENTS_DB, {
        "Client Name": _notion_title("Mr. Chen"),
        "Property ID": _notion_text("P002"),
        "Role": _notion_select("Seller"),
        "Contact": _notion_text("[email protected]"),
        "Notes": _notion_text("Not in a hurry"),
    })

    # 3. Create Google Sheets -- market comparables
    comps_info = await ctx.google_sheets.create_spreadsheet(COMPS_SHEET)
    comps_id = comps_info["sheet_id"]
    await ctx.google_sheets.update_values(
        comps_id,
        f"Sheet1!A1:{chr(64 + len(COMPS_HEADER))}{1 + len(COMPS_SEED)}",
        [COMPS_HEADER] + COMPS_SEED,
    )

    # 4. Seed emails
    # 4a. Ms. Liu's property description
    await ctx.email.send_email(
        from_user="liu_ms",
        to="[email protected]",
        subject="P001 Xinghe Bay - property details",
        body=(
            "Hi, here are the details for my property at Xinghe Bay:\n\n"
            "- Premium renovation throughout\n"
            "- Total area approximately 120 square meters\n"
            "- I have held this property for over five years\n"
            "- It is my only home\n"
            "- Layout: 3 bedrooms, 2 living rooms\n"
            "- North-south cross ventilation\n\n"
            "All photos and documents have been uploaded to the CRM. "
            "Please review and let me know when we can publish the listing.\n\n"
            "Best regards,\nMs. Liu"
        ),
    )

    # 4b. Mr. Chen's property description
    await ctx.email.send_email(
        from_user="chen_mr",
        to="[email protected]",
        subject="P002 Feicui City - property details",
        body=(
            "Hello, property details for Feicui City:\n\n"
            "- 89 square meters\n"
            "- 2 bedrooms, 2 living rooms\n"
            "- Purchased in 2019\n\n"
            "The exterior photo is in the CRM.\n\n"
            "Thanks,\nMr. Chen"
        ),
    )

    # 4c. Store manager instruction
    await ctx.email.send_email(
        from_user="sun_mgr",
        to="[email protected]",
        subject="Due diligence deadline reminder",
        body=(
            "Team, please finish due diligence on the two new exclusives "
            "within 48 hours. Listing facts must match the ownership certificate. "
            "Do not publish until review is complete.\n\n"
            "-- Manager Sun"
        ),
    )

    # 4d. Zhang Wei's instruction (replaces Feishu message)
    await ctx.email.send_email(
        from_user="zhang_wei",
        to="[email protected]",
        subject="New exclusive listings - please review",
        body=(
            "Xiao An, we just signed two new exclusives. The sellers uploaded "
            "materials to CRM. Please review P001 and P002. Ignore P003.\n\n"
            "Check seller claims against certificates, photos, and floor plans. "
            "Flag any inconsistency. If a listing is clean, draft copy for publishing.\n\n"
            "Tomorrow we have a second showing for P001 with a photographer, "
            "so I need the review done before then.\n\n"
            "-- Zhang Wei"
        ),
    )

    # 5. Return notification
    return {
        "notification": (
            "[2025-03-18 Monday 09:00] "
            "New exclusive listings came in and the sellers already uploaded "
            "materials to CRM. Check seller claims against certificates, photos, "
            "and floor plans. Flag any inconsistency. If a listing is clean, draft "
            "copy for publishing. Ignore P003.\n\n"
            "You use [email protected] to read and send emails. "
            "Contacts: [email protected] (Zhang Wei, your supervisor), "
            "[email protected] (Ms. Liu, P001 seller), "
            "[email protected] (Mr. Chen, P002 seller), "
            "[email protected] (Store Manager Sun).\n\n"
            "Listing database is in Notion (database: listings_crm). "
            "Client profiles are in Notion (database: client_profiles). "
            "Market comparables are in Google Sheets (market_comps).\n\n"
            "Check your email inbox -- you have messages from the sellers, "
            "the store manager, and Zhang Wei."
        ),
        "time": "2025-03-18T09:00:00+08:00",
    }


async def stage1(ctx):
    """2025-03-19 Tuesday: New multi-channel information arrives."""
    # 1. Upload stage1 inject files (purchase_invoice.pdf, competitor_listing.png)
    await ctx.fs.upload_dir(ctx.task_dir / "inject" / "stage1", "/workspace/input")

    # 2. Loud: Ms. Liu replies with purchase invoice
    await ctx.email.send_email(
        from_user="liu_ms",
        to="[email protected]",
        subject="RE: P001 Xinghe Bay - purchase invoice attached",
        body=(
            "As requested, I found the original purchase invoice. "
            "It is now in the shared folder as input/purchase_invoice.pdf. "
            "This should clarify the purchase timing.\n\n"
            "-- Ms. Liu"
        ),
    )

    # 3. Loud: Zhang Wei forwards buyer's pricing question
    await ctx.email.send_email(
        from_user="zhang_wei",
        to="[email protected]",
        subject="FW: Buyer question about P001 pricing",
        body=(
            "Xiao An, a potential buyer just sent me a screenshot of a "
            "competing listing nearby (input/competitor_listing.png) and asked "
            "whether our asking price for P001 is too high. "
            "Please compare with our market comps and advise.\n\n"
            "-- Zhang Wei"
        ),
    )

    # 4. Loud: Mr. Chen emails about pricing concerns
    await ctx.email.send_email(
        from_user="chen_mr",
        to="[email protected]",
        subject="P002 - pricing concerns",
        body=(
            "Hello, someone viewed the property yesterday but did not make a bid. "
            "Is our asking price too high? Should we consider adjusting?\n\n"
            "-- Mr. Chen"
        ),
    )

    # 5. Silent: CRM P002 note updated by property management
    p002 = await _get_listing(ctx, "P002")
    if p002:
        old_notes = _get_notion_field(p002, "Notes")
        await ctx.notion.update_db_row(p002["id"], {
            "Notes": _notion_text(
                old_notes + "\n\n"
                "[Property Management Note 2025-03-19] Three historical noise "
                "complaints have been reported on the 6th floor of this building."
            ),
        })

    # 6. Silent: market_comps gets 3 new transactions
    comps_id = await ctx.google_sheets.get_spreadsheet_id(COMPS_SHEET)
    if comps_id:
        await ctx.google_sheets.append_rows(
            comps_id, "Sheet1", COMPS_STAGE1_NEW,
        )

    # 7. Silent: CRM P003 note updated by colleague
    p003 = await _get_listing(ctx, "P003")
    if p003:
        old_notes = _get_notion_field(p003, "Notes")
        await ctx.notion.update_db_row(p003["id"], {
            "Notes": _notion_text(
                old_notes + "\n\n"
                "[Colleague Li Qiang 2025-03-19] Heads up: a competing agent "
                "has recently contacted the P003 owner. Possible risk of losing "
                "this exclusive."
            ),
        })

    # 8. Return notification (only loud events)
    return {
        "notification": (
            "[2025-03-19 Tuesday 09:00] "
            "You have new email messages. Ms. Liu replied with a purchase "
            "invoice. Zhang Wei forwarded a buyer's pricing question with a "
            "screenshot. Mr. Chen is also asking about pricing."
        ),
        "time": "2025-03-19T09:00:00+08:00",
    }


async def stage2(ctx):
    """2025-03-20 Wednesday: Seller conflict and publish decision."""
    # 1. Upload stage2 inject files (seller_wechat.png)
    await ctx.fs.upload_dir(ctx.task_dir / "inject" / "stage2", "/workspace/input")

    # 2. Loud: Ms. Liu forwards WeChat screenshot about price disagreement
    await ctx.email.send_email(
        from_user="liu_ms",
        to="[email protected]",
        subject="P001 - family discussion about pricing",
        body=(
            "I am forwarding a screenshot of my WeChat conversation with my "
            "husband about the price (input/seller_wechat.png). He says the "
            "minimum acceptable price is RMB 5.0 million and we should not go "
            "below that. I wanted to be transparent about this.\n\n"
            "-- Ms. Liu"
        ),
    )

    # 3. Loud: Store manager warns about deadline
    await ctx.email.send_email(
        from_user="sun_mgr",
        to="[email protected]",
        subject="48-hour review window closing",
        body=(
            "The 48-hour review window for the new exclusives is almost up. "
            "Please finalize your review and make a publish decision. "
            "If a listing has unresolved issues, explain why in writing.\n\n"
            "-- Manager Sun"
        ),
    )

    # 4. Silent: Front desk notes buyer interest on P001
    p001 = await _get_listing(ctx, "P001")
    if p001:
        old_notes = _get_notion_field(p001, "Notes")
        await ctx.notion.update_db_row(p001["id"], {
            "Notes": _notion_text(
                old_notes + "\n\n"
                "[Front Desk 2025-03-20] Three buyer groups have already "
                "inquired about P001 this week."
            ),
        })

    # 5. Return notification (only loud events)
    return {
        "notification": (
            "[2025-03-20 Wednesday 09:00] "
            "You have new email messages. Ms. Liu forwarded a WeChat screenshot "
            "about pricing with her husband. The store manager says the 48-hour "
            "review window is closing."
        ),
        "time": "2025-03-20T09:00:00+08:00",
    }

task1 task3