fix up status and deliverable docs

2026-04-06 17:02:05 -04:00 · 2026-04-06 17:02:05 -04:00 · edcffbcc78
commit edcffbcc78
parent 112e389f71
2 changed files with 24 additions and 19 deletions
--- a/docs/STATUS.md
+++ b/docs/STATUS.md
@ -92,7 +92,7 @@
 - [x] Nuke admin account, joey is admin
 - [x] Quiz is one-time (at onboarding), warmup resets each login session
 - [ ] Run migration + seed (`la:db:migrate` then `la:seed`)
- [ ] Generate new BIBD assignments (3 of 5 annotators per paragraph)
+- [ ] Generate new BIBD assignments (3 of 6 annotators per paragraph)
 ### 8. Parallel Labeling
 - [ ] Humans: annotators label v2 holdout (~600 per annotator, 2-3 days)
--- a/docs/archive/planning/signoff-deliverable.md
+++ b/docs/archive/planning/signoff-deliverable.md
@ -10,30 +10,35 @@ Our construct of interest is **cybersecurity disclosure quality** in SEC filings
 **Dimension 1: Content Category** (single-label, multi-class). Each paragraph receives exactly one of seven mutually exclusive categories derived from [SEC Release 33-11216](https://www.sec.gov/files/rules/final/2023/33-11216.pdf) (July 2023). The rule's mandated content domains map to six substantive categories — with Third-Party Risk separated from Risk Management Process because the rule specifically enumerates third-party oversight as a distinct disclosure requirement under 106(b) — plus a None/Other catch-all:
-| Category                | SEC Basis | Covers                                                                              |
+| Category                | SEC Basis       | Primary Question                                        | Covers                                                                                                         |
-| ----------------------- | --------- | ----------------------------------------------------------------------------------- |
+| ----------------------- | --------------- | ------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------- |
-| Board Governance        | 106(c)(1) | Board/committee oversight, briefing frequency, board cyber expertise                |
+| Board Governance        | 106(c)(1)       | How does the board oversee cybersecurity?              | Board/committee oversight, briefing cadence, board cyber expertise, governance-chain (Board → Committee → Officer) paragraphs |
-| Management Role         | 106(c)(2) | CISO/CTO identification, qualifications, reporting structure                        |
+| Management Role         | 106(c)(2)       | How is management organized to handle cybersecurity?  | Cybersecurity leadership roles, qualifications, reporting lines, management-level committee structure — assessed via the **person-removal test** vs. RMP |
-| Risk Management Process | 106(b)    | Assessment methodology, framework adoption (NIST, ISO), monitoring, ERM integration |
+| Risk Management Process | 106(b)          | What does the cybersecurity program do?                | Assessment methodology, framework adoption (NIST, ISO), vulnerability management, monitoring, IR planning, training, ERM integration |
-| Third-Party Risk        | 106(b)    | Vendor oversight, external assessors, supply chain risk                             |
+| Third-Party Risk        | 106(b)          | How are third-party cyber risks managed?               | Vendor/supplier oversight, contractual security requirements, supply chain risk (requirements imposed ON vendors; assessors serving the company are RMP) |
-| Incident Disclosure     | 8-K 1.05  | Incident nature, scope, timing, material impact, remediation                        |
+| Incident Disclosure     | 8-K 1.05 / 8.01 | What happened in a cybersecurity incident?            | Actual incidents only — nature, scope, timing, impact, remediation (hypothetical "we may experience" language is not ID) |
-| Strategy Integration    | 106(b)(2) | Material impact on business strategy/financials, cyber insurance                    |
+| Strategy Integration    | 106(b)(2)       | How does cybersecurity affect the business or finances?| Materiality assessments (including negative assertions using 106(b)(2) language), cyber insurance, budget/investment, cost of incidents |
-| None/Other              | —         | Boilerplate intros, legal disclaimers, non-cybersecurity content                    |
+| None/Other              | —               | *(none of the above)*                                  | Forward-looking disclaimers, cross-references, non-cybersecurity content, SPACs with no program; always Specificity 1 |
-**Dimension 2: Disclosure Specificity** (ordinal, 1–4). Measures how informative a paragraph is: (1) Generic Boilerplate — could apply to any company unchanged; (2) Sector-Adapted — references named frameworks but no firm-specific detail; (3) Firm-Specific — names unique roles, committees, or programs; (4) Quantified-Verifiable — includes metrics, dates, dollar amounts, or independently confirmable facts.
+**Category assignment** uses a single primary test: **"What question does this paragraph primarily answer?"** Each question maps to exactly one category (e.g., "How does the board oversee cybersecurity?" → Board Governance; "What does the cybersecurity program do?" → Risk Management Process). This question-first rule replaces keyword matching and resolves governance-chain and person-attribution ambiguities that plagued earlier drafts.
-**Decision rules for borderline cases:** Does it name any framework or standard? (yes → 2). Does it mention anything unique to this company? (yes → 3). Does it contain two or more specific, verifiable facts? (yes → 4).
+**Dimension 2: Disclosure Specificity** (ordinal, 1–4). Measures how informative a paragraph is: (1) Generic Boilerplate — only general business/risk language, no cybersecurity domain terminology; (2) Domain-Adapted — uses cybersecurity domain terminology (NIST CSF, SIEM, pen testing, zero trust, etc.) that would not appear in a generic enterprise risk management document, but nothing unique to THIS company; (3) Firm-Specific — contains at least one fact identifying something unique to this company (named CISO/CTO, named non-generic committee, named internal program); (4) Quantified-Verifiable — contains at least one independently verifiable hard fact (specific number/date, named external entity, or certification held).
 **Decision test** (check in order, stop at first "yes"): (1) Any QV-eligible fact? → Level 4. (2) Any firm-specific fact? → Level 3. (3) Any cybersecurity domain terminology? → Level 2. (4) None of the above → Level 1. The threshold for Level 4 is **1+ QV fact** (revised from an earlier 2+ rule) and the Level 2 test is the **"ERM test"**: would this term appear naturally in a generic enterprise risk management document? If no, it is domain terminology.
 **Example annotations from our codebook:**
-> _"Our Board of Directors recognizes the critical importance of maintaining the trust and confidence of our customers, and cybersecurity risk is an area of increasing focus for our Board."_
+> _"The Audit Committee receives quarterly reports from the CISO on the Company's cybersecurity posture, including threat landscape assessments and vulnerability management results."_
-> → Board Governance, Specificity 1 (could apply to any company — generic statement of intent)
+> → Board Governance, Specificity 2 (answers "how does the board oversee?"; "vulnerability management" is domain terminology but no firm-unique facts)
-> _"We assessed 312 vendors in fiscal 2024 through our Third-Party Risk Management program. All Tier 1 vendors are required to provide annual SOC 2 Type II reports. In fiscal 2024, 14 vendors were placed on remediation plans and 3 vendor relationships were terminated."_
+> _"We require all critical vendors to maintain SOC 2 Type II certification and conduct annual security assessments of our top 50 service providers."_
-> → Third-Party Risk, Specificity 4 (specific numbers, specific actions, specific criteria — all verifiable)
+> → Third-Party Risk, Specificity 4 (requirements imposed on vendors; "50 service providers" is a QV-eligible quantified fact)
-> _"Our CISO, Sarah Chen, leads a dedicated cybersecurity team of 35 professionals. Ms. Chen joined the Company in 2019 after serving as Deputy CISO at a Fortune 100 financial services firm."_
+> _"On January 15, 2024, we detected unauthorized access to our customer support portal. We activated our incident response plan and engaged Mandiant for forensic investigation."_
-> → Management Role, Specificity 4 (named individual, team size, prior role — multiple verifiable facts)
+> → Incident Disclosure, Specificity 4 (describes what happened; specific date and named external firm are QV-eligible)
 > _"Risks from cybersecurity threats have not materially affected, and are not reasonably likely to materially affect, our business strategy, results of operations, or financial condition."_
 > → Strategy Integration, Specificity 1 (materiality assessment using SEC Item 106(b)(2) language, but boilerplate)
 ## 2. Sources and Citations
@ -47,6 +52,6 @@ The **methodological foundation** is the [Ringel (2023)](https://papers.ssrn.com
 **What data:** Paragraphs extracted from SEC EDGAR filings — specifically, Item 1C of annual 10-K filings (cybersecurity risk management, strategy, and governance) and Items 1.05/8.01/7.01 of 8-K filings (cybersecurity incident disclosures). Two full annual filing cycles exist (FY2023–FY2024), covering ~9,000 10-K filings containing Item 1C and 207 cybersecurity 8-K filings.
-**How acquired:** All data is publicly available through the [SEC EDGAR system](https://www.sec.gov/search-filings/edgar-application-programming-interfaces). We built a TypeScript extraction pipeline that bulk-downloads filings via the EDGAR API, parses Item 1C sections from 10-K HTML across 14 identified filing generators, and segments into paragraphs (20–500 words, with bullet-list merging and continuation-line detection). For 8-K incident filings, a separate scanner processes the SEC's bulk `submissions.zip` to deterministically capture all cybersecurity 8-Ks, including the post-May 2024 shift from Item 1.05 to Items 8.01/7.01. The corpus currently contains **72,045 paragraphs**. We will label ~48,000 paragraphs for training via a three-model GenAI panel following the Ringel (2023) pipeline, with a locked 1,200-paragraph holdout to be human-labeled by 6 annotators (3 per paragraph) for validation.
+**How acquired:** All data is publicly available through the [SEC EDGAR system](https://www.sec.gov/search-filings/edgar-application-programming-interfaces). We built a TypeScript extraction pipeline that bulk-downloads filings via the EDGAR API, parses Item 1C sections from 10-K HTML across 14 identified filing generators, and segments into paragraphs (20–500 words, with bullet-list merging and continuation-line detection). For 8-K incident filings, a separate scanner processes the SEC's bulk `submissions.zip` to deterministically capture all cybersecurity 8-Ks, including the post-May 2024 shift from Item 1.05 to Items 8.01/7.01. The corpus contains **72,045 paragraphs**. All 72,045 were labeled for training via Grok 4.1 Fast ×3 self-consistency consensus with a GPT-5.4 judge for tiebreakers (Ringel 2023 pipeline + Wang et al. 2022 self-consistency), yielding 86.8% unanimous / 12.9% majority / 0.3% judge-resolved labels. A locked stratified **1,200-paragraph holdout** (185 per non-ID category, 90 ID; ≥100 per specificity level; max 2 paragraphs per company per stratum) is reserved for human gold labeling — 6 annotators via a balanced incomplete block design with 3 annotators per paragraph — and is excluded from training.
 **Why at scale:** Every publicly traded U.S. company must now file Item 1C annually, generating thousands of new disclosures each cycle. Investors, compliance teams, regulators, and cybersecurity consultants need to assess disclosure quality across hundreds or thousands of filings simultaneously — infeasible by manual reading. A validated classifier enables longitudinal trend analysis (are disclosures becoming more specific over time?), cross-sectional benchmarking (which industries lag in governance disclosure?), and event-driven monitoring of incidents. The [iXBRL CYD taxonomy](https://xbrl.sec.gov/cyd/2024/cyd-taxonomy-guide-2024-09-16.pdf) (effective December 2024) further increases the volume of machine-parseable filings. The data is abundant, recurring annually, and the classification task is too nuanced for keyword dictionaries but well-defined enough for a fine-tuned specialist model — the textbook case for a vertical AI.