SEC-cyBERT/docs/archive/planning/signoff-deliverable.md
2026-04-06 17:02:05 -04:00

10 KiB
Raw Blame History

Construct of Interest and Data Sign-off

Team: S1 Team 4 | Construct: Project 3 — Cybersecurity Governance and Incident Disclosure Quality (SEC-Aligned)


1. Construct Definition

Our construct of interest is cybersecurity disclosure quality in SEC filings, operationalized as two simultaneous classification dimensions applied to each paragraph.

Dimension 1: Content Category (single-label, multi-class). Each paragraph receives exactly one of seven mutually exclusive categories derived from SEC Release 33-11216 (July 2023). The rule's mandated content domains map to six substantive categories — with Third-Party Risk separated from Risk Management Process because the rule specifically enumerates third-party oversight as a distinct disclosure requirement under 106(b) — plus a None/Other catch-all:

Category SEC Basis Primary Question Covers
Board Governance 106(c)(1) How does the board oversee cybersecurity? Board/committee oversight, briefing cadence, board cyber expertise, governance-chain (Board → Committee → Officer) paragraphs
Management Role 106(c)(2) How is management organized to handle cybersecurity? Cybersecurity leadership roles, qualifications, reporting lines, management-level committee structure — assessed via the person-removal test vs. RMP
Risk Management Process 106(b) What does the cybersecurity program do? Assessment methodology, framework adoption (NIST, ISO), vulnerability management, monitoring, IR planning, training, ERM integration
Third-Party Risk 106(b) How are third-party cyber risks managed? Vendor/supplier oversight, contractual security requirements, supply chain risk (requirements imposed ON vendors; assessors serving the company are RMP)
Incident Disclosure 8-K 1.05 / 8.01 What happened in a cybersecurity incident? Actual incidents only — nature, scope, timing, impact, remediation (hypothetical "we may experience" language is not ID)
Strategy Integration 106(b)(2) How does cybersecurity affect the business or finances? Materiality assessments (including negative assertions using 106(b)(2) language), cyber insurance, budget/investment, cost of incidents
None/Other (none of the above) Forward-looking disclaimers, cross-references, non-cybersecurity content, SPACs with no program; always Specificity 1

Category assignment uses a single primary test: "What question does this paragraph primarily answer?" Each question maps to exactly one category (e.g., "How does the board oversee cybersecurity?" → Board Governance; "What does the cybersecurity program do?" → Risk Management Process). This question-first rule replaces keyword matching and resolves governance-chain and person-attribution ambiguities that plagued earlier drafts.

Dimension 2: Disclosure Specificity (ordinal, 14). Measures how informative a paragraph is: (1) Generic Boilerplate — only general business/risk language, no cybersecurity domain terminology; (2) Domain-Adapted — uses cybersecurity domain terminology (NIST CSF, SIEM, pen testing, zero trust, etc.) that would not appear in a generic enterprise risk management document, but nothing unique to THIS company; (3) Firm-Specific — contains at least one fact identifying something unique to this company (named CISO/CTO, named non-generic committee, named internal program); (4) Quantified-Verifiable — contains at least one independently verifiable hard fact (specific number/date, named external entity, or certification held).

Decision test (check in order, stop at first "yes"): (1) Any QV-eligible fact? → Level 4. (2) Any firm-specific fact? → Level 3. (3) Any cybersecurity domain terminology? → Level 2. (4) None of the above → Level 1. The threshold for Level 4 is 1+ QV fact (revised from an earlier 2+ rule) and the Level 2 test is the "ERM test": would this term appear naturally in a generic enterprise risk management document? If no, it is domain terminology.

Example annotations from our codebook:

"The Audit Committee receives quarterly reports from the CISO on the Company's cybersecurity posture, including threat landscape assessments and vulnerability management results." → Board Governance, Specificity 2 (answers "how does the board oversee?"; "vulnerability management" is domain terminology but no firm-unique facts)

"We require all critical vendors to maintain SOC 2 Type II certification and conduct annual security assessments of our top 50 service providers." → Third-Party Risk, Specificity 4 (requirements imposed on vendors; "50 service providers" is a QV-eligible quantified fact)

"On January 15, 2024, we detected unauthorized access to our customer support portal. We activated our incident response plan and engaged Mandiant for forensic investigation." → Incident Disclosure, Specificity 4 (describes what happened; specific date and named external firm are QV-eligible)

"Risks from cybersecurity threats have not materially affected, and are not reasonably likely to materially affect, our business strategy, results of operations, or financial condition." → Strategy Integration, Specificity 1 (materiality assessment using SEC Item 106(b)(2) language, but boilerplate)

2. Sources and Citations

The construct is theoretically grounded in disclosure theory (Verrecchia, 2001) and regulatory compliance as an information-provision mechanism. The SEC's final rule provides the taxonomic backbone: it specifies four content domains — governance, risk management, strategy integration, and incident disclosure — creating a natural multi-class classification task directly from the regulatory text. Our categories further map to NIST CSF 2.0 functions (GOVERN, IDENTIFY, PROTECT, DETECT, RESPOND, RECOVER) for independent academic grounding.

The specificity dimension draws on the disclosure quality literature. Hope, Hu, and Lu (2016) demonstrate that boilerplate risk-factor disclosures are uninformative to investors, while specific disclosures predict future outcomes. Gordon, Loeb, and Sohail (2010) establish that voluntary IT security disclosures vary in informativeness and that more specific disclosures correlate with market valuations. Von Solms and Von Solms (2004) provide the information security governance framework connecting board oversight to operational risk management. The Gibson Dunn annual surveys of S&P 100 cybersecurity disclosures empirically document the variation in quality across firms, confirming that the specificity gradient is observable in practice.

The methodological foundation is the Ringel (2023) synthetic experts pipeline — frontier LLMs generate training labels, then a small open-weights model is fine-tuned to approximate the GenAI labeler at near-zero marginal cost. Ma et al. (2026) provide the multi-model consensus labeling architecture we adopt for quality assurance. No validated classifier or public labeled dataset for SEC cybersecurity disclosure quality currently exists — this is the gap our project fills.

3. Data Description

What data: Paragraphs extracted from SEC EDGAR filings — specifically, Item 1C of annual 10-K filings (cybersecurity risk management, strategy, and governance) and Items 1.05/8.01/7.01 of 8-K filings (cybersecurity incident disclosures). Two full annual filing cycles exist (FY2023FY2024), covering ~9,000 10-K filings containing Item 1C and 207 cybersecurity 8-K filings.

How acquired: All data is publicly available through the SEC EDGAR system. We built a TypeScript extraction pipeline that bulk-downloads filings via the EDGAR API, parses Item 1C sections from 10-K HTML across 14 identified filing generators, and segments into paragraphs (20500 words, with bullet-list merging and continuation-line detection). For 8-K incident filings, a separate scanner processes the SEC's bulk submissions.zip to deterministically capture all cybersecurity 8-Ks, including the post-May 2024 shift from Item 1.05 to Items 8.01/7.01. The corpus contains 72,045 paragraphs. All 72,045 were labeled for training via Grok 4.1 Fast ×3 self-consistency consensus with a GPT-5.4 judge for tiebreakers (Ringel 2023 pipeline + Wang et al. 2022 self-consistency), yielding 86.8% unanimous / 12.9% majority / 0.3% judge-resolved labels. A locked stratified 1,200-paragraph holdout (185 per non-ID category, 90 ID; ≥100 per specificity level; max 2 paragraphs per company per stratum) is reserved for human gold labeling — 6 annotators via a balanced incomplete block design with 3 annotators per paragraph — and is excluded from training.

Why at scale: Every publicly traded U.S. company must now file Item 1C annually, generating thousands of new disclosures each cycle. Investors, compliance teams, regulators, and cybersecurity consultants need to assess disclosure quality across hundreds or thousands of filings simultaneously — infeasible by manual reading. A validated classifier enables longitudinal trend analysis (are disclosures becoming more specific over time?), cross-sectional benchmarking (which industries lag in governance disclosure?), and event-driven monitoring of incidents. The iXBRL CYD taxonomy (effective December 2024) further increases the volume of machine-parseable filings. The data is abundant, recurring annually, and the classification task is too nuanced for keyword dictionaries but well-defined enough for a fine-tuned specialist model — the textbook case for a vertical AI.