docs restructuring
This commit is contained in:
parent
3e010a6d0c
commit
745172adb8
@ -15,9 +15,12 @@ Bun workspace monorepo. Three packages:
|
||||
| Codebook ethos (reasoning behind every codebook decision) | `docs/CODEBOOK-ETHOS.md` |
|
||||
| Project narrative (decisions, roadblocks, lessons) | `docs/NARRATIVE.md` |
|
||||
| Project status & todo list | `docs/STATUS.md` |
|
||||
| v1 codebook (preserved) | `docs/LABELING-CODEBOOK-v1.md` |
|
||||
| v1 narrative (preserved) | `docs/NARRATIVE-v1.md` |
|
||||
| Implementation plan for labelapp | `docs/labelapp-plan.md` |
|
||||
| Specificity improvement plan (pending threshold tuning) | `docs/SPECIFICITY-IMPROVEMENT-PLAN.md` |
|
||||
| Training docs (DAPT procedure, data quality audit, strategy notes) | `docs/training/` |
|
||||
| Data pipeline reference (tech guide, HTML cleaning, filing generators) | `docs/data-pipeline/` |
|
||||
| v1 archive (codebook, narrative, iteration logs, analyses) | `docs/archive/v1/` |
|
||||
| Planning archive (project overview, implementation plan, labelapp plan) | `docs/archive/planning/` |
|
||||
| Professor-provided reference materials | `docs/reference/` |
|
||||
| Labelapp-specific agent guide | `labelapp/AGENTS.md` |
|
||||
| Docker compose (Postgres) | `docker-compose.yaml` (root) |
|
||||
| DB credentials | `sec_cybert` / `sec_cybert` / `sec_cybert` on localhost:5432 |
|
||||
|
||||
@ -194,3 +194,18 @@ Option 2 is defensible: "Human inter-annotator agreement on specificity (alpha=0
|
||||
The F1 threshold is achievable. The project is strong. The specificity distribution is the only structural problem, and it's fixable by aligning the codebook with the professor's construct (which we drifted from by being too precise). Everything else — the T5 ambiguity, the representative sample, the small classes — is manageable.
|
||||
|
||||
The worst thing to do right now is panic and pivot. The second worst thing is to agonize and delay. Pick a path, execute, get real numbers.
|
||||
|
||||
---
|
||||
|
||||
## Decision Made: Option A (executed)
|
||||
|
||||
**Chosen:** Option A — broaden Level 2 + loosen Level 4 to 1+ QV fact, full v2 codebook reboot.
|
||||
|
||||
**What happened:**
|
||||
- v2 codebook approved 2026-04-04 (5/6 group approval)
|
||||
- Stage 1 re-run: Grok 4.1 Fast ×3 self-consistency panel, $135.51
|
||||
- Specificity distribution shifted to L1=41.1%, L2=22.7%, L3=24.9%, L4=11.4% — healthy
|
||||
- Independent threshold heads replaced CORAL, solving the spec F1 bottleneck (0.517 → 0.945)
|
||||
- Final model: Cat F1=0.943, Spec F1=0.945, both well above 0.80 target
|
||||
|
||||
See `docs/STATUS.md` for full pipeline status and `docs/SPECIFICITY-IMPROVEMENT-PLAN.md` for the architecture iteration.
|
||||
Loading…
x
Reference in New Issue
Block a user