docs restructuring

This commit is contained in:
Joey Eamigh 2026-04-05 21:00:40 -04:00
parent 3e010a6d0c
commit 745172adb8
No known key found for this signature in database
GPG Key ID: CE8C05DFFC53C9CB
18 changed files with 21 additions and 3 deletions

View File

@ -15,9 +15,12 @@ Bun workspace monorepo. Three packages:
| Codebook ethos (reasoning behind every codebook decision) | `docs/CODEBOOK-ETHOS.md` |
| Project narrative (decisions, roadblocks, lessons) | `docs/NARRATIVE.md` |
| Project status & todo list | `docs/STATUS.md` |
| v1 codebook (preserved) | `docs/LABELING-CODEBOOK-v1.md` |
| v1 narrative (preserved) | `docs/NARRATIVE-v1.md` |
| Implementation plan for labelapp | `docs/labelapp-plan.md` |
| Specificity improvement plan (pending threshold tuning) | `docs/SPECIFICITY-IMPROVEMENT-PLAN.md` |
| Training docs (DAPT procedure, data quality audit, strategy notes) | `docs/training/` |
| Data pipeline reference (tech guide, HTML cleaning, filing generators) | `docs/data-pipeline/` |
| v1 archive (codebook, narrative, iteration logs, analyses) | `docs/archive/v1/` |
| Planning archive (project overview, implementation plan, labelapp plan) | `docs/archive/planning/` |
| Professor-provided reference materials | `docs/reference/` |
| Labelapp-specific agent guide | `labelapp/AGENTS.md` |
| Docker compose (Postgres) | `docker-compose.yaml` (root) |
| DB credentials | `sec_cybert` / `sec_cybert` / `sec_cybert` on localhost:5432 |

View File

@ -194,3 +194,18 @@ Option 2 is defensible: "Human inter-annotator agreement on specificity (alpha=0
The F1 threshold is achievable. The project is strong. The specificity distribution is the only structural problem, and it's fixable by aligning the codebook with the professor's construct (which we drifted from by being too precise). Everything else — the T5 ambiguity, the representative sample, the small classes — is manageable.
The worst thing to do right now is panic and pivot. The second worst thing is to agonize and delay. Pick a path, execute, get real numbers.
---
## Decision Made: Option A (executed)
**Chosen:** Option A — broaden Level 2 + loosen Level 4 to 1+ QV fact, full v2 codebook reboot.
**What happened:**
- v2 codebook approved 2026-04-04 (5/6 group approval)
- Stage 1 re-run: Grok 4.1 Fast ×3 self-consistency panel, $135.51
- Specificity distribution shifted to L1=41.1%, L2=22.7%, L3=24.9%, L4=11.4% — healthy
- Independent threshold heads replaced CORAL, solving the spec F1 bottleneck (0.517 → 0.945)
- Final model: Cat F1=0.943, Spec F1=0.945, both well above 0.80 target
See `docs/STATUS.md` for full pipeline status and `docs/SPECIFICITY-IMPROVEMENT-PLAN.md` for the architecture iteration.