66 lines
1.6 KiB
Plaintext
66 lines
1.6 KiB
Plaintext
SEC-cyBERT Prompt Pilot Report — 2026-03-28T05:14:43.318Z
|
||
Prompt version: v1.0
|
||
Sample: 0 paragraphs, seed=42
|
||
Models: google/gemini-3.1-flash-lite-preview, xiaomi/mimo-v2-flash, x-ai/grok-4.1-fast
|
||
Total cost: $0.0000
|
||
|
||
═══ PER-MODEL STATS ═══
|
||
|
||
google/gemini-3.1-flash-lite-preview:
|
||
Cost: $0.0000 ($NaN/para)
|
||
Tokens: 0 in, 0 out, 0 reasoning
|
||
Avg latency: 0ms
|
||
Categories:
|
||
Specificity:
|
||
Category confidence:
|
||
Specificity confidence:
|
||
|
||
xiaomi/mimo-v2-flash:
|
||
Cost: $0.0000 ($NaN/para)
|
||
Tokens: 0 in, 0 out, 0 reasoning
|
||
Avg latency: 0ms
|
||
Categories:
|
||
Specificity:
|
||
Category confidence:
|
||
Specificity confidence:
|
||
|
||
x-ai/grok-4.1-fast:
|
||
Cost: $0.0000 ($NaN/para)
|
||
Tokens: 0 in, 0 out, 0 reasoning
|
||
Avg latency: 0ms
|
||
Categories:
|
||
Specificity:
|
||
Category confidence:
|
||
Specificity confidence:
|
||
|
||
|
||
═══ AGREEMENT ANALYSIS ═══
|
||
Paragraphs with all 3 models: 0
|
||
|
||
Content Category Agreement:
|
||
3/3 unanimous: 0/0 (NaN%)
|
||
2/3 majority: 0/0 (NaN%)
|
||
All disagree: 0/0 (NaN%)
|
||
|
||
Specificity Level Agreement:
|
||
3/3 unanimous: 0/0 (NaN%)
|
||
2/3 majority: 0/0 (NaN%)
|
||
All disagree: 0/0 (NaN%)
|
||
|
||
Both dimensions 3/3: 0/0 (NaN%)
|
||
|
||
Pairwise category agreement:
|
||
gemini-3.1-flash-lite-preview × mimo-v2-flash: 0/0 (NaN%)
|
||
gemini-3.1-flash-lite-preview × grok-4.1-fast: 0/0 (NaN%)
|
||
mimo-v2-flash × grok-4.1-fast: 0/0 (NaN%)
|
||
|
||
|
||
═══ DISAGREEMENT DETAILS ═══
|
||
|
||
|
||
═══ COST PROJECTIONS (50K paragraphs) ═══
|
||
TOTAL Stage 1 (all 3 models): ~$NaN
|
||
|
||
Observed disagreement rate: NaN%
|
||
Estimated Stage 2 judge calls: ~NaN
|
||
(Judge cost depends on Sonnet 4.6 pricing — see OpenRouter) |