20 KiB
SEC EDGAR Filing Generator Reference
Reference for identifying which software generated a given SEC 10-K HTML filing. Built from direct inspection of EDGAR filings and market research (March 2026).
1. Major Vendors and HTML Signatures
Workiva (Wdesk) -- Market Leader for 10-K/10-Q
Filing agent CIK: 0001628280
HTML comment signature (lines 1-3):
<?xml version='1.0' encoding='ASCII'?>
<!--XBRL Document Created with the Workiva Platform-->
<!--Copyright 2025 Workiva-->
<!--r:{uuid},g:{uuid},d:{hex-id}-->
Detection heuristics:
- HTML comment:
XBRL Document Created with the Workiva Platform - HTML comment:
Copyright \d{4} Workiva - Third comment line contains
r:,g:,d:UUIDs (document/generation tracking) xml:lang="en-US"attribute on<html>tag- Body uses inline styles exclusively (no CSS classes on content elements)
- Heavy use of
<span>with inline styles containingbackground-color,font-family,font-size,font-weight,line-heightin every span - Div IDs follow pattern:
i{hex32}_{number}(e.g.,id="i56b78781f7c84a038f6ae0f6244f7dd8_1") - Tables use
display:inline-tableandvertical-align:text-bottom - iXBRL fact IDs follow pattern:
F_{uuid}(e.g.,id="F_d8dc1eb1-109d-445d-a55a-3dde1a81ca63") - No
<meta name="generator">tag - No CSS classes on body content (purely inline styles)
Structural patterns:
- Span-heavy: nearly every text fragment wrapped in
<span style="..."> - Font specified as
font-family:'Times New Roman',sans-serif(note: sans-serif fallback, unusual) - Line-height specified on every span (e.g.,
line-height:120%) - Background color explicitly set:
background-color:#ffffff
Known quality issues:
- Extremely verbose HTML; simple paragraphs become deeply nested span trees
- Text extraction is clean because span boundaries align with word boundaries
- Large file sizes due to inline style repetition
DFIN / Donnelley Financial Solutions (ActiveDisclosure)
DFIN operates under two distinct CIKs with two different HTML output formats.
DFIN "New" ActiveDisclosure (primary)
Filing agent CIK: 0000950170 (also 0000950130)
HTML comment signature:
<?xml version='1.0' encoding='ASCII'?>
<!-- DFIN New ActiveDisclosure (SM) Inline XBRL Document - http://www.dfinsolutions.com/ -->
<!-- Creation Date :2025-02-18T12:36:24.4008+00:00 -->
<!-- Copyright (c) 2025 Donnelley Financial Solutions, Inc. All Rights Reserved. -->
Detection heuristics:
- HTML comment:
DFIN New ActiveDisclosure - HTML comment:
http://www.dfinsolutions.com/ - HTML comment:
Copyright (c) \d{4} Donnelley Financial Solutions - HTML comment:
Creation Date :with ISO timestamp - Body style:
padding:8px;margin:auto!important; - Inline styles use
font-kerning:none;min-width:fit-content;on most spans - Extensive use of
white-space:pre-wrapon spans - CSS class
item-list-element-wrapperandpage-border-spacingpresent - iXBRL fact IDs follow pattern:
F_{uuid}
Structural patterns:
- Every text span carries
min-width:fit-content(distinctive) - Uses
 for spacing extensively - Uses
<p>tags with inline margins for all paragraphs - Tables use explicit
padding-top:0in;vertical-align:top;padding-bottom:0incell styles
DFIN Legacy (RR Donnelley heritage)
Filing agent CIK: 0001193125
HTML signature:
<?xml version='1.0' encoding='ASCII'?>
<html xmlns:link="..." xmlns:xbrldi="..." ...>
<head>
<title>10-K</title>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type"/>
</head>
<body style="line-height:normal;background-color:white;">
<h5 style="font-size:10pt;font-weight:bold"><a href="#toc">Table of Contents</a></h5>
Detection heuristics:
- No identifying HTML comments (no generator/copyright comment)
- Accession number prefix
0001193125is definitive <body style="line-height:normal;background-color:white;">- Immediately starts with
<h5>Table of Contents link - Uses deprecated namespace aliases:
xmlns:xl,xmlns:xbrll,xmlns:deprecated - iXBRL fact IDs follow pattern:
Fact_{large_number}(e.g.,id="Fact_129727210") - Uses
<FONT>tags (HTML 3.2 style) in some documents - Uppercase HTML tags in older filings (
<P>,<B>,<DIV>)
Structural patterns:
- Cleaner HTML than ActiveDisclosure New
- Uses semantic
<h5>for table of contents - Inline styles are simpler and more standard
- File description filenames follow pattern:
d{number}d10k.htm
Toppan Merrill (Bridge)
Filing agent CIKs: 0001104659 (primary), 0001558370 (secondary)
HTML comment signature:
<?xml version='1.0' encoding='ASCII'?>
<!-- iXBRL document created with: Toppan Merrill Bridge iXBRL 10.9.0.3 -->
<!-- Based on: iXBRL 1.1 -->
<!-- Created on: 2/21/2025 8:11:11 PM -->
<!-- iXBRL Library version: 1.0.9062.16423 -->
<!-- iXBRL Service Job ID: {uuid} -->
Detection heuristics:
- HTML comment:
iXBRL document created with: Toppan Merrill Bridge iXBRL - HTML comment:
iXBRL Library version: - HTML comment:
iXBRL Service Job ID: - Includes version number in comment (e.g.,
10.9.0.3) <title>tag contains company name + period end date (e.g.,Sunstone Hotel Investors, Inc._December 31, 2024)- Uses
xmlns:xsalongsidexmlns:xsi(both XML Schema namespaces) - Body starts with
<div style="margin-top:30pt;"></div>(distinctive) - iXBRL hidden div uses
display:none;(no additional styles on the div)
Structural patterns:
- Context IDs use descriptive names with GUIDs:
As_Of_12_31_2024_{base64-like},From_01_01_2024_to_12_31_2024_{guid} - Hidden fact IDs follow pattern:
Hidden_{base64-like} - Unit ref IDs follow pattern:
Unit_Standard_USD_{base64-like} - No CSS classes used on content elements
- Relatively clean HTML structure
RDG Filings (ThunderDome Portal)
Filing agent CIK: 0001437749
HTML signature:
<?xml version='1.0' encoding='ASCII'?>
<html xmlns:thunderdome="http://www.RDGFilings.com" ...>
<head>
<title>avpt20241231_10k.htm</title>
<!-- Generated by ThunderDome Portal - 2/27/2025 6:06:48 PM -->
<meta http-equiv="Content-Type" content="text/html"/>
</head>
<body style="cursor: auto; padding: 0in 0.1in; font-family: "Times New Roman", Times, serif; font-size: 10pt;">
Detection heuristics:
- XML namespace:
xmlns:thunderdome="http://www.RDGFilings.com" - HTML comment:
Generated by ThunderDome Portal <title>contains the filing filename- Body style includes
cursor: auto; padding: 0in 0.1in - iXBRL fact IDs prefixed with
thunderdome-(e.g.,id="thunderdome-EntityCentralIndexKey") - Context ref IDs use simple date ranges:
d_2024-01-01_2024-12-31 - Other fact IDs follow
ixv-{number}orc{number}pattern
Market presence: ~14,000 filings/year, rank #9 among filing agents. About 5% of annual filings.
Broadridge Financial Solutions (PROfile)
Filing agent CIKs: 0001140361 (primary), 0001133228 (secondary)
HTML comment signature:
<!-- Licensed to: Broadridge
Document created using Broadridge PROfile 25.1.1.5279
Copyright 1995 - 2025 Broadridge -->
Detection heuristics:
- HTML comment:
Licensed to: Broadridge - HTML comment:
Document created using Broadridge PROfilewith version number - HTML comment:
Copyright 1995 - \d{4} Broadridge - CSS classes with
BRPFprefix:BRPFPageBreak,BRPFPageBreakArea,BRPFPageFooter,BRPFPageHeader,BRPFPageNumberArea - CSS class:
DSPFListTable - CSS class:
cfttable - CSS class:
Apple-interchange-newline(suggests Mac/WebKit origin) - Context ref IDs use XBRL-standard descriptive format:
c20240101to20241231_AxisName_MemberName
Note: Broadridge acquired CompSci Resources LLC in July 2024 and is integrating CompSci's Transform platform. Filings may transition to Broadridge branding over time.
CompSci / Novaworks (Transform and GoFiler)
CompSci Resources produces two tools that leave distinct signatures.
CompSci Transform (now Broadridge)
Filed via: EdgarAgents LLC (0001213900) or other agents
HTML comment signature:
<?xml version='1.0' encoding='ASCII'?>
<!-- Generated by CompSci Transform (tm) - http://www.compsciresources.com -->
<!-- Created: Mon Mar 17 19:46:10 UTC 2025 -->
Detection heuristics:
- HTML comment:
Generated by CompSci Transform - HTML comment:
http://www.compsciresources.com - XML namespace:
xmlns:compsci="http://compsciresources.com" - Body wrapped in:
<div style="font: 10pt Times New Roman, Times, Serif"> - Uses
<!-- Field: Rule-Page -->and<!-- Field: /Rule-Page -->HTML comments as structural markers - Empty
<div>tags used as spacers between paragraphs - iXBRL context refs use simple sequential IDs:
c0,c1,c2, ... - iXBRL fact IDs follow
ixv-{number}pattern - Uses shorthand CSS:
font: 10pt Times New Roman, Times, Serif(combined property) - Margin shorthand:
margin: 0pt 0
Known quality issues:
- Words can be broken across
<span>tags mid-word - Heavy use of
 for spacing - Empty divs between every paragraph create parsing noise
<!-- Field: ... -->comments interspersed throughout document body
Novaworks GoFiler (XDX format)
Filed via: SECUREX Filings (0001214659) or self-filed
HTML signature:
<head>
<title></title>
<meta http-equiv="Content-Type" content="text/html"/>
</head>
<!-- Field: Set; Name: xdx; ID: xdx_021_US%2DGAAP%2D2024%2D... -->
<!-- Field: Set; Name: xdx; ID: xdx_03B_... -->
Detection heuristics:
- HTML comments with pattern:
<!-- Field: Set; Name: xdx; ID: xdx_{code}_{data} --> - XDX comments appear between
</head>and<body>(unusual placement) - Body style:
font: 10pt Times New Roman, Times, Serif(same shorthand as CompSci) - Empty
<title></title>tag - iXBRL fact IDs use
xdx2ixbrl{number}pattern (e.g.,id="xdx2ixbrl0102") - Standard fact IDs use
Fact{number:06d}pattern (e.g.,id="Fact000003") - Context refs use
From{date}to{date}orAsOf{date}format (no separators within date)
XDX explained: XDX (XBRL Data Exchange) is GoFiler's proprietary format that uses HTML tag ID attributes ("engrams") to embed XBRL metadata. The xdx_ comments carry taxonomy, entity, period, and unit definitions that GoFiler uses to generate the final iXBRL.
Discount EDGAR / NTDAS (XBRLMaster / EDGARMaster)
Filing agent CIK: 0001477932
HTML signature:
<head>
<title>crona_10k.htm</title>
<!--Document Created by XBRLMaster-->
<meta http-equiv="Content-Type" content="text/html"/>
</head>
<body style="text-align:justify;font:10pt times new roman">
Detection heuristics:
- HTML comment:
Document Created by XBRLMaster - Body style:
text-align:justify;font:10pt times new roman - Hidden iXBRL div has
id="XBRLDIV" - Additional body styles include
margin-left:7%;margin-right:7% - Uses lowercase
times new roman(no capitalization) - iXBRL fact IDs use
ixv-{number}pattern
EdgarAgents LLC
Filing agent CIK: 0001213900
EdgarAgents is a filing agent service, not a document creation tool. The HTML they submit is typically generated by CompSci Transform, GoFiler, or other tools. Check the HTML comments to identify the actual generator.
DFIN Legacy (pre-iXBRL / SGML-era)
Filing agent CIK: 0001193125
Older filings (pre-2019) from this CIK may appear in <DOCUMENT> SGML wrapper format:
<DOCUMENT>
<TYPE>10-K
<SEQUENCE>1
<FILENAME>d913213d10k.htm
<DESCRIPTION>10-K
<TEXT>
<HTML><HEAD>
<TITLE>10-K</TITLE>
</HEAD>
<BODY BGCOLOR="WHITE" STYLE="line-height:Normal">
<Center><DIV STYLE="width:8.5in" align="left">
Detection heuristics:
- Uppercase HTML tags:
<HTML>,<HEAD>,<BODY>,<P>,<B> BGCOLOR="WHITE"attribute (deprecated HTML)<Center>tag with capital C<DIV STYLE="width:8.5in"(page-width container)<FONT>tags for styling- Filename pattern:
d{number}d10k.htm
2. Filing Agent Market Share
Based on secfilingdata.com total filings across all form types:
| Rank | Filing Agent | CIK | 2025 Filings | Total (All Time) |
|---|---|---|---|---|
| 1 | Donnelley Financial (DFIN) | 0001193125 | 65,180 | 1,872,890 |
| 2 | EdgarAgents LLC | 0001213900 | 48,021 | 367,211 |
| 3 | Quality Edgar (QES) | 0001839882 | 38,017 | 151,031 |
| 4 | Toppan Merrill | 0001104659 | 48,260 | 988,715 |
| 5 | WallStreetDocs Ltd | 0001918704 | 22,387 | 56,431 |
| 6 | Workiva (Wdesk) | 0001628280 | 21,606 | 141,795 |
| 7 | M2 Compliance LLC | 0001493152 | 13,810 | 164,603 |
| 8 | Davis Polk & Wardwell LLP | 0000950103 | 16,231 | 326,359 |
| 9 | RDG Filings (ThunderDome) | 0001437749 | 14,209 | 187,270 |
| 10 | Morgan Stanley | 0001950047 | 12,822 | 56,468 |
| 11 | Broadridge | 0001140361 | -- | 597,664 |
| 14 | SECUREX Filings | 0001214659 | -- | 115,218 |
| 19 | Blueprint | 0001654954 | -- | 62,250 |
| 20 | FilePoint | 0001398344 | -- | 76,218 |
| 38 | Discount EDGAR | 0001477932 | -- | 37,422 |
For 10-K/10-Q specifically (estimated from biotech IPO data and market research):
- DFIN: ~40-50% of annual/quarterly filings
- Workiva: ~25-35% (has been gaining share from DFIN since ~2010)
- Toppan Merrill: ~10-15%
- RDG Filings: ~5%
- Broadridge/CompSci: ~5%
- Others (law firms, self-filed, smaller agents): ~5-10%
3. XBRL/iXBRL Tool Signatures
The iXBRL tagging tool is often the same as the filing generator, but not always. Key distinguishing patterns in the iXBRL layer:
| Tool | Context Ref Pattern | Fact ID Pattern | Unit Ref Pattern |
|---|---|---|---|
| Workiva | C_{uuid} |
F_{uuid} |
U_{uuid} |
| DFIN New | C_{uuid} |
F_{uuid} |
Standard names |
| DFIN Legacy | Fact_{large_int} |
Fact_{large_int} |
Standard names |
| Toppan Merrill | As_Of_{date}_{guid} / From_{date}_to_{date}_{guid} |
Hidden_{guid} |
Unit_Standard_USD_{guid} |
| ThunderDome | d_{date_range} / i_{date} |
thunderdome-{name} or ixv-{n} or c{n} |
Standard names |
| CompSci Transform | c0, c1, c2 ... |
ixv-{number} |
Standard names |
| GoFiler (XDX) | From{date}to{date} / AsOf{date} |
xdx2ixbrl{number} |
Standard names |
| XBRLMaster | From{date}to{date} |
ixv-{number} |
Standard names |
| Broadridge PROfile | c{date}to{date}_{axis}_{member} |
Descriptive | Standard names |
4. Detection Priority (Recommended Heuristic Order)
For maximum reliability, check signatures in this order:
- HTML comments (first 10 lines) -- most generators embed identifying comments
Workiva Platform--> WorkivaDFIN New ActiveDisclosure--> DFIN NewToppan Merrill Bridge--> Toppan MerrillThunderDome Portal--> RDG FilingsCompSci Transform--> CompSci/BroadridgeBroadridge PROfile--> BroadridgeXBRLMaster--> Discount EDGAR / NTDAS
- XML namespaces on
<html>tagxmlns:thunderdome="http://www.RDGFilings.com"--> RDGxmlns:compsci="http://compsciresources.com"--> CompSci
- XDX comments between head and body --> GoFiler/Novaworks
- Accession number prefix (first 10 digits) --> identifies filing agent CIK
- Body style patterns as fallback
- iXBRL fact ID patterns as secondary confirmation
5. Known Quality Issues by Generator
CompSci Transform
- Words broken across spans: Text is split at arbitrary character boundaries, not word boundaries. A single word like "cybersecurity" may be split across 2-3
<span>tags. This breaks naive text extraction that operates per-element. - Empty div spacers:
<div>\n\n</div>between every paragraph adds noise. - Field comments in body:
<!-- Field: Rule-Page -->markers interspersed with content.
Workiva
- Extreme span nesting: Every text run gets its own
<span>with full inline style. A simple bold sentence may have 5+ spans. - Large file sizes: Inline style repetition causes 10-K files to be 2-5x larger than equivalent DFIN filings.
- Clean word boundaries: Despite heavy span usage, spans align with word/phrase boundaries, making text extraction reliable.
DFIN New ActiveDisclosure
min-width:fit-contenteverywhere: Unusual CSS property on every span; may cause rendering inconsistencies in older browsers.font-kerning:none: Explicit kerning disable on all text spans.- Generally clean: Text extraction works well; word boundaries respected.
DFIN Legacy
- Uppercase HTML tags: Older filings use
<P>,<B>,<FONT>-- need case-insensitive parsing. - Mixed HTML versions: Some documents mix HTML 3.2 and 4.0 constructs.
- SGML wrappers: Some filings wrapped in
<DOCUMENT>SGML envelope.
GoFiler / Novaworks
- XDX comment noise: Multiple
<!-- Field: Set; ... -->comments that must be stripped. - Generally clean HTML: Body content is straightforward.
Toppan Merrill Bridge
- Clean output: Among the cleanest generators. Minimal inline style bloat.
- GUID-heavy IDs: Context and unit refs use base64-like GUIDs that are less human-readable.
6. Self-Filed / In-House Filings
Some large filers submit directly using their own CIK as the accession number prefix. These filings have no generator comment and variable HTML quality.
Detection: Accession number prefix matches the filer's own CIK (e.g., Halliburton CIK 0000045012 files with accession 0000045012-25-000010).
However: Even self-filed companies typically use a commercial tool. Halliburton's self-filed 10-K contains the Workiva comment signature, indicating they use Workiva but submit directly rather than through a filing agent.
Truly in-house HTML (no commercial tool) is rare among 10-K filers. When it occurs:
- No identifying comments
- No consistent structural patterns
- May use Word-to-HTML conversion (look for
mso-CSS prefixes from Microsoft Office) - May have minimal or no iXBRL tagging
7. Law Firm Filings
Several large law firms act as filing agents:
- Davis Polk & Wardwell (
0000950103) -- 326K total filings - Paul Weiss (
0000950142) -- 56K total filings - Foley & Lardner (
0000897069) -- 30K total filings - Sidley Austin (
0000905148) -- 39K total filings - Seward & Kissel (
0000919574) -- 107K total filings
Law firms typically file transactional documents (S-1, proxy, 8-K) rather than periodic 10-K filings. The HTML in law-firm-filed documents often comes from Word conversion and lacks commercial generator signatures.
8. Summary: Quick Detection Regex Table
Pattern | Generator
-----------------------------------------------------|------------------
/Workiva Platform/ | Workiva
/DFIN New ActiveDisclosure/ | DFIN (New)
/Donnelley Financial Solutions/ | DFIN (New)
/Toppan Merrill Bridge/ | Toppan Merrill
/ThunderDome Portal/ | RDG Filings
/CompSci Transform/ | CompSci/Broadridge
/Broadridge PROfile/ | Broadridge
/XBRLMaster/ | Discount EDGAR
/xmlns:thunderdome="http:\/\/www\.RDGFilings\.com"/ | RDG Filings
/xmlns:compsci="http:\/\/compsciresources\.com"/ | CompSci
/Field: Set; Name: xdx/ | GoFiler/Novaworks
/dfinsolutions\.com/ | DFIN
/min-width:fit-content/ | DFIN (New)
/BRPFPage/ | Broadridge PROfile
/id="XBRLDIV"/ | XBRLMaster
Sources
- Direct inspection of SEC EDGAR filings (March 2026)
- secfilingdata.com/top-filing-agents -- filing agent rankings
- newstreetir.com -- Top SEC Filing Agents for Biotech IPOs -- biotech IPO market share
- houseblend.io -- SEC Filing Software Platforms -- vendor comparison
- novaworkssoftware.com/inlinexbrl -- XDX format documentation
- rdgfilings.com/thunderdome -- ThunderDome Portal
- toppanmerrill.com/bridge -- Toppan Merrill Bridge
- edgarmaster.com -- EDGARMaster / XBRLMaster by NTDAS
- pernasresearch.com -- DFIN analysis -- market share dynamics