# SEC EDGAR Filing Generator Reference Reference for identifying which software generated a given SEC 10-K HTML filing. Built from direct inspection of EDGAR filings and market research (March 2026). --- ## 1. Major Vendors and HTML Signatures ### Workiva (Wdesk) -- Market Leader for 10-K/10-Q **Filing agent CIK:** `0001628280` **HTML comment signature (lines 1-3):** ```html ``` **Detection heuristics:** - HTML comment: `XBRL Document Created with the Workiva Platform` - HTML comment: `Copyright \d{4} Workiva` - Third comment line contains `r:`, `g:`, `d:` UUIDs (document/generation tracking) - `xml:lang="en-US"` attribute on `` tag - Body uses inline styles exclusively (no CSS classes on content elements) - Heavy use of `` with inline styles containing `background-color`, `font-family`, `font-size`, `font-weight`, `line-height` in every span - Div IDs follow pattern: `i{hex32}_{number}` (e.g., `id="i56b78781f7c84a038f6ae0f6244f7dd8_1"`) - Tables use `display:inline-table` and `vertical-align:text-bottom` - iXBRL fact IDs follow pattern: `F_{uuid}` (e.g., `id="F_d8dc1eb1-109d-445d-a55a-3dde1a81ca63"`) - No `` tag - No CSS classes on body content (purely inline styles) **Structural patterns:** - Span-heavy: nearly every text fragment wrapped in `` - Font specified as `font-family:'Times New Roman',sans-serif` (note: sans-serif fallback, unusual) - Line-height specified on every span (e.g., `line-height:120%`) - Background color explicitly set: `background-color:#ffffff` **Known quality issues:** - Extremely verbose HTML; simple paragraphs become deeply nested span trees - Text extraction is clean because span boundaries align with word boundaries - Large file sizes due to inline style repetition --- ### DFIN / Donnelley Financial Solutions (ActiveDisclosure) DFIN operates under **two distinct CIKs** with **two different HTML output formats**. #### DFIN "New" ActiveDisclosure (primary) **Filing agent CIK:** `0000950170` (also `0000950130`) **HTML comment signature:** ```html ``` **Detection heuristics:** - HTML comment: `DFIN New ActiveDisclosure` - HTML comment: `http://www.dfinsolutions.com/` - HTML comment: `Copyright (c) \d{4} Donnelley Financial Solutions` - HTML comment: `Creation Date :` with ISO timestamp - Body style: `padding:8px;margin:auto!important;` - Inline styles use `font-kerning:none;min-width:fit-content;` on most spans - Extensive use of `white-space:pre-wrap` on spans - CSS class `item-list-element-wrapper` and `page-border-spacing` present - iXBRL fact IDs follow pattern: `F_{uuid}` **Structural patterns:** - Every text span carries `min-width:fit-content` (distinctive) - Uses ` ` for spacing extensively - Uses `

` tags with inline margins for all paragraphs - Tables use explicit `padding-top:0in;vertical-align:top;padding-bottom:0in` cell styles #### DFIN Legacy (RR Donnelley heritage) **Filing agent CIK:** `0001193125` **HTML signature:** ```html 10-K

Table of Contents
``` **Detection heuristics:** - No identifying HTML comments (no generator/copyright comment) - Accession number prefix `0001193125` is definitive - `` - Immediately starts with `
` Table of Contents link - Uses deprecated namespace aliases: `xmlns:xl`, `xmlns:xbrll`, `xmlns:deprecated` - iXBRL fact IDs follow pattern: `Fact_{large_number}` (e.g., `id="Fact_129727210"`) - Uses `` tags (HTML 3.2 style) in some documents - Uppercase HTML tags in older filings (`

`, ``, `

`) **Structural patterns:** - Cleaner HTML than ActiveDisclosure New - Uses semantic `
` for table of contents - Inline styles are simpler and more standard - File description filenames follow pattern: `d{number}d10k.htm` --- ### Toppan Merrill (Bridge) **Filing agent CIKs:** `0001104659` (primary), `0001558370` (secondary) **HTML comment signature:** ```html ``` **Detection heuristics:** - HTML comment: `iXBRL document created with: Toppan Merrill Bridge iXBRL` - HTML comment: `iXBRL Library version:` - HTML comment: `iXBRL Service Job ID:` - Includes version number in comment (e.g., `10.9.0.3`) - `` tag contains company name + period end date (e.g., `Sunstone Hotel Investors, Inc._December 31, 2024`) - Uses `xmlns:xs` alongside `xmlns:xsi` (both XML Schema namespaces) - Body starts with `<div style="margin-top:30pt;"></div>` (distinctive) - iXBRL hidden div uses `display:none;` (no additional styles on the div) **Structural patterns:** - Context IDs use descriptive names with GUIDs: `As_Of_12_31_2024_{base64-like}`, `From_01_01_2024_to_12_31_2024_{guid}` - Hidden fact IDs follow pattern: `Hidden_{base64-like}` - Unit ref IDs follow pattern: `Unit_Standard_USD_{base64-like}` - No CSS classes used on content elements - Relatively clean HTML structure --- ### RDG Filings (ThunderDome Portal) **Filing agent CIK:** `0001437749` **HTML signature:** ```html <?xml version='1.0' encoding='ASCII'?> <html xmlns:thunderdome="http://www.RDGFilings.com" ...> <head> <title>avpt20241231_10k.htm ``` **Detection heuristics:** - XML namespace: `xmlns:thunderdome="http://www.RDGFilings.com"` - HTML comment: `Generated by ThunderDome Portal` - `` contains the filing filename - Body style includes `cursor: auto; padding: 0in 0.1in` - iXBRL fact IDs prefixed with `thunderdome-` (e.g., `id="thunderdome-EntityCentralIndexKey"`) - Context ref IDs use simple date ranges: `d_2024-01-01_2024-12-31` - Other fact IDs follow `ixv-{number}` or `c{number}` pattern **Market presence:** ~14,000 filings/year, rank #9 among filing agents. About 5% of annual filings. --- ### Broadridge Financial Solutions (PROfile) **Filing agent CIKs:** `0001140361` (primary), `0001133228` (secondary) **HTML comment signature:** ```html <!-- Licensed to: Broadridge Document created using Broadridge PROfile 25.1.1.5279 Copyright 1995 - 2025 Broadridge --> ``` **Detection heuristics:** - HTML comment: `Licensed to: Broadridge` - HTML comment: `Document created using Broadridge PROfile` with version number - HTML comment: `Copyright 1995 - \d{4} Broadridge` - CSS classes with `BRPF` prefix: `BRPFPageBreak`, `BRPFPageBreakArea`, `BRPFPageFooter`, `BRPFPageHeader`, `BRPFPageNumberArea` - CSS class: `DSPFListTable` - CSS class: `cfttable` - CSS class: `Apple-interchange-newline` (suggests Mac/WebKit origin) - Context ref IDs use XBRL-standard descriptive format: `c20240101to20241231_AxisName_MemberName` **Note:** Broadridge acquired CompSci Resources LLC in July 2024 and is integrating CompSci's Transform platform. Filings may transition to Broadridge branding over time. --- ### CompSci / Novaworks (Transform and GoFiler) CompSci Resources produces two tools that leave distinct signatures. #### CompSci Transform (now Broadridge) **Filed via:** EdgarAgents LLC (`0001213900`) or other agents **HTML comment signature:** ```html <?xml version='1.0' encoding='ASCII'?> <!-- Generated by CompSci Transform (tm) - http://www.compsciresources.com --> <!-- Created: Mon Mar 17 19:46:10 UTC 2025 --> ``` **Detection heuristics:** - HTML comment: `Generated by CompSci Transform` - HTML comment: `http://www.compsciresources.com` - XML namespace: `xmlns:compsci="http://compsciresources.com"` - Body wrapped in: `<div style="font: 10pt Times New Roman, Times, Serif">` - Uses `<!-- Field: Rule-Page -->` and `<!-- Field: /Rule-Page -->` HTML comments as structural markers - Empty `<div>` tags used as spacers between paragraphs - iXBRL context refs use simple sequential IDs: `c0`, `c1`, `c2`, ... - iXBRL fact IDs follow `ixv-{number}` pattern - Uses shorthand CSS: `font: 10pt Times New Roman, Times, Serif` (combined property) - Margin shorthand: `margin: 0pt 0` **Known quality issues:** - Words can be broken across `<span>` tags mid-word - Heavy use of ` ` for spacing - Empty divs between every paragraph create parsing noise - `<!-- Field: ... -->` comments interspersed throughout document body #### Novaworks GoFiler (XDX format) **Filed via:** SECUREX Filings (`0001214659`) or self-filed **HTML signature:** ```html <head> <title> ``` **Detection heuristics:** - HTML comments with pattern: `` - XDX comments appear between `` and `` (unusual placement) - Body style: `font: 10pt Times New Roman, Times, Serif` (same shorthand as CompSci) - Empty `` tag - iXBRL fact IDs use `xdx2ixbrl{number}` pattern (e.g., `id="xdx2ixbrl0102"`) - Standard fact IDs use `Fact{number:06d}` pattern (e.g., `id="Fact000003"`) - Context refs use `From{date}to{date}` or `AsOf{date}` format (no separators within date) **XDX explained:** XDX (XBRL Data Exchange) is GoFiler's proprietary format that uses HTML tag ID attributes ("engrams") to embed XBRL metadata. The `xdx_` comments carry taxonomy, entity, period, and unit definitions that GoFiler uses to generate the final iXBRL. --- ### Discount EDGAR / NTDAS (XBRLMaster / EDGARMaster) **Filing agent CIK:** `0001477932` **HTML signature:** ```html crona_10k.htm ``` **Detection heuristics:** - HTML comment: `Document Created by XBRLMaster` - Body style: `text-align:justify;font:10pt times new roman` - Hidden iXBRL div has `id="XBRLDIV"` - Additional body styles include `margin-left:7%;margin-right:7%` - Uses lowercase `times new roman` (no capitalization) - iXBRL fact IDs use `ixv-{number}` pattern --- ### EdgarAgents LLC **Filing agent CIK:** `0001213900` EdgarAgents is a filing agent service, not a document creation tool. The HTML they submit is typically generated by CompSci Transform, GoFiler, or other tools. Check the HTML comments to identify the actual generator. --- ### DFIN Legacy (pre-iXBRL / SGML-era) **Filing agent CIK:** `0001193125` Older filings (pre-2019) from this CIK may appear in `` SGML wrapper format: ```html 10-K 1 d913213d10k.htm 10-K 10-K
``` **Detection heuristics:** - Uppercase HTML tags: ``, ``, ``, `

`, `` - `BGCOLOR="WHITE"` attribute (deprecated HTML) - `

` tag with capital C - `
` tags for styling - Filename pattern: `d{number}d10k.htm` --- ## 2. Filing Agent Market Share Based on [secfilingdata.com](https://www.secfilingdata.com/top-filing-agents/) total filings across all form types: | Rank | Filing Agent | CIK | 2025 Filings | Total (All Time) | |------|-------------|-----|-------------|-----------------| | 1 | Donnelley Financial (DFIN) | 0001193125 | 65,180 | 1,872,890 | | 2 | EdgarAgents LLC | 0001213900 | 48,021 | 367,211 | | 3 | Quality Edgar (QES) | 0001839882 | 38,017 | 151,031 | | 4 | Toppan Merrill | 0001104659 | 48,260 | 988,715 | | 5 | WallStreetDocs Ltd | 0001918704 | 22,387 | 56,431 | | 6 | Workiva (Wdesk) | 0001628280 | 21,606 | 141,795 | | 7 | M2 Compliance LLC | 0001493152 | 13,810 | 164,603 | | 8 | Davis Polk & Wardwell LLP | 0000950103 | 16,231 | 326,359 | | 9 | RDG Filings (ThunderDome) | 0001437749 | 14,209 | 187,270 | | 10 | Morgan Stanley | 0001950047 | 12,822 | 56,468 | | 11 | Broadridge | 0001140361 | -- | 597,664 | | 14 | SECUREX Filings | 0001214659 | -- | 115,218 | | 19 | Blueprint | 0001654954 | -- | 62,250 | | 20 | FilePoint | 0001398344 | -- | 76,218 | | 38 | Discount EDGAR | 0001477932 | -- | 37,422 | **For 10-K/10-Q specifically (estimated from biotech IPO data and market research):** - DFIN: ~40-50% of annual/quarterly filings - Workiva: ~25-35% (has been gaining share from DFIN since ~2010) - Toppan Merrill: ~10-15% - RDG Filings: ~5% - Broadridge/CompSci: ~5% - Others (law firms, self-filed, smaller agents): ~5-10% --- ## 3. XBRL/iXBRL Tool Signatures The iXBRL tagging tool is often the same as the filing generator, but not always. Key distinguishing patterns in the iXBRL layer: | Tool | Context Ref Pattern | Fact ID Pattern | Unit Ref Pattern | |------|-------------------|----------------|-----------------| | Workiva | `C_{uuid}` | `F_{uuid}` | `U_{uuid}` | | DFIN New | `C_{uuid}` | `F_{uuid}` | Standard names | | DFIN Legacy | `Fact_{large_int}` | `Fact_{large_int}` | Standard names | | Toppan Merrill | `As_Of_{date}_{guid}` / `From_{date}_to_{date}_{guid}` | `Hidden_{guid}` | `Unit_Standard_USD_{guid}` | | ThunderDome | `d_{date_range}` / `i_{date}` | `thunderdome-{name}` or `ixv-{n}` or `c{n}` | Standard names | | CompSci Transform | `c0`, `c1`, `c2` ... | `ixv-{number}` | Standard names | | GoFiler (XDX) | `From{date}to{date}` / `AsOf{date}` | `xdx2ixbrl{number}` | Standard names | | XBRLMaster | `From{date}to{date}` | `ixv-{number}` | Standard names | | Broadridge PROfile | `c{date}to{date}_{axis}_{member}` | Descriptive | Standard names | --- ## 4. Detection Priority (Recommended Heuristic Order) For maximum reliability, check signatures in this order: 1. **HTML comments** (first 10 lines) -- most generators embed identifying comments - `Workiva Platform` --> Workiva - `DFIN New ActiveDisclosure` --> DFIN New - `Toppan Merrill Bridge` --> Toppan Merrill - `ThunderDome Portal` --> RDG Filings - `CompSci Transform` --> CompSci/Broadridge - `Broadridge PROfile` --> Broadridge - `XBRLMaster` --> Discount EDGAR / NTDAS 2. **XML namespaces** on `` tag - `xmlns:thunderdome="http://www.RDGFilings.com"` --> RDG - `xmlns:compsci="http://compsciresources.com"` --> CompSci 3. **XDX comments** between head and body --> GoFiler/Novaworks 4. **Accession number prefix** (first 10 digits) --> identifies filing agent CIK 5. **Body style patterns** as fallback 6. **iXBRL fact ID patterns** as secondary confirmation --- ## 5. Known Quality Issues by Generator ### CompSci Transform - **Words broken across spans**: Text is split at arbitrary character boundaries, not word boundaries. A single word like "cybersecurity" may be split across 2-3 `` tags. This breaks naive text extraction that operates per-element. - **Empty div spacers**: `
\n\n
` between every paragraph adds noise. - **Field comments in body**: `` markers interspersed with content. ### Workiva - **Extreme span nesting**: Every text run gets its own `` with full inline style. A simple bold sentence may have 5+ spans. - **Large file sizes**: Inline style repetition causes 10-K files to be 2-5x larger than equivalent DFIN filings. - **Clean word boundaries**: Despite heavy span usage, spans align with word/phrase boundaries, making text extraction reliable. ### DFIN New ActiveDisclosure - **`min-width:fit-content` everywhere**: Unusual CSS property on every span; may cause rendering inconsistencies in older browsers. - **`font-kerning:none`**: Explicit kerning disable on all text spans. - **Generally clean**: Text extraction works well; word boundaries respected. ### DFIN Legacy - **Uppercase HTML tags**: Older filings use `

`, ``, `` -- need case-insensitive parsing. - **Mixed HTML versions**: Some documents mix HTML 3.2 and 4.0 constructs. - **SGML wrappers**: Some filings wrapped in `` SGML envelope. ### GoFiler / Novaworks - **XDX comment noise**: Multiple `` comments that must be stripped. - **Generally clean HTML**: Body content is straightforward. ### Toppan Merrill Bridge - **Clean output**: Among the cleanest generators. Minimal inline style bloat. - **GUID-heavy IDs**: Context and unit refs use base64-like GUIDs that are less human-readable. --- ## 6. Self-Filed / In-House Filings Some large filers submit directly using their own CIK as the accession number prefix. These filings have **no generator comment** and variable HTML quality. **Detection:** Accession number prefix matches the filer's own CIK (e.g., Halliburton CIK `0000045012` files with accession `0000045012-25-000010`). **However:** Even self-filed companies typically use a commercial tool. Halliburton's self-filed 10-K contains the Workiva comment signature, indicating they use Workiva but submit directly rather than through a filing agent. **Truly in-house HTML** (no commercial tool) is rare among 10-K filers. When it occurs: - No identifying comments - No consistent structural patterns - May use Word-to-HTML conversion (look for `mso-` CSS prefixes from Microsoft Office) - May have minimal or no iXBRL tagging --- ## 7. Law Firm Filings Several large law firms act as filing agents: - Davis Polk & Wardwell (`0000950103`) -- 326K total filings - Paul Weiss (`0000950142`) -- 56K total filings - Foley & Lardner (`0000897069`) -- 30K total filings - Sidley Austin (`0000905148`) -- 39K total filings - Seward & Kissel (`0000919574`) -- 107K total filings Law firms typically file transactional documents (S-1, proxy, 8-K) rather than periodic 10-K filings. The HTML in law-firm-filed documents often comes from Word conversion and lacks commercial generator signatures. --- ## 8. Summary: Quick Detection Regex Table ``` Pattern | Generator -----------------------------------------------------|------------------ /Workiva Platform/ | Workiva /DFIN New ActiveDisclosure/ | DFIN (New) /Donnelley Financial Solutions/ | DFIN (New) /Toppan Merrill Bridge/ | Toppan Merrill /ThunderDome Portal/ | RDG Filings /CompSci Transform/ | CompSci/Broadridge /Broadridge PROfile/ | Broadridge /XBRLMaster/ | Discount EDGAR /xmlns:thunderdome="http:\/\/www\.RDGFilings\.com"/ | RDG Filings /xmlns:compsci="http:\/\/compsciresources\.com"/ | CompSci /Field: Set; Name: xdx/ | GoFiler/Novaworks /dfinsolutions\.com/ | DFIN /min-width:fit-content/ | DFIN (New) /BRPFPage/ | Broadridge PROfile /id="XBRLDIV"/ | XBRLMaster ``` --- ## Sources - Direct inspection of SEC EDGAR filings (March 2026) - [secfilingdata.com/top-filing-agents](https://www.secfilingdata.com/top-filing-agents/) -- filing agent rankings - [newstreetir.com -- Top SEC Filing Agents for Biotech IPOs](https://newstreetir.com/2025/05/14/who-are-the-top-sec-filing-agents-for-biotech-ipos/) -- biotech IPO market share - [houseblend.io -- SEC Filing Software Platforms](https://www.houseblend.io/articles/sec-filing-software-platforms-pricing-compliance) -- vendor comparison - [novaworkssoftware.com/inlinexbrl](https://www.novaworkssoftware.com/inlinexbrl.php) -- XDX format documentation - [rdgfilings.com/thunderdome](https://rdgfilings.com/thunderdome-client-portal/) -- ThunderDome Portal - [toppanmerrill.com/bridge](https://www.toppanmerrill.com/bridge/) -- Toppan Merrill Bridge - [edgarmaster.com](https://edgarmaster.com/) -- EDGARMaster / XBRLMaster by NTDAS - [pernasresearch.com -- DFIN analysis](https://pernasresearch.com/research-vault/donnelley-financial-initiation/) -- market share dynamics