Nick George
all/

Simple and Efficient SEC Filing Parsing Using Go

First published: December 30, 2025
Last updated: December 30, 2025

Wave Life Sciences Trading Insider Trading Patterns. Arcs represent 10b5-1 plan initiation and execution. Data from 2025, see RxDataLab.com for more information.
Wave Life Sciences Trading Insider Trading Patterns. Arcs represent 10b5-1 plan initiation and execution. Data from 2025, see RxDataLab.com for more information.

There are many SEC filings parsers, this one is mine.

Public companies in the US are required to file standardized disclosures with the SEC. Tracking those filings and extracting structured data from them is extremely useful, but surprisingly difficult to do. I run an application focused on biotech company activity, and SEC filings are a core input that can tell you a lot about what companies are up to. For example, the cover image above is a screenshot of an interactive plot of insider trades based on Form 4 from Wave Life Sciences.

SEC Filings should be easy to parse, especially with standards like inline XBRL, but in practice, it is very challenging. Filings contain inconsistent formatting, edge cases, footnotes that matter, and ambiguous fields that require interpretation. This complexity is one reason data vendors charge tens of thousands of dollars per year for “fundamental” data that is public but hard to work with at scale.

After surveying existing tools and testing several parsers, I came to the conclusion that SEC parsing is an inherently personal problem. Most libraries try to solve too much at once or assume a language centric workflow, rather than just providing the data and getting out of your way. For example, a common pattern is outputting pandas data frames and assuming a Python-centric workflow. My application is written in Go, optimized for minimal dependencies. I want a fast, lightweight parser that returns exactly the fields I care about, in a format I can extend as needed. With modern tooling and LLMs to help handle edge cases, it is much more practical to incrementally build than it is to adapt a heavy, opinionated stack.

go-edgar

go-edgar is a small Go library and CLI for downloading and parsing SEC filings into clean, structured JSON. I wrote the initial version over a weekend, focusing on getting the core plumbing right, with things like fetching filings, parsing XML, and emitting predictable output.

The first form I implemented was Form 4, which reports insider transactions. Form 4 has a relatively simple structure and is immediately useful for understanding executive behavior. I recently relied on this data when analyzing unusual insider activity at Wave Life Sciences.

Below is an example of a section of the parsed output from a Form 4 filing from the CFO of Wave Life Sciences, involving an insider sale under a pre-arranged trading plan:

{
  "formType": "4",
  "data": {
    "metadata": {
      "cik": "0001631574",
      "accessionNumber": "0001193125-25-314736",
      "formType": "4",
      "periodOfReport": "2025-12-08",
      "filingDate": "",
      "reportDate": "",
      "source": "https://www.sec.gov/Archives/edgar/data/1631574/000119312525314736/ownership.xml"
    },
    "schemaVersion": "X0508",
    "has10b51Plan": true,
    "issuer": {
      "cik": "0001631574",
      "name": "Wave Life Sciences Ltd.",
      "ticker": "WVE"
    },
    "reportingOwners": [
      {
        "cik": "0001657765",
        "name": "Moran Kyle",
        "relationship": {
          "isDirector": false,
          "isOfficer": true,
          "isTenPercentOwner": false,
          "isOther": false,
          "officerTitle": "Chief Financial Officer"
        }
      }
    ],
    "transactions": [
    {
      "securityTitle": "Ordinary Shares",
      "transactionDate": "2025-12-08",
      "transactionCode": "S",
      "shares": 60000,
      "pricePerShare": 13.2,
      "acquiredDisposed": "D",
      "sharesOwnedFollowing": 89218,
      "directIndirect": "D",
      "equitySwapInvolved": false,
      "is10b51Plan": true,
      "plan10b51AdoptionDate": "2025-03-13",
      "footnotes": ["F1"]
    },
    //...
  ]
}

Interpretation:

This JSON is emitted directly by the parser and is designed to be consumed as-is by downstream services.

I built this library to return exactly the data I need, in the simplest form possible. In the RxDataLab research platform, it allows filings to be automatically tagged with meaningful attributes, enabling quick “what just happened?” views across companies:

screenshot from the RxDataLab application of the recent filings view tagged with sale/purchase/exercise based on the filing content.

Roadmap

This is not intended to be a general-purpose or fully comprehensive SEC parsing library. I will continue to add support for additional forms as time allows and as they become relevant to my work. The focus will remain on filings that matter most for public and private biotech companies.

Next on the list are likely Form D private placement forms, followed by 10Q, 10K, and 13F filings.