Skip to content

Changelog

All notable changes to markymark are documented here. Each release links to the full diff on GitHub.

Completes the extraction pipeline migration started in v0.5.0. All remaining symbol types — code spans, XML tags, tasks, embeds, callouts, block refs, link definitions, query blocks, and properties — are now extracted through the full Zig→FFI→Rust→LSP/MCP stack. The legacy regex-based Rust extractors have been removed; zig-kernels is now mandatory. The RealmIndex has been rebuilt as v2 with string interning and incremental document updates.

Note: Agents like Claude Code use both LSP and MCP simultaneously — they are complementary protocols, not alternative paths. This release completes MCP parity with LSP for all extracted element types.

  • Complete extraction pipeline (ix3) — 9 new element types extracted by the Zig md4c ExtractionRenderer, serialized in blob v2 format (128-byte header with backward compatibility), and wired through from_blob, from_scan, ScanBackend, and LSP/MCP
  • RealmIndex v2 (n7wx) — String interning via lasso deduplicates URI and heading allocations; new update_document() computes contribution diffs incrementally; O(1) stem_to_uris index for wiki link resolution; lazy tag_to_docs built on first access
  • Blob v2 header — 128-byte header with v1/v2 backward compatibility
  • MCP batch indexing migrated from from_ast to from_scan
  • Preserve frontmatter and mask it before engine/scan parsing
  • Pick earliest frontmatter close delimiter in mixed-ending files
  • Property scan past non-property lines and CRLF frontmatter handling
  • Correct offset recovery for code spans and tasks
  • Windows: append .exe to binary path and download URL
  • VSCode: return bare name for PATH fallback on unsupported platforms
  • zig-kernels is now mandatory — regex extractors have been removed
  • Zig 0.15.2+ required for all builds (enforced by build.rs)
  • scanner.rs split into 4 submodules
  • from_blob.rs converted to module directory with header.rs, owned.rs, and 4 test submodules
  • document.zig split into helpers, free functions, stored types, and FFI types
  • extraction_renderer_tests.zig split into 4 thematic test files
  • extract.rs converted to submodule directory with frontmatter, tasks, blocks, links, and tags
  • Zig 0.15.2 archive corruption fix — archives repacked with system ar on Linux targets

Full diff: v0.5.1…v0.6.0


Focused improvements to the Zig md4c parsing pipeline and Rust error handling.

  • O(n) autolink paren trimming — the GFM autolink parenthesis-balancing loop in the Zig md4c port was quadratic; rewritten as a single-pass O(n) scan
  • BlobError now implements Display and Error traits, enabling Box<dyn Error>, ? operator, and anyhow integration with per-variant diagnostic messages
  • Extract map_md4c_heading / map_md4c_link helpers in Md4cScanBackend

Full diff: v0.5.0…v0.5.1


Replaces the tree-sitter incremental indexing pipeline with a new md4c-based DocumentEngine. The new pipeline vendors Bun’s md4c Zig parser for single-pass markdown extraction, serializes results to a compact binary blob format, and crosses the FFI boundary into Rust — eliminating double-parse overhead.

  • New md4c parsing pipeline — Vendored Bun’s md4c CommonMark parser (Zig), streaming ExtractionRenderer extracts headings, links, tags, and block IDs in a single pass, exposed through C ABI with Rust FFI bindings
  • DocumentEngine with blob serialization — Zig DocumentEngine produces a compact binary blob; Rust-side DocumentIndex::from_blob() deserializes without re-parsing
  • LSP pipeline overhaulscan_all replaces tree-sitter incremental indexing, eliminating the previous double-parse (tree-sitter parse + separate extraction pass)
  • Async debouncedid_change notifications debounced at 75ms with generation counters preventing stale batches after close/reopen cycles
  • Removed undefined behavior in DocumentIndex arena_ref mutex escape
  • Removed unsound Sync impl on DocumentIndex
  • Fixed extractFromMarkdown double-free and heading text leak on OOM
  • Fixed toOwnedSlice cascade leak and append double-frees in parseAll
  • Fixed normalizeLabel memory leak in vendored md4c parser
  • Wiki link alias detection: compare text != target
  • Slug truncation no longer returns empty string
  • Tree-sitter incremental indexing replaced by md4c-based DocumentEngine
  • New document processing pipeline — plugins relying on tree-sitter internals will need updates
  • Release process formalized in RELEASING.md with 7-crate publish order
  • prepare-release skill added for guided 4-phase release workflow
  • Golden blob roundtrip test catches unilateral Zig/Rust format drift

Full diff: v0.4.2…v0.5.0


  • get-diagnostics MCP tool — new MCP tool with file:// URI validation and structured doc support
  • Include new entries from large insertions in incremental merge
  • Correct assertion message in realm_stats preview check
  • Create parent dirs in TempWorkspace::write for nested paths
  • Eliminate eager allocation of heading names in SearchSymbols
  • Split incremental/tests.rs into submodules
  • Split runtime_engine_tests.rs into 9 submodules
  • Move realm.rs to realm/ module directory with types.rs, helpers.rs, and tests.rs
  • Extract shared TempWorkspace test helper and migrate across test suites

Full diff: v0.4.1…v0.4.2


  • LSP debouncedid_change notifications debounced with 75ms async cancellation
  • Detect edits in large gaps between extractor entries during incremental indexing
  • Deduplicate link edges per document in graph analysis
  • Improve markdown link resolution with path-relative lookup

Full diff: v0.4.0…v0.4.1


Major release introducing Zig SIMD acceleration kernels, MCP intelligence tools, incremental indexing, and a VSCode extension.

Note: Agents like Claude Code use both LSP and MCP simultaneously — they are complementary protocols, not alternative paths. This release adds MCP tools that pair with the existing LSP capabilities.

  • Zig SIMD kernels — Format scanners (env_scan, ini_scan, toml_scan, yaml_scan, json_keys) with SIMD-accelerated key extraction; link graph engine; batched fuzzy match; Aho-Corasick multi-pattern scanner; slug generation via C ABI
  • MCP toolssearch-workspace (full-text search with frontmatter/property/tag filtering), search-for-pattern (regex search with glob filtering and context lines), graph-analysis (link graph intelligence with orphans, hubs, broken links, clusters)
  • Incremental indexing (Phase 3) — All 5 extractors carry byte offsets for selective merge; range intersection with neighbor window and tail-boundary guard; benchmarked 1.23x speedup
  • VSCode extension — Marketplace-ready extension with binary discovery and LSP client
  • Implement ___chkstk_ms for Windows x86_64 to resolve stack frame issues
  • CRLF offset drift fix in incremental indexing
  • Insertion-point boundary correction
  • New feature-gated Zig dependency (zig-kernels feature flag)
  • Split state.rs (1170 lines) into state/{mod,completion,navigation,rename}.rs
  • Renamed runtime_engine to engine with submodules
  • Extracted tool handlers into tools/ submodule

Full diff: v0.3.0…v0.4.0


Introduces the Zig SIMD kernel foundation and incremental tree-sitter parsing.

  • Zig kernel scaffoldzig/ directory with build.zig, source structure, and markymark-kernels Rust crate with FFI bridge
  • SIMD kernelsheading_scan, link_scan, tag_scan, block_scan, token_estimate, content_hash implemented in Zig with Rust FFI wrappers
  • Shared BRZA kernels — Similarity, normalize, entities, quantize, and embedding index kernels ported with Rust FFI wrappers
  • Incremental tree-sitter parsingMarkdownTree stored per document, TextDocumentSyncKind::INCREMENTAL enabled, incremental parsing wired end-to-end
  • Core traitsScanBackend and EmbeddingProvider traits added to markymark-core
  • Enable PIC in Zig static library for Linux x86_64 linking
  • Defensive FFI hardening: replace as u32 casts with try_from at FFI boundary
  • Initialize written parameter in scan FFI functions
  • Skip invalid incremental edits when old_end < start
  • First Zig dependency introduced (optional via zig-kernels feature flag)

Full diff: v0.2.0…v0.3.0


Introduces arena allocation, multi-format structured document support, and security hardening.

  • Arena allocationbumpalo-based arena infrastructure in markymark-core, migrated through parser and index layers with DocumentArena and ArenaHashMap
  • Multi-format support — JSON, JSONC, JSON5, JSONL, YAML, TOML, .env, and INI parsers with StructuredDocumentIndex, AnyDocumentIndex, and RealmIndex integration
  • LSP multi-formatDocumentSymbols, hover, and find-references for structured documents
  • Tree-sitter migration — Upgraded from tree-sitter 0.19 to 0.26 with tree-sitter-md wrapper
  • Security — Advisory security workflow with SARIF uploads, lefthook pre-commit hooks, custom semgrep rules with fixture validation
  • Plugin distributionmarketplace.json for self-hosted plugin distribution
  • Resolve frontmatter_and_properties SIGSEGV
  • Preserve duplicate block IDs across documents
  • Ignore XML-like syntax inside fenced code blocks
  • BlockEntry range propagation for go-to-definition
  • Optimize remove_from_cross_doc_indexes to O(doc size)
  • Real corpus benchmarks added
  • Split types.rs into submodules below 1000 LOC

Full diff: v0.1.0-alpha.2…v0.2.0


Second alpha release focused on plugin distribution improvements.

  • CI per-platform pre-packaging for plugin binary distribution
  • Download-on-first-run fallback for marketplace installs

Full diff: v0.1.0-alpha.1…v0.1.0-alpha.2


Initial alpha release of markymark — a high-performance Markdown LSP server built in Rust. Includes core LSP capabilities: document symbols, go-to-definition, find-references, rename, hover, completion, and diagnostics for Markdown files with wiki-link support.

Full changelog: v0.1.0-alpha.1